GISdevelopment.net --> Application --> Urban Planning

An Application of Geographic Information System (GIS) to Model Commercial Land Use Development

Mohd Faris, Ruslan Rainis
faris@upsi.edu.my


Introduction
Generally, a land use model is a simplified representation of the planner’s understanding of the land development process in his or her particular planning situation, i.e., a particular planning jurisdiction, over a particular time period, for a particular planning problem. Models are idealized in the sense that they are considerably less complex and subtle than situation being represented, only the most relevant factors and processes of land use change are included in the model. Land use models can also be useful in the problem analysis phase. By simulating urban behavior, they help the planner anticipate where future problems will arise. They also express a systematic statement of relationships among elements of the problem situation and thereby improve the planner’s understanding of the forces of change and the relative importance of various factors in change process. Urban planners had been using computer models since the early 1960s when computer become available for public uses. Numerous planning models were developed since then. However, it is not until recently that planning models become more commonly used. The recent development of computer technology such as geographic information system (GIS) has been created new opportunities for model development and use. GIS for instance, has provided various capabilities for input, management, analysis and display of spatially referenced data. It provides a framework for integrating spatial and non-spatial from different format, sources and times period. More importantly GIS provides special capabilities for spatial analysis. With these capabilities, it is relatively easier to generate and incorporate various spatial variables in urban land use modeling. However, these capabilities have not been fully utilised to date especially in the context of developing countries.

The purpose of this paper is to describe an initial attempt to develop a spatial commercial land use model with special reference to Seberang Perai Tengah, Penang (Malaysia). From the knowledge and experience of the authors, there have been a very few attempts to develop land use development model that specific to the Malaysian environment. Taking for instant, this paper is intended to pave the way for subsequent studies, development and uses of urban land use models in the country. The model was developed using discriminant analysis. It is an extension to the model developed earlier by the authors (Ruslan and Narimah, 1996). GIS was used to generate the spatial data required for developing and testing the model.

The first section of the paper gives a conceptual development of the model. This is followed by a brief discussion on the methodology. The resulting model and evaluation of the predictive model performance are then discussed.

The Conceptual Framework of The Proposed Model
The present study attempts to develop the modeling of commercial land use development. The chosen of this activity because commercial land use are dynamic phenomenon, changing with space and time. For the development of commercial land use model the approaches was used are testing and design the effectiveness of the proposed model.


Figure 1: Schematic diagram of the cellular automata modeling approach
  1. The formulation of the proposed model
    For this study, the proposed model was developed based on the geographic model (Tobler, 1979). This models is selected because it’s provide effective ways of simulating the process of land use changes as well as offering a means of evaluating alternative planning scenarios. Figure 1 have shown the evolution of this model started from one cell and the expanded to one area. The evolution of cell development which locate nearby with others develop cell have been controlled by transition rules. Therefore, the important task in this modeling process is to determine the equation function than will be use as transition rules to controlled the changing of commercial land use development. The model was developed using discriminant analysis. This technique is selected because of its capability to discriminate objects into different group based on the category for that group. Therefore, this core of the model is to formulate a function for the transition rules. The basic form of the model is as follows:

    A = a + a1b1 + a2b2 + a3b3..........................................................anbn = a + NSi=1 aibi (1)

    A = dependent variable
    a = constant
    ai = coefficient
    i = independent variable
    N = number of independent variable
    So the basic form of the land use development model is as follows:

    DGt-t1 = a + a1bt1 + a2bt2 + a3bt3 + ...................+ anbtn = a + NSi=1 aibi (2)

    namely:
    DGt-t1 = commercial land use change from time t to t1
    a= constant
    bti...bti= factors have determined the development especially commercial land use at time t
    N = number of factors have been considered

    The proposed model attempts to identify areas potentially to be developed at time t+m based on the supporting factors available at time t. Since commercial land use is only one of the major land uses that usually existed in an urban area, this model need to differentiate or classify the study area into 3 different development groups:

    Group 1 - Undeveloped areas
    Group 2 - Areas (potentially to be) developed for commercial development, and
    Group 3 - Areas (potentially to be) developed for other land use development.
    Therefore, the model requires a technique that is capable of classifying objects. There are several statistical techniques that can be used for this purpose. Since the dependent variable - development type at a given year - is categorical, and because the dependent variable has more than two categories, discriminant function analysis (DFA) was selected to produce the predictive model. This technique is selected because of its capability to discriminate objects into different group based on the category for that group (Hair et. al, 1992). So, it is expected that this technique will be able to discriminate areas potentially to be developed for commercial development from other types of development (categorical) based on the changing factors.

  2. Factors Influencing Commercial Land use Development
    The development of commercial land use is influenced by numerous factors. These factors can be classified into two categories namely based on physical characteristic and spatial characteristic (Landis and Zhang, 1997; Landis, 1994; MPSP Structure Plan, 1993; Ruslan, 1991a; Lillesand and Keifer, 1987; Chapin and Kaiser, 1979). Among the physical factors that are commonly studied in commercial land use development are topography, geology and soils. Areas with slopes exceeding 10 percent are not suitable for commercial development. The most ideal areas for commercial land use development are areas with 2 to 6 percent slopes (Lillesand and Keifer, 1987). Strong geologic foundation will ensure that the selected areas are not susceptible to the danger of landslides and others.

    Selected site for commercial land use development should also have good drainage system. This is important to ensure that the site is not susceptible from flooding. Apart from these factors, land availability is also important to ensure that the development has been planned can be implemented as scheduled.

    Therefore, the development of commercial land use is determined by spatial characteristic. The spatial characteristic has been determined commercial land use development included neighborhood, site suitability, good economic status, very high population densities, infrastructure availability and accessibility. We would expect site land uses to be strongly affected by the pattern of neighboring or adjacent uses. We would expect, for example, that a vacant site surrounded by residential uses would be more likely to be developed into commercial use. Likewise, we would expect that a vacant site surrounded by commercial uses would more likely be developed into commercial use. Site selected for commercial land use development should also have favourable environmental qualities. The area should also be located away from conflicting land uses such as wetlands. Distance or proximity factors are also very important in selecting suitable areas for commercial land use development. To provide maximum accessibility to other supporting activities such as residential areas, industrial areas and commercial area are usually located in close proximity to major roads.

  3. Model Evaluation
    The effectiveness of the proposed model needs to be tested to evaluate its predictive capability. There are a number of approaches for model evaluation. One of the methods are to divide a data set into two parts, one to be use for the model development and the other part for evaluating the accuracy of the proposed model in predicting commercial development. The other approach is to use two different data sets, one to be used for model building and the other one is for model evaluation respectively. However, for the second method the large data set were needed (Norusis, 1993). For this study researcher will be using the first method.
Table 1: Factors influencing commercial land use development and their data sources
Name of Variables Required Data Elements
Proximity to urban centers Town maps
Proximity to commercial areas Land use
Proximity to primary roads Roads
Proximity to secondary roads Roads
Proximity to industrial areas Land use
Proximity to residential areas Land use
Proximity to agricultural areas Land use
Spatial index of neighborhood development Land use and land parcels
% neighbor develop as commercial areas Land use and land parcels
% neighbor develop as residential areas Land use and land parcels
% neighbor develop as industrial areas Land use and land parcels


Methodology
  1. Study Area
    The Seberang Perai Tengah, Penang (Malaysia) was selected as the site for developing and testing the proposed model (Figure 2). This area was selected for several reason. Firstly, Seberang Perai Tengah has grown quite rapidly since the early 1970s and is expected to continue growing in the near future. It is located in the centre of the Northern Region of Peninsular Malaysia and has the potential to become an important centre in the region. The second reason for the selection of the area is the availability of spatial data (planning permit records), which can save time and cost of data capture for the study. This area is part of a pilot area for the Penang State GIS (PEGIS). Several digital data like parcel lot, public facilities and road networks were available from the project.


    Figure 2: Location of the study area – Seberang Perai Tengah, Penang

  2. Factors Influencing Commercial Development Considered in the Model
    In this study, ten spatial factors are included in the model and one factor is vacant land use (Table 1). Ten spatial factors are included in the model as follow:
    • proximity to urban centres
    • proximity to commercial areas
    • proximity to residential areas
    • proximity to primary roads
    • proximity to secondary roads
    • proximity to industrial areas
    • percent of neighbor develop as commercial areas
    • percent of neighbor develop as residential areas
    • percent of neighbor develop as industrial areas
    • spatial index of neighborhood development
    However, in this study physical factor such as topographic and geological factors are not relevant because the average elevation is less than 50 metre and almost 95% of the area are located within 1 to 5 percent slopes. The entire study area is covered with fluvial formation (Seberang Perai Municipal Council, 1993).

  3. Spatial Database
    This study covers an area is about 23679.6 hectare in size and consist of 80662 land parcels. However for the analysis only 5823 samples (land parcels) were used in this study and the random process was chosen to select the sample. Unlike other models, the proposed model is based on land parcels. Fundamental to this study are temporal data on land uses, infrastructure and related amenities. Information on existing land use condition is useful in determining lands potentially available for development. Information on past land uses when compared to the existing land use is useful for detecting land use changes. It can also indicate growth trend and direction as well as the development that have occurred within a given time period. Two sets of data were used in this study: year 1992-1994, and year 1995-1998. Figure 3, Figure 4 and Figure 5 shown the basic land use maps of three times period namely year 1992, 1994 and 1998. Therefore the probability for a vacant sites will be developed, should depend on the availability of roads information. In this study all roads information has been digitising for database development. Figure 6 have been shown the basic map of major roads and urban centre of the study area. GIS spatial operations were used to generate the required spatial data for the model development. Among the spatial analyses used in the study include distance and proximity analysis, buffer generation, selection/classification, map overlay and neighborhood analysis. In this study, neighborhood analysis in vector format was carried out using a computer program. This is because ArcView 3.1 software (as well as many other vector-based GIS softwares) has limited built-in operations for neighborhood analysis in vector format.


    Figure 3: Land use of the study area – 1992


    Figure 4: Land use of the study area – 1994


    Figure 5: Land use of the study area – 1998

    As described earlier, the proposed model uses discriminant analysis to discriminate three groups of development: undeveloped areas, commercial development, and other types of development. The model was developed based on a sample of 5823 land parcels. From 5823 samples the number of cases for group one are 2973 (51.1%), group two are 793 (13.6%) and group three are 2057 (35.5%) respectively. The "SPSS for Windows Release 10.5" statistical package was used to carry out this analysis. The package provides six different methods for calculating the discriminant score. The first method enters the independent variables together and the other five are step-wise methods i.e. Wilk's Lambda, Unexplained Variance, Mahalanobis Distance, Smallest F-ratio and Rao's V (Norusis, 1993). In this study, the Smallest F-ratio step-wise method was used. In this method, the variable with the smallest F statistic is entered consecutively. The F value chosen was 0.05. This means that only variables that are significant at 0.05 of confidence level will enter the analysis. All the data required for the modeling were imported from ArcView 3.1 to the SPSS statistical package in dBASE file format.

Figure 6: Roads and urban centres of the study area

Table 2: Coefficients of the unstandardised discriminant functions
Variables Description Function1 Function2
X1 Proximity to agricultural areas .00029 .00011
X2 Proximity to primary roads -.00015 .00044
X3 Proximity to secondary roads -.00011 -.00010
X4 Proximity to urban centres -.00015 -.00001
X5 Proximity to commercial areas .00008 -.00020
X6 Proximity to residential areas .00033 .00146
X7 Proximity to industrial areas .00010 .00009
X8 spatial index of neighborhood development .00913 -.09274
X9 % neighbor develop as residential areas .34549 -.14842
X10 % neighbor develop as commercial areas -.36098 -.75505
X11 % neighbor develop as industrial areas .49056 -.75505


Resulting Model
From the step-wise discriminant analysis, it was found that all factors were important in discriminating the three development groups. Table 2 shows the constant and coefficients of the unstandardised discriminant functions of the original model. These constant and coefficients can be substituted into equation (2) and use to calculate the discriminant score for each function. The equations can be stated as follows:

Function 1 = -.72608 + .00029X1 + .00015X2 + -.00011X3 + -.00015X4 + .00008X5 + .00033X6 + .00010X7 + .00913X8 + .34547X9 + -.36098X10 + .49506X11. (4.1)

Function 2 = -2.05831 + .00011X1 + -.00044X2 + - .00010X3 + -.00001X4 + -.00020X5 + .00146X6 + .00009X7 + .09274X8 + -.14842X9 + -1.44904X10 + -.75505X11. (4.2)

Table 3: Group means derived from the discriminant analysis
Group Function 1 Function 2
1 .89696 .02073
2 -.82942 -.98905
3 -.97662 .35134


Discriminant score derived from the two functions can be used to predict/determine the grouping of the unclassified cases. There are several ways to determine the grouping. One way is to compare the discriminant score with the group means. Table 3 shows the group means for the original model. From the table, it can be seen that Function 1 is useful for discriminating group 1 from groups 2 and 3. This is because the mean for groups 1 is positive while the means for group 2 and 3 are negative. On other hand, Function 2 is useful for discriminating group 2 from group 1 and 3. This is because the mean for groups 1 and 3 is positive while the means for group 2 are negative. The fact that cases with positive discriminant score in Function 1(X<0.0) will be classified to belong to group 1. By leaving out cases in group 1, the cases with negative discriminant score will be classified to belong to group 2 and group 3.

Table 4: Classification accuracy of the original model
Actual Group No. of cases Predicted Group
1 2 3
Group 1 2973 2856 (87.5 %) 186 (6.3 %) 201 (6.8 %)
Group 2 793 146 (18.4 %) 467 (58.9 %) 180 (22.7 %)
Group 3 2057 431 (21.0 %) 315 (15.3 %) 1311 (63.7 %)
Percentage of group correctly classified: 74.9 %.

Table 4 shows the accuracy of the original model. From that table, the accuracy for group 1 is highly (87.9%) compare with group 3 (63.7%) and group 2 (58.9%) as well. From group 2 the error or confusion matrix is 22.7%. This is because the factors have been identified confused to differentiate between commercial land use development and other development. However the accuracy for group 3 are not very differ with group 2 namely 63.7% was correctly classified and 15.3% has been classified belong to group 2. The overall of the original model derived from discriminant function analysis is relatively highly i.e 74.9%. It is means that, the overall of factors has been identified earn to discriminate all groups.

Evaluation of Commercial Land Use Model
The original commercial land use development model was developed based on the land use changes between year 1992-1994 as input. To evaluate the effectiveness of the original model, it was used to predict commercial development (as well as the other two groups) that has occurred between year 1995-1998 based on the factors available in year 1994. A total of 5636 samples of undeveloped lots was used in the evaluation. It was known that in year 1998, 450 lots (7.9%) were developed for commercial (Group 2), 2753 lots (48.8%) for other urban development (Group 3) and the rest were still undeveloped. To predict the type of development for an unknown land parcel using the original DFA model, the year 1992 data value for each independent variables were used to calculate the discriminant score for each function using the two functions (Eq. 4.1 and 4.2) calibrated earlier. As mentioned earlier, function 1 is useful for discriminating non-developed areas (Group 1) with urban development (Groups 2 and 3). While function 2 is useful in discriminating Group 2 from Group 1 and Group 3. Using the discriminant scores of the two functions, land parcels with negative score in Function 1 were assigned to Group 1 - non-developed areas, and then Function 2 was used to discriminate the remaining unclassified samples either to Group 1 or 3. Land parcels with positive score on Function 2 will be assigned to Group 2 - commercial development and the rest to group 3.

Table 5: Classification accuracy of the model evaluation
  Predicted Group
Actual Group No. of cases 1 2 3
Group 1 2433 2403(98.8%) 30(1.2%) 0(0%)
Group 2 450 4(0.9%) 446(99.1%) 0(0%)
Group 3 2753 46(1.7%) 0(0%) 2707(98.3%)
Percentage of group correctly classified: 98.6 %.

Table 5 shows the result of the evaluation. The model accurately predicted about 98.6% of the development in the sample area between year 1995 and year 1998, slightly better than could be achieved by chance (50%). Further examination of the classification results reveals that the accuracy of the group classification was not in-line with the original model. In the evaluation model the accuracies for Group 1, 2 and 3 were quite high namely 98.8%, 99.1% and 98.3 % respectively.

Discussion and conclusions
This paper has described an initial attempt to provide a framework for the development and evaluation of commercial land use development model especially in the context of developing countries. Models have been used in land use planning since the early 1960s. However an using and development process become more viable in recent years due to the rapid advances in the information technology such as geographic information system (GIS), which provide additional capabilities for spatial data handling. In this study, a spatial commercial land use development model was developed using discriminant function analysis. The model was developed based on the year 1992 to 1994 land use changes and factors existed in year 1992. The overall accuracy of the original model is quite good i.e about 74.9% for commercial development. The same model accurately predicted 98.6% of the year 1995 to 1998 overall development. For commercial development the accuracy was only 57%, with high confusion with other types of urban development. The relatively low accuracy may be due to factors that were not included in the model such as land ownership and land value, which is an important to determining the availability of land for development. Therefore, it is suggested that future attempts should include other relevant variables in the model. Other possible reason may be related to changes in government policy in 1980s that might have affected the land development structure of the study area that was not accounted in the original model such as the construction of a major highway passing nearby the study area. Such changes were not captured in the model because of data availability problems. It is hoped that such developments would further encourage urban planners to use and further develop more urban models in the future.

References
  • Hair, Joseph F Jr, Rolph E. Anderson, Ronald L. Tatham, dan William C. Black (1992) Multivariate Data Analysis with Readings, 3rd. ed., New York: Macmillan Publishing Company.

  • Landis, J.D, and Ming, Zhang (1997). Modelling Urban Land Use: The Next Generation of the California Urban Futures Model. http://ncgia.ucsb.edi/conf/landuse97/papers/landis_john/paper.html

  • Landis, J.D. (1994). “The California Urban Model: a New Generation of Metropolitan Simulation Models”. Environment and Planning B: Planning and Design, 21, 399-420.

  • Lillesand, Thomas M. and Ralph W. Keifer (1987). Remote Sensing and Image Interpretation. Second Edition, New York: John Wiley & Sons.

  • Norusis, M.J. (1985) SPSSX: Advanced Statistics Guide, Chicago: SPSS Inc.

  • Ruslan Rainis et. al (1995). "The Development of Neighborhood Analysis in Vector-Based GIS". Paper presented at the Seminar on the Integration of GIS and Remote Sensing for Applications in ASEAN Region, March, Kuala Lumpur.

  • Ruslan Rainis and Narimah Samat (1996). “Modelling Urban Residential Landuse Development using GIS”. Journal HBP, USM,3, 10-18. Seberang Perai Municipal Council (1993). Seberang Perai Structure Plan. Penang.

  • Tobler, W.R. (1979) "Cellular Geography". in S. Gale dan G. Olsson, (eds.), Philosophy in Geography, Dordrecht, Holland: D. Reidel Publishing Company.
© GISdevelopment.net. All rights reserved.