Artificial neural networks for improvement of classification accuracy in Landsat ETM+ images



Meysam Argany
Department of Surveying Engineering
Faculty of Engineering
University of Tehran
Tehran, Iran.
margany@geomatics.ut.ac.ir

Jalal Amini
Department of Surveying Engineering
Faculty of Engineering
University of Tehran
Tehran, Iran.

M. Reza Saradjian
Department of Surveying Engineering
Faculty of Engineering
University of Tehran
Tehran, Iran.


Abstract
Remote sensing data are often used in land cover and land use applications. However, classes of interest are often imperfectly separable in the feature space provided by the spectral data. The application of Neural Network (NN) to the classification of satellite images is increasingly emerging. Without any assumption about the probabilistic model to be made, the networks are capable to forming highly non-linear decision boundaries in the feature space. Training has an important role in the NN. The objective of this paper is to develop an Artificial Neural Network (ANN) for classification of Landsat ETM+ images into various types of land-use, especially for urban areas. So specific land-use classes are defined including city, water, bare rock, vegetation, and various types of soils. The test area is an image from Karaj in Iran.

1 Introduction
Image classification is as a main component in remote sensing applications. Image classification is the process of creating thematic maps from satellite imagery. A thematic map is an informational representation of an image, which shows the spatial distribution of a particular theme. An example of themes could be vegetation types consisting of trees, crops, grasslands, etc. Finer sub-themes can also be defined inside a theme to make the process of classification more refined, such as classifying trees as deciduous or evergreen. Image classification relies on the spectral distinctness of classes or spectra-temporal variability. It also depends on the context of classification. For example, two features with nearly identical spectral signatures for vegetation could be assigned to the classes 'forest' and 'crops' depending on whether the area in the images has irregular or straight boundaries. Various studies reported a very high accuracy in image classification (Ritter and Hepner, 1990; Chen, 1995; Tsai, 2002). However, it is vital to extend techniques to further improve remote sensing image classification accuracy for deriving dependable land cover information for vegetation, land-use and other applications.

The maximum likelihood algorithm with Gaussian probability distribution functions is considered as a best classifier in the sense of obtaining optimal classification rate. However, the application of neural network to the classification of satellite image is increasingly emerging. Without any assumption about the probabilistic model to be made, the networks are capable to forming highly non-linear decision boundaries in the feature space and therefore they have the potential of outperforming a parametric Bays classifier when the feature statistics deviate significantly from the assumed Gaussian statistics.

Bendiktsoon et al. (Bendiktsoon et al., 1990) compared neural network and statistical approaches to multispectral data classification. They noted that conventional multivariate classification methods cannot be used in processing multisource spatial data. This is due to different distribution properties and measurements scales. Heermann and Khazenie (Heermann and Khazenie, 1992) compared neural network with more classical statistical methods. Heerman and Khazenie’s study emphasized the analysis of larger data sets with a back propagation methods, in which error is distributed throughout the network. They concluded that the back propagation network could be easily modified to accommodate more features or to include spatial and temporal information. Hepner et al. (Hepner et al, 1990) compared the use of neural network back propagation with a supervised maximum likelihood classification method using a minimum training set. The results show that a single training site per class neural network classification was comparable to a four training site per class conventional classification. The result demonstrated that the neural network method offered a potentially more robust approach to land cover classification than the conventional image classification methods.

In this paper, the multi layer perception (MLP) network with the back-propagation (BP) algorithm is used to classify Landsat ETM+ images and improved the classification accuracy.

2 Artificial neural network and image classification
In human visual image interpretation, the criteria used for classification can be broadly defined by the tone or color, size, shape, shadow, pattern, texture, and /Spatial relationships of the ground targets. An interpreter's knowledge, experience, and familiarity with a study area also contribute to the classification process. The powerful capabilities for knowledge acquisition, recall, synthesis, and problem solving of the human brain have inspired scientists from different disciplines to attempt to model its operations. Based on the biological theory of human brain, artificial neural networks are models that attempt to parallel and simulate the functionality and decision-making processes of the human brain. In general, a neural network is referred to as a mathematical model of theorized mind and brain activity (Simpson, 1990). Neural network features corresponding to the synapses, neurons, and axons of the brain are input weights, processing elements, and output paths. In an artificial neural network, the processing element (PE) is the analog to the human brain's biological neuron. A processing element has many input paths, analogous to the brain's dendrites, and the information transferred along these paths is combined by one of a variety of mathematical functions, most commonly simple summation. The result of these combined inputs is some level of internal activity for the receiving processing element. The combined input contained within the processing element is modified by a transfer function before being passed to other connected processing elements, whose input paths are usually weighted by the perceived synaptic strength of neural connections. A transfer function is required to avoid saturation of a processing element, caused by extremely large positive or negative internal summations. Commonly, either a sigmoid or hyperbolic tangent function is applied. Both are monotonic (smooth) transformations of a processing element's internal value.

A neural network consists of organized topological interconnections among the PEs, learning rules, and knowledge recall. The topological structure establishes the frame of the network, the learning paradigm trains the network by presenting example input data pattern and the corresponding desired output, and the recall applies the pattern recognition knowledge learned in the training step to process and in this case classify the raw data. The most popular forms of neural networks typically consist of three or more layers--an input layer, an output layer, and one or more hidden layers. The input layer consists of one or more processing elements which present the training data, and the output layer consists of one or more processing elements which store the results of the network. In the case of remote sensing data classification, the inputs often represent the vector of brightness values for the multispectral data. Hence, for single-date Landsat data, there would be seven input nodes, each corresponding to a band of the Thematic Mapper sensor. The input patterns could consist also of ancillary data (e.g., multitemporal spectral patterns, image texture, elevation and its derivatives, etc.). Since the learning and recall depend on the linear and nonlinear combination of data patterns instead of the statistical parameters of the input data, neural networks offer the opportunity to analyze spatial data with different origins and properties simultaneously, without a priori assumptions about the distribution for each data type. In fact, neural networks have the ability to learn those distributions, if they exist, in the input data. Therefore, a neural network can be trained by data in different types. The one, two, or perhaps more hidden layers consist of a number of processing elements which enable the translation of input data into output information, which, in the present context, is the land cover classification corresponding to an input pattern. Ideally, each data type will make a unique contribution to the discrimination of land cover class patterns, therefore, enabling the neural network to learn the spectral, spatial, and temporal signature of each class.

3 Methodology and Experimental Results
In order to demonstrate the application of NN in remotely sensed data, a Landsat ETM+ image (fig. 1) from an area in Karaj, Iran was used and a neural network was classified spectral information into land cover classes. Classification of the spectral information was done based on three types of inputs to the network. In the first type, the input vectors contained six elements so that each element was a value of each pixel belonged to one bands in the image. In this case, the input vector to the network for a pixel was:



Fig. 1 Landsat ETM+ image


In the second type, the input vector contained twelve elements that the first six elements belonged to the bands and the second six elements belonged to the mean of a window 3*3 around a pixel in each band of image. In this case, the input vector to the network for a pixel was:



In the last type, the input vector contained eighteen elements that the first six elements belonged to the bands, the second six belonged to the mean of a window 3*3 around a pixel in each band and the last six elements belonged to the standard deviation of the same window in the image. In this case, the input vector to the network for a pixel was:



Whereas the output vector, containing the desired land use categories was:



So, three networks with 6, 12, and 18 neurons in input layer and 11 neurons in output layer and one hidden layer were developed.

The ENVI software was used for preparing the multispectral data and selecting the training and testing data for the neural network development, evaluation, and refinement. It was used for building the topologically structured neural network and the final neural network to conduct the final classification.

Training parameters that used for three networks are as follows:

Number of Training Iterations: 10000,
Training RMS Exit Criteria: 0.1,
Initial learning rate: 0.01,
Training Threshold Contribution: 0.9,
Training Momentum: 0.9 and
Minimum Output Activation Threshold: 1.00e-8.

Two activation functions tan-sigmois and linear transfer function were used in hidden and output layers respectively. The hidden layer was responsible for internal representation of data and the information transformation between input and output layers. If there are too few neurons in the hidden layer, the network may not contain sufficient degrees of freedom to form a representation. If too many neurons are defined, the network may become over trained (Heerman, et al., 1992). Therefore, an interest design for number of neurons in the hidden layer is important. The ENVI software algorithm can automatically structure the hidden layer and prune unnecessary processing elements. This is accomplished by an initial over-allocation of PEs to the hidden layer and a subsequent examination and elimination of irrelevant elements (i.e., ones with little weighting). This approach was used in this research to elucidate the structure of the hidden layer by submitting all of the 6, 12 and 18 channels of input data and the corresponding output class to the network and allowing it to self-organize. After pruning, PEs that remained in the hidden layer transformed information from the input data patterns into the output land cover category types.

Data for training and test the networks were selected from two unsupervised clustering methods Isodata and K-mean for 11 clusters on the original image. Fig 2-a and Fig 2-b show 11 clusters with Isodata and K-mean respectively.


Fig 2. Clustered image (a) Isodata, and (b) K-mean.


Based on the clustering results, 11 training areas in the K-means classification result were selected for training the networks. The training areas are depicted in fig 3.


Fig 3. Original image with training areas.


After selecting the training areas, the networks were trained with their parameters. Fig 4 shows the training curves for each network. The figure shows when the input layer has 12 neurons, training is doing faster than other cases. Also, if a network with 18 neurons in input layer was used, the network would not train yet.




Fig 4. Convergency curve for networks with six inputs (top left) 12 inputs (top right) and 18 inpus (down).


Therefore, a network with 12 neurons in input layer, one hidden layer, and 11 neurons in output layer is suit for classification of Landsat ETM+ (fig 5a). To assess the result of this network, two common classification methods were used: maximum likelihood and minimum distance for classification the image (fig 5(b), and 5(c)) and compared the results with that network (Table 1).


Fig 5. Results of classification methods (a) Neural network (b) Maximum likelihood (c) Minimum distance


Method Overall Accuracy(%) Kappa (%)
Neural networks 98.3 97
Maximum likelihood 91.6 89
Minimum distance 87 82


4 Conclusions
One of the important advantages of the neural network is its ability to learn the internal information in the data and recall the knowledge acquired in the learning stage to conduct the classification. In this paper, three different networks with same parameters are used. The results show that if standard deviation is used as element in training vector, the classification results will be affect. It is due to correlation between the bands. So a method such PCA can be used to remove depended bands and then use standard deviation as an attribute in training vectors. It can be concluded from the research reported here, that neural networks have abilities to improve the accuracy of classification about 10% greater than conventional statistically-based classification of Landsat ETM+ images.

References
  1. Bendiktsson J.A., Swain P.H., and O.K. Ersoy, 1990, "Neural Network Approaches Versus Statistical Methods on Classification of Multisource Remote Sensing Data", IEEE Trans. Geosci. Remote Sensing, vol. 28, pp. 540-552, July 1990
  2. Bischof H., Schnider W., and Pinz A., 1992, Multispectral classification of landsat images using neural networks. IEEE Trans. On geoscience and remote sensing, Vol. 30, pp: 482-490.
  3. Chen, Y. Q., Nixon, M. S., and Thomas, D. W., 1995. Statistical geometrical features for texture classification, Pattern Recognition 28 (4), 537-552.
  4. Giraudel, J. L., 2001. A comparison of self-organizing map algorithm and some conventional statistical methods for ecological community ordination, Ecological Modelling 146 (1-3), 329-339.
  5. Heermann P., and Khazenie N., 1992, Classification of multispectral remote sensing data using a backpropagation neural network, IEEE Trans. On geoscience and remote sensing, Vol 30, pp: 81-88.
  6. Hepner G., Logon T., Rittner, N., and Bryant N., 1990, Artificail neoural network classification using a minimum teraining set.
  7. Lee, J., Weger, R. C., Sengupta, S. K., and Welch, R. M., 1990. A neural network approach to cloud classification, IEEE Trans. Geosci. Remote Sensing 28 (5), 846-855.
  8. Luo, X., Jayas, D. S., and Symons, S. J., 1999. Comparison of statistical and neural network methods for classifying cereal grains using machine vision, Trans. of the ASAE 42 (2), 413-419.
  9. Mather, P. M., 1990. Theoretical problems in image classification, In: Clark, J. A. and Sevens, M. D. (Eds.), Application of Remote Sensing in Agriculture, London, Butterworths, pp. 127-135.
  10. McCelland J.L., and Rumelhar D. E., 1986, Parallel Distributed Processing, Vol. 1. Cambridge, MA : MIT Press, 1986.
  11. Moshou, D., Vrindts, E., De Ketelaere, B., De Baerdemaeker, J., and Ramon, H., 2001. A neural network based plant classifier, Comput. Electron. Agric. 31 (1), 5-16.
  12. Pao Y.H.,1989, Adaptive pattern Recognition and Neural Network, Addition-Wesley Publishing Company, Inc., 1989
  13. Tsai, F., 2002. A derivative-aided hyperspectral image analysis system for land-cover classification. IEEE Trans. Geosci. Remote Sensing 40 (2), 416-425.
  14. Topouzelis, K., Karathanassi, V., Pavlakis, P., Rokos, D., 2003. A Neural Network Approach to Oil Spill Detection Using SAR Data, 54th International Astronautical Congress, Bremen, Germany, 29.Sept - 03.Oct
  15. Ritter, N. D. and Hepner, G. F., 1990. Application of an artificial neural network to land-cover classification of thematic mapper imagery, Computers and Geosciences 16 (6), 873-880.
  16. Salu, Y. and Tilton, J., 1993. Classification of multispectral image data by the binary diamond neural network and by nonparametric, pixel-by-pixel methods, IEEE Trans. Geosci. Remote Sensing 31 (3), 606-617.
  17. Zhang, J. and Foody, G. M., 2001. Fully-fuzzy supervised classification of sub-urban land cover from remotely sensed imagery: statistical and artificial neural network approaches, Int. J. Remote Sensing 22 (4), 615-628.