|
|
|
Classification of Satellite Images by using Self-organizing map and Linear Support Vector Machine Decision tree

Mehmet I. Saglam
Istanbul Technical University, Informatics Institute
Advanced Technologies in Engineering
Satellite Communications and Remote Sensing Program, Maslak
Istanbul-Turkey
Email: misaglam@be.itu.edu.tr

Bingül Yazgan
Istanbul Technical University, Faculty of Electrical and Electronic Engineering
Electronics and Communication Dept., Maslak
Istanbul-Turkey
Email: yazgan@ehb.itu.edu.tr

Okan K. Ersoy
Purdue University, School of Electrical and Computer Engineering
West Lafayette, Indiana,USA
Email: ersoy@purdue.edu
Abstract
High resolution imaging sensors are very important in modern remote sensing technology.
These sensors produce multispectral data and thereby result in one image per wavelength.
With the growth of dimensionality and higher spectral resolution, a large number of classes
can be identified. When pattern recognition methods are applied to remote sensing problems,
smallness of the training data, which is used for designing the classifier, is an inherent
problem. Furthermore, complex statistical distribution of a large number of classes
constitutes another important problem.
The main purpose of developing a special classifier in this thesis was to design a system
which performs better than similar classifiers such as existing support vector machines
(SVM’s), specifically in dealing with complex remote sensing classification problems. A new
support vector learning algorithm, which is called Linear Support Vector Machine Decision
Tree with Self Organizing Map (SOM-LSVMDT) was further developed for remote sensing
to solve these problems. The SOM-LSVMDT consists of a clustering part and a binary tree
structure with linear support vector machines in all tree nodes. The SOM-LSVMDT simplifies
the model selection problem inherent in SVM design. In addition, the SOM-LSVMDT has in-built
properties for dealing with classes which can be considered as rare eve nts. There are two
reasons for the occurrences of rare events. The first one is by nature due to a class having low probability of occurrence. The second one is due to the SOM-LSVMDT’s structure, in which
classification in the decision tree nodes lead to rare event classes. In order to solve rare event
problem, the randomly adding vector method is used to prevent all data vectors from lying on
only one side of the hyperplane during training. This problem naturally occurs in the deeper
tree nodes. In the SOM-LSVMDT, the SOM part divides the remote sensing data in to a
number of partitions. As a consequence, small decision trees are generated, thereby the
problem of rare events is reduced in scope.
Before training with LSVM, training data is partitioned by using the SOM. Then, multiple
linear hyperplanes are constructed while traversing down the tree by using the LSVM. In
testing part, the output of the current node is selected with the votes of each binary
classification. This process is repeated until a leaf is reached. The SOM-LSVMDT stores the
final class label at the leaf assigned to the data vector reaching this leaf. The computer
experiments show that the SOM-LSVMDT can achieve better performance than linear
support vector machine decision tree (LSVMDT). The eight, ten, and thirteen Colorado data
sets are used to obtain experimental results. They are known and very complex remote
sensing classification problems. The number of samples of each data set is showed in Table 1.
Table 1. Number of Samples each Colorado Data Set.
| | 8-Class Colorado | 10-Class Colorado | 13-Class Colorado |
| Training | 1600 | 1188 | 1008 |
| Test | 3000 | 831 | 1011 |
Table 2. Performance of the SOM-LSVMDT using the Colorado data sets.
| | LSVMDT | SOM-LSVMDT II | SOM-LSVMDT III | SOM-LSVMDT IV |
| 8 Class Colorado | %7,17 ERROR | %3,4 ERROR | %0,97 ERROR | %1,90 ERROR |
| 10 Class Colorado | %48,62 ERROR | %35,02 ERROR | %30,69 ERROR | %35,98 ERROR |
| 13 Class Colorado | %29,48 ERROR | %22,64 ERROR | %20,08 ERROR | %18,40 ERROR |
For example the best previous result offered by support vector machines for ten-class Colorado problem was around 51% (49% error). The results are shown in the Table 2. The experiment is repeated for three times. Firstly, the Colorado data is divided to two clusters. For 10-class Colorado data set, the testing error decreases from 48,62% to 35,02%. Classification errors of 8 and 13 class of Colorado sets decrease, too. They are shown in the column of SOM-LSVMDT II. After that, Self Organizing Map divide the Colorado data sets to three clusters (SOM-LSVMDT III). Again classification errors decrease for all three Colorado data sets. For the last column, the performance of SOM-LSVMDT is better than LSVMDT, but the classification errors increase except 13-class Colorado data set. Whenever the Colorado data sets are clustered, the statistical distribution of the classes changes. The number of classes and samples decreases in each cluster, if the more clusters are obtained by using SOM. Furthermore, the classes already have nonlinear separation. For example, class of Douglas fir/ Ponderosa pine/Aspen has only 25 samples, while water has 408 samples in the 10-class Colorado data set. 2 or 3 clusters are appropriate for a data set which has a maximum 10 classes. To obtain more clusters by using SOM, the number of samples and classes should increase. The classification error of 13-class Colorado data set also decreases after four clustering. But the most important point is that performances of all SOM-LSVMDT’s are better than LSVMDT.
|
|
|