Land cover classification by principal component histogram method
Katsunori Furuya
Graduate school of Science and Technology
M. Akasaka, R. Tateishi
Remote Sensing and image Research Center
H. Ishii
Faculty of Horticulture Chiba University
1-33 Yayoi-cho, Chiba city, Chiba, 260 Japan
Study area
Study area was located 40km northeast of Tokyo in Japan. Its width extends about 18km from north to south and about 27km from east to west. The study area contains city areas, agricultural areas, river and lake. We used five categories, forest paddy, urban, water and golf course.
Methodology
Figure 1 provides the flow processing, the details of which is described below.
- Geometric registration
TM data was reiterated and resampled. The number of reiterated TM data pixels in the study area is 900 by 600 amounting to 720,000.
- Ground truth data
Ground truth data were collected from an an enhanced TM image and maps. The number of ground truth data for classification is 1399 pixels. The number of check data for evaluation of results is 14971 pixels includes ground truth data for classification .
Figure. 1 Flow of processing
- PC Histogram method
- Steps of PC Histogram method
PC Histogram method is supervised classification method, which consists of following five steps. The flow of PC Histogram method is shows figure 2.
Step1: Principal component analysis In order to reduce the dimension, principal component analysis is applied, and 1st, 2nd and 3rd components are used for classification.
Step2: Three-dimensional histogram The each distribution of ground truth data in three-dimensional histogram is produced from their 1st, 2nd and 3rd principal components. Sampling interval of three principal components for production of histograms is decided by the variance of principal components and frequency of ground truth data.
Step 3: Interpolation of histogram frequency data into continuous three dimensional grid data
Histograms produced in the Step 2 have discontinuous frequency distribution. Frequency data in some intervals may be zero though surrounding sampling intervals have non-zero frequency data. Interpolation processing is applied in order to produce continuous frequency distribution for the next Step.
Step4: Division of dimensional space into regions assigned to categories In order to compare distributions of all categories, frequencies of Step 3 is converted to normalized frequencies. The normalization here is the processing by which the sum of frequencies of each category becomes equal. By comparing normalized frequencies of different categories at every sampling interval, the category which has the highest normalized frequency is selected. By this processing three-dimensional space is divided into regions assigned to categories.
Step5: Classification of image data At every pixel of TM data, the values of 1st, 2nd and 3rd principal component are calculated. Classification is performed by this value and divided dimensional space of step 4.
- Maximum likelihood method
For the verification of PC histogram method, maximum likelihood method was performed with same TM data and ground truth data.