Logo GISdevelopment.net

GISdevelopment > Proceedings > ACRS > 2000


1989 | 1990 | 1991 | 1992 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2002
Sessions

Agriculture & Soil

Water Resources

Coastal Zone Monitoring

Digital Photogrammetry

Environment

Forest Resources

GIS & Data Integration

Hazard Mitigation

Image Processing

Educational & Profession

Global Change

Landuse

Mapping from Space & GPS

SAR/InSAR

Oceanography

Hyperspectral & Data Acquisition System

AirSAR/MASTER

Poster Sessions
  • Session 1
  • Session 2
  • Session 3



  • ACRS 2000


    Image Processing

    Printer Friendly Format

    Page 1 of 2
    | Next |

    A Synergistic Automatic Clustering Technique (Syneract) for Multispectral Image Analysis

    Kai-Yi Huang
    Associate Professor, Department of Forestry,
    National Chung-Hsing University,
    250 Kuo-Kuang Road, Taichung, Taiwan 402
    Tel: +886-4-2854663; Fax: +886-4-2873628
    E-mail:kyhuang@dragon.nchu.edu.tw

    Keywords: Remote Sensing, Clustering Algorithm, Unsupervised Classification

    Abstract
    ISODATA has been widely used in unsupervised and supervised classification. However, it requires the analyst to have a priori knowledge about the data for specifying input parameters. The beginner will spend much time on determining those parameters by trial and error. It is time-consuming because of its iterative process. The study aimed to develop a synergistic automatic clustering technique (SYNERACT) that combined the hierarchical descending and K-means clustering procedures to avoid limitations with ISODATA. They were compared according to classification accuracy and efficiency using two-date video images. An inappropriate seed assignment for ISODATA was shown to significantly reduce accuracies. In contrast, SYNERACT was required to specify only two parameters. It determined the locations of the initial clusters automatically from the data, thereby avoiding those limitations. Because SYNERACT passed through the data set in times largely fewer than ISODATA did, SYNERACT was much faster than ISODATA. SYNERACT also matched ISODATA in accuracy. Accordingly, SYNERACT was a suitable alternative for ISODATA in the multispectral image analysis.

    1. Introduction
    Multispectral image classification is one of the most often used techniques for extracting information from remotely sensed data of the Earth, which can be performed by unsupervised classification using clustering (Jensen, 1996). Clustering performs the valuable function of identifying some of the unique classes but with very small areal extent that might not be initially apparent to the analyst applying a supervised classifier. Therefore, the clustering method seems to be a much more practical approach for information extraction (Viovy, 2000). There are two main families of clustering methods: the Iterative Self-Organizing Data Analysis Techniques (ISODATA) clustering and hierarchical clustering approaches (Viovy, 2000).

    The ISODATA (or K-means) algorithm is a widely used clustering method to partition the image data in the multispectral space into a number of spectral classes (Jensen, 1996). This type of clustering algorithms, however, suffers from several limitations. One of the limitations is that ISODATA requires the user to specify the number of clusters beforehand. The second limitation is that it requires the user to specify the starting positions of these clusters through an educated guess. The clustering starts with a set of arbitrarily selected pixels as cluster centers with exception no two may be the same (Swain, 1978). The choice of the initial locations of the cluster centers is not critical, but it will evidently affect the time it takes to reach a final, acceptable result (Richards, 1993). It does not matter where the initial cluster centers are located, as long as enough number of iterations is allowed (ERDAS, 1997). Because no guidance is available in general, a logical procedure is adopted in LARSYS (Richards, 1993). The initial cluster centers are chosen evenly distributed along the diagonal axis in the multidimensional feature space. This is a line from the origin to the point corresponding to the maximum digital number in each spectral component. Other similar procedures used in ISODATA have been proposed in Jensen (1996) and ERDAS (1997). However, few studies have investigated the problem for the past years about how random choice of the initial cluster centers and those procedures proposed by ERDAS (1997) and Jensen (1996) may affect final classification results. The study thus will examine this very important but apparently neglected problem.

    The third limitation associated with ISODATA is its processing speed. ISODATA is computationally intensive when processing large data sets since, at each iterative step, all pixels in the whole data set must be checked against every cluster center. Furthermore, this method tends to suffer from performance degradation as the number of bands, the number of pixels, or the number of clusters increases (Richards, 1993; Viovy, 2000).

    The second type of clustering methods that does not require the analyst to specify the number of clusters beforehand is hierarchical clustering (Richards, 1993). As pointed out by Richards (1993), a divisive hierarchical clustering (or hierarchical descending) algorithm has been developed in which the data are initialized as a single cluster that is progressively subdivided. This method is more computationally intensive and is rarely used in remote sensing image analysis since usually a large number of pixels are involved.

    The goal of this study was to develop a synergistic automatic clustering technique (SYNERACT) that combined both hierarchical descending and K-means approaches to avoid before-mentioned limitations associated with ISODATA. The study attempted to accomplish the three specific objectives using two-date video image data. The first objective was to develop SYNERACT based on hyperplane, dynamical (iterative optimization) clustering principles, and binary tree. The second one was to compare SYNERACT with ISODATA in terms of classification accuracy and efficiency. The third one was to show that SYNERACT was very fast and well suited to act as a substitute for ISODATA in remote sensing applications, which was opposite to the ideas suggested by Richards (1993).

    2. Study Area And Materials
    The study area was located near Weslaco in Hidalgo county, Texas. It was a completely randomized block designed field experiment consisting of plots of the following surface features: (1) cotton, (2) cantaloupe, (3) sorghum, (4) johnsongrass, (5) pigweed, and (6) bare soil (Figure 1). Each of the 24 plots (six treatments and four replications) measured 7.11 m by 9.14 m, making the total site dimension 42.67 m by 36.56 m (Richardson et al., 1985). However, the fourth row (drawn with dash line) was excluded from the study due to damage of this portion of the video data file.

    The two-date video image data were acquired on 31 May and 24 July 1983 near noon on moderately sunny days from an altitude of 900 m. The video imaging system used to collect data for the study is described in detail in Richardson et al. (1985). Spectral bands 1-4 were acquired on 24 July; spectral bands 5-8 were acquired on 31 May. Channels 1 and 5 were blue bands (420 to 430 nm), Channels 2 and 6 were red bands (640 to 670 nm), channels 3 and 7 were yellow-green bands (520 to 550 nm), and channels 4 and 8 were near-infrared bands (850 to 890 nm). Multiple-date image normalization using regression (Jensen, 1996) was performed to radiometrically correct the data set used in the study since atmospheric effects probably affected pixel brightness values of the two-date video image data. Accuracy assessment was performed over 18 plots from row 1 to row 3 in the experimental field with the exception of row 4.

    3. Method and Rationale
    SYNERACT in fact combines the concepts of hyperplane, iterative optimization clustering and binary tree. The hyperplane divides a cluster into two clusters of smaller size and computes their means. The iterative optimization clustering procedure (IOCP) is based on estimating some reasonable assignment of the pixel vectors into candidate clusters and moving them from one cluster to another in such a way that an objective function is minimized (Richards, 1993). The binary tree is a useful data structure that can store the clusters successively generated from each split. SYNERACT treats all of the image pixels as a single cluster. The single cluster is placed at the root of a binary tree firstly, and then is progressively subdivided. At each split, the algorithm will attempt to divide each cluster defined at the previous split into two clusters of smaller size and to compute their centers. IOCP is then performed. Each split is tested for relevance a posteriori. Two clusters of smaller size generated from each split by a hyperplane are placed at the left and right subtree nodes, respectively. If the test fails, the cluster will no longer be split in further separation. The cluster is called stabilized. This process continues until all clusters are stabilized.

    3.1 Definition of a Hyperplane
    To appreciate the development of SYNERACT, it is required to understand the concept of hyperplane. The family of linear discriminant functions (Nilsson, 1965) can be expressed in the form as follows:

    f (X) = W1 * X1 + W2 * X2 + . . . + Wn * Xn + Wn+1, (1)

    where W1, W2, . . . , Wn, Wn+1 are weighting coefficients. f is a linear function of the components of an augmented column vector X.

    A simple linear separation is performed by a linear discriminant function that partitions the feature space into two regions. The linear discriminant function can be viewed as a separating surface in which the simplest form is a hyperplane (Nilsson, 1965). A hyperplane partitions the feature space into two regions defined as:

    f (X) = W·X > 0 and f (X) = W̒X ̒ 0, (2)

    where W = [W1, W2, . . . , Wn, Wn+1], Xt = [X1, X2, . . . , Xn, 1], and n = dimension of the feature space.

    Assume that there is a cluster in the feature space. A weight vector W is viewed to implement a linear separating surface (hyperplane) to divide a cluster in the feature space into two clusters of smaller size (children). The augmented pixel vectors of the one child-cluster have a positive dot product value with W, while the other child-cluster consists of pixels (lying on the other side of W) that have a zero or negative dot product value with W. The former is categorized as S1 and the latter is categorized as S2. Centers of the two sets are computed according to all of the pixels in the two sets, respectively. The sets of parent cluster (S) and two child-clusters (S1and S2) can be defined as:

    S1 = {XeS| W̒X > 0}; S2 = {Xe S| W̒X̒0}; S = S1 ̒ S2.
    Page 1 of 2
    | Next |

    Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book