Logo GISdevelopment.net

GISdevelopment > Proceedings > ACRS > 2000


1989 | 1990 | 1991 | 1992 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2002
Sessions

Agriculture & Soil

Water Resources

Coastal Zone Monitoring

Digital Photogrammetry

Environment

Forest Resources

GIS & Data Integration

Hazard Mitigation

Image Processing

Educational & Profession

Global Change

Landuse

Mapping from Space & GPS

SAR/InSAR

Oceanography

Hyperspectral & Data Acquisition System

AirSAR/MASTER

Poster Sessions
  • Session 1
  • Session 2
  • Session 3



  • ACRS 2000


    Image Processing


    A Synergistic Automatic Clustering Technique (Syneract) for Multispectral Image Analysis

    3.2 Generation of a Hyperplane
    An augmented pixel vector is defined as:

    X = [Vt, 1]t , where Vt = [V1,V2, . . . , Vn]. (3)

    The process of generating a weight vector is illustrated in Figure 2. Let the pixels be augmented and expressed by the general column vector X. One pixel P is chosen from this cluster so that |P - C| > |X - C| for all X, where C is an arbitrarily chosen position vector in the feature space. Let C be the grand mean vector computed from all of the pixels in a single cluster. Under this condition, P and C define a line from the grand mean C to the farthest pixel P. Assume that there is a single cluster in the two-dimensional feature space consisting of two spectrally distinct clusters of smaller size represented by U and V, respectively so that {X} = {U}?{V}, Pe{X}. Assume further that P comes from {U} and is denoted by U*, U* = P. The next step is to find all of V so that (V - C)?(U* - C) ? 0 , {V} ( {X}. Find V1* or V2* from {V} so that V1* must satisfy |(V1* - C)?(U* - C)| = 0 and V2* must satisfy |(V2* - C)?(U* - C)| < |(V - C)?(U* - C)|. Since the brightness value of each pixel in remotely sensed data is recorded as an integer, V1* cannot always be found in all cases. Thus V2* having the biggest negative dot product value is used as an alternative of V1*. Lines perpendicular to (U* - C) and passing through V1* or V2* will be hyperplanes (weight vectors) separating {U} and {V}, as shown in Figure 2. According to the concept of the single-sided decision surface proposed by Lee and Richards (1984), these two hyperplanes (weight vectors) are defined by equations shown as follows:

    W1 = [(U* - C)t , V1*?(U* - C) ]t (4)
    W2 = [(U* - C)t , V2*?(U* - C) ]t (5)

    3.3 Test for Relevance of Splitting
    SYNERACT will split each cluster formed at the previous separation into two clusters of smaller size. The splitting process is theoretically continued until there is only one pixel in each cluster. Therefore, this process must be controlled by two input parameters, including the maximum number of clusters to be considered (Cmax) and the minimum percentage of pixels allowed in a single cluster (P%). Each split is tested for these two parameters a posteriori in order that a homogeneous cluster will not be split inappropriately. Since each cluster is the basis for an information class, Cmax will become the maximum number of classes to be formed. Some clusters with number of pixels less than P% can be eliminated, leaving less than Cmax clusters. The pixels in these discarded clusters will be reassigned to an alternative cluster.

    4. Results and Discussion
    The two-date video image data used for testing two clustering algorithms had eight bands mentioned in section 2. The total number of band combinations (i.e. 8C8 +, . . . , + 8C1) was equal to 255. The band combination of 2, 3, 4, and 8 with best separability was chosen for the test using the program module Separability \ Euclidean Distance Measure in ERDAS Imagine 8.3.1 software.

    4.1 Influences of Initial Seed Assignment
    Table 1 presents the classification results of ISODATA using six sets of initial cluster centers randomly generated. Overall accuracies varied from the lowest 65% to the highest 85% for six sets of initial seeds randomly generated. Clearly, the choice of the initial locations of the cluster centers was critical, because it evidently affected final classification results. This outcome was contrary to the ideas proposed by Richards (1993) and ERDAS (1997).

    This study also applied the procedure of initial seed assignment proposed by Jensen (1996) and ERDAS (1997) to specify the initial locations of the cluster centers required by ISODATA. Table 1 also presents the classification results generated from ISODATA using five initial seed allocations. In this case, µ±1*s had the best overall accuracy (93%) and µ±5*s had the lowest overall accuracy (75%), whereµ is the mean vector and s is standard deviation; however, the analyst would have to spend much of his time on 'try and see' to determine an optimal number of standard deviations for other cases.

    4.2 Efficiency
    4.2.1 Processing Speed
    Table 2 presents the lengths of computational time of two clustering algorithms varied with the number of pixels in the data set they processed and the number of clusters generated. The lengths of computational time spent by ISODATA were 6-39 (TI/TS) times as long as those spent by SYNERACT, as the number of pixels increased from 3,400 to 17,640 and the number of clusters was set 16. Table 3 presents the lengths of computational time of two clustering algorithms varied with different band combinations. The lengths of computational time spent by ISODATA were 6-12 (TI/TS) times as long as those spent by SYNERACT, as the number of bands varied between two and eight.

    4.2.2 Ease of Use
    A sophisticated ISODATA algorithm described by Jensen (1996) normally requires the analyst to specify seven parameters. The ISODATA program of ERDAS Imagine software requires the user to specify four parameters identical to some of parameters proposed by Jensen and to initialize cluster means along a diagonal axis or principal axis, but skips splitting and merging parameters. In contrast, SYNERACT developed in the study required the analyst to specify only two parameters already mentioned in section 3. Moreover, it eliminated the need for a priori estimates on the locations of the initial clusters. Accordingly, SYNERACT was more users-friendly for the beginner than ISODATA.

    4.3 Classification Accuracy
    Table 4 presents the overall accuracies and the accuracies of individual categories of the two clustering approaches. The overall accuracy (92%) of SYNERACT was only about 1% lower than that of ISODATA (93%); this difference was probably due to chance. Accordingly, SYNERACT and ISODATA were equally matched in classification accuracy.

    5. Conclusions
    SYNERACT required a minimum of user input with only two parameters, thereby reducing the load of specifying other complex parameters by trial and error on the beginner. By comparison, ISODATA was not users-friendly since it required the analyst to specify many input parameters, particularly the starting positions of the initial clusters. This study showed that an inappropriate choice of this parameter for ISODATA significantly reduced final classification accuracies. This outcome obviously was opposite to the ideas stated by Richards (1993) and ERDAS (1997). In contrast, SYNERACT had the ability to determine this parameter automatically from the data set itself, and thus avoided the adverse effect on a final clustering. SYNERACT was very fast, whereas ISODATA was time-consuming. SYNERACT had the capability to compete with ISODATA in classification accuracy. In sum, this study showed that SYNERACT was really efficient and well suited to serve as an alternative of ISODATA for applications in remote sensing image analysis involving a large data set, which was opposite to the thoughts proposed by Richards (1993).

    References
    • ERDAS, 1997. ERDAS Field Guide. ERDAS, Inc., Atlanta, Georgia, pp. 225-229.
    • Jensen, J. R., 1996. Introductory Digital Image Processing--A Remote Sensing Perspective. Prentice Hall, Inc., New Jersey, pp. 197-256.
    • Lee, T., and J. A. Richards, 1984. Piecewise linear classification using seniority logic committee methods with application of remote sensing. Pattern Recognition, 17(4), pp. 453-464.
    • Nilsson, N. J., 1965. Learning Machines. McGraw-Hill Book Co., New York, pp. 15-27.
    • Richards, J. A., 1993. Remote Sensing Digital Image Analysis: An Introduction. Springer-Verlag, Berlin, German, pp. 229-244, 265-291.
    • Richardson, A. J., R. M. Menges, and P. R. Nixon, 1985. Distinguishing weed from crop plants using video remote sensing. Photogrammetric Engineering & Remote Sensing, 51(11), pp. 1785-1790.
    • Swain, P. H., 1978, Fundamentals of Pattern Recognition in Remote Sensing. In: the Remote Sensing: the Quantitative Approach, edited by Swain, P. H. and Davis, S. M., McGraw-Hill Book Co., New York, pp. 136-187.
    • Viovy, N., 2000. Automatic classification of time series (ACTS): a new clustering method for remote sensing time series. International Journal of Remote Sensing, 21(6), pp. 1537-1560.

    Table 1. Classification Accuracies of ISODATA Using Six Sets of Randomly Generated Initial Seeds and five sets of ERDAS's Initial Seed Assignment.

    Random Seed Assignment ERDAS's Initial Seed Assignment
    Set Overall Accuracy (%) No. of Standard Deviation (s) Overall Accuracy (%)
    1 83 1 93
    2 80 2 88
    3 70 3 80
    4 76 4 75
    5 65 5 75
    6 85 - -


    Table 2. The Lengths of Processing Time Spent by SYNERACT and ISODATA for Different Number of Pixels.

    No. of Pixels No. of Clusters SYNERACT ISODATA TI/ TS
    TS (Second) No. of Iterations TI (Second) No. of Iterations
    3400 16 3.1 6 17.2 24 6
    6460 16 5.1 5 55.1 42 11
    9499 16 7.0 10 84.6 45 12
    12284 16 8.8 7 258.4 105 29
    17640 16 12.3 7 478.7 138 39


    Table 3. The Lengths of Processing Time Spent by SYNERACT and ISODATA for Seven Sets of Band Combinations.

    Band
    Combination
    SYNERACT ISODATA TI /TS
    T S(Second) No. of Iterations TI (Second) No. of Iterations
    2 3 4 8 7.0 10 84.7 45 12
    2 3 4 5 8 8.0 4 74.8 33 9
    1 2 3 4 5 8 10.6 8 93.2 35 9
    1 2 3 4 5 6 8 12.1 8 69.0 22 6
    1 2 3 4 5 6 7 8 13.3 8 92.0 26 7


    Table 4. Classification Accuracies of the SYNERACT and ISODATA Methods.

    Land-cover Type Accuracy (%) of SYNERACT Accuracy (%) of ISODATA
    Cotton 90 93
    Soil 97 97
    Johnsongrass 86 93
    Cantaloupe 95 95
    Pigweed 91 85
    Sorghum 90 93
    Overall Accuracy 92 93




    Figure 1. Plot Identification Map of the Study Area
    (Not Drawn on the Scale).




    Figure 2. A Two-dimensional, Hypothetical Case with Two Clusters Illustrates the Process of Generating the Weight Vector W1 or W2 that Implements a Hyperplane of Hierarchical Descending Clustering.


    Page 2 of 2
    | Previous |

    Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book