|
|
|
Classification of Remotely Sensed Data using Gravitational Symbolic Clustering
D.S.Vinod
Research scholar, Department of Computer Science and Engineering
Sri Jayachamarajendra College of Engineering Manasagangothri, Mysore 570 006 Karnataka, India
Email:-ds_vinod@mailcity.com, vinod@sjce.ac.in
Tel:-0821-512568(college), 481314(Res.)
T. N. Nagabhushana
Asst. Prof., Department of Computer Science and Engineering Sri Jayachamarajendra College of Engineering Manasagangothri, Mysore 570 006,Karnataka, India. Email:-tnn@sjce.ac.in
Tel:-0821-512568(college)
Introduction
Symbolic objects are extensions of classical data types. In conventional data sets, the objects are individualized, whereas in symbolic data sets they are more unified by means of relationships (2-4,16-17).
Gowda and Diday(2-3) have presented an agglomerative clustering algorithm clustering algorithm for symbolic objects. They form composite symbolic objects (CSO) using a Cartesian join operator whenever a mutual pairs of symbolic objects are selected for agglomeration based on minimum dissimilarity and maximum similarity.
The combined usage of similarity and dissimilarity measures for agglomerative and divisive clustering of symbolic objects is presented by Gowda and Ravi(5-6).
To manage time and memory requirements, the clustering method involves the use of data reduction techniques(11-12). Here the data reduction as proposed by K.C.Gowda and S.K.Prakash(15) is modified to accommodate the task without dimensionality reduction.
We have also incorporated the gravitational clustering techniques to cluster multispectral images of satellites(7,10). We have also compared our results using cluster indices(4). Both Agglomerative and Disaggregative Gravitational Symbolic approaches are considered. The concept of mutual pairs is used to merge or split the samples.
Data Reduction Techniques
Data reduction techniques are used before clustering the data. The data reduction technique uses bin arrays to store useful information of the reduced image. Here the m samples of d-dimensions are mapped on to bin arrays. Two cases are:
1) When dimension d ģ2 and
2) d = 1.
The results of the reduced data must be mapped on to the original data after clustering
the reduced data. This requires a set of labels of all the m samples, which identifies the location
of each sample in the nonempty bins. This is used for making the classification-output
map after clustering the reduced data.
1 When dimension d ģ 2
This works by assigning d-dimensional m number of samples to any one of the 2-dimensional arrays. The d 2-dimensional R ī R bin arrays (R is a positive integer as required) are used to store the useful information of all the features. The data before reduction requires normalization between 1 to R. The normalized features are assigned to the RīR arrays.
The idea that the first feature is used to determine the column position of the bin to which the sample is to be assigned. The combination of the remaining feature values is used to determine the row position of the bin.
If the first sample have the feature values f11, f12, f13,
,f1d, which are normalized between 1 to R. This sample is assigned to a bin having a column value of f11. As this is the first sample, the row value of its bin is also 1.
If the second sample have the feature values f21, f22, f23,
,f2d. Say if f11 = f21, then the bin to which this sample is assigned also has a column value of f11 and its bin has a row value of 1 if the below condition is satisfied:
Where T is a user-defined limit. If the above condition is not satisfied, then the row value of the bin is 2. Accordingly for a given column position, a new bin with a higher row value is considered only when the present sample cannot be assigned to any other bin with lower row value.
The steps are as follows:
If S 1, S 2, S 3,
S d be the RīR 2-dimensional arrays to store updated information of each feature and W is another such RīR 2-dimensional such bin-weight array to update number of samples assigned to each corresponding bin.
- Normalize the d dimensional m samples between 1 to R.
- The first feature gives the column position of the bin.
- The row value of the particular bin is determined as above.
- As each sample is assigned to a bin the arrays S1, S2, S3,
Sd is updated.
- Also the number of nonempty bins is updated in W bin-weight array.
|
|
|