Graphical techniques for component type: A case study
Dr. Omar Mohd. Rijal
Dept. of mathematics, university of Malaya
Ismail harun forest research institute of Malaysia
Norliza Mohd. Noor
Malaysian centre for Remote Sensing Ministry of Science
Technology and The envoronment
Malaysia
Abstract
The arbitrary nature of applying statistical data analysis methods in studying remotely-sensed images create “uncertainty”. By “uncertainty” we mean any “activity” that could bias our final conclusion; for example assuming unimodality in histogram instead of a mixture of histogram. The following study attempts to reduce particular forms of “uncertainty”.
This paper illustrate the problem of determining component type. A test area has been chosen to be studied using conventional software employing band 1, band 2 and band 3 of SPOT data. Statistically, mixture distributions are of interest, and statistically, mixture distributions are of interest, and particular attention paid to the determination of the number of components. This paper will emphasize informal graphical techniques. We take a critical look in the usage of histograms of gray levels. A brief review of other (graphical) techniques is given. Particular practices ( example assuming normality ) in remote sensing will be discussed.
Introduction
- Image Data
Agricultural land in pasir Gudang, Johor ( our test area ) is being converted into industrial use at a rapid pace. Identification of type of land use is required for development of the pasir Gudang area.
The digital image of pasir Gudang was obtained from SPOT using band 1, band 2 and band 3. The area covered is about 3095 x 3095 pixels. Statistical classification of histograms will be used to identity types of land use.
- Classification using histograms:
The histogram of gray – levels ( 0-255 for our data ) can be used to indicate ( perhaps suggest) the number of component types. If each component is represented by a hump (distribution), several components will be represented by several humps ( a mixture Distribution ), see for example Everett and Hand, ( 1981) Where we may have the linear combination of three normal distributions. Visually, a hump is seen to exist by the presence of a mode ( or peak). Unfortunately, unimodality does not imply it is not a mixture, so the histogram can be deceptive.
Our study concentrates on investigating the “information provided” by the three histograms ( due to different bands ) on the same test area ( Pasir Gudang ), see figure 1.


Figure 1
Clearly histogram band 1 ( henceforth referred to as HIST 1) and HIST 2 visually suggest uni-modality whilst HIST 3 suggests the existence of 2 modes. The inference could therefore be that HIST 1 and HIST 2 suggest some form of uniformity for the test area whilst HIS 3 suggest otherwise.
We propose the following procedure to investigate unimodality.
- Hypothesis testing of histograms
- Using the techniques of Bhattacharya
- Density estimation techniques.
Data Analysis
(i) Hypothesis testing
Let us represent the histograms by
{ X
j’ f
i (X
j); J = 1, …, k; I = 1,2,3 }
Where X
j is the midpoint of each of k classes ( or bandwidth ), and F
i (X
j) the corresponding coll-frequency. The subscript I denote the it Band.
Comparison of two histograms involve considering
Df
lm = f
l(x) - f
m(x); l=1,2,3
where the non-parametric Sign Test is applied to
D F (X).
Let N (+) = number of times
Df
lm(X) is positive.
Let
q = probability f
l(X) > f
m(X).
Clearly N (+)
» in (k,
q) where K = 256.
The hypothesis to be tested then is;
H
o :
q = 0.5
"S H
1 :
q ¹ 0.5
Accepting H
o : o = .5 implies equality of the Histograms
{ i.e. samples from f
l(X) = f
m(X)} We use the normal approximation to obtain the critical cut-of points;
i.e. N(t)
» Normal (k
q, K
q( 1-
q)
if H
o is true,
Result:
comparing Hist 1 and Hist 2, N(+) = 215
Hist 2 and Hist 3, N(+) = 154
Hist 1 and Hist 3, N(+) = 156
Clearly in all three cases N(+) lie outside the 95% confidence interval, implying we reject Ho, or all three histograms are significantly different,