|
|
|
Digital Image Processing
|
Development of a NOAA image database with feature-based retrieval functions
Feature Extraction of NOAA-AVHRR Images
In order to extract features of NOAA-AVHRR images, first of all we must classify the raw images. Classification algorithm to be applied in this system must be fast, and absolute accuracy is less crucial because, essentially, the classified images are only required to stand up to global feature description. An automatic classification method has been proposed in [1][2], but is mainly utilizes the absolute brightness temperature thresholds obtained empirically, and can not be applied to this system, because we use uncalibrated data and the temperature thresholds is inapplicable to our case due to the regional difference. Because of the huge quantities of NOAA-AVHRR images, the classification techniques proposed so far which is fundamentally based on the pixel-by-pixel processing are not adaptive in this case. We, therefore, select the histogram-based approaches to classify NOAA-AVHRR images. Many approaches based on histograms are proposed for the image binarization, even if most of them can be expanded to multivalue quantization, the class number must be specified previously. For NOAA-AVHRR images, however, class number is not invariant because the variation of images in different weather conditions. Consequently, a peak detection technique is selected to determine class number and thresholds.
- Peak Detection Method Based on the Histograms
The peak detection technique proposed in [5][6] uses the image cumulative distribution function (cdf) to locate the peaks of the histogram. The peaks are located using the zero-crossing and local extrema of a peak detection signal generated from the cdf. For an image represented by M gray levels, the cdf c (n) can be derived from the gray-level histogram, and from c (n) a new function cN (n) can be obtained by [1].
 Eq.(1)
Where, Ä means convolution operation and a uniform rectangular window wn is defined as in (2).
wN(m) = 1/N, - (N-1)/2 < m < (N - 1)/2 .......................(2)
then a peak detection signal rN (n) can be defined as in (3).
Eq.(3)
The following principles are applied to the detection signal rN to
estimate the start, maximum and end points of the peaks: (1) A zero-crossing
of the detection signal to negative values indicates the start of a
peak, and denoted by si, for the ith one. A zero-crossing
of the detection signal to positive values following a negative
crossover estimates the gray level at which the peak attains its
maximum, and this gray level is denoted by m1. Similarly, Si + 1 and mi +1 can be
obtained (2) The gray level between two successive negative crossovers
at which the detection signal attains its local maximum is defined to
be the end points of the peak For the it peak, this peak, this gray
level is denoted by ei. One peak will be represented by
such three parameters (si, mi, ei)
later in this paper.
Obviously, the sensibility of the above peak detection signal depends on the parameters N in (2), which is referred to as peak-detection parameter. When this technique is applied to real image histograms, e.g. NOAA-AVHRR image histograms, it is very difficult to determine the value of N because of the variation of detected peaks b using different N. We, therefore, propose an adjustment method to derive the optimal number of peaks no matter what the parameter N is specified. The adjustment method used by us utilizes the square of Fisher distance (FD2) shown in (4) and Maharanobis generalized distance (MD) to check two successive peaks under the hypothesis of Gaussian-like distribution (i.e. bimodality check).
FD2 = n(m1 - m2)2 / (n1 s21 + n2 s22 .......................(4)
Where, n1, n2 are the sample numbers of two distributions, respectively. Correspondently, m1, m2 and s1.,s2 are the means and standard deviations.
As pointed out in [7], FD2 attains the maximum at which the
point is forest from the center of a Gaussian distribution. We
calculate FD2 and MD for two successive peaks denoted by
(si, mi, ei) and (si+1,
mi+1, ei+1), respectively. If FD2 attains the
maximum outside (mi, mi+1) then we combine the
two peaks into a cluster denoted by
(si, max (mi, mi+1), ei+1) and repeat this process until no peaks can be combined. MD is used to calculated
the percentages of the two peaks as another adjustment criterion to avoid the noise and exclude much small peaks.
|
|
|
|
|
|
|