Hyperspectral Image Compression Using Three-Dimensional Wavelet Transformation
The multiresolution WT can be implemented using a two-channel perfect reconstruction filter bank as shown in (10) and (11). The first channel is a low-pass filter h and the second channel is a high-pass filter g. The WT of a signal can be obtained by repeatedly applying the two-channel filter bank to the signal in a pyramidal scheme. A separable 3-D WT can be computed by extending the pyramidal algorithm. Suppose the spatial dimensions are labeled x and y, and z denotes the spectral dimension. The decomposition can be performed sequentially; by first convoluting the x-direction, followed by the y and z directions. Figure 2 shows a one-level 3-D WT.
Figure 3 shows the eight data blocks resulting from a one-level image cube 3-D WT. Three-letter labels are marked on each data block to denote the filter type in the x, y, and z directions. L denotes a low-pass filter and H denotes a high-pass filter. The block in the top-left corner is the low-frequency portion of the image cube. The other blocks are filtered at least once with a high-pass filter and therefore contain high-frequency components in one of the directions. The low-frequency block can be further decomposed into eight more blocks. The WT yields a good representation of the original image for compression purposes. The low-frequency component of WT contains about 90% of the total energy in most cases. Different blocks and different levels of resolution can be coded in different ways to improve compression results.
Fig. 2. One level of 3-D decomposition.
Fig. 3. The eight data blocks resulting from the first level image cube 3-D WT.
3. Optimal Scalar Quantization
The purpose of quantization is to reduce data entropy by decreasing the data precision. A quantization scheme maps a large number of input values into a smaller set of output values. This implies that some information is lost during the quantization process. The original data cannot be recovered exactly after quantization. A quantization strategy design must balance the compression achievement and information loss. One of the criteria for optimal quantization is minimizing the mean square error (MSE) given a quantization scale. The quantization scale is not necessarily uniform, and for most data sets the optimal quantization scale is not uniform. In this study, wavelet coefficients were compressed using the optimal scalar quantization scheme proposed by (Lloyd, 1957) and (Max, 1960), which is commonly called the Lloyd-Max quantization.
The data blocks resulting from the 3-D WT are represented by floating point values and consist of two types of data: the LLL block, which preserves most of the energy; and the other high-resolution data blocks, which contain the sharp edge information. Figure 4 shows the histograms of a typical LLL data block and a typical high-resolution data block. The LLL block obviously has a much larger data range than the high-resolution data blocks. It is therefore reasonable to apply a large quantization scale for the LLL block and a relatively small scale for the other blocks. In this study, we used 256 intervals for the LLL block and less than 32 intervals for the other blocks.
Fig. 4. The histograms of (a) a typical LLL data block and (b) a typical high-resolution data block.
4. Huffman Coding
The final step in the compression process is coding. Huffman coding (Huffman, 1952) is a minimum redundancy coding. It assigns fewer bits to the values with a higher frequency of occurrence and more bits to the values with a lesser frequency of occurrence. Based on the occurrence frequency of each quantization level, a hierarchical binary coding tree structure brings by sequentially finding the lowest two frequencies as tree branches and adding each low frequency pair as a new node for the next level. Huffman coding allows reforming the data to be optimized in less space than the original data.
Because each data block is quantized into different numbers of quantization levels, the coding process should be performed for each data block separately. Compared with equal-length coding, Huffman coding can easily save more than 50% of the required memory space without losing any information.
5. Test Data and Evaluation
5.1 Test Data
The test data is an AVIRIS image downloaded from the web site (
http://dynamo.ecn.purdue.edu/~biehl/MultiSpec/) of Purdue University. The image size is 145 145 pixels with 220 bands, and the pixels are stored as 16-bit words. The storage size is about 9.3 mega-bytes. Figure 5 shows a perspective picture of the image cube. The image was taken in 1992, covering the Indian Pine Site 3, an agriculture area. The ground truth is also available for image classification evaluation. There are 16 land cover classes, in which some classes may be grouped into single landuse types. For example, corn, corn-min, and corn-notill belong to the corn landuse type, but due to the differences of crop canopies they are categorized into three different land-cover classes. Classes of a same group tend to possess similar spectral properties, so that it is usually difficult to differentiate them in a multispectral image. This data set was especially provided to show the application potentials of hyperspectral images.
Fig. 5. A perspective picture of image cube of the test data.
5.2 Evaluations of Image Compression
There are two ways to evaluate the quality of decompressed images: the objective and subjective evaluations. The objective method measures the amount of information loss or preserved by comparing decompressed images with the original one. The amount of information loss may be defined as mean square error (MSE), or the amount of preserved information can be described as cross correlation (CC) or signal to noise ratio (SNR) etc. Although MSE and CC objectively gauge the image difference, they are sensitive to strong signals. For most images with textures, SNR is an ideal assessment of how much the original signal preserved after a compressed image is recovered from compression, and is commonly applied for the evaluation of image compression. Therefore, in this paper, SNR is used as an objective measure to evaluate the performance of the developed algorithm.
The subjective evaluation is a measure of the performance of decompressed images on certain applications, for example visualization, feature extraction, or classification etc. Because image classification is a common application of a remotely sensed image, the classification results of the original and decompressed images are therefore compared to obtain subjective measures. Supervised classification using maximum-likelihood classifier is applied for the classification test. All samples of the ground truth data are used as training data, and they are also used as reference data for accuracy assessment. Theoretically, all spectral bands of image should be involved in the classification process. However, due to the insufficiency of training samples, classification will be failed if all spectral bands are used. In order to reduce the number of feature space dimension, only the first 30 principal components of image are extracted and used for classification. When the original image is applied, the resulted overall accuracy of classification is 89.39%.