Optimum Feature Selection For Classfication of Lidar Data Using Genetic AlgorithmsF. Samadzadegan Department of Geomatics Engineering, Faculty of Engineering, University of Tehran samadz@ut.ac.ir A. Javaheri Department of Geomatics Engineering, Faculty of Engineering, University of Tehran a.javahery@yahoo.com Due to the low resolution characteristic of available commercial LIDAR systems, it becomes difficult to correctly classify objects from LIDAR range data. In order to improve the performance of classification process, additional data should be considered. These are mainly: First and last LIDAR pulse and Intensity of returned laser beam and color Aerial Image. By using different combination of mentioned information, several numbers of features (Pattern Descriptor) have been developed. Nevertheless, there are no theoretical guidelines that suggest the appropriate features to be used in specific classification situations. The presented method uses a genetic algorithm for feature selection. Genetic algorithms (GAs), a form of inductive learning strategy, are adaptive search techniques which have demonstrated substantial improvement over a variety of random and local search methods We have taken the most popular classifier, the maximum likelihood classifier, to evaluate the quality of the output of proposed optimum feature selection method. A framework for quality assessment has been proposed and tested, based on similarity measures between classified data and reference data. The numerical investigation of the obtained results demonstrates the high capability of the proposed method for determining the optimum features for classification of LIDAR data. And results show that Image classification with optimum feature subset increase the overall accuracy. 1. Introduction Recognition and reconstruction of the object in real world is a major goal for many fields of research such as, photogrammetry, machine vision and vision metrology. In this field, we can define an object with textural, structural and spectral property. Textural property related to this fact that, the image of real objects often do not exhibit regions of uniform intensities and textural image, defined as a function of the spatial variation in pixel intensities (Tuceryan 1998). Structural feature describes the geometry of an object and finally the electromagnetic radiation reflected by objects of the same nature is similar overall and these objects will thus have similar spectral property. Simplest way to classify image is use all of extricable features simultaneously in classification algorithm but and there are a number of inter-related reasons why feature selection is desirable.
Feature subset selection algorithms can be classified into two categories. If feature selection is done independently of the learning algorithm, the technique is said to follow a filter approach. Otherwise, it is said to follow a wrapper approach (M. Sebban, 2001). The filter approach is computationally more efficient but its major drawback is that an optimal selection of features may not be independent of the inductive and representational biases of the learning algorithm that is used to build the classifier. On the other hand, the wrapper approach involves the computational overhead of evaluating a candidate feature subset by executing a selected learning algorithm on the database using each feature subset under consideration. wrapper based algorithm is categorized into three, Sequential, Exponential and Random Search. Genetic algorithm is type of randomized search strategy. The applicability of GAs to the optimum feature subset selection problem is obvious, and there has been considerable interest in this area in the last decade. In this paper, genetic algorithms are applied to optimum feature subset selection. 2. Genetic Search Genetic Algorithms (GAs) are adaptive heuristic search algorithm premised on the evolutionary ideas of natural selection and genetic. The basic concept of GAs is designed to simulate processes in natural system necessary for evolution. The main operator of genetic algorithm to search in pool of possible solutions is Crossover, Mutation and Elitism. The usual approach to the use of GAs for feature selection involves encoding a set of d features as a binary string of d elements, in which a 0 in the string indicates that the corresponding feature is to be omitted, and a 1 that it is to be included this coding scheme represent the presence or absence of a particular feature from the feature space Figure (1). the length of chromosome equal to feature space dimension. ![]() Figure 1. Designed Chromosome for subset Selection 3. Maximum Likelihood Classifier Classification can be defined as the association of a land use/land cover attribute to every pixels of an image (Duda and P.E. Hart-1973), and ML Supervised image classification begins with computing statistics for user-selected training feature vector of land cover classes and it uses the results of the statistical summary to classify the image. For classify the image, the probabilities of each feature vector’s belonging to each of the classes are calculated and the image pixel is assigned to the class for which this probability is the highest. The computation of probabilities is given by: ...................(1)Where μ is the mean value and ∑ is the covariance matrix of class i Gi(x)> Gj(x)if then pixel x is belong to class i. 4. Data set The airborne LIDAR data used in the experimental investigations have been recorded from city in Germany. The pixel size of the range images is one meter per pixel so that the density of point is one per m2. Intensity images for the first and last pulse data have been also recorded and the intention was to use them too in the experimental investigations. Furthermore colored aerial image was available for describe spectral property of objects. Feature space has 8 members in which constitute a subcategory .a pool of possible feature or feature spaces contain:
...............(2)![]() Figure 2.Range Image First & Range Image last ![]() Figure 3.Colored Aerial Image & NDDI Image ![]() Figure 4. Intensity Image & Intensity Image last 5. our work The classification process is composed of the following steps: Step1: Preparation data, Contain Co-registration of LIDAR and Aerial Image, Noise reduction and filtering the LIDAR data Step2: generate pool of possible solution Step3: Optimum feature selection An optimum feature subset selection has demonstrated in this following diagram: ![]() Step4 .After the selection process, the all image classified with optimum feature subset with ML Classifier. 5.1. Objective Function A goal of supervised image classification is classify image to a certain class with a highest accuracy and the feature set is optimal that make available this condition. With this concept The fitness evaluation is a mechanism used to determine the confidence level of the optimized solutions to reach higher accuracy. In this work we use confusion matrix for accuracy assessment. And extract Kappa parameter that can be computed from below ..............(3)Maximum value of kappa parameter equal to 1 and Because of this type of genetic algorithm minimized the fitness value, so fitness function defined as .................(4)5.2. Parameter Setting Our experiment used the following parameter setting for genetic algorithm.
6. Experiment and Result The main goal of these experiments is to optimize the feature set presented in feature space section to reduce the complexity of pattern recognition and increase the overall accuracy of this problem. Above concept has been implemented in MATLAB7.1. The classification result shows in figure (4), figure (5) ![]() Figure 5. Reconition Result of Tree and Road Class ![]() Figure 6.Recognition result for Grassland and Building During the analysis of classification results, quality assessment was performed by comparing overall accuracy and kappa coefficient. In general, classification with optimum features leads to overall accuracy of about 0.928%. The result shows that overall accuracy is 3% higher than using all of the features. Furthermore improvement of the accuracy in building class is better than Grass Land class. Table (1) shows the Confusion matrix of image classification with optimum feature subset. ![]() 7. Conclusion We have presented the results of applying ML Classification technique on LIDAR data and Aerial Image for 3Dand 2D object recognition. The result shows the capability of using this dataset simultaneously. Furthermore shows that optimum feature subset lead to improvement of classification accuracy. 8. References
| ||
|
|