Multi-View Technique for 3-D Robotic Object Recognition System using Neuro-Fuzzy Method D. Naga jyothi Lecturer, School of Computing and Information Technology Inti College Malaysia, Jalan BBN 12/1, Bandar baru nilai, 71800, Negeri Sembilan, Malaysia Fax(6)06-799 7531/13, Tel: (6)06-798 2000(Ext:2451) E-mail: jyothi@intimal.edu.my
Abstract The recognition of objects is one of the most challenging goals in robotic vision system. The problems increase when the process of recognition involves three dimensional (3-D) objects. To overcome this problem, many researchers have proposed their own solution. In this paper, a multiview 3-D object recognition system based on neuro-fuzzy system is proposed. The system consists of two stages. The first stage is feature extraction of object. In practice, there are many techniques for feature extraction such as moment invariants, Fourier transform coefficients, M-transform etc. We use moment invariants for feature. extraction. The second stage is the process of recognition. We use Multiple Adaptive Network based Fuzzy Inference System (MANFIS) as a classifier for this stage. This method has been successfully tested using five different objects. High recognition rate was obtained using the proposed method. I. Introduction An object recognition system finds an object in the real world from an image of the world, using object models which are known priori [9]. The process of recognition is one of the hardest problems in computer vision. Although human can perform object recognition effortlessly and instantaneously, an algorithmic description of this task for implementation on machine has been very difficult especially in case of 3-D objects. In robotics application such as object grapping or manipulation, efficient 3D object recognition will assist in faster identification and localization process for real-time dynamic arm motion control. In general most model based object recognition system considers the problem of recognizing objects from the image of a single view [2][5][22]. However, a single view may not contain sufficient features to recognize the object. In addition, it required complex feature sets and this make the recognition process time consuming [7][22]. To overcome this problem, modeling 3D object recognition using multiple 2D views was proposed by some researchers. It summarised the set of possible 2D appearances of a 3D object. Some of the early studies such as use of aspect graph was proposed by Koenderink and van Doorn [13]. An aspect graph represents all stable 2D views of a 3D object. However, the extraordinarily large in size and complexity of aspect graphs for even simple object has hindered the use of this method. Edelman and Bulthoff [4] found a strong and stable correlation between recognition performance and viewpoint variation and suggest object representations by multiple viewpoint specifically 2D representations. Murase and Nayar [17] and Nene [18] developed a parametric eigenspace method to recognize 3D objects directly from their appearance. This technique however is not robust to occlusion and do not provide indication on how to optimize the size of the database with respect to the types of objects considered for recognition and their respective eigenspace dimensionality. Recently, some papers have proposed an effective recognition algorithm using neural networks. The advantage of this model is the ability to learn from a training data set and perform a prediction of the other dataset. Lin et. al. [15], Nasrabadi and Li [19] used Hopfield neural networks for their 3D object recognition system. Compare with conventional 3D object recognition, it provides a more general and parallel implementation paradigm. Lu et al. [16] recognized 3D objects using a back-propagation algorithm, which has been commonly used in pattern recognition applications. Other works using neural networks such as Foresti and Pieroni [6] used neural tree (NT), Ham and Park [7] used hidden Markov modelbasedsystem combined with neural networks and Carpenter and Ross [3] used ART-EMAP networks. However, the use of neuro-fuzzy system; a combination of neural network and fuzzy system, is not widely used in 3D object recognition. In this paper, we used a type of neuro-fuzzy system called Multiple Adaptive Network based Fuzzy Inference System (MANFIS) to perform 3D object recognition. II. System Overview In this section, a methodology for image acquisition and data extraction is presented. A 3D object recognition system using multiple views was developed. The system aims to recognize 3D objects which are stand alone, separated and are independent to each other. The possible object and camera set-up for the proposed system is illustrated in Fig. 1. turntable. Three B/W CCD cameras are used to capture the images simultaneously from different viewpoints (different angle). These cameras are fixed at the same height (y coordinate), at 450 from the center of turntable. The cameras must have same focal length and distance from the center of turntable. The angle that separated camera 1 and camera 2, camera 2 and camera 3 are fixed at 450. We assumed that the location of camera 1 as a reference point (scene at 00). For the first condition, camera 1 views the scene at 00, camera 2 views the scene at 450 and camera 3 views the scene at 900. Fig. 6 shows an example of images taken from the cameras at the reference points. Next, the object will be rotated 50 clockwise to get the second condition. At this condition, camera 1 views the scene at 50, camera 2 views the scene at 500 and camera 3 views the scene at 950. For image acquisition process, each object will be rotated 3600 and images will be captured for each 50 rotation. Hence, for each object, we will have 72 conditions after a complete 3600 rotation. Captured images are then digitized by the DT3155 frame grabber from Data Translation Inc. and set to the pre-processing and feature extraction stage. In this study, moment invariants are used as a feature as it is invariants with position, orientation and scale changes. The algorithm has been commonly used in pattern recognition because it explains geometrical properties of an object. ![]() Figure 1: Image acquisition set-up ![]() Figure 2: System configuration Furthermore it takes short processing time as the algorithm is simple. Some works using moment descriptions and its properties can be found in [7, 14, and 21]. All the features extracted from various viewpoints will be presented as an input for the recognition stage. Fig. 2 depicts the overall proposed system. The invariance properties of moments of 2-D and 3-D shapes have received considerable attention in recent years. They are useful as they define a simply calculated set of region properties that can be used for shape classification and part recognition. Hu [8] derived a set of invariants based on combinations of regular moments using algebraic invariants. These invariants are invariant under change of size, translation and rotation. In this work, the first moment invariant is selected to be used as suggested in [23]. III. Neuro-fuzzy system Neuro-fuzzy system is a combination of neural network and fuzzy system in such a way that neural network learning algorithms, is used to determine parameters of the fuzzy system [20]. ANFIS is a neuro-fuzzy model proposed by Jang [11]. The structure of ANFIS with five layers is shown in Fig. 3. x and y are the inputs for ANFIS. Note that the input layer is not calculated as an ANFIS layer. ![]() Figure 3: ANFIS Architecture For learning rule of ANFIS, hybrid learning algorithm [4,5] which combines the gradient descent and least-squares method is used to find a feasible set of parameters. Table 1 shows the hybrid learning procedure for ANFIS. Further information can be obtained from [10, 11, 12, and 20]
However, ANFIS itself only suitable for single output system. For a system with multiple outputs, ANFIS will be placed side by side to produce a Multiple ANFIS (MANFIS) [12]. The number of ANFIS required depends on the number of required output. Fig. 4 shows a MANFIS with five outputs. Since the input data remains the same for each ANFIS, they also have the same initial parameter such as initial step size? , membership function (MF) type and number of MF. ![]() Figure 4: MANFIS with five output IV. Experimental results In order to examine the performance of this system, we have selected five 3D objects for this recognition. Some examples are shown in Fig. 5. As we mentioned earlier, each object will has 72 conditions. We choose odd condition (1, 3, 5, … 71) as a training data and the even condition (2, 4, 6,… 72) for the testing, so that the views of the testing images have never appeared in the training process at all. Hence, for five objects, we will have 180 data for training set and 180 testing data set. MANFIS with five outputs was used to perform this task. ![]() Figure 5: Example of objects used in the experiment We have analyzed the MANFIS performance using different initial parameter set. To find the best, first, we run our system using MF=2 with initial step size, ? =0.01, 0.05, 0.10, 0.25, and 0.35. Increasing the initial step size value will increase the learning rate for the ANFIS. However, if the step size is set too large (i.e. 0.35), the system will fail to learn properly. Table 2 summarized the system performance.
We also analyzed the system performance with the number of MF=3 and 4. MF=5 and above are not suitable for the analysis since the number of data is smaller than the number of adjustable parameters in the network. Table 3 and 4 summarized the results for each number of MF.
The results show that selecting a proper number of MF and initial step size value will affect the system performance. The system produces the best result at MF=4, ? =0.10 with 84.44% recognition accuracy. However, MF=2 is adequate to perform a good and fast recognition with a slightly less accuracy at 82.78%.
V. Conclusion A multiple view 3D robotic object recognition system using neuro-fuzzy system is proposed in this paper. Our experiments show that 3D objects can be modeled and represented by a set of multiple 2D views. In addition, it does not require complex feature sets for 3D object modeling, thus improve processing time for feature extraction stage. Our experiments also proved that neurofuzzy system can perform well in 3D object recognition task although we are using simple feature. While we use simple feature for the purpose of illustration, one may use or combine other feature such as edge, Zernike moment, texture, corner etc to improve the performance of this system. Future work will be the comparison of the approach with other neural networks and/or neuro-fuzzy and actual implementation of the system in a robotic arm object handling and motion planning applications. VI. References
![]() Figure 6: Image scene from different view at reference point | |||||||||||||||||||||||||||||||||||||||||
|
|