Abstract | Full Paper | PDF | PPT | Printer friendly format

Page 3 of 4
| Previous | Next |



Improving classification accuracy using knowledge based approach


In this Equation, the left-hand term describes the probability that the measurement vector will take on the values Xi given that the object measured is a member of class wk.

This probability could be determined by sampling a population of measurement vectors for observations known to be from class wk. However, the distribution of such vectors is usually assumed to be Gaussian. Thus, we can assume that P{Xi | wk } is acceptably estimated by Fk(Xi)and rewrite Equation (6) as


Rearranging the Equation

Thus, the numerator of Equation (5) can be evaluated as the product of the multivariate density function Fk(Xi) and the prior probability of occurrence of class wk. To evaluate the denominator of expression (5), and knowing that for all k classes the conditional probabilities must sum to 1,


This Equation provides the basis for the decision rule which includes prior probabilities. Since the denominations remain constant for all classes, the observation is simply assigned to the class for which Fk*(Xi) the product of Fk(Xi ) and P{wk}, is a maximum. In its simplest form, this decision rule can be stated as: R2: Choose k which minimizes


It is important to understand how this decision rule behaves with different prior probabilities. If the prior probability P{wk}is very small, then its natural logarithm will be a large negative number; when multiplied by -2, it will become a large positive number and thus F 2, k for such a class will never be minimal. Therefore, setting a very small prior probability will effectively remove a class from the output classification. Note that this effect will occur even if the observation vector Xi is coincident with class mean vector mk. In such a case, the quadratic product distance function (Xi-mk)'D k-1(Xi-m k) goes to zero, but the prior probability term -2lnP{wk} can still be large. Thus, it is entirely possible that the observation will be classified into a different class, one for which the distance function is quite large.

As the prior probability P{wk} becomes large and approaches 1, its logarithm will go to zero and F2,k will approach F1,k for that class. Since this probability and all others must sum to one, however, the prior probabilities of the remaining classes will be small numbers and their values of F2,k will be greatly augmented. The effect will be to force classification into the class with high probability. Therefore, the more extreme are the values of the prior probabilities, the less important are the actual observation vector Xi.

Experimental Work
Training data for each class have been collected, and then the image is classified by maximum likelihood approach. It is assumed that a prior probability of the whole classes are equal. Figure 3 is the classified image.


Figure 3. Classified image by maximum likelihood approach and equal a prior probability.

Overall accuracy of this approach is 52%. In this stage rule maps of the 8 crops can be calculated which is the basis for decision making for the software. For example Table 1 can show the rule matrix (the probability of each pixel for class W).

Table 1. Rule matrix for class W.

Since the sum of rule matrices for whole classes must be one, Table 1 will be modified to Table .2:

Table 2. Sum of rule matrices for 8 classes.

Page 3 of 4
| Previous | Next |