Logo GISdevelopment.net

GISdevelopment > Proceedings > ACRS > 2000


1989 | 1990 | 1991 | 1992 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2002
Sessions

Agriculture & Soil

Water Resources

Coastal Zone Monitoring

Digital Photogrammetry

Environment

Forest Resources

GIS & Data Integration

Hazard Mitigation

Image Processing

Educational & Profession

Global Change

Landuse

Mapping from Space & GPS

SAR/InSAR

Oceanography

Hyperspectral & Data Acquisition System

AirSAR/MASTER

Poster Sessions
  • Session 1
  • Session 2
  • Session 3



  • ACRS 2000


    Landuse
    Machine Learning Methods to Identify Mislabeled Training Data and Appropriate Features for Global Land Cover Classification



    Data
    The training data set is obtained by sampling from the AVHRR 1km global vegetation cover product (Hansen et al., 2000). A total of 37340 cases were extracted. Formation of the 41 metrics is described in details in Hansen et al. (2000). The training data is thoroughly shuffled to prevent any correlation between training and testing.

    Discussions and Conclusions
    Table 1. Number of cases identified as mislabels
    Number of cases
    Expert knowledge 11732
    LVQ 9058
    DTC 7438
    IB 5376


    Table 2. Agreements between individual data modelers
    IB DTC
    LVQ 3725 4389
    DTC 3200


    Filtering methods to identify mislabeled data
    An improved version of training using expert opinion is available for comparison. With expert knowledge, 11732 cases were identified as mislabels. Table 1 shows the number of mislabels identified by each individual classifier. The number of mislabels actually represents the classification errors of each learning algorithm. IB outperformed the other two classifiers in terms of accuracy rates and hence it identified the least amount of mislabels. LVQ has the highest number of mislabels, but that is still about 2700 pixels less than expert opinions. Table 2 shows the agreements among filters and LVQ and DTC has the highest agreement. Since LVQ has the highest number of mislabels, it also has a slightly higher agreement as compared to other filters.

    Table 3. Number of mislabeled cases identified using consensus/voting filtering and the agreements with expert knowledge
    Number of votes Number of cases Cases agree with expert knowledge
    One Vote 13074 7080 (54.2%)
    Two Votes 6280 4115 (65.5%)
    Three Votes 2517 1863 (74.0%)



    Table 4. List of the 41 features used for the 1km map. The 18 features in bold fonts are those selected by the feature subset selection algorithm
    1. Maximum NDVI value
    2. Minimum NDVI value of 8 greenest months
    3. Mean NDVI value of 8 greenest months
    4. Amplitude of NDVI over 8 greenest months
    5. Mean NDVI value of 4 warmest months
    6. NDVI value of warmest month
    7. Maximum Channel 1 value of 8 greenest months
    8. Minimum Channel 1 value of 8 greenest months
    9. Mean Channel 1 value of 8 greenest months
    10. Amplitude of Channel 1 over 8 greenest months
    11. Channel 1 value from month of maximum NDVI
    12. Mean Channel 1 value of 4 warmest months
    13. Channel 1 value of warmest month
    14. Maximum Channel 2 value of 8 greenest months
    15. Minimum Channel 2 value of 8 greenest months
    16. Mean Channel 2 value of 8 greenest months
    17. Amplitude of Channel 2 over 8 greenest months
    18. Channel 2 value from month of maximum NDVI
    19. Mean Channel 2 value of 4 warmest months
    20. Channel 2 value of warmest month
    1. Maximum Channel 3 value of 8 greenest months
    2. Minimum Channel 3 value of 8 greenest months
    3. Mean Channel 3 value of 8 greenest months
    4. Amplitude of Channel 3 over 8 greenest months
    5. Channel 3 value from month of maximum NDVI
    6. Mean Channel 3 value of 4 warmest months
    7. Channel 3 value of warmest month
    8. Maximum Channel 4 value of 8 greenest months
    9. Minimum Channel 4 value of 8 greenest months
    10. Mean Channel 4 value of 8 greenest months
    11. Amplitude of Channel 4 over 8 greenest months
    12. Channel 4 value from month of maximum NDVI
    13. Mean Channel 4 value of 4 warmest months
    14. Channel 4 value of warmest month
    15. Maximum Channel 5 value of 8 greenest months
    16. Minimum Channel 5 value of 8 greenest months
    17. Mean Channel 5 value of 8 greenest months
    18. Amplitude of Channel 5 over 8 greenest months
    19. Channel 5 value from month of maximum NDVI
    20. Mean Channel 5 value of 4 warmest months
    21. Channel 5 value of warmest month


    Table 3 shows the results of filtering by voting. If we take all mislabels with any filter (One Vote), a total of 13074 cases were identified and 54.2% of those cases agree with expert knowledge. For majority vote filtering, i.e. at least two votes, then there are 6280 cases with higher percentage (65.5%) of agreement with expert knowledge. If consensus filtering (Three Votes) is performed, only 2517 cases were chosen. The agreement between consensus filtering and expert knowledge is the highest (74.0%). Our finding echoed the observations from Brodley and Friedl (1999) that consensus filtering is conservative in terms of throwing away good data and majority voting are better at detecting bad data but at the risk of throwing away good data. Our results show that filtering by consensus or by voting can be used to identify mislabeled training data with the lowest agreement with expert opinions at 54.2%. Since retaining mislabels would degrade performance, it is suggested to use majority vote since more mislabels can be identified and discarded. While we have shown that certain portion of expert opinion can be modeled by machine learning methods, more studies would be needed to address the problem concerning the disagreements between filtering by data modelers and expert knowledge.

    Feature Subset Selection
    The size of input dimension is extremely important when dealing with global data set since the task involves handling of millions of pixels. Our last experiment performed feature subset selection over the improved training data. We have chosen the mislabeled cases identified by majority voting (Two Votes) to be discarded. It has to be noted that there are 7612 mislabels picked by the expert that are not identified by machine learning procedures. Feature subset selection algorithm identified a subset of 18 features from the original 41 features. The input dimension is downsized by more than 50%. It will significantly reduce the processing time.

    Table 4 listed the optimal features chosen by feature subset selection (FSS) algorithm. The inclusion of all NDVI metrics shows the utility of the normalized ratio in mapping global vegetative land cover. This agrees with the expert-modified tree classification where NDVI metrics were useful in mapping most cover types. Minimum annual red reflectance was also found to be important in mapping tree cover and the FSS includes it as well. Temperature bands, particularly minimum channel three, are useful in discriminating tree leaf type and leaf longevity classes. Compared to the expert-modified tree classification, an important metric missing from the FSS is the mean of the warmest four month metrics of channel 4 and/or 5 (features 33 /40 in Table 4). These metrics are useful in delineating tropical forest from woodland based on dry season land surface temperature, but are not included in the FSS output. In general, the FSS clearly retains the most important metrics, those which best generalize the data set. However, some metrics which act regionally, as just mentioned, are not included.

    Map output/agreement
    The overall agreement between a map generated using the training data set improved by machine learning and the expert-modified tree classification is 53.1%. Classes with the best agreement include needleleaf evergreen and deciduous forests, crops and bare ground (all >70%). The most poorly performing classes include wooded grassland, closed shrubs and grassland (all < 25%). Preserving classes with high intraclass variability such as wooded grassland can be problematic. This is highlighted here as the filtered data and FSS tree outputs resulted in an area of mapped wooded grassland of less than half that of the expert-modified map. Hansen et al. (2000) state that most training errors/confusion occurs between classes consisting of mixtures. An example of such confusion are areas of partial tree cover such as wooded grasslands and croplands which often exist in mosaics with naturally occurring land cover types. Objectively finding the appropriate thresholds to best depict mixed cover types such as wooded grassland is a challenge and further examination is required.

    Acknowledgements
    This research is supported by the NASA grants NAG56970, NAG56004, NAS596060 and NAG56364.

    References
    • Aha, D.W., 1992. Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. International Journal of Man-Machine Studies, 36:267-287.
    • Aha, D.W., 1998. Feature weighting for lazy learning algorithms. In Feature Extraction, Construction and Selection: A Data Mining Perspective, edited by Liu, H., and Motoda, H., Kluwer Academic.
    • Brodley, C.E., and Friedl, M.A., 1999. Identifying mislabeled training data. Journal of Artificial Intelligence Research, 11, pp.131-167.
    • DeFries, R., Hansen, M.C., Townsend, J.R.G. and Sohlberg, R., 1998. Global land cover classifications at 8 km spatial resolution: The use of training data derived from Landsat Imagery in decision tree classifers. International Journal of Remote Sensing, 19 (16), pp. 3141-3168.
    • Ginsberg, M., 1993. Essentials of Artificial Intelligence. Morgan Kaufmann.
    • Hansen, M.C., DeFries, R.S., Townsend, J.R.G. and Sohlberg, R., 2000. Global land cover classifications at 1 km spatial resolution using a classification tree approach. International Journal of Remote Sensing, 21 (6&7), pp. 1331-1364.
    • John, G.H., Kohavi, R., and Pfleger, K., 1994. Irrelevant features and the subset selection problem. Machine Learning: Proceedings of the Eleventh International Conference, pp. 121-129.
    • Kohavi, R. and John, G. H., 1998. The wrapper approach. In Feature Extraction, Construction and Selection: A Data Mining Perspective, edited by Liu, H. and Motoda, H., Kluwer Academic.
    • Kohonen, T., 1995. Self Organizing Maps. Berlin: Springer
    • Quinlan, R. J., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann.
    • Sellers, P.J, Dickinson, R.E., Randall, D.A., Betts, A.K., Hall, F.G., Moonney, H.A., Nobre, C.A., Sato, N., Field, C.B., and Henderson-Sellers, A., 1997. Modeling the exchanges of energy, water, and carbon between continents and the atmosphere. Science, 275, pp.502-509.
    • Weisberg, S., 1985. Applied linear regression. John & Wiley & Sons.
    • Wettschereck, D., 1994. A study of distance-based machine learning algorithms. Doctoral dissertation, Oregon State University, Department of Computer Science.




    Page 2 of 2
    | Previous |


    Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book