Primary Validation
- Logical cartographic consistency
- Closed polygons
- One label for each polygon
- No duplicate arcs
- No overshoot arcs (dangles)
- Similar features use similar symbols
- Logical attribute consistency
- Values within logical range (look for illegal values)
- Dates (e.g. month less than or equal to 12)
- Time of day less than 24:00 hours
- Nominal data illegally re-sampled into ratio data
- Precipitation values equal to or greater than zero
- Linkage of features with attribute fields
Secondary Validation
- Logical query and statistical tests of the spatial and attribute data (look for unlikely values)
- Points placed in distant locations on the map
- Elevations with reasonable values
- Ground truth or comparison to known standards
- Sample ground areas and compare to database
- Evaluate spatial accuracy
- Evaluate attribute accuracy
- Completeness of data (“model type” completeness - relative to user needs) in a map of roads, are all the roads important to the user included?
- Sensitivity analysis
- Change the data, and see if those changes affect the results for your application.
Quality Assurance Plans
Quality Assurance plans can broadly be classified into two categories, viz; Visual QA and Automated QA and discussed below.
Visual QA : Visual QA is meant to detect not only random error such as a misspelled piece of text, but also systematic error such as an overall shift in the data caused by an unusually high RMS value. Existence and absence of data as well as positional accuracy can only be checked with a visual inspection. The hard copy plotting of data is the best method for checking for missing features, misplaced features and registration to the original source. On-screen views are an excellent way to verify that edits to the database were made correctly. Visual inspection should occur during initial data capture, at feature attribution, and then at final data delivery. At initial data capture the data should be inspected for missing or misplaced features, as well as alignment problems that could point to a systematic error. In either case each error type needs to be evaluated along with the process that created the data in order to determine the appropriate root cause and solution.
Automated QA : Visual inspection of GIS data is reinforced by automated QA methods. GIS databases can be automatically checked for adherence to database design, attribute accuracy, logical consistency and referential integrity. Automated QA must occur in conjunction with visual inspection. The goal of the automated quality assurance is to quickly inspect very large amounts of data and report inconsistencies in the database that may not appear in the visual inspection process. Both random and systematic errors are detected using automated QA procedures. Once again the feedback loop has to be short in order to correct any flawed data conversion processes.
Data Acceptance
Defining acceptance criteria is probably one of the most troubling segment of the GIS project, due to non availability of Standards for acceptable errors and/or any rejection criteria. GIS coverage being application specific, these can best be defined on the basis of existing data model and database design as well as the user needs and application requirements. Project schedule, budget and human resources all play a role in determining data acceptance. Further, the accepting data can be confusing without strict acceptance rules. A GIS data set may have ‘m’ features of ‘n’ attributes each. Any one feature having a single incorrect attribute, may lead to error-count conditions, such as:
- 1 error, if it does not affect other (n –1) features in any way.
- m errors, if it effects all other features.
Each attribute should be reviewed to determine if it is a critical attribute and then weighted accordingly. Additionally, the cartographic aspect of data acceptance should be considered. A feature’s position, rotation and scaling must also be taken into account when calculating the percentage of error, not only its existence or absence.
Once the acceptable percentage of error and the weighting scheme have been chosen, methods of error detection should be established. The methods of error detection for data acceptance are the same as those employed during the data conversion phase. Check plots should be compared to the original sources and automated database checking tools should be applied to the delivered data. Very large databases may require random sampling for data acceptance.
Data Maintenance
Maintenance involves additions, deletions and updates to the database in a tightly controlled environment, in order to retain the database’s integrity. It provides the user with only one point of entry into the database, thus improving the consistency and security of the database. Maintenance applications are usually supported by a database management system, consisting permanent and local (temporary) storage systems. Data is checked out from permanent storage into local storage for update and then posted back to the permanent storage to complete the update. Pre-posting QA checks are required to ensure database integrity. Database schema are maintained so that table structure and spatial data topologies are not destroyed. Automated validation of attribute values as well as Visual check-plots for addition/deletion of large amounts of data are also useful. Periodic database validation for large multi-user databases can identify some very important and potentially costly errors. Errors or last minute changes in business rules, bugs in the maintenance application or inconsistent editing methods can all be detected during periodic validation.
Conclusions
The main purpose of this paper is to present an overview of some methods that are especially suited to assessing the Quality of GIS data base and digital maps/ coverage. The various quality parameters of a GIS and their measurements in real life environment are presented and a set of guidelines that are to be adhered to by all of the projects and users intended to establish the minimum acceptable level of accuracy are also highlighted. The issues involved in the development and implementation of an integrated GIS Quality Assurance Plan are also discussed.
References
- Berry B (1964) : Approaches to regional analysis: A synthesis, Annals, Asoo. American Geographics,
54, 2- 11. (1)
- Blakmore M (1983) : Generalisation and error in spatial data bases. Cartographica 21, 131- 139. (2)
- Goodchild MF (1991) : Key note Address, Proc. Sympo. On spatial Data base, Accuracy, 1- 16. (3)
- Van Genderen JL & Lock BF ( 1977) : Testing land-use map accuracy, Photogrammetric enng. & RS, 43, 1135-37. (4)
Books/ Standards.
- CEN 1995 Geographic Information- Data description- Quality (Draft). Brussels CEN Central Secretariat. (1)
- Elements of spatial data quality (1995), Oxford, Elsevier eds: Guptill SC & Morrison JL. (6) (2)
- Brassel K. et al, Completeness, 81 - 108 (1)
- Kainz W, Logical consistency, 109 –137. (b)
- Vergin H. et al, An evaluation matrix for geographical data quality, 167 - 188. (c)
- Langran G (1992), Time in geographic information systems, London: Taylor & Francis. (3)
- Redman TC (1992), Data Quality, NY, Bantam. (4)
Acknowledgement
The author is indebted to Sri. S Adiga, Director, NNRMS/RRSSC, Bangalore, for his kind approval. Thanks are due to Dr. A Jeyram, Head, RRSSC, Kharagpur, for his views and suggestions for preparation of the paper.