Quality assurance in data conversion: data end users can trust
Ron E. Cromer, Eric M. Hughes, PMP
SpatialAge Solutions Division, Byers Engineering Company
6285 Barfield Road, Atlanta, GA 30328-4303
Introduction
You are undertaking a major Geographic Information System (GIS) project and have selected a
conversion vendor to perform the work. For a successful implementation you must provide
quality data that will satisfy the end user and the applications used by mission critical business
functions. To ensure good data quality you must properly plan, execute, and check the progress
of your project. Since a conversion vendor is performing the work, the conversion specification
must be complete and comprehensive. Planning starts by producing a conversion specification
that documents what and how work must be performed. You must also determine your data
quality requirements and establish acceptance criteria so that you can properly check the
progress of the conversion vendor.
Essential components of a conversion specification
The method used to communicate requirements to the conversion vendor is through a conversion
specification document. This document must address not only what data features will be
converted, but must specify the necessary placement and business rules needed to convert the
data. Individuals responsible for producing the specification must be intimately familiar with the
data and knowledgeable of the new Geographic Information System database design and
business applications. The specification must be in enough detail, and as explicit as possible, to
ensure that the conversion vendor can accurately convert data into the required digitized format.
It is extremely useful, in the beginning stages of the project, to review the conversion vendor's
work to ensure that they fully understand the source data and the specified rules for data
conversion. Changes to the specification should be kept to a minimum and tightly controlled.
A comprehensive conversion specification must address:
- Rules for Facility Geospatial Positioning. You need to plan for how you want your features and text to be displayed in your business applications. Real world positioning is not always feasible. You will need to establish spacing rules to avoid overstrikes.
- Business Rules for Converting Sources into Digital Format. First you need to determine what actually needs to be converted before establishing your rules. You want to ensure that your data requirements, for your business applications, are met but also not convert unnecessary data.
- Listing and Description of Each Source. Your vendor must be able to understand each of your sources, especially digital sources. Ensure that you provide description of all fields, data code lists, and acceptable values.
- Detailed Definition of all GIS Data Structures. Your target database must be accurately defined. All data field structures and validation rules must be clearly understood by the conversion vendor.
- Source Data Matrix. This identifies all of your data sources, by priority, for each feature and attribute, as well as, specifying the critical and non-critical fields in the new GIS database design. It is critical that you identify the priority source when there are multiple sources with conflicting data. The source data matrix will also show you if there are any gaps between your existing sources and the new database design.
- Procedures for Handling Reformatting/Re-symbolizing. Your goal here is to standardize your data features in your target database.
- Rules for Handling Data Source Errors. This deals with establishing default rules when confronted with missing data or data source errors for non-critical fields. Defaulting will allow you to correct source errors at a later time.
- Problem Resolution Procedures. You are going to have data errors that are discovered by the conversion vendor which need some type of resolution so you need to plan for it. Your procedures should address how the communication between the vendor and client will take place and the expected turn around time.
Accuracy Requirements
The goal of quality in GIS databases is to ensure the proper functioning of business applications
in order to properly make key operational and strategic decisions. It is essential to determine
what accuracy level is required to support these applications. Accuracy or quality comes with a
cost. Checking data requires a huge investment in time and resources. You must consider what
fields and attributes are critical and non-critical, what can be checked easily, what is considered
vital to the database design, and what are considered aesthetic requirements.
The accuracy of your GIS database is also dependent on the quality of your original sources.
You must determine what type of error correction will be necessary through an Error Transmittal
Process and what type of errors you will allow to be defaulted and corrected at a later time.