Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 2000


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997 |  
Sessions

Data development and evolution

Engineering and design applications

Exploiting field and mobile technologies

Invited presentations

It's a brave new world

Leveraging web-based technologies

Mobilizing the enterprise

Operations support

People issues

System architecture

The best of the rest

Uniting the enterprise

User perspectives

Work management solutions



GITA 2000


Data Development and Evolution
Printer Friendly Format

Page 1 of 7
| Next |


Data integrity and quality - How do you get there?

Bob Britton
US WEST, 700 W. Mineral Ave. , Littleton, CO 80120

Darrell Rhodes
Analytical Surveys, Inc., 941 N. Meridian St.
Indianapolis, IN 46204


Quality Control And Quality Assurance (Qa/Qc) Processes
Data integrity and quality are critical to the success of any geospatial program implementation and are achieved through effective QA/QC and data validation processes, especially during the data conversion phase of the project. The importance of these processes become more evident as the project stakeholders recognize that the initially converted database will be used as the foundation for all future business applications and functions requiring geospatial data and analysis. Typically, additional layers or levels of data will be added on top of the originally converted data as future enhancements are made to the geospatial system, adding more importance to the initial quality and integrity of the primary system data.

Typically, conversion vendors will utilize detailed quality assurance and quality control steps within their conversion process to ensure the data specifications are met prior to delivering the data to the client. With the increasing focus on ISO 9000 implementation and certification, quality assurance and control procedures are beginning to adhere to a basic set of standards recognized throughout the GIS community as well as other industries, resulting in improved QA/QC processes.

During data conversion, quality control is focused on "inspection" and will include full manual and automated checks of the converted data against the source information and specifications at defined checkpoints within the conversion process. Manual checks may include a one to one comparison of the source document with a hard copy plot as well as performing consistency checks digitally on screen. Automated checks typically involve very specific validation routines that are run at the completion of the tasks. These would check feature level and model level requirements and would report them to an operator for correction. This process is generally cyclical until all the errors have been identified and corrected before moving to the next task. Quality assurance is focused on the "process" as well as "validation". The conversion process must be engineered to ensure the quality of the data is "built in" rather than "inspected in". This involves various types of validations built into the conversion software and process to minimize the "human error" factor on the data.

For example, data model requirements can be incorporated into the conversion software, enabling the data being entered by an operator to be validated "on the fly". In other words, if an operator is capturing attributes for a cable feature, the software would only allow the operator to key in legal values for the attributes as defined by the data model. For system defined attributes, the software would populate these fields automatically and would eliminate any operator intervention thereby reducing the risk of error. In addition to process engineering, quality assurance also involves data validation utilizing random sampling techniques at major points within the conversion process. This generally occurs on the final software platform and will consist of running QA scripts and checking reports as well as reviewing hard copy check plots and performing onscreen integrity checks. Generally, better results are achieved if the conversion vendor can replicate the random sampling process / technique utilized by the client.

The client must fully understand the conversion vendors QA/QC processes in order to establish their own checks and be assured that the data will comply with the specifications and meet the desired quality levels. In this regard, the client and the conversion vendor must work together and share in the QA/QC responsibility. Providing the conversion vendor with concise requirements and targets is the first step in setting up good quality methods. This is generally accomplished through the development of acceptance criteria, which is discussed in the last section of this paper.

Most clients typically try to minimize the resources required to perform the QA/QC analysis of the data delivered by the conversion vendor. In addition, knowledgeable resources are not always available. This is compounded by the fact that when the resources are available they are needed for other programs within the company. In addition to resource issues, there are also time constraints that must be managed as part of the data review. Generally, there are specific time periods that have been defined in the contract for data review and acceptance. With this in mind, the client usually uses a statistical sampling scenario to maximize the resource and time constraints to validate data coming from the vendors. It is also important to note that the client is counting on much of the actual detailed quality control to have taken place before the data is delivered.

To establish a proper statistical sampling method, the client must adhere to an established set of criteria such as presented in the ANSI Sampling Procedures and Tables for Inspection by Attributes. It is critical that once the inspection criteria are established, the staff assigned to perform the inspection follow the criteria to the letter. A common failure of this type of analysis is convincing the users that it works. The people who are performing the inspection are usually ex-records staff who will not be tolerant of ignoring "other errors" that they notice outside of the sample set of data. For random sampling to work however, you must record errors only on the items randomly selected for sample.

One method to help mitigate this natural resistance to look past additional errors is to utilize a separate error tally sheet to record the "other errors" found during the inspection. The project team will then need to decide whether or not action is required to specifically correct these errors before the data is turned over to the end user. If the data were accepted based on the sample, typically the "other errors" would not be corrected before the data is turned over to the client. If the data were rejected based on the sample, typically the "other errors" are returned to the conversion vendor for correction as part of their normal rework cycle to correct the rejected delivery. A typical random sampling process will include four components, which are, inspection conditions, characteristics to be inspected, inspection methods, and the consolidation of results. With these components in place, one should be able to ensure their converted data is complete and accurate. Again, to establish proper statistical sampling processes, one should follow accepted standards such as ANSI's Guide to Inspection Planning, and Sampling Procedures and Tables for Inspection of Attributes.

Page 1 of 7
| Next |

Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book