Quality assurance in data conversion: data end users can trust
Ron E. Cromer,Eric M. Hughes, PMP
SpatialAge Solutions Division, Byers Engineering Company 6285 Barfield Road, Atlanta, GA 30328-4303 Introduction You are undertaking a major Geographic Information System (GIS) project and have selected a conversion vendor to perform the work. For a successful implementation you must provide quality data that will satisfy the end user and the applications used by mission critical business functions. To ensure good data quality you must properly plan, execute, and check the progress of your project. Since a conversion vendor is performing the work, the conversion specification must be complete and comprehensive. Planning starts by producing a conversion specification that documents what and how work must be performed. You must also determine your data quality requirements and establish acceptance criteria so that you can properly check the progress of the conversion vendor. Essential components of a conversion specification The method used to communicate requirements to the conversion vendor is through a conversion specification document. This document must address not only what data features will be converted, but must specify the necessary placement and business rules needed to convert the data. Individuals responsible for producing the specification must be intimately familiar with the data and knowledgeable of the new Geographic Information System database design and business applications. The specification must be in enough detail, and as explicit as possible, to ensure that the conversion vendor can accurately convert data into the required digitized format. It is extremely useful, in the beginning stages of the project, to review the conversion vendor's work to ensure that they fully understand the source data and the specified rules for data conversion. Changes to the specification should be kept to a minimum and tightly controlled. A comprehensive conversion specification must address:
The goal of quality in GIS databases is to ensure the proper functioning of business applications in order to properly make key operational and strategic decisions. It is essential to determine what accuracy level is required to support these applications. Accuracy or quality comes with a cost. Checking data requires a huge investment in time and resources. You must consider what fields and attributes are critical and non-critical, what can be checked easily, what is considered vital to the database design, and what are considered aesthetic requirements. The accuracy of your GIS database is also dependent on the quality of your original sources. You must determine what type of error correction will be necessary through an Error Transmittal Process and what type of errors you will allow to be defaulted and corrected at a later time. Acceptance Criteria Based on your budget, you must determine what is satisfactory for your business applications and document your acceptance criteria. Acceptance criteria are achieved by defining standards for running automated validation tests and by defining a method of computing errors during your visual inspection. These standards are the rules of the game and should be known up front and not changed unless everyone is in agreement. As a contractual requirement, you may have the conversion vendor be responsible for performing the automated validation tests and providing output reports as part of your deliverable. During your visual inspection, an Error Tally Sheet is helpful in determining compliance to your conversion specification. The Error Tally Sheet shows the number of errors by type and indicates the error value associated with that type of error. An error value for a feature is associated with the number of attributes (e.g., location, symbology, length, height, etc.) it has. For example, in Exhibit 1 if an operator missed digitizing a feature with weighted attributes, then the error value associated with missing the feature would be 16.
Exhibit 1. Extract of Error Tally Sheet for Feature with Weighted Values
Remember that you must consider Error Transmittals (ET's) submitted by the conversion vendor when computing the number of errors. In order to do this, you must have a procedure for tracking ET's quickly for referencing. The best way to do this is by creating a database of all the ET's with their associated source map or file. Quality acceptance process You have developed a complete and comprehensive specification, determined your data quality requirements, and established your acceptance criteria. You must now confirm that your conversion vendor provides the required data quality. You will need to establish processes to verify the accuracy and completeness of the new GIS database against your conversion specification. Verification is achieved through a combination of automated validation testing and visual inspection procedures. Although you will be able to run automated validation reports to check a large portion your GIS database, you still need to perform a visual inspection of a sampling of your data. This can be time consuming and expensive, however, it must be done to fully confirm data quality. The focus of this discussion is on the importance of acceptance sampling, during this visual quality assurance process. Importance of Acceptance Sampling The common impulse is to do a 100% inspection of your conversion vendor's deliverables, but a random sample will provide a more efficient and cost effective determination of data quality. The use of sampling techniques raises several critical questions. How much should be sampled to ensure your data quality requirement is met? Should re-sampling be performed if certain criterion is not met? And for a particular sample size, what criteria should be used to accept or reject the deliverable? When you employ sampling techniques, you assume a certain risk that the sample may not be representative of the rest of the work. However, the inspection process for conversion projects can be particularly tedious or monotonous, and through sampling you will be able to produce results as good or better than 100% inspection. In order to reduce this risk and minimize the amount of visual inspection that must be performed, you must develop an acceptance-sampling plan. Acceptance sampling is a statistical sampling technique used to test a batch of work that has already been digitized such as a division, circuit, or wire center. The purpose of acceptance sampling is to recommend a specific action; it is not an attempt to estimate quality or to control quality directly. The basic action recommendation is to accept or reject the items represented by the sample. Acceptance sampling involves establishing a sampling plan indicating the number of digitized features that need to be inspected (sample size) and the criteria for determining the acceptance or rejection of the deliverable. The American National Standard for Sampling Procedures and Tables for Inspection by Attributes (ANSI/ASQC Z1.4 1993) provides sampling procedures and reference tables for use in establishing single, double, or multiple sampling plans. A single sampling plan is a specific type of sampling plan in which a decision to accept or reject a batch of work is based on a single sample. Acceptance Sampling Procedures Inspection procedures and reference tables have been extracted from the American National Standard for Sampling Procedures and Tables for Inspection by Attributes (ANSI/ASQC Z1.4 1993) and shown here for a single sampling plan. Step 1: Determine Batch Size. The batch size is the number of data features in your deliverable (division, circuit, wire center, etc.). Step 2: Determine Sample Size. The sample size code is first determined from Table I - Sample Size Code Letters as shown in Exhibit 2. Find the appropriate range, in the first column "Lot or Batch Size", for the number of features in the deliverable. Find the corresponding sample size code letter. Next, find the sample size code letter in the first column of the Table II - Single Sampling Plans as shown in Exhibit 3.
Exhibit 2. Extract of Table I - Sample Size Code Letters (for General Inspection Level II)
Step 3: Select Sample Maps and Conduct Inspection. Randomly select a map or proof plot and inspect each feature according to your specification. Continue selecting maps until you have inspected the minimum sample size. Tally attribute errors and compute the sample accuracy in accordance with the feature attribute values established as part of your acceptance criteria.
Exhibit 3. Extract of Table II-A - Single Sampling Plan for Normal Inspection (in %)
Step 4: Accept or Reject Determination. As shown in Exhibit 3, the Acceptance Percentage, Ac, and Rejection Percentage, Re, are first determined from Table II based on the Acceptable Quality Level (AQL) and the sample size. The Acceptable Quality Level is the maximum percent error, for the purpose of sampling inspection, that can be considered satisfactory as a process average. Compare the sample accuracy with the Ac and Re, and accept or reject the deliverable. If the sample accuracy is between the Ac and the Re, the deliverable is considered accepted. Example: Exhibit 3 shows that for a sample size of 125 and an AQL of 1.5%, the Ac (Acceptance %) = 96.00% and Re (Rejection %) = 95.20%. (Note: In the ANSI Sampling Plan Tables acceptance and rejection criteria are shown based on the number of defects in a sample. This number has been converted, in Exhibit 3, to show acceptance and rejection criteria based on sample accuracy as a percentage.) Step 5: Take Corrective Action if Necessary. Review audit results and provide feedback to vendor on systemic errors, if appropriate. In cases where a deliverable was rejected, provide sample accuracy to vendor along with error tally sheets and proof plots for each map in the sample. Ensure vendor corrects all errors found during the inspection and inspects remaining work prior to resubmitting deliverable for follow up inspection. The ANSI sampling procedures employ switching rules to further reduce risk and minimize the amount of sampling. There are three individual inspection levels, Normal, Tightened, and Reduced Inspection. All inspections start out at the Normal inspection level and may switch to either Tightened or Reduced level based on the switching rules. As an example, after 10 consecutive accepted batches you would switch to the Reduced Inspection level. The Reduced Inspection Level requires a smaller sample size for a given batch. For a complete description of the switching rules for Individual Inspection Levels, refer to the American National Standard for Sampling Procedures by Attributes (ANSI/ASQC Z1.4 1993). Conclusion In conclusion, you must successfully plan for the quality of the data that is expected to support your critical business applications. After determining your database design requirements, you must document all your conversion specifications to ensure that these requirements are communicated properly to the conversion vendor and ultimately met. To successfully achieve your data quality criteria you must be able to check that work is being performed to standards. While automated validation testing is extremely valuable, you must still plan to perform some visual or manual checking of your deliverables from the conversion vendor. Using acceptance sampling will reduce the effort of work involved in this process. There is an ANSI standard available for acceptance sampling and you should use it. It is also a good idea to conduct a Client/Vendor walk-thru for the first delivery to ensure that the vendor understands the conversion specification and both client and vendor are interpreting the acceptance criteria in the same manner. Your investment in Quality Assurance planning and controlling will pay off in providing GIS application data that end users can trust. References American National Standard, Sampling Procedures and Tables for Inspection by Attributes, (ANSI/ASQC Z1.4 1993) | ||
| © GISdevelopment.net. All rights reserved. |