Application integration in data conversion quality Measurement
Robert G. Stengle, P. E.
Stoner Associates, Inc., 1170 Harrisburg Pike
Carlisle, PA 17013
>
Overview
Implementation of a Geographic Information System involves a complex mixture of computer
hardware specification and procurement, software selection, software development, and data
conversion. As with any complex system implementation, quality assurance plans are essential
in providing a means to gauge the progress of the project in meeting its objectives. An effective
quality assurance program provides timely identification of design or process deficiencies.
Creation of new data and conversion and loading of existing enterprise data into the GIS is an
extensive and costly segment of the GIS setup. The source data used to populate the GIS will
have varying degrees of currency, completeness, attribute accuracy, and spatial accuracy. In
some cases, the "pedigree" of the source data will be known, while others may be difficult to
assess. It is not unusual to have different accuracies within one source data set. As an example,
facilities mapped before a certain date or in a certain area may be more suspect than other data.
User confidence in the GIS data is a key aspect of responding to customer, contractor, and other
inquiries or to conduct network analysis. For example: in responding to an excavation plan, the
GIS user needs to understand the completeness of the data as well as the accuracy of its spatial
presentation. Depending on the data characteristics, the contractor's request may or may not
warrant a field check by the utility.
This presentation will explore ways to establish the all-important "data about data" (metadata)
characteristics of a GIS.
Approaches to Quantifying Data Accuracy
There are a number of factors that may affect the accuracy of the source data for the GIS,
including:
- Source maps which have inadequate topographical references
- Inaccuracies in the landbase of the source data
- Inaccuracies in the landbase used in the GIS
- Inaccuracies in the field sketches for as-built facilities
- Distortions while incorporating the field sketches onto the master drawing
- Facility changes that are not incorporated in the master drawing
- Attributes which are missing from the source maps, such as material type
- Errors made in recording or transferring attribute data
- Shrink, stretch, or other distortion of the source map
- Source maps that are of a smaller scale than the normal scale used by the GIS operators and map users
- Data blurring during reproduction, due to a defective master map or a process problem that creates an erroneous pipe break or improper connection in the copy
- Unavailability or improper interpretation of detail drawings, causing a congested area to be misinterpreted during data capture
- Lack of data on the interior conditions of the equipment (e.g. valve trim, pipe interior diameter and/or wall coating thickness)
- Data sets which come from existing systems wherein some of the information is not validated and/or maintained at all or to the accuracy level needed by the target GIS application(s)
Most, if not all, utilities must deal with one or more of the above data issues. The research
and/or field checking effort to fully validate the accuracy of a GIS data set would be prohibitive
for the vast majority of utilities and other distribution companies. Validation of transmission
facilities is more practical, but would still involve considerable time and expense.
Facilities verification can be carried out through field checks, record checks and/or indirect
checks. For field checking of buried facilities, magnetic sensors or other special equipment may
be used to locate the route of the facilities. Some excavation may also be required. Record
checks involve reviewing more accurate records such as individual as-built drawings or service
cards. Indirect checking involves measuring parameters in the system and comparing them with
those reported by a model built from the GIS data. Typically, the parameters involved in indirect
checking are pressures and flows in fluid systems and power, current, and voltage in electrical
systems.
Field and record checks generally involve statistical sampling and extrapolation of the sample
results to geographical or equipment groups. For some facilities attributes, such as the spatial
accuracy of buried equipment, this is the only practical method of data verification. While direct
checks are costly, there is no acceptable substitute for validating certain types of data. However,
indirect checking, either alone or coordinated with field and record checks, can provide the
enterprise with improved data quality at a lower cost than would be the case if only direct checks
were conducted.
We noted above that spatial attributes are best validated through field and record checks. An
example where indirect checking would apply is valve status. A combination of field and
indirect checking may be the most effective approach for checking valves. Consider a situation
where a statistical field survey of a gas system results in identifying one percent of the noncritical
("convenience") valves as being found either partly or fully shut. If there are 20,000
valves in the system, it would be a major undertaking to locate and manually check every valve.
A viable alternative would be to create a hydraulic model of the system and then investigate
areas where there are unexplained differences between the model results and the field pressures.
The model results could then be used to direct the field investigation in specific sections of the
system. An out-of-position valve could cause a disparity between the model and the field, so
valve position checks in the vicinity of pressure/flow discrepancies would be included in the
field investigation.