Foundation of the quality assurance odyssey
Kevin Peters
Stoner Associates, P.O. Box 86
Carlisle, PA 17013-0086
Introduction
With limited resources and compressed project schedules, it is difficult for utilities and
municipalities to perform quality assurance/quality control on 100 percent of converted data.
Due to these constraints, quality assurance/quality control is typically performed via
sampling techniques. Considering the massive investment that is tied to data, a sampling
often does not seem adequate and often results in a lack of confidence in the converted data.
To help build confidence that data can be validated via sampling, it is important that the
client and vendor work together to build quality into the conversion process at the beginning
of a project. To ensure this solid foundation for quality control is established, it is critical that
technical conversion specifications be developed that:
- Define how the source data is to be interpreted and put into the target system
- Define what types of source data are acceptable (and hierarchy of multiple sources)
- Define how the end product (converted data) should graphically appear in the target system
- Define the precise criteria upon which the data will be accepted
- Define how the data will be tested
- Define how errors will be tracked and recorded
These types of details pertaining to technical conversion specifications are typically not
addressed in the RFP/Proposal process; therefore, it is important that they be addressed fully
with the organization performing conversion. Whether used by conversion vendor or an inhouse
department, it is important that technical specifications be established prior to the
commencement of conversion.
Source to target conversion specifications
It is important that both the vendor and customer have a clear understanding of how source
data is to be interpreted, put into the target GIS system, and displayed in the GIS. Detailed
specifications covering these issues can be mutually developed and addressed through a Data
Source Matrix, Source Interpretation Document, and Map Standards.
Data Source Matrix
The source of converted data for any given project can come in a variety of formats
including, hard copy maps, digital CAD files, tabular data, legacy AM/FM/GIS data, or in
some cases, generated via software. In addition, different sources often depict the same
entities or attributes in different ways; and quite commonly, the 'same' information on two
different sources varies. For example, it is not uncommon to find the same entity located in
two different geographic locations on two different sources or a given attribute for the same
entity with two different values. To help organize how the source data is to be used, a Data
Source Matrix should be developed. The creation of a Data Source Matrix not only defines
how the source data is to be used, it also establishes a hierarchy describing which source
takes priority over another. Specifically, the source matrix should:
- List and describe all source data involved in conversion
- Define which entities/attributes in the physical data model are to be addressed during conversion
- Define the rules that are to be applied for those entities/attributes that are programmatically derived
- Define which source should be used to determine the existence and geographic location of each entity in the data model
- Define which source should be used in populating each attribute in the data model
- Define the source priority that should be used for each entity/attribute, whe re source data overlap.
Figure 1 shows an example of a Data Source Matrix.
The creation of the Data Source Matrix should be a mutual exercise between the
organizations that are familiar with the source data and the conversion team. In order to
complete the source matrix, it is important to understand not only what data exists on a
source, but also how it is depicted on the source. Therefore, it should be created in
conjunction with the Source Interpretation Document.
Source Interpretation Document
Whether source data exists as hard copy maps, tabular formats or legacy GIS data,
specifications must exist for defining how the source data is to be interpreted against the
target data model. The Source Interpretation Document will do this, and in doing so will
demonstrate to the customer, before any data is converted, that the conversion team has a
solid understanding of their source data. The methods for creating the Source Interpretation
Document and determining its content vary depending upon the type of source data;
however, it must include a full listing of the entities, attributes, and values that exist in the
target system.
For Hard Copy Source Maps
If all hard-copy source maps contained detailed legends covering every symbol and piece of
text that exists on every source map, the need for a Source Interpretation Document is
diminished. However, hard-copy source data typically contains a variety of graphic
representation (symbols, lines and text) for the same entity/attribute. For example, an open
Valve may be represented in two ways:
Open
-----X---- or -----O-----
It should be no surprise to find varying symbology on hard copy source data; after all, one of
the goals of a GIS/AM/FM system is to standardize mapping symbology. The variations may
be due to changes made to the map standards (legend) over time, or simply due to
'cartographers' not adhering to the map standards. While variations within a given set of data
may be minimal, the situation is compounded when data for an organization has been
maintained by separate departments or record offices, and is even further compounded when
it is from a utility company that was acquired at some point in the past.
The main purpose of the Source Interpretation Document is to compile all varying source
symbologies and align them to a common entity, attribute, and value (thus symbol) in the
target GIS/AM/FM system. This process begins by identifying sets of source maps that
display unique sets of symbology. For example, all maps from a given record office may
have used a common symbol set or legend.