Foundation of the quality assurance odyssey
A sample of maps from each unique set should be fully reviewed. If map standards or a
legend exist, it provides a good starting point. Where a unique symbol, line, text or
combination of these elements exists, it should be noted, copied and matched to an
entity/attribute/value in the target data model. In theory, the sample maps should be reviewed
to the point where no new unique symbology is being found. It is important to use a
photocopy from the source map itself when conveying the symbology. This ensures the full
context of the source symbology is conveyed. For example, an 'X' along a Water Main might
mean Closed Valve but represent an entirely different entity along a Sewer Main. In addition
to a photocopy of the source symbology, any specifications or comments describing how the
asset should be interpreted and converted should be included.
Figure 2 shows an example of a Source Interpretation Document for a Hard Copy Map.
Side Note: There is an additional benefit of performing this exercise: It can assist in
determining the map standards for the GIS/AM/FM system. A good start in determining what
standardized symbology should exist in new system begins with knowing exactly what is
currently being used for manual mapping.
For Tabular Source Data
Interpretation of tabular data, whether from a hard copy format, database or, commadelimited
files while less complicated than maps, is still important and should be
documented. Evaluating tabular data typically consists of a column mapping from the tabular
data to the target entities, attributes, and values along with any specifications or migration
rules, such as parsing of fields that might pertain to the target system. Depending on the
amount of data that is coming from tabular sources, it could be included in the mapping
document or source matrix. For example,
For Legacy GIS Source Data
Where source data exists in a legacy GIS/AM/FM system, the source interpretation document
typically evolves into a data mapping document where the entities, attributes and values from
the legacy system are mapped to the target system. However, the source interpretation goes
much further and also must cover the migration of graphic elements, including x and y
coordinates, as well as connectivity.
Map Standards Document
Prior to starting conversion, it is important to have a clear understanding as to what the
converted data should look like graphically. Specifically, there is a need to define the
symbols, lines, and text that will be used to represent the various combinations of features,
attributes, and values. The size of the symbols/text and the thickness of lines must also be
defined. In addition, the frequency of annotation placement must be specified. The sum of
these specifications results in a Map Standard, much of which is typically determined by or
becomes the target system's rulebase. A Map Standard such as the one in Figure 3 compiles
the graphic information and allows for an easier understanding of what the final map product
should look like.
Figure 3: Map Standards Example
Mechanism for resolving sourcre/specification queries
No matter how much detail is established through the source matrix and source interpretation
document, the unknown always arises, i.e., there are bound to be unique instances of
symbology on the source data that are not covered by, or perhaps conflict with the
specifications in the Source Interpretation Document.
Another typical scenario occurs when a
source document is simply not legible. These conflicts essentially represent occurrences
outside the reasonable constraints of the Source Matrix/Source Interpretation Documents.
The customer will feel more comfortable with the delivered data knowing that 'guessing' is
not involved in handling source anomalies, but rather they are being queried and resolved.
Therefore, a mechanism for resolving these source conflicts, and updating the specifications
with the resolution, should be part of the planning process.
The mechanism for raising source conflicts is performed by the team performing conversion.
That team provides queries to the experts at the utility who best understand the source data.
Those queries are answered by the source expert and returned to the conversion team. In this
process, it is important that all parties use the common terminology of the target data model
since the use of 'local' terminology often leads to confusion. For example, if the model
contains specific types of valves such as, Flow Regulating Valve, Pressure Regulating Valve,
or Control Valve, it is important that both the query and resolution address exactly what type
of valve is in question. Otherwise, the answer might refer to an 'isolation valve' leaving the
recipient guessing as to exactly which valve is being discussed.
Ideally, all specification and source queries should be logged in a database if possible by
entity, attribute, and value. This helps ensure the same query is not asked twice, (perhaps
from a different map or source material), and it organizes the queries/resolutions in a manner
that facilitates updates to future versions of the conversion specifications.
Specifications for addressing source queries should also specify a time frame for their
resolution. Many a project schedule has suffered due to delays in answering queries. In the
event turn-around time becomes a project risk, it might be wise to consider implementing
defaults for specific types of queries that can help control production if a fixed response
period has expired. These pre-defined, mutually agreed defaults for specific types of queries
could be determined by examining past queries and their associated resolutions. For example,
if a common query was that the size of a valve was not shown on the source, and the answer
to that query nine out of ten times was "set it to 4", then that answer could be used as the
default. The ability to track and identify these query/resolution trends is yet another benefit
to managing them through a database.
Acceptance Criteria
A lot of attention is focused on defining the acceptance criteria with regards to the percent
accuracy that should be met. However, many RFPs and Proposals make references to
meeting different categories of accuracy but often do not fully define them or discuss the
means by which they should precisely be measured. A set of acceptance criteria should be
established that defines not only the error percentage by category, but also defines how errors
in the category will be determined as well as how they will be measured.
The methodology for defining how an error percentage should be measured depends heavily
on the category or accuracy that is being measured. Accuracy requirements for categories
that are set at 100 percent typically only need a definition. For example, if the expectation is
that connectivity and programmatically derived data must always be correct, then measuring
those against the number of attributes or entities is not critical. However, if an accuracy
requirement for a category is less than 100 percent it becomes critical for the measurement
approach to be defined. For example, completeness and attribute and location accuracy
might be measured as follows:
Attribute Accuracy
This measurement criterion is based on the total number of attribute errors found divided by
the total number of possible attributes for those entities that are sampled. An attribute error is
recorded when any attribute value for a given entity is not correct as reflected in the source
and conversion specifications. X% of all attributes must correspond to the value that is
reflected in the source, conversion specifications, etc.
Positional Accuracy
This measurement criterion is based on the number of graphic features whose symbols or
lines have positional errors divided by the total number of graphic features having symbols or
lines that are contained in a given sample. A positional accuracy error will be recorded when
any symbol or line that is not correctly located as reflected in the position shown on the
source or as reflected in the Map Standard placement rules. X percent of all sampled graphic
features with symbols or lines must correspond to the locational placement rules as defined in
the Map Standards and/or the relative location reflected on the source document or material.
Entity Completeness Accuracy
This measurement criterion is based on the number of entities that are found to be missing (or
extra) divided by the total number of entities that were sampled. A completeness error will be
recorded when any entity depicted on the source document or material is not reflected in the
data. The count of the entities captured must be X percent of the number of entities in the
sample.