Avoiding data de-evolution
Data Evolution
Geospatial data undergoes an evolution from an initial conversion stage where data is
converted to a digital format, to an early maintenance period, and eventually to a routine
maintenance mode. At any point along that evolutionary path, data is subject to mutation via
potential migration to new systems. In addition, data quality is subjected to forces all along
the evolutionary path that may degrade it.
Initial Conversion
During the initial conversion stage, geospatial data is converted into a digital format from
unintelligent hard-copy or CAD source information or in some cases via field collection.
Utilities typically establish acceptance criteria for the initial digital conversion effort. The
criteria typically is in the form of error percentages by categories, i.e., 98% of all converted
attributes match the source data, 100% of all assets meet connectivity rules, etc. Converted
data is then measured against the acceptance criteria/standard during the QA/QC check.
Assuming these two prerequisites are in place, the quality of data is known immediately
when it’s ‘born’ into the GIS. For example, the attribute accuracy was measured to be 98.5%,
100% of it complied with connectivity rules, etc. Once converted and measured, maintaining
that known accuracy post conversion becomes the challenge.
Early Maintenance
Once converted, geospatial data is updated by a maintenance process in its infancy. There
typically is a feeling out period for the new system, tools, and records maintenance process.
This early maintenance mode is a particularly dangerous period where, if not watched, data
quality could begin to regress. Even with the best-established processes and training, it
should be recognized that new records maintenance technicians are getting used to the new
tools and procedures. In the case where it is a utilities’ initial GIS implementation, the GIS
itself is a new concept. Records maintenance staff, especially those with a CAD background,
may not grasp that geospatial data is more than a picture; it is also a supporting database with
network intelligence, etc. They may not grasp the importance of snapping connected pipes to
gain network connectivity or that annotation is generated by the database and should not be
placed or maintained in a free form text feature. Even for those familiar with a legacy system,
they will need to become familiar with the new system’s data model, business rules, and
tools. Any of these circumstances have the potential to impact the quality of the data that was
known when the data had just been converted.
Routine Maintenance
At some point the maintenance staff becomes comfortable with the new system and data
maintenance leaves the infancy stage and becomes a routine endeavor. While data appears to
be relatively safe during this stage, quality can still be at jeopardy. Records maintenance staff
are resourceful and likely will develop methods that are not covered in documented
procedures or specifications, some of those ‘workarounds’ will be helpful, some will be
harmless, and some will have a negative effect.
Some utilities outsource during this stage and the introduction of new people to the process
could be particularly dangerous to the quality of data, especially if there are undocumented
specifications. A typical scenario might take the form of something like:
John the engineer always symbolizes valves with a circle on his sketches and then jots
down a note about the kind of valve. Unfortunately, a standard symbol is not part of
the official documentation and most engineers show valves with bow tie symbols
(circles are used for something else). Jane the record maintenance technician, who
inputs all of John’s work, has worked with John long enough that she knows what he
means. Unfortunately Jane is the only one who knows this and when a new technician
or vendor gets involved that ‘undocumented’ specification is typically not conveyed.
Migration
Likely just when maintenance becomes routine, new and better technology comes along and
data must undergo a migration to a new system. If done properly, migration typically does
not jeopardize quality as the core data should not change, i.e., if your assets were in the right
location, with the right attributes, and proper connectivity in the old system, then they ought
to be able to be migrated with the same characteristics in the new system. While the
migration effort itself shouldn’t have a big impact, problems in the legacy data may exist and
be transferred or amplified because of the new system. For example, the legacy data/system
may have been subject to a less stringent rulebase where a pipe was not required to have a
pipe size in order to be posted to the database. If the new system puts such a requirement in
place, then much of the legacy data would be in violation of the new standard. Ideally the
problematic data would be fixed as part of the migration process but the reality is that these
types of fixes are often left out or overlooked as part of the migration scope and therefore are
left to be handled post migration.
Safeguarding the records maintenance process
With a multitude of evolutionary factors that geospatial data can be subjected to, it is
important to ensure that safeguards are in place from the inception of a maintenance process.
Fortunately, there area several safeguards a utility can take to help protect the integrity of
data and prevent it from getting worse. The most obvious is the establishment of a QA/QC
process whereby updated data is checked prior to being ‘posted’ to the live system.
Additional safeguards, above and beyond QA/QC, can be built into the GIS and records
maintenance process from its inception. If these techniques weren’t applied during
implementation, they likely can be incorporated into an existing system. Like any other
process, the maintenance process itself changes over time and should be open to process
improvement.