Providing data to the masses in stages
Tracking the Work
When the conversion effort is being done off-site, the conversion group will need to work in
a current copy of the live GIS database. The subset of data that has been specified as a batch
(or delivery area) is the only data that will be delivered back to the user. A method for
tracking edits, moves, deletes, and additions against the original source data can be used so
that only data that has been changed or added is delivered to the live database. One way of
doing this is to reserve or save a copy of the original live database as a static database and
compare it against the conversion database. Using SQL statements to run through the
attribute fields in the database tables, a list of features that have been modified can easily be
compiled and used to extract the data for appending into the live environment. The table in
Exhibit 3, below, shows an example of a database table that was built to track changes in the
database. It shows a comparison of old and new values at the attribute level.
Exhibit 3 ŒResults of a Database Comparison
Managing the data append
There are several points to consider that will make appending data effortless.
-
Is the converted data independent of data in the live database?
- Does the batch being appended share a common border with the data in the live
database?
- Were updates made to data that was already in the live database?
Each of these conversion scenarios requires a different method of appending data into the
live environment.
Appending Data Without Connectivity
When the batch of data does not share connectivity with the live data and conversion work is
carried out independent of record updates, then the only consideration is timing the data
append so it minimizes disruption to the user.
Appending Data With Common Borders
Converting a batch of data that has a common ‚edge™ with data in the live environment
requires that edge to remain locked to users during conversion. As each batch is delivered
the shared edge is unlocked and the data is appended into the database. The data that will
border the next batch has to be relocked. Exhibit 4, below, shows an example of batches that share an edge and how the data is locked (gray area) after each batch is appended into the
live database. This method of appending data continues until all of the data is converted.
Exhibit 4 Œ Locking Data on a Shared Edge
Appending Data With Updates
As discussed earlier, converting data for batches that involve updates to data and attributes of
features that still exist in the live data environment requires carefully tracking all updates
made to the live data. The live environment must be ‚prepared™ before appending this data.
One method of doing this is to select all the features in the live environment that correspond
to the changed features in the conversion environment, and then delete the updated assets
from the live environment before appending the subset that was converted, keeping in mind
that the user™s dataset should have been locked. Using this approach for appending data
ensures that connectivity is maintained and that updates performed by the records group do
not get overwritten.
Concluding Remarks
The purpose of this discussion was to introduce issues that might encourage utilities to
consider converting data in stages as an alternative to the commonly used approach of
converting all of the data prior to a GIS going live. Getting the data into the hands of the
business as quickly as possible will build confidence in the conversion effort, provide an
early leaning opportunity and deliver an early win on a high-risk project. A utility should
begin by identifying the benefits that will be realized at both the corporate and user level.
Then by defining the needs of the business and establishing an understanding of critical data,
the project team will be able to develop a conversion strategy that fits those needs. An
important factor that will make converting data in stages work to the advantage of a utility is
to define the process for transitioning the datasets into a live GIS. This should include,
determining how the datasets interact with each other, defining each ‚batch™ of data to be
delivered and deciding on a schedule for each data append. Designing and implementing a
sound plan will make it possible to realize the benefits of providing data to a utility in stages.