A pilot project for landbase migration
It was agreed that grouping of objects and attributes would support all departments in AGL
It was also decided that data delivery would be in Geographic projection, using NAD83 and decimal
degrees. For purposes of the pilot project, the data would also be published in UTM zone 16, NAD83, in
meters. This would allow the data to be overlayed onto USGS DOQQs for a spatial “sanity check”.
For purposes of the pilot project, ArcView ® GIS format was used to display the data. For full rollout the
data delivery would be in the form of a Geodatabase or ArcSDE™ files. Specific RDBMS was not
determined at this time at the outset of the project
Conflation
(Alan Witmer, 2001)
GDT acquired the DeKalb County GIS coverage from GSDC and conflated the horizontal control network
into its core database. The process of the vector conflation activity is described below. GDT’s approach
was to modularize the software and the process for each step listed below, allowing for individual
development, tuning, independent operation, and quality assurance.
Correlating features in two landbases (conflation) requires these steps:
Prepare the databases for conflation processing.
Analyze the incoming data’s quality and usability, and convert as necessary to a common format.
Build a common representation
GDT builds a topological representation from the selected features to be matched in each landbase in order
to filter out unwanted detail and form two congruous data sets, to organize the remaining data into chains
and their intersection/end nodes, and to generate units of geography that can be meaningfully compared.
This process also provides software-generated attribution or information to guide the correlation process
past ambiguities such as tight multiple-lane highway representations.
The topological model aggregates the remaining linear features to make meaningful entities or “chains”.
For example, a chain of arcs representing a street centerline, running uninterrupted from one intersection to
the next, might be considered an aggregate. The operator defines the aggregation rules for each conflation
so that the model can avoid aggregating wherever a significant attribute – such as name or feature type –
changes. This can help matching when both data sets reliably record a given attribute. For example, if
both landbases record street name with a high degree of accuracy, then a name change along a street, even
if it is not at an intersection, should be considered as a node between two distinct chains.
GDT also builds additional information at this time. In particular, the software locates and marks multiple-multiple
roads per user-supplied criteria. It assigns a directional flag to indicate on which side the
counterpart is found. This prevents ambiguity later, eliminating the possibility that the wrong lanes will be
matched.
Matching
Identify common elements

Figure 1 above illustrates the basic challenge of matching. We see a view of two overlaid street centerline
databases. At first glance, it seems that they represent the same area. We see a major road in each
database, with a common route number and similar heading. There is a development to the northeast of
that road in each case, with some similarity in names and geography. Even the crook in North St below the
highway (label 1) bears enough similarity to that of the unnamed road to prompt a mental match, despite
the difference in detail. But there are significant issues for software: roads are more angular in one
database, lengths and proportions vary significantly, and streets that should match are not often nearest
neighbors (label 1, North street, is a case in point). The following labeled areas illustrate other common
challenges:
- Corresponding streets meet in differing intersection configurations like the North St/Unnamed
intersection with Route 16.
- The names are similar, but not exact: “Alton Hgts Ln” versus “Afton Ln”.
- Two stretches of road in one database (North St.) match to only one in the other, and the single item
must be conceptually split in order to build a one-to-one relationship.
- The B St/ Unnamed match continues further in one database than the other, and conflation must decide
how much of the more-complete street should be matched.
The process begins with node matching. Nodes are the confluence of a great deal of information, and are
thus the places where pivotal matches can be assured. As with most other conflation software developers,
GDT uses iterative matching, choosing the strongest node matches in an early pass, and then conceptually
rubber sheeting and using neighborhood information to match in repeated passes, continuing as long as new
matches can be found.
Node matching uses two match agents. One agent analyzes the candidate nodes’ rubber-sheeted offset and
area density of nodes. A second attempts to build an optimal “test match” of all the feature chains that are
incident at the node pair, to determine the similarity of the local features at the nodes.
Following node match, the GDT process uses the matched nodes as a guide to matching our topologic
chains. Chain match criteria include agents that weigh:
- overall orientation of the line or significant shape
- convexity/concavity
- overall length
- neighboring node topology and match status
- affine transformation of both lines based on calculated trend and
- the overall quality of all other characteristics if one or the other chain were split.
In addition, the following attribute-based match agents may be enabled if the associated attributes are
available and reliable:
- name (using a tunable fuzzy text match algorithm)
- feature classification
- multiple lane side
- polygonal boundary coding (such as presence in an incorporated city boundary)
- other attributes, such as permanent ID