A Pilot Project for Land base Migration
Dave Magee Account Manager, Utilities Jay Clark Product Manager, Utilities Bart Guetti Software Engineer Geographic Data Technology, Inc. 11 Lafayette St Lebanon NH 03766 Telephone: 800 331-7881 ext. 1112 Fax: 603 653-0249 E-mail: jay_clark@gdt1.com bart_guetti@gdt1.com Abstratct This presentation examines data modeling, topological maintenance, and alignment of multiple spatial data sets in a project for Atlanta Gas Light Company. Processes include:
This presentation discusses a pilot project undertaken by Geographic Data Technology, Inc. (GDT) for Atlanta Gas Light Company (AGL), the eighth largest natural gas distribution utility in the United States. Faced with advances in software and GIS technology since the development of their last GIS, AGL decided that it was time both to replace their GIS software environment and to migrate to a more accurate and upto- date landbase. When GDT met with AGL in January 2001, discussions of the landbase migration process centered on three key issues. Data Quality AGL wanted the new landbase to be spatially accurate enough to overlay an aerial image of equal or better quality than a United States Geodetic Survey digital orthorectified quarter quadrangle (USGS DOQQ). This would result in assumed horizontal accuracy of +/- 5 to 7 meters from “groundtruth”. AGL also stipulated that attribution available with the data needed to be current and complete enough for service request and Customer Information System (CIS) purposes. Management of Facilities Data Conversion AGL wanted to manage the alignment of their facility data to the new landbase in the most accurate and efficient manner possible Cost AGL sought to balance the appropriate level of effort to get the job done correctly with a moderate budget. AGL contracted a pilot project for DeKalb County, Georgia, in order to demonstrate the feasibility of a solution from a commercial vendor such as GDT. DeKalb is one of the heavily urbanized counties that make up the metropolitan Atlanta region. ![]() The new Landbase Data Quality Criteria The following criteria were developed for the new landbase GIS:
Spatial accuracy within the tolerance stated would be obtained using digital vector data from the Georgia State Data Clearinghouse, or, where vectors were unavailable, created from USGS DOQQ. The following data model was selected. Landbase GIS Transportation Data Model ![]() It was agreed that grouping of objects and attributes would support all departments in AGL It was also decided that data delivery would be in Geographic projection, using NAD83 and decimal degrees. For purposes of the pilot project, the data would also be published in UTM zone 16, NAD83, in meters. This would allow the data to be overlaid onto USGS DOQQs for a spatial “sanity check”. For purposes of the pilot project, ArcView® GIS format was used to display the data. For full rollout the data delivery would be in the form of a Geodatabase or ArcSDE™ files. Specific RDBMS was not determined at this time at the outset of the project Conflation (Alan Witmer, 2001) GDT acquired the DeKalb County GIS coverage from GSDC and conflated the horizontal control network into its core database. The process of the vector conflation activity is described below. GDT’s approach was to modularize the software and the process for each step listed below, allowing for individual development, tuning, independent operation, and quality assurance. Correlating features in two landbases (conflation) requires these steps: Prepare the databases for conflation processing. Analyze the incoming data’s quality and usability, and convert as necessary to a common format. Build a Common Representation GDT builds a topological representation from the selected features to be matched in each landbase in order to filter out unwanted detail and form two congruous data sets, to organize the remaining data into chains and their intersection/end nodes, and to generate units of geography that can be meaningfully compared. This process also provides software-generated attribution or information to guide the correlation process past ambiguities such as tight multiple-lane highway representations. The topological model aggregates the remaining linear features to make meaningful entities or “chains”. For example, a chain of arcs representing a street centerline, running uninterrupted from one intersection to the next, might be considered an aggregate. The operator defines the aggregation rules for each conflation so that the model can avoid aggregating wherever a significant attribute – such as name or feature type – changes. This can help matching when both data sets reliably record a given attribute. For example, if both landbases record street name with a high degree of accuracy, then a name change along a street, even if it is not at an intersection, should be considered as a node between two distinct chains. GDT also builds additional information at this time. In particular, the software locates and marks multiplemultiple roads per user-supplied criteria. It assigns a directional flag to indicate on which side the counterpart is found. This prevents ambiguity later, eliminating the possibility that the wrong lanes will be matched. Matching Identify common elements ![]() Figure 1 above illustrates the basic challenge of matching. We see a view of two overlaid street centerline databases. At first glance, it seems that they represent the same area. We see a major road in each database, with a common route number and similar heading. There is a development to the northeast of that road in each case, with some similarity in names and geography. Even the crook in North St below the highway (label 1) bears enough similarity to that of the unnamed road to prompt a mental match, despite the difference in detail. But there are significant issues for software: roads are more angular in one database, lengths and proportions vary significantly, and streets that should match are not often nearest neighbors (label 1, North street, is a case in point). The following labeled areas illustrate other common challenges:
Node matching uses two match agents. One agent analyzes the candidate nodes’ rubber-sheeted offset and area density of nodes. A second attempts to build an optimal “test match” of all the feature chains that are incident at the node pair, to determine the similarity of the local features at the nodes. Following node match, the GDT process uses the matched nodes as a guide to matching our topologic chains. Chain match criteria include agents that weigh:
With match information, the process can generate a rubber sheet mapping for use in realigning associated facility data. It can also identify points where the mathematical model breaks because the topology has changed significantly or wherever there is a large amount of “shear” in the warp model. The conflation process descried above was also used to correlate the existing AGL landbase to GDT’s spatially improved landbase data. Using the software correlation process, a Control Vector data set was produced. Control Vectors Control vectors are a linking data set that correlate between the nodes of the existing AGL landbase and the new landbase. Seen as a text file, the data set looks like this. -97479453, 32956206, -322, 799 This is a longitude/latitude pair in decimal degrees (decimal implied at 6 points of precision) and an offset value equal to the horizontal change in decimal degrees. The same file can be converted to a “link” data type for Arc by calculating the offset values. (-97479453 +-322) = -97479775 (32956206 +799) = 32957005 WARPING AGL selected the following items for warping:
The point features were merged into one feature named ‘Fittings”. The line (arc) features were left as the pipeseg coverage. The original text data file was also converted to an Arc Link file as discussed earlier in this paper. The facilities data was warped using ARCEDIT™ software. A watch file was created to demonstrate the procedure, it is shown below Note that in the watch file CAPITAL UNDERLINES INDICATE USER INPUT ITEMS General Procedure for warping: Arc: |> ae <| Copyright (C) 1982-2001 Environmental Systems Research Institute, Inc. All rights reserved. ARCEDIT (COGO) 8.1 (Fri Mar 16 11:31:29 PST 2001) Arcedit: |> ec arcpipes <| The edit coverage is now ENTER LOCATION OF COVERAGE TO BE WARPED HERE WARNING the Map extent is not defined Defaulting the map extent to the BND of COVERAGE TO BE WARPED Arcedit: |> de arcs links <| Arcedit: |> ef links <| Adding the extreme boundary points as hull points Please wait... 8 element(s) for edit feature LINKS Arcedit: |> nodesnap closest .00001 <| Arcedit: |> weedtolerance .00001 <| Arcedit: |> grain .00001 <| Arcedit: |> get delinks <| Copying the links from ADD LINK DATA COVERAGE LOCATION HERE into COVERAGE TO BE WARPED 13396 link(s) copied Arcedit: |> draw <| Please wait... Arcedit: |> select poly <| Define the polygon HERE YOU WILL NEED TO DEFINE THE EXTENTS OF THE COVERAGE THAT YOU WANT TO WARP BY DRAWING A POLYGON <1,2 to enter, 4 to remove last point, 5 to remove polygon, 9 to end> 12196 element(s) now selected Arcedit: |> nsel <| 1208 element(s) now selected Arcedit: |> delete <| 1208 link(s) deleted Arcedit: |> save <| Saving changes for COVERAGE TO BE WARPED Saving arcs... ** NOTE ** Arc(s) unchanged Reopening arcs... Please wait... Saving links... 12196 link records(s) written to LINK COVERAGE from the original 0 link, 12196 added and 0 deleted Reopening links... Please wait... BND replaced into COVERAGE TO BE WARPED Saving set tolerances to TOL file... Re-establishing edit feature LINK Arcedit: |> limitadjust poly <| Define the polygon HERE YOU WILL AGAIN TRACE THE EXTENTS OF THE AREA TO BE WARPED <1,2 to enter, 4 to remove last point, 5 to remove polygon, 9 to end> Limiting polygon has an area: 8762.961 and a perimeter: 39758.033 Deleting all links falling outside of limiting area... Adding the perimeter of the limiting area as identity links...Please wait... Arcedit: |> adjust bivariate <| Adjusting coverage COVERAGE TO BE WARPED Building the adjustment structure from the links for the first pass... Proximal tolerance set to 0.000... Removing duplicate points within tolerance... Within tolerance 0. Remaining 6193... Proximal tolerance set to 0.000... adjusting ARCs... Please wait... adjusting LABELs... Updating the adjustment structure for the second pass... adjusting ARCs...Please wait... Please wait... adjusting LABELs... Arcedit: |> save <| Saving changes for COVERAGE TO BE WARPED Saving arcs... 5740 arc attribute record(s) written to COVERAGE TO BE WARPED 5740 arc(s) written to COVERAGE TO BE WARPED from the original 5740, 5740 added and 5740 deleted Reopening arcs... Please wait... Saving labels... ** NOTE ** Label(s) unchanged Reopening labels... Please wait... Saving links... 6193 link records(s) written to COVERAGE TO BE WARPED from the original 12196 link, 6192 added and 12195 deleted Reopening links... Please wait... BND replaced into COVERAGE TO BE WARPED Saving set tolerances to TOL file... Re-establishing edit feature LINK Arcedit: |> &w &off <| The results of the warping can be observed in the illustration below. ![]() Overall results in the data set show that the warping process has made a good start at facilities realignment. There are several issues with each warping method that need to be addressed:
Conclusion With the results of the pilot project in hand, AGL studied the warped facilities data carefully. Both GDT and AGL staff members warped the data independently and obtained very similar results. As a result of this pilot project, AGL decided to move forward with a similar process for their new landbase migration project. Current schedule plans the completion of the entire migration effort by early 2003. Acknowledgement Witmer, A, 2001, “The Best of Both Worlds: Vector Conflation of Database Segments and Attributes”, presentation at ESRI User Conference 2001. | ||||||||||||||||
|
|