GISdevelopment.net ---> GITA 1999 ---> Data Development and Evolution

Centerline Magic

Tim Sosinski
M.J. Harden Associates, Inc., 1019 Admiral Boulevard
Kansas City, MO 64106


Introduction
For many years now the intelligent street centerline file (or the address range guide) has been one of the most utilitarian tools in a Geographic Information System (GIS). The benefits of a street centerline file go well beyond creating a nice looking street map. Street centerline files provide quick references for address location, incident reporting and aggregation, gee-coding, vehicle routing, and/or address matching. They are found in planning offices, utility companies and emergency dispatch. One of the reasons these files are so versatile and useful is the fact that the design of this tool is simple, practical and has withstood the test of time.

Although using this tool is simple, building a file is no easy matter. Many communities have constructed digital versions of their centerline files, but have not propagated the address ranges because of time and cost constraints. The process described in the following text provides a way to efficiently and easily construct address ranges for a street centerline. It avoids costly field collection and provides a method to audit the address ranges for files that were built using traditional means.

Basic Street Centerline Model
The basic model for an intelligent street centerline file consists of a graphic and descriptive tabular data set. Generally, each record, which is a graphic and tabular pair, in the data file corresponds to a single street segment between intersecting streets. For example the segment of Main St. between 1't Ave. and 2nd Ave. would be represented by one record in the centerline file. The tabular data record is comprised of several descriptive fields. Some of the data fields describe the segment as a whole entity and other fields describe characteristics lying to either the left or right side of the street segment. Typically the fields that describe the entity as a whole are the street segment's direction, name and type (such as, N - Main - St.). The data fields that describe the left side and right characteristics are generally the left side address range and the right side address range. The lell side / right side address information is often listed as a range consisting of the low and high addresses.

The best way to describe a centerline address record is by using a real world example. Suppose someone were compiling data for Main Street between 1't and 2nd Avenue. If the compiler "stood" at the intersection of 1't and Main and faced 2nd Ave., he would look to his Iefi and record the left address nearest the intersection in the left low address field. He would similarly collect the address number of the structure on his right in the right low address data field. The compiler would then proceed down Main Street. At 2nd Ave. he would look to his left, determine the house number of the last structure nearest the intersection and record it in left high address field. Then looking to his right he would likewise record the number for the right high address field. The resulting record would look something like this:

Table 1: Standard Street Segment Address Format
Record Id Dir. Name Type Lft_Low Lft-High Rgt-Low Rgt-High
132 N Main St 100 136 101 145

The record in the example above is based on a couple of assumptions. The first assumption is that the graphic record (the coordinate record) and the tabular record both have "directionality." The graphic record consists of a starting coordinate (or x,y pair) and an ending coordinate. Once again referring to the example above, the starting coordinate (commonly referred to as the "from node") for the segment on Main Street would be the coordinate pair at 1't and Main. R then follows that the coordinates at the 2nd and Main Street intersection would be recorded as the ending coordinate or "to node."

The second assumption deals with how the addresses are recorded. There are two schools of thought regarding the recording of address ranges. In the example presented above, the recorded address ranges represent the "actual" or "field verified" address ranges. The other possibility is to record the "theoretical" or "hundred block" range. The hundred block approach records addresses as complete one-hundred block ranges. Below is a comparison between the two approaches.

Table 2: Actual and Theoretical Range Comparison

Actual
Dir. Name Type Lft_Low Lft-High Rgt-Low Rgt-High
N Main St 100 136 101 145

Theoretical
Dir. Name Type Lft_Low Lft-High Rgt-Low Rgt-High
N Main St 100 198 101 199

The differences between the two procedures may be observed by comparing the corresponding high address ranges. The theoretical method accommodates an obviously broader range of address possibilities. There are also a number of different operational impacts other than merely storing the data:

Table 3: Match Rate Comparison
Task Actual Theoretical
Maintenance Costs Higher Lower
Match Rate Lower Higher
Location Estimation Higher Lower

Which strategy to use for address range recording depends of course on the needs of the user. Different agencies will of course have different need requirements. Some users find it beneficial to store and maintain both types of address ranges within their centerline file.

Recalling the example above, the collection of centerline street name and address range information has been very much like sending field collectors to the field. They would then “walk” each street segment recording (or filming) the addresses along the way. Although accurate, this method can be slow. Data collection errors are costly to correct+ften requiring another trip to the field. With the proliferation of intelligent and spatially-referenced data sets of all kinds, it is now possible to take advantage of the digital wealth of knowledge to improve the data collection and construction of street centerline files.

Automatic Address Range Assignment
It is now possible to automatically assign address ranges. The method described below is applicable in the following instances:
  • The user wishes to record actual address ranges
  • A spatially-referenced address point file is available for use as a substitute for field verification
  • Centerlines lacking address range information must contain accurate street names that are consistent with the street naming conventions of the source point address data
System Inputs
Suitable input data must be available for the process to work properly. Appropriate source point address information is usually a parcel centroid file containing a “situs” address. A situs refers to the actual address of the property, as opposed to a mailing address. A parcel centroid may be a graphic digital centorid file or the x and y coordinates recorded in a data file. Centroids of all types of parcels maybe used including both vacant and improved property. Another type of source point address data is an address point data file that contains an x,y coordinate and an address record for every address in a community.

The addresses associated with the centroids or the address point file must be standardized to a degree. There should be some logical type of formatting for the address components, such as one field for each of the following: house number, street direction, street name and street type. The street name portion of the record must be consistent from record to record. For example, all address points or centroids located on Dearborn Street must have Dearborn spelt the same way and not Dearbome or Deerbom. Most of these naming convention errors can be cleaned up by creating a standard address name table and comparing the file against it. The file may also be parsed using one of the many commercially available software packages. All records with partial

information must be corrected or purged from the file. For example, records with a blank house number or blank street type will create false address ranges if allowed to remain in the file. The file containing the street record or centerline file must follow similar restrictions before input. Principally the street name portion of the data record must be of an identical format to the street name portion in the centroid or address point file. Therefore, if one file has an item description for the street name of 20 bytes and character type, then both files must have the same format of 20 bytes and character type. The centerline files must be scrubbed for completeness. Every effort must be made to assign a street name direction and type to each graphic element. Any street segments that will not have abutting addressable features, such as interstates, limited access highways and access ramps, may be temporarily removed from the file. If all conditions are met for both of these files, then processing may begin for automatic address range assignment.

System Process
The first process is to create a table of unique street direction, name and street type combinations from each of the input files. Once this has been accomplished the two files maybe compared to determine where, if at all, there will be any obvious no-assignment conditions. This would occur if there are 12 parcels with addresses on Elm Street, but no street segments found by that name in the street centerline file. This condition would obviously result in a no match situation. Both the centroid file and centerline file are read simultaneously. If the first street to be processed is Ash Street, then all the street segments named Ash Street and all the parcel centroids on Ash Street are extracted and written to two temporary files.


A proximity command is then performed (in Arc/Info this is the NEAR command) which calculates the distance from the centroid to the nearest arc segment. This information is recorded in a third file.

Table 4: Process Input Record
Seg –ID Point-ID Address
154 1257 110 N Elm St
154 1256 102 N Elm St
154 1258 105 N Elm St
154 1265 101 N Elm St

Once this information has been collected, the data is analyzed for information such as segment side, its relative position along the segment, and parity. The address range record is then generated.

ID Street Even Low Even High Odd Low Odd High
000154 N Elm St 102 110 101 105

Having collected the necessary information, it is now possible to determine all address points that abut the street segment. We can also determine:
  • The high and low ranges for each side of each segment
  • The parity of each side of each segment
  • Whether the direction of the line segment is consistent with the direction of addresses
One side benefit of this process is the creation of a tally sheet of addresses per segment. This may not appear to be much of a benefit, but it provides a strong indication of the reliability of the resulting address ranges. Another benefit of this process is in identi~ing anomalous addresses, those located within the wrong hundred block or located on the wrong side of the street. Some post processing must be performed to “fill in” the blanks in cases where there exists only one abutting street address along a street segment. This occurs in sparsely populated areas on the urban fringe. It also occurs in heavily developed urban areas when a single structure or parcel may occupy an entire city block. An additional post-processing step corrects the direction of the line segments identified as being contrary to the direction of the addresses.

The end result of this process is an automated tool that mimics the field collection techniques used for so long in building street centerlines. In stead of sending field personnel onto the streets, a software module does the looking and encoding. The automated tool “looks” at existing data sets using a consistent process based on high school trigonometry and Boolean logic. By using this automated approach, intelligent street range graphic files may be constructed in a shorter amount of time and by performing less field work–resulting in a file that has a greater degree of accuracy and is more defendable.

Conclusion
An automated address assignment process has many advantages over traditional field or map compilation methods:
  • A significant time and cost reduction
  • Defendable and repeatable address range construction
  • Graphic and Tabular editing occurring in one automated jobstream
  • Address statistics for each centerline segment
  • Periodic recompilation is an option
  • Subjective operator interpretation is removed from the process
© GISdevelopment.net. All rights reserved.