GISdevelopment.net ---> GITA 2000 ---> The best of the rest

Automated QA/QC procedures for cablevision's landbase database

Somkid Nimnual
Sr. GIS programmer/Analyst, Cablevision Systems Corporation
1111 Stewart Ave., Bethpage, NY 11714-3581, Office: (516) 803-3907
Fax: (516) 803-3796

Timothy G. McKee
Sr. Engagement Manager, Navigant Consulting, Inc.
2100 Wharton Street, Pittsburgh, PA 15203
Office: (412) 390-2050
Fax: (412) 390-2049


Introduction
Cablevision Systems Corporation is one of the nations leading telecommunications and entertainment companies. Its operations range from high-speed Internet access and robust cable television packages to professional sports teams and national television program networks. The corporation was founded in 1973 as a cable television operator serving 1,500 households in New York's Long Island suburbs. Today, From the headquarters on Long Island, Cablevision serve more than 3.4 million subscribers primarily in the New York, New Jersey, Connecticut, Boston and Cleveland metropolitan areas.

Cablevision has implemented Smallworld and Model.it application since 1998, to replace the existing non-GIS AutoCAD and Magic design application. The goal is to provide the operational support system (OSS) applications for end-to-end design and management of Fiber networks, from inside plant (rack-mounted equipment), fiber plant, coax plant to outside plant design and build network elements and supporting structures. Also, the application be integrated with OSI NetExpert system for network fault conditions, Remedy system for ticketing, and Field's mobile application.

The initial implementation included:
  • Documenting the Fiber network in Long Island area.
  • Installed a central data server in Long Island Head-quarter's office with a WAN connection to local persistent cache server and workstations in engineering offices which located in Bronx, Islandia, Hickville, New Jersey, and other locations.
  • Installed Smallworld and Model.it applications, building images.
  • Landbase data conversion.
Landbase Source Data
The Enhanced Dynamap/2000 from Geographic Data Technology, Inc (GDT) was selected as the based-map for the Model.it application. GDT provided the enhancement using 1-meter ortho-rectified aerial photography known as Digital Orthophoto Quads (DOQ) acquired through the National Digital Orthophoto Program, managed by the US Geologic Survey, as the source data to rectify the existing Dynamap/2000. The orthophotography meets the National Map Accuracy Standards (NMAS) which specify that 90 percent of the well-defined points tested must fall within 1/30 inch (33.3 feet) at 1:12,000 scale. The measurement using the differential GPS (DGPS) ground coordinates for existing intersections yielded a horizontal coordinate accuracy ranging between 4.2 and 7.8 meters RMSE (root mean square error).

The Enhanced Dynamap/2000 was delivered county by county in decimal degree unit and ArcView' shape file format. The map will be projected to Lambert Conformal Conic Projection (East-West shape preservation), which based on two standard parallels, NAD 83 datum, GRS80 spheroid, and unit in feet. The Enhanced Dynamap/2000 comes in a nationwide directory (USA) with state and county sub directories and layers. The interested layers that were used in Cablevision's Model.it data model are Streets, Railroad, Water Segments, Water Polygon, Park, Institutions, Large Area Landmarks, Recreational Areas, Retail Centers, Transportation Terminal, Airports, Zip Code, Minor and Sub Civil Divisions (MCDs). Each layer composed of spatial geographical records that are classified by the Feature Class Codes (FCC). The series of FCC codes, is based on the USGS classification code in the DLG-3 file, provides more detailed information on the classification of the line segment, such as class of road, class of stream, and so forth. For example, A11 is assigned to the limited access and undivided highway and A15 is assigned to the limited access and divided highway. This FCC assignment will be used to separate each type of records into different classes of Landbase objects in Model.it data model. The Landbase source data are illustrated in Figure 1.a using ArcView application.


Figure 1.a Shows the GDT's Enhanced Dynamap/2000 in ArcView application and


Figure 1.b Shows the Landbase data model in Model.it application.

Model.IT Landbase Data Model
In Smallworld's Model.it application, the database models are classified as different categories, depends on the application's modules. These categories are the outside plant equipment, the drafting and drawing, the Landbase, the dxf object, the rack mounted equipment, the circuit, the copper, the microwave, the conduit, the relationship table and miscellaneous.

In the Landbase category, the data model is sub-classed into manifolds mainly include address, building, campus, complex, count, lot, non-political boundary, political boundary, road edge, street, street annotation, transportation boundary, transportation route, waterbody, and waterway (Figure 1.b). Each of manifolds is further sub-divided into classes depends on the geometry type (point, line or polygon) and real world objects. For example, political boundary consisted of geometry type of polygon and real world objects of state boundary, city boundary, province boundary, town boundary, and zip code boundary and school boundary. The Smallworld's Model.it Landbase data models are summarized in Table 1.

Table 1. Shows the geometry type and real world objects in the Model.it Landbase Data Model.
Landbase Manifolds Geometry Type Real World Objects
Street Line Street center line
Street Annotation Text Street name, prefix, suffix,designation
Road Edge Line Street edge
Political Boundary Polygon State, city, town, province,zip code, school
Non-Political Boundary Polygon Park, golf course, cemetery,worship, hospital, stadium
Transportation Route Line Railroad, subway, tunnel,bridge
Transportation Boundary Polygon Airport, bus terminal, train station
Waterway Line River, stream, canal, creek,ditch
Other:Address, count Building, campus, lot Point Polygon Address, counts Building, campus, lot

QA/AC Standard Checked List
Cablevision has performed preliminary study to convert the GDT's Enhanced Dynamap/2000 source data into Smallworld's environment using ArcView and Data Automation Kit from ESRI for data manipulation and Safe Software's Feature Manipulation Engine (FME) to convert and project the data to Model.it Landbase database. The data manipulation were included:
  • Selection of records of particular type, such as, school from Institutions and Large Area Landmark Layers.
  • Reduce segments and check connectivity from line type geometry, such as, Street, Water Segments and Railroad.
  • Create Rode Edge object for Street centerline.
  • Check number of original objects, duplication objects, object's name and name/attribute's type compatibility.
  • Check overlapping of polygon geometry such as park, golf course, zip code, city, county boundary.
  • Create polygon from point geometry such as school, church, cemetery, stadium and hospital.
The conversion from ESRI's Shape file to Model.it data model was performed through a one-to-one mapping file for each Shapefile layer to the Landbase objects. The projection was also scripted in the mapping file.

After the conversion is completed, the Landbase objects were checked in Smallworld's environment by using,
  • Object Browser Engine to visually check the number of object conversion, name/type compatibility and duplicated objects.
  • Network Follower for connectivity, edge matching and manifold identity.
  • Projection was checked by comparison to the reference point.
The QA/QC standard checked list was then created based on this preliminary study and are summarized in Table 2.

Table 2. Shows the QA/AC Standard Checked List used for Landbase conversion.
Real world Object Smallworld Object QA/QC check
Street centerline Street
  • Segment reduction
  • Network connectivity
  • Name, annotation
  • Edge matching
  • Contain by Road Edge
Road Edge Road Edge;
  • Width dependent (20',50', and 60')
  • Connectivity
  • Closed at the end
  • Manifold identity
Zip Code Boundary Political Boundary;
  • Type = Zip code
  • Boundary overlapping
  • Name, type, annotation
  • Number of objects
  • Object duplication
Zip Code Object Zip Code Object
  • No geometry
  • Number of objects
State, City, Town Political Boundary Type = State, City, Town
  • Boundary overlapping
  • Name, type, annotation
  • Number of objects
  • Object duplication
    School Political Boundary Type = School
    • Point to Polygon
    • Name, type, annotation
    • Number of objects
    • Object duplication
    Park, Golf Course, Cemetery Non-political Boundary Type = Park, Golf Course, Cemetery
    • Point to Polygon
    • Name, type, annotation
    • Number of objects
    • Object duplication
    Hospital, Church, Stadium Non-political Boundary Type = Hospital, Church, Stadium
    • Point to Polygon
    • Name, type, annotation
    • Number of objects
    • Object duplication
    Water Segments Waterway Type = River, Stream,Ditch, Canal Creek, etc.
    • Segment reduction
    • Network connectivity
    • Name, annotation
    • Edge matching
    • Manifold identity
      Water Polygon Waterbody Type = Lake, Pond,Reservoir, Sea,etc.
      • Boundary overlapping
      • Name, type, annotation
      • Number of objects
      • Object duplication
      Railroad, Bridge, Tunnel,Subway Transportation Route Type = Railroad, Bridge,Tunnel, Subway
      • Segment reduction
      • Network connectivity
      • Name, annotation
      • Edge matching
      • Manifold identity
      Airport, Bus Terminal,Train Station Transportation Boundary Type = Airport, Bus Terminal, Train Station
      • Point to Polygon
      • Name, type, annotation
      • Number of objects
      • Object duplication Retail Center Complex
      • Point to Polygon
      • Name, type, annotation
      • Number of objects
      • Object duplication

      Data Conversion
      The Landbase data conversion was performed through a selected vendor that has been proved to convert the data into Smallworld data format, within a critical schedule. The data model was delivered (via ftp or CD-ROM), to the vendor using Smallworld's "Extract Process". The data were then converted county by county according to the requested schedule. The conversion took about 3 weeks/county, and then return to Cablevision. The GIS personnel and engineers then perform a visual check of the returned data by using ArcView, Object Browser Engine and Network Follower based on the QA/QC standard checked list. All objects were checked for names, annotation, missing objects, objects duplication, edge matching, overlapping objects, checking object's attributes for unset or missing fields, checking name/type compatibility, comparing data against original data source, etc. If errors were found more than 5%, Cablevision will request the vendor to resubmit the replacement of the conversion data.

      QA/QC TOOL
      The QA/QC tool was developed by Navigant Consulting, Inc. based on the Landbase QA/QC standard checked list. A new object has been added to Landbase data model, called "Error Flag Object". The error flag geometry will be created at the location of the error detected with various statuses include unresolved, resolved, in progress and unknown. The nature of the encountered problem, the script that found the problem and the geographic location are all stored in the error flag object attributes, allowing the users to review and resolve the problem or make decision for accept or request for resubmit the database from the Vendor. The QA/QC tool is a menu driven format (Figure 2.) allows the user to manually activate selected QA scripts on an entire database or on a specific geographic area.

      Two options of scripts were implemented in the QA/QC tool:
      • Find and Report option, a script will find and create the error flag at the location where the error was discovered. The error flag's status is set to "unresolved" or "unknown".
      • Find and Fix option, a script will find the error and automatically fix the encountered errors with the pre-defined tolerance. The error flag is generated with the status is set to "resolved".
      The QA menu interface also allows the user to select the extent of data upon which the automated tools will be run. For instance, the current GIS view, any polygon object, or the entire data set can be selected. Other user defined options include the setting of various parameters, which are accessed through the selected script. These tools may include options such as running the tools on objects within the selected boundary, or within and touching the selected boundary; setting search tolerances, or snapping tolerances; or table selections.

      Finally, the QA/QC tool provides a standard error reporting for all the QA scripts. The generated report can be printed out or saved as an ASCII file format for further analysis. Figure 2. Showa a QA/QC tool Menu.


      Figure 2. Showa a QA/QC tool Menu.

      Conclusion
      The QA/QC tool provides an efficient and flexible environment for running QA or Data Cleanup routines on Cablevision's GIS Database. The tool is developed based on a standard checked list to provide an easy-to-use interface for applying QA processes. The error reporting that places error flags at the location of the error encountered and keep the history of the error found in the database provide a useful information for the future analysis. In day to day operation, the QA/QC tool plays an important role in a decision making to accept or reject the data conversion from venders in order to assure an accuracy of the database. It significantly reduces QA/QC time for the Landbase development project.
      © GISdevelopment.net. All rights reserved.