A Recent technology direction: Storing new data types in an RDBMS
Paul J. Yarka Director of Product Management, Convergent Group 6200 South Syracuse Way, Englewood, CO 80111 Phone: 303-741-8484, Fax: 303-741-8456 Abstract Vendors and customers are gaining experience in the implementation of AM/FM/GIS systems that allow users to store spatial and aspatial data in a relational database management system (RDBMS). Today’s latest commercial AM/FM/GIS product offerings enable the storage of point, line, polygon, complex polygon, topology, connectivity and other spatial primitives in either an extended or object-relational database. The technology advance of efficiently storing and retrieving spatial data from an RDBMS, along with the ANSI and ISO standards efforts of the SQL MM and SQL 3 organizations enhances the long-term viability and attractiveness of relational data stores that are now being used to store and manage vector and raster data. Introduction Imagine calling your gas utility company because you want to install a gas fireplace in your home and need to have a new service line installed. The customer service representative who answers your call asks for your address and accesses your account. Using a customer information system connected to an AM/FM/GIS, he or she constructs and views a map of your home, a nearby main, and the local gas distribution network. Using this information, the two of you discuss a preferred route for the required gas line, taking into account such things as the location of your driveway, landscape, other utility services, and the existing gas utility easement. The customer service representative quickly designs the new gas line on a computer screen, provides a cost estimate, and schedules the installation with you and the utility’s construction crew. An installment payment plan is arranged for your monthly billing, and preliminary design plans, appropriate work orders, and a permit request are generated. From your perspective, with one simple telephone call you have set into motion the installation of your gas fireplace. From the utility company’s perspective, only one trip to your house is required, your request has been satisfactorily serviced, a digital engineering design with required gas equipment has been stored in an organization-wide database, and the construction crew has updated the same database with a field-checked and modified version of required equipment in its as-built state. Eliminating Technology Boundaries with Extended Relational Spatial Databases Clearly, GIS software technology has automated manual processes to help multiple users and departments orchestrate the planning, design, construction, and maintenance of infrastructure. However, with most competitive AM/FM/GIS products, spatial and aspatial data are stored in different database management technologies and, consequently, are isolated. Aspatial or tabular data are typically stored in a commercial RDBMS, while spatial data are frequently stored in a proprietary binary file structure. Recent software developments now make it possible for entities with spatial and aspatial attributes to be stored entirely in an RDBMS. Once restricted to specialized functions, users and departments, spatial information, and manipulation tools are now “mainstream-accessible”; in other words, they can be seamlessly integrated with aspatial corporate data repositories. Extended relational spatial databases are typically layered on top of commercial RDBMSS (Fig. 1), enabling easy access and manipulation of spatial data from the same data store used for storing and managing aspatial corporate data. Effective spatial data management capabilities are crucial to implementing spatial information systems, automated mapping, facilities management, geographic information systems, and computer-aided design. With extended relational technology, a single database management approach can be used to store and manage data for enterprise users with diverse application interests. Extended relational databases increasingly are clientkerver-based providing transparent access to the underlying commercial RDBMS. In some cases, applications may be developed using an API that is designed on key database and software industry standards, such as Microsoft’s Open Database Connectivity (ODBC) standard. Products in this niche also offer spatial extensions to Structured Query Language (SQL). Using SQL with spatial extensions, users can add various spatial operators to their SQL queries and qualify entities using aspatial and spatial constraints. One key advantage is that through APIs that are based on key standards and SQL with spatial extensions, widespread enterprise access for spatial data is now possible, thereby eliminating the required use of proprietary formats for spatial data storage. ![]() Figure 1. Typical architecture of an extended relational spatial database platform. Key features to look for in extended relational spatial databases include the following:
Large long-term investments have been made in recent years by many software and hardware companies in relational technology research and development. As a consequence, RDBMS technology is quite mature, providing a highly optimized environment for the storage, retrieval, and manipulation of large databases. The relational model is a simple model, wherein an entity and associated attributes are modeled through the use of multiple tables with common keys. Queries are typically constructed using SQL statements, and searches are conducted by using data in specified columns to find target entities and additional data. Other benefits of using commercial RDBMS technology in the storage, retrieval, and manipulation of spatial data include scalability across multiple CPUS; support for distributed queries across multiple processors; connectivity across multiple operating system and hardware platforms; distributed, yet user transparent, databases; support for row-level locking; and deadlock detection and facilities for flexible and comprehensive data backup and recovery. Commercial RDBMS technology integrates a spatial database environment with transaction processing capabilities, forms and report writing tools, and other RDBMSS. Combining many RDBMS technology benefits with the ability to store spatial data in a relational database also eliminates a number of long-term problems that have typically plagued AM/FM/GIS vendors. Spatial and aspatial data can now be made readily accessible to an enterprise using a single data storage technology. Large numbers of concurrent spatial database users, as well as multiple levels of access and security, can now be supported. Cumbersome tiling schemes that create inherent data integration, boundary, complex topology, and usage problems are eliminated. Entities with spatial attributes can be stored following established relational database optimization schemes, such as clustering, without complete focus on location-based optimization schemes. Compatibility with the Existing RDBMS Systems that are implemented using an extended relational database and a commercial RDBMS are fully capable of providing integrated and rapid retrieval of entities with aspatial and spatial attributes. No distinctions or performance compromises are made between a relation operated on by an extended relational database and another operated on by the underlying RDBMS. Spatial information systems employing an extended relational database can access traditional corporate databases, while data created or modified using the extended relational technology can be used in other systems. Solutions retain capabilities offered as part of a standard RDBMS product line, such as forms and report writing tools, data entry and query, security mechanisms, and transaction journaling. The Extended Relational Model Progressive spatially extended relational software packages are obviously based upon an extended relational data model. Functionality of the commercial RDBMS is extended by providing the ability to store and manipulate spatial data and the ability to maintain multiple versions of information. Specific extensions to the relational model include spatial data primitives to record spatial information in relational database tables, spatial data qualifiers for query and access, and spatial data integrity constraints. Spatial Data Primitives Standard commercial RDBMS technology provides support for various data types, such as numerics, character strings, and datetimes. Extended relational databases add a number of spatial or geometric data types treating the geometric description of an entity as a spatial attribute of the entity. Spatial data primitives enable spatial attributes of an object, such as location, extent, and topological relationships, to be recorded. In addition, existing RDBMS-provided data types or primitives are used to store aspatial data. A spatial attribute can be declared as a geometric data type chosen from a number of forms. In addition, advanced extended relational databases incorporate a spatial data library providing geometric as well as topologic representation of entities. Spatial data primitive types include point, rectangle, line segment, circular arc, line string, link, directed link, chain, complete chain, area chain, network chain, ring, polygon, unitary complex polygon, general polygon, and Binary Large Object (BLOB). Spatial entities may be held directly in the database as sets of coordinates in direct representations. In other cases, geometric description of an entity maybe assembled from a set of components as a linked representation. For example, a parcel boundary along a street may be appropriately stored as a boundary segment that is shared by the parcel and the edge of pavement. Describing an entity by its topology is advantageous when a change to a common segment, such as the parcel boundary and the edge of pavement, must be reflected in both entities. Spatial Data Qualifiers RDBMS data maybe created, retrieved, and updated using SQL statements, such as Insert, Select, Update, and Delete. Aspatial constraints are applied in SQL statements to specify a row or rows in which to operate. Normal relational operators, such as Less Than, Greater Than, and Equals, may be combined with spatial and topological qualifiers. In addition, SQL with spatial extensions (i.e., spatial and topological qualifiers) is typically implemented to take advantage of spatial indexes that provide high-performance access to spatial data. Spatial and Aspatial Data Integrity Constraints Integrity constraints are crucial to and ensure the reliability of data managed in an RDBMS. Spatial and aspatial data integrity issues must be considered in an 486.extended relational database technology. Database integrity is maintained through commit and rollback mechanisms and data integrity constraints, including geometric integrity, entity integrity, referential integrity, and check constraints. Geometric integrity constraints, which are specific to each supported geometric type, ensure that every spatial entity, either in director topological form, is validated before the entity is committed to the database. Entity integrity ensures that each row (i.e., defining an entity) in a table is unique. This is implemented through the use of a primary key. Referential integrity is essential for joining tables. In a pure relational context, referential integrity ensures that when two tables are joined by a foreign key in the main table, which is also a primary key in the joined table, the joined table contains a row for each foreign key occurrence in the main table. From a spatial data perspective, referential integrity also ensures that all components of a topological entity exist within the database and that their combined geometric form does not produce an invalid entity. Check constraints may also be specified to check if values chosen for a particular attribute are appropriate. Spatial Indexing and Performance The objective of spatial indexing is similar to indexing facilities provided by the commercial RDBMS. In both cases, optimized retrieval occurs when qualifications are applied during a data selection. Leading-edge extended relational databases incorporate a high-performance spatial indexing scheme that is applied to each entity. As a consequence, each entity stored in a conventional RDBMS table may be assigned one or more locational keys that are derived from its spatial position and extent. In addition, the locational keys may be “further” indexed using standard indexing facilities provided by the host RDBMS that handle spatial indexes in the same way as aspatial indexes. Be aware when evaluating extended relational database performance that a number of factors must be considered to simulate times observed by a user. The true test of performance is the actual retrieval and display time, including time required to search a database, retrieve selected entities, render the results, transfer the results through the network, and display them on a screen. Long Transaction Support Few extended relational databases icorporate an entity version management system or long transaction mechanism. Long transactions typically require days, weeks, months, or longer to complete. Tools should be provided to bring database subsets under change management control without impacting the master database. Under version control, all proposed changes to the database should remain private to the current job and should be made public (i.e., published) after approval. When proposed changes have been executed, data is then committed or posted to the database. In some products, preposted data 487.can be removed using a cancel process. If other proposals (i.e., proposed changes to the database) are dependent upon data contained in a proposal to be canceled, long transaction management prevents cancellation until all dependencies have been removed. Long transaction management controls all creation, modification, and deletion of data, while also providing a capability to undo changes. Key Criteria to Differentiate Extended Relational Spatial Database Products In summary, when searching for a superior and appropriate extended relational spatial database management technology for your project or problem, the following key features should be kept in mind:
| ||
|
|