GISdevelopment.net ---> GITA 2001 ---> System Architecture

Using object Relational Database Management Systems to enable enterprise GIS

Tom Helmer
Convergent Group
6399 South Fiddler's Green Circle, Suite 600
Greenwood Village, CO 80111


System Architecture

Features of the system architecture pictured below (Figure 1):
  • Supports all of today's GIS desktop clients used within the enterprise
  • Supports web applications
  • Supports MapX-enabled applications
  • Correlates data from all GIS systems and DBMS systems
  • Uses one storage representation for both GIS and Map servers
  • Latest DBMS spatial storage structures
  • Takes advantage of the concurrency control, scalability, backup, recovery, replication, and performance of the leading DBMS vendors
  • Allows all desktop applications to access all data via declarative SQL
  • Visible enterprisewide data within departmental edit sessions
  • Leverages all investments in heterogeneous GIS data maintenance applications
  • Architecture is a highly layered approach
  • N-tiered to support scalability
  • Software components provide load balancing and component pooling
The components of the architecture are:
  • A RDBMS with spatial data types: This is the persistent storage mechanism for all data. This allows all map servers, applications servers, and desktop applications to access the same information and provides a declarative access mechanism for lower application development. It provides the concurrent control, backup, and recovery and performance scalability to support enterprise usage requirements.
  • A Distribution Server: This service provides the mechanism to distribute the data from data maintenance environments to the enterprise Spatial Data Warehouse and supports keeping N copies of the Spatial Data Warehouse in synch. The data distribution portion of this paper will go into detail on why replication servers were implemented to support both performance and reliability.
  • A GIS Server: This service provides the spatial analysis applications that may be needed by the map server or other desktop applications.
  • A Map Server: This service serves up maps on demand by the web server.
  • A Spatial Data Translator: This service provides the data-loading and data-extraction services required to integrate heterogeneous GIS data maintenance environments.
  • A Heterogeneous DBMS Gateway: This service provides the access and correlation power to other spatial databases in the enterprise.

Figure 1. Enterprise Spatial Data Warehouse Architecture


Supporting MapX-enabled applications goes a long way to spatially enabling all desktop applications since MapX is an OLE custom control that offers true object linking and embedding.

Providing web access gives the enterprise a very inexpensive data distribution mechanism. An architecture that supports correlating all data from both GIS and DBMS systems supports enduser applications being developed much more quickly and provides a stronger decision support data resource to answer more complex questions.

Using the latest DBMS spatial storage structures supports scalable performance, open architecture, access for CAD and mapping applications, GIS-viewing desktop tools, and ad hoc user requirements.

Provides persistent storage of facility network connectivity and topology relationships. The user is allowed to request analysis on-demand rather than querying and rebuilding these relationships from the native spatial data warehouse package on the fly in order to support a suite of heterogeneous GIS analysis applications, CAD applications and MapX-enabled applications.

Data Distribution Architecture
A key part of any system design is the data distribution architecture. User scenarios, network architecture, data currency requirements, and site autonomy requirements drive the data distribution architecture. The design allows each site that maintains data to become the GIS master publisher site for that information. All other sites should subscribe to the master site to get the latest information. This provides for a distributed spatial data warehouse that has a high degree of availability designed into it because of the numerous sites that can handle any of the end users' requests for data viewing, data extraction, or data analysis.

There are many replication options supported by today's RDBMS vendors.
  • Publisher is the site that initiates the change.
  • Subscriber defines those who are interested in the articles published.
A Subscriber may need to be kept in near lockstep with the Publisher, to have full control over when copies of articles are updated, to be able to update copies of articles and in turn publish them internally, and to subscribe over the Internet because it is a remote or firewall site. All of these subscriber needs are supported by the various RDBMS vendors. The drivers and tradeoffs for each type of subscription will be outlined.

A subscriber can be either a push- or pull-type of subscriber. Push subscriptions are the easiest for the spatial data warehouse administrator to manage, because they keep the replicated sites better in synch with the publishing site. They do, however, place the most processing requirements on the data distribution server.

Subscriptions also have the behavior of being snapshots, either transactional or merged. Snapshots are complete shipments of data each time the publisher pushes the articles. Transactional subscriptions only receive the changes made since the last publishing event.

Merged subscriptions allow updates to occur at each node and enable mobile applications to synchronize their updates within the enterprise environment. The inherent benefit of transactional-based subscriptions is potentially lower demand on the network requirements. Since the transactions are first written to the distribution server and then to the subscribers, the potential exists for the distribution server to become the bottleneck. This is one of the drawbacks if one site is the master publisher for all articles. The design supports two ways to alleviate the bottleneck. The first is to move the distribution server to its own machine. The second is to support multiple publishers, so the replication load is spread over all of the sites that make up the enterprise spatial data warehouse. All local data maintenance environments are supported best in the data distribution architecture as transactional push subscriptions. This keeps all copies of the enterprise spatial data warehouse in synch when edits are posted or committed to the master version.

Another type of subscriber is the anonymous pull. This subscriber type supports sharing data over an intranet or Internet to remote locations. Pull subscriptions support site autonomy because each site can decide how often to refresh subscriptions. It does add a linear cost to the overall system administration of the sites that compose the enterprise GIS, since each will have to manage its own replication catalog definitions.

The Enterprise GIS has inherent reliability and scalability designed into its architecture because any application server or desktop application can use any of the replicated spatial data warehouses (HUB data servers) to meet their application requirements (Figure 2).


Figure 2. Data Distribution Architecture


Benefits of the data distribution architecture are:
  • Spatial data warehouse with replication server supports fault tolerance
  • All user queries happen on LAN to ensure acceptable response times
  • Replication of spatial data warehouse to minimize dependence on WAN
  • Global work can be performed even when WAN is down
  • Replication could be LAN's/mobile's area of interest (reduced set)
  • End users see LAN speeds for all access
  • Leverages current investments in existing GIS systems
  • Spatial data warehouse to support enterprisewide applications
  • Data translation occurs once for posting
  • Reuse of existing maintenance applications
  • Supports remote user performance requirements
  • HUB supports modem access for remote users
  • Minimizes WAN bandwidth and reliability requirements
  • Persistent storage of topology and network connectivity
  • Supports node recovery and synchronizations
  • Highly scalable design
  • Mobile application
Cons of the data distribution architecture are:
  • System administration
  • Software licenses
  • Disk space
Logical Data Independence
The design of the enterprise spatial data warehouse and its associated data distribution architecture can go a long way in providing logical data independence. Logical data independence removes from the application developer and the ad hoc end user the need to know where data is stored, what is called on system X, and where it got moved during the last IT upgrade. Many mechanisms exist to help provide logical data independence. RDBMS vendors and now to some extent, the GIS vendors, provide views and stored procedures. Views are nice in that they buffer the applications from the physical table naming conventions. Tuning may cause tables to be split or merged and views allow this tuning work to be performed without having to recode applications. Stored procedures provide the same level of independence and provide a better software engineering approach to interfacing with the potential set of heterogeneous data sources, namely a routine name and a set of parameters. With the runtime support of a JAVA VM in the RDBMS, stored procedures can now be developed in JAVA so true business objects with their behavior can be implemented inside of the RDBMS. This provides the best logical data independence via an object request broker (ORB). Replication mechanisms within the RDBMS vendor community are very good at replicating just a transaction's worth of changes or a complete table's worth as discussed above.

Standard data dictionary support exists for location transparency. Nodes with whole databases may come and go and applications can look up the location at runtime without hard coding the locations of their databases.

Summary
The advances of RDBMS technology in spatial data types and data replication technology support the demanding requirements of an Enterprise Spatial Data Warehouse. The proposed architecture leverages an enterprise's investment in potentially multiple GIS systems by allowing current data maintenance systems to continue to operate in their autonomy and using spatial data transformation technology and RDBMS replication technology to publish the departmental data to the Enterprise Spatial Data Warehouse. The architecture supports the development of webbased spatial data warehouse applications and allows users to select from a wealth of desktop spatial viewers on the market. It provides all users local LAN speed access for both data maintenance and viewing client applications and minimizes its dependence on the Enterprise WAN from both a performance and reliability point of view.
© GISdevelopment.net. All rights reserved.