Using object Relational Database Management Systems to enable enterprise GIS
Data Distribution Architecture
A key part of any system design is the data distribution architecture. User scenarios, network
architecture, data currency requirements, and site autonomy requirements drive the data
distribution architecture. The design allows each site that maintains data to become the GIS
master publisher site for that information. All other sites should subscribe to the master site to
get the latest information. This provides for a distributed spatial data warehouse that has a high
degree of availability designed into it because of the numerous sites that can handle any of the
end users' requests for data viewing, data extraction, or data analysis.
There are many replication options supported by today's RDBMS vendors.
-
Publisher is the site that initiates the change.
-
Subscriber defines those who are interested in the articles published.
A Subscriber may need to be kept in near lockstep with the Publisher, to have full control over
when copies of articles are updated, to be able to update copies of articles and in turn publish
them internally, and to subscribe over the Internet because it is a remote or firewall site. All of
these subscriber needs are supported by the various RDBMS vendors. The drivers and tradeoffs
for each type of subscription will be outlined.
A subscriber can be either a push- or pull-type of subscriber. Push subscriptions are the easiest
for the spatial data warehouse administrator to manage, because they keep the replicated sites
better in synch with the publishing site. They do, however, place the most processing
requirements on the data distribution server.
Subscriptions also have the behavior of being snapshots, either transactional or merged.
Snapshots are complete shipments of data each time the publisher pushes the articles.
Transactional subscriptions only receive the changes made since the last publishing event.
Merged subscriptions allow updates to occur at each node and enable mobile applications to
synchronize their updates within the enterprise environment. The inherent benefit of
transactional-based subscriptions is potentially lower demand on the network requirements.
Since the transactions are first written to the distribution server and then to the subscribers, the
potential exists for the distribution server to become the bottleneck. This is one of the drawbacks
if one site is the master publisher for all articles. The design supports two ways to alleviate the
bottleneck. The first is to move the distribution server to its own machine. The second is to
support multiple publishers, so the replication load is spread over all of the sites that make up the
enterprise spatial data warehouse. All local data maintenance environments are supported best in
the data distribution architecture as transactional push subscriptions. This keeps all copies of the
enterprise spatial data warehouse in synch when edits are posted or committed to the master
version.
Another type of subscriber is the anonymous pull. This subscriber type supports sharing data
over an intranet or Internet to remote locations. Pull subscriptions support site autonomy because
each site can decide how often to refresh subscriptions. It does add a linear cost to the overall
system administration of the sites that compose the enterprise GIS, since each will have to
manage its own replication catalog definitions.
The Enterprise GIS has inherent reliability and scalability designed into its architecture because
any application server or desktop application can use any of the replicated spatial data
warehouses (HUB data servers) to meet their application requirements (Figure 2).

Figure 2. Data Distribution Architecture