Transparent access to distributed Geographic Information Systems
Cy Smith
State of Oregon - Geospatial Enterprise Office 955
Center Street, Room 470 Salem, OR 97301
Mike Walls
PlanGraphics, Inc. 112 East Main Street Frankfort, KY 40601
Abstract
The State’s data is distributed among a large variety of data repositories, managed by
various agencies. Utilization of data from different networks generally requires agencies
to download copies of the data. This leads to multiple copies, outdated versions of
information, and data stewardship issues. Oregon has tested a network-based tool that
will integrate data from several sources, and demonstrated that web-based applications
can utilize this tool for data access without knowing data locations or formats.
The Logistical Bottleneck
Recently the State of Oregon contracted with PlanGraphics, Inc., and their technical
partners Xmarc and ESRI to successfully complete a proof of concept application, named
DIMOND - Digitally Integrated Mining of Oregon Networked Data. The multiple
technologies involved worked together to provide a novel way of getting around the
logistical bottlenecks of consolidating data from multiple sources in the enterprise.
Utilization of data from different networks or Internet locations generally has required
agencies to manually, or via automated FTP, download copies of the data to local
network locations to be used by applications. This leads to multiple copies of the same
information at different places, possible issues surrounding use of outdated versions of
this information by applications, and a number of issues regarding data stewardship,
ownership, and management. Common problems encountered when attempting to use
data in distributed locations include:
- Different agencies use different software and hardware to manage the data.
- The data are not uniformly accessible over the network.
- A crucial data set is locked away behind security firewalls or too slow to access
because of poor network connectivity to the agency.
- Data has to be gathered via ‘sneaker net’ and hand massaged to get a consistent data
set for analysis.
- Data processing (the familiar ‘Extract-Transform-Load’ or ETL process) can require
significant effort and time.
As the number of data repositories grows larger and the wider their distribution in terms
of network and Internet location, the more complex it is to develop, implement, and
maintain applications needing access to enterprise-wide data and providing enterprisewide
services. Existing applications critical to most agencies’ business functions
presently use data at these distributed locations. This base of existing applications needs
to be able to continue to do so as long as these agencies do not choose to alter aspects of
their data architecture.
Many agencies and organizations have considered placing their collective enterprise data
in a centralized data set in advance of queries, so the data set would be ready to support
the agencies’ fast-breaking, critical operational needs. Many agency operational
functions require that data and information be shared across agency system boundaries in
a rapid, transparent manner. Repositories, operational data stores, and data warehouses
are each a different technical approach to meeting this need. The DIMOND proof of
concept introduces another.
Traditionally, one of the biggest obstacles to building a centralized repository of data for
an enterprise is the logistical effort involved in updating the contents and keeping them
internally consistent. Finding a solution to this problem is often feasible for a one-time
policy study, but has proven difficult to set up as an ongoing operation. Recent
advancements in network oriented data management technologies have provided some
options not previously available. For this project, the technical team was able to
implement a proof of concept for a “virtual data warehouse” in which Web-based
middleware eliminated the need for centralizing a data repository, along with the
associated updating and consistency problems.
The Dimond Project
The key to the DIMOND project was the elimination of the requirement to physically
centralize the data. Instead, a virtual repository was created using middleware tools from
Xmarc, Inc. and Oracle 8i Spatial to manage metadata. The Web was used to display
answers to a user’s query that is executed on their desktop machine, thus eliminating the
need for a repository.
The proof of concept that was developed for this project had to meet the following
requirements:
- Show that a network-based layer of software can be applied that will integrate data
elements from several sources (locations, as well as databases), such that all of the
underlying data elements can be accessed as if a single data source is being accessed.
- Demonstrate that network- and web-based applications can be applied that utilize this
data integration layer for access of data rather than having to understand the true data
locations and underlying data formats themselves.
- Demonstrate that the original data owners or stewards can make changes to the
contents of data sources accessed by this data integration layer without affecting the
network- and web-based applications that use the data integration layer for their data
access/source.
Technical Architecture
Users start a session by pointing their Web browser to the URL of the opening screen.
This downloads the core of the Proof of Concept application -- a custom-written Java
applet. As the user navigates through the application, it appears to them that they are
querying a set of GIS tables and layers in a traditional client-server application. Instead,
they are interacting with a middleware servlet that looks up in Oracle tables the
accessible metadata for each data set being used. The middleware then initiates a
connection to a series of translator servlets located at each of the data provider sites,
which pass the query through and package the response. The middleware servlet
consolidates all of the responses into a Web page that the user sees. Each of these
interactions was coded by and managed using standard Java techniques. Figure 1
indicates the overall technical architecture used in the DIMOND project.

Figure 1: System Architecture—DIMOND Proof of Concept
Applets
Xmarc’s Enterprise Spatial Java Map product (ESJMap) was utilized in this project to
produce the interface for the virtual data warehouse. The interface was created using
applets coded in Java and compiled using Sun’s JDK 1.1.8 product for maximum browser
compatibility. The applets can be viewed in a standard browser window with Java
enabled, such as Internet Explorer or Netscape.
The Xmarc ESJMap Applications Programming Interface (API) consists of hundreds of
Java classes, bundled in one JAR file (ESJMap.jar) as a Java standard means of
packaging pre-defined class definitions for distribution and reuse. The main components
of the API consist of a map window for displaying spatial data, a legend for displaying
information about layers in the map window, and a toolbar for manipulating data in the
map window, each of which were utilized in the proof of concept project.