Spatial data warehousing "What business intelligence is There?"
Spatial Data
While traditional warehouses incorporate the spatial dimension in the form of sales
regions, stores, and customers, the relationships between them are not typically exploited.
Powerful spatial query techniques found in GIS systems such as finding customers or
network equipment within an area can be used to map disparate operational data onto
dimensions not previously possible. In addition, thematic mapping techniques can be
used to plot metrics on maps, resulting in the ability to easily spot previously unrevealed
geographically related trends.
Examples of thematic mapping appear below.

Figure 1 -- Thematic Mapping: Sales Data

Figure 2 -- Thematic Mapping: Population Density
Spatial Data Warehousing for analysis of enterprise metrics
Traditional data warehousing can be applied to spatial data, simply by tracking
operational metrics about the GIS data. Examples would be storing inspections, trouble
calls, disconnects, leaks, and other network metrics in fact tables. The GIS can be used to
classify each fact into the correct region or regions, using “within polygon” query
capabilities. Attributes about the network elements could be stored in “product” type
dimension tables, such as transformer, conductor, pipe, or customer. By integrating the
functionality of a GIS with the reporting and analysis capabilities of a data warehouse,
regions can be more easily changed or updated, and facts can be driven from the
application functionality for storage in the warehouse.
Some examples of potential analysis and reporting capabilities are:
- Plot leaks or outages by region and timeframe thematically using colored regions.
- Report number of inspections against problems found by inspector zone over time
on a chart within each region.
- Display sales against marketing initiatives on a demographic map.
- Show customer complaints by region against number of outages.
Another possibility is the ability to combine multiple GIS and other operational system
data into a single browser. Middleware can be used to pass messages from source
systems to update dynamic information in real-time, resulting in an enterprise asset
viewer and reporting tool. Layer controls can be used to show operational states, such as
outages, or any other report available in the data warehouse. Thematic mapping
capabilities of spatial viewers can be implemented to color regions based on the results of
the reports.
Spatial Data Warehousing for System Integration
Spatial Data Warehousing is a powerful system integration paradigm that can pave the
way forward for companies facing difficult integration problems involving multiple
spatial platforms, and diverse data sets and formats. The warehouse is built alongside
existing systems, and implemented and maintained in parallel with existing initiatives so
as not to disrupt production. The warehouse is designed to provide immediate business
intelligence and traditional warehousing benefits, while at the same time aligning diverse
data sets on a common robust enterprise platform. Eventually, applications are enhanced
to use the warehouse as the source of record, allowing direct updates with multiple
application concurrency control and version management. This approach provides many
short and long term benefits, including improved management, integration, performance,
automation, stability, and security.
Implementation Specifies
First, a Spatial Data Warehouse (SDW) is created, using one of many GIS server storage
products. (Oracle Spatial, ESRI SDE, Autodesk AGDS, etc.) The SDW allows read only
access (view / query) to applications, and is not the source of record for any dataset
except the operational metrics. Many off the shelf tools can be used to create simple data
mover routines for many data formats, and can be customized and enhanced as needed.
Unique dataset identifiers are created on the initial load of each dataset, and those IDs are
stored as part of the primary key for each row in the SDW. Alternatively, a composite
primary key can be built from the GIS key and a dataset id. The dataset id allows each
dataset to be refreshed, by deleting each row having the key and importing a new copy of
the source data.
Initially, the data movers for each dataset format simply delete all warehoused data for
the dataset, and re-import it. For example, each file (dwg, shp, etc.) is a single dataset, as
is each continuous database dataset. Refreshing is controlled by an ongoing server
scheduler process, or by a manual “as-needed” trigger.
This approach works as long as refreshes are done infrequently, since each full refresh
involves moving every data element. As more accurate and up-to-date information is
desired, "modified" flags are built into the source apps, so that refresh only has to move
modified data, and can be done more frequently. This could also be implemented using
triggers, time stamps, or many other similar constructs.