Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 1997


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997
Sessions

Advanced Technical Topics

Building & Supporting Applications

Business Evolution & Platform Migration

Expanding the User Base -- Non-Traditional Applications

From the office to the Field

Fundamental & Economic Issues of AM/FM/GIS

Lessons Learned

Major Technology Trends and their Impacts

Project Planning, Implementation and Management

Re-Engineering and Integration Issues

Scada and Real-Time Systems

User Project Presentations

Best of the Rest

Invited Presentation


GITA 1997


Advanced Technical Topics
Printer Friendly Format

Page 1 of 3
| Next |


The Data Warehouse


William R. Donaldson
Senior Consultant, Convergent Group, 6200 South Syracuse Way, Englewood, Colorado 80111


Abstract
“A data warehouse is a subject-oriented integrated non-volatile, time variant collection ofdata suited to the needs of management. ” [Inmon, 1994] This paper will take that definition apart and describe how the data warehouse can provide decision support information and contribute to the integrated solution of AM/FM/GIS in the enterprise data environment. The data warehouse is more than a central copy of system data and less than a cure-all. The data warehouse can be an effective and efficient means of delivery of the information created and maintained by the AM/FM/GIS. It can be the solution that makes the difference between success and failure of a project. The paper will illustrate the technical and economic factors to be examined when considering a data warehouse in the project architecture. It examines the details of the 12 criteria defining what the data warehouse is (and is not), explains the five elements of data warehouse architecture, and concludes with a review of the potential hazards and measures of success.

The Enterprise Environment
“The biggest mistake any company can make is to promise that one of the GIS benefits will be that it will help reduce personnel. “ [Muench, 1996] In a large utility company, AM/FM/GIS may not be cost effective as a stand-alone application. This assertion was considered almost heretical a few years ago, but more and more we are seeing projects falling short of projected benefits. The most commonly seen justification for implementing an AM/FM/GIS is that it will reduce cost by eliminating people. Planners project improvements in productivity and related savings to be 30 percent or more for a stand-alone system. That makes for a pretty attractive return on investment. Yet, when we look at actuals from systems that have been installed for a year or more, we see few, if any, headcount reductions.

On the other hand, the case is being made that the information kept in the AM/FM/GIS is incredibly valuable, not only to facilities engineers, but to the entire company-but only if it’s available in a usefhl form to everybody who needs it. It’s not the AM/FM/GIS application that has value to company; it’s the information the system maintains that has value. Or to paraphrase a catch-phrase from the 1992 presidential campaign, “It’s the data, stupid.” Experts are coming to realize how the value of the information multiplies many times over when it can be shared across the company.

There are four architectures that can be used to satisfy the needs of enterprise data:
  • Central Database
  • Distributed Databases
  • Federated Databases
  • Data Warehouse
Each of these architectures has advantages and disadvantages that should be thoroughly understood by the system architects to be used individually or in combination to create the optimal company-wide system environment.

Most systems today use the central database architecture, either as a single corporate database, or multiple, stand-alone databases located in districts, Distributed databases are defined as multiple, homogeneous databases connected by a network and having data services that direct the application to the correct location for specific data. Few AM/FM/GIS systems today use this architecture.

Federated databases are similar to distributed databases except that they are heterogeneous and thus more suited to integration with legacy systems however, the difficulty and cost of integrating federated databases are enormous.

Pair-wise interfaces between systems and databases do not constitute distributed or federated databases and are a poor substitute for enterprise data architectures. Enterprise data is that which meets the following criteria
  • Data is separate from applications
  • Redundancy is managed
  • Data is shareable
  • Data is independent from supplier products
  • Data is stewarded
  • Access is open and documented
The data warehouse has the advantage of being relatively easy to steward and to manage redundant y. It is inherently independent of applications, and the data is shareable. Moreover, the data warehouse is (relatively) easy to integrate with legacy systems. Perhaps its greatest advantage, however, is that it can be designed and implemented in stages, with each stage returning its benefit as soon as it is implemented. Unlike other components of a major system, it does not have to be complete before it starts returning value. The data warehouse must be carefidly designed, however, as once built it tends to be inflexible.

The Data Warehouse Defined
The definition of a data warehouse seems to be quite simple, ye~ as Hackathorn [1993] says, “there has been much confusion and even controversy over what constitutes a warehouse.” Hackathom attempts to define the subject as “a collection of data objects that have been packaged and inventoried for distribution to a business community.” This definition seems too generic and ambiguous in describing a warehouse and how it differs from other data stores. For example, his deftition could describe a simple database extract.

We are inclined to ask, so what’s the big deal with definitions so long as it stores my data? In fact, the data warehouse is as different from other data stores as a database is from a flat file. It’s critical to understand the differences, at least at the conceptual level, if we are to communicate requirements and design options among users, designers, and suppliers, and to make good choices in our project planning.

Bill Inmon, the self-proclaimed “Father of the Data Warehouse,” describes 12 rules of the data warehouse that pin it down in greater detail and clarity.

Page 1 of 3
| Next |

Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book