GISdevelopment.net ---> GITA 2000 ---> System Architecture

Evaluating GIS Vendor Templates

Gary R. Graybill
Stoner Associates, Inc.


Introduction
This paper will focus on the process of performing GAP analysis on standard templates packaged with GIS software for distribution utilities (Sue – I do not see where an evaluation is being discussed in this paper. If you think the author is right, omit my edit). (Sue – I do not see where an actual ESRI/Intergraph template is being referenced) The scope of this review covers both relational and object oriented vendor offerings.

The GIS Software Selection Process
Many utilities are incorporating an evaluation of vendor supplied templates as a part of the software selection process. To properly evaluate the usefulness of a template, some level of understanding is required about the information the GIS will manage and how it will be stored.

Types of Data Modeling
What is a GIS data model? A GIS data model can be thought of in the same context as a traditional information system data model, which can simply be defined as “a specification of the data structures and business rules needed to support a business area”. A true data model should be more than just a structural diagram, and should convey complete information about how entities are managed by the information system. Data models should start at a low level of detail and then expand the detail as they reach completion.

The Logical Data Model (LDM) is a representation of the system at the abstract level, and contains none of the constraints of the actual physical implementation. The Physical Data Model (PDM) is the expression of the logical design into a working application architecture.

Entity-Relationship Data Models
Entity Relationship (ER) data models have been used for many years to assist with the implementation of relational database systems. The current accepted standard for ER modeling is called IDEF1X, and is supported by most modeling software vendors.

Special terminology is associated with data modeling and some of the commonly used terms are defined below:
  • An entity is a thing of significance to the business, whether real or conceptual, which holds information for the business or system being modeled. Each defined entity must have more than one occurrence and each occurrence must be uniquely identifiable. For example, a transformer is an entity in an electrical distribution system. Transformers can have many types or occurrences and each type or occurrence can be uniquely identified. An entity relationship diagram not only shows data entities but also their relationships with others.

    Data entities are represented using the following notation:

    Data Entity

  • An attribute is an individual piece of descriptive data about an entity. Attributes of an entity are identical for all the specific instances of the entity. Attributes for a super-entity are applied to all of its sub-entities.
  • A relationship is the nature of the interaction between two entities. Sample entity relationships are shown below.



    An optional relationship description can also be shown using text in the ER diagram.
  • A primary key is an entity attribute that uniquely identifies each instance of an entity from other instances. It is the primary key that provides the relationship between two entities.
  • A domain is a common database format and set of discrete values that can be used repeatedly for similar attributes. All attributes must either have a domain or their own unique definition.
Object Models
Object models have the advantage of dealing with entities at a completely abstract level. Since an object can contain both the attributes and code to control its behavior, it is more intelligent than a relational entity. These attributes and behaviors are called properties and methods in object terminology. Another important characteristic of object oriented systems is their support of inheritance. An object can inherit the properties and methods of another object, and only focus on differences which make it unique from its parent. Several object notations existed before the mid-1990s, but most of them have been replaced with the standard Unified Modeling Language (UML). Some of the basic terminology and notation of the UML standard can be found below:
  • A class is a description of a set of objects that share the same attributes, operations, methods, relationships and semantics. A class may use a set of interfaces to specify collections of operations it provides to its environment. Classes can also be categorized into abstract and coclass types depending on their ability to create new objects and specify subclasses.

    Object Classes are represented using the following notation:

    Object Class
    attributes

  • A method is the implementation of a service that is requested from an object. It specifies the algorithm or procedure that effects the results of the service.
  • Inheritance is the mechanism by which more specific elements incorporate structure and behavior of more general elements.
  • A property is a named value denoting a characteristic of an object. Properties of an object are shared by all the specific instances of the object.
  • A relationship defines the nature of the object class interaction. An n-ary association can exist between three or more classes. Some sample relationships are shown below:



This information is intended only as in introduction to the object modeling process. More information on object modeling and the complete UML specification can be found on the World Wide Web at http://www.rational.com/uml. [OMG, 1999]

The Gap Analysis
A Gap Analysis is a procedure to find and quantify the differences between the vendor’s template and an existing design or system requirements document. A number of exercises have been laid out to assist with the evaluation of two data models. All differences should be reviewed carefully within the context of how they will affect the system implementation. The number and types of discrepancies will determine the effort that is required to customize the template. Some of these exercises are relatively simple to perform for completely relational systems, but become very tedious whenever object oriented systems with complex entities are involved.

Comparing Models
It is strongly recommended that system planners perform the exercise of creating a data model. Although some vendors supply the logical data model that goes along with the template, this diagram will probably be very ineffective at representing the organization’s operating business rules. The logical design may not even exist for the vendor template data model. In this situation, Computer Aided Software Engineering (CASE) tools can help construct a logical representation by “reverse engineering” the physical data model specification into a higher level abstract Logical Data Model (LDM). Once two data models exist for the proposed system, differences between the Logical Data Model (LDM) and the GIS vendor’s template can be evaluated.

Entity Cross-reference Exercise
Once a logical design Model has been constructed, a comparison can be made between the template and the LDM.
  • Find entities in a vendor’s template data model that do not exist in the LDM. These should be considered potential omissions, and may need to be added to the LDM.
  • Find entities in the LDM that don’t exist in vendor’s data model. These will be a source of customization in the GIS to create the needed entities.
  • Determine the level where entities match between the two systems. Analogous entities may exist in both systems, but be considered a sub-type in one system and a super-type in the other.
All of the entity mismatches should be listed, and recommendations made to deal with the differences.

Attribute Evaluation Exercise
Vendor templates typically strive to include as many attributes as possible, without becoming overburdened or redundant. The list of attributes has been selected to be appropriate for most utilities intending to use the system. Unless the logical design model (LDM) has been carried out to a significant level of detail, attributes will not be available in the LDM for comparison to the vendor’s template. The following list of questions is intended to assist the designer in the evaluation of attributes.
  • Are all attributes from the template necessary? If not, they are a good candidate for removal.
  • Will attributes be used at some point in the future? If they are likely to be used in the future, keep them as part of the logical design, but exclude them in the physical design.
  • Are any attributes missing from the template? Adding attributes is usually a simple matter unless there are relationships to other entities, other systems or complex functionality required to support them.
  • Are attributes in the template associated to a different entity in the LDM? It can potentially be very simple to move attributes from one entity to the other. At other times it can be a very complex matter, especially when the attribute is moved to an entity of a different geometry type or the attribute is tied to built in application functionality. Unused and long term future attributes should be removed from the database if it is expected to grow to any appreciable size. Unused entities and attributes can waste significant space in the physical database if not removed.
Business Rules Evaluation
Comparing business rules is one of the most demanding parts of a Gap Analysis. It requires a thorough understanding of business processes and GIS application architecture. The template can be expected to have some level of business rules built-in for rudimentary support, but will not be extensive. Business rules usually provide only basic functionality and serve as an example for the purpose of constructing them on your own.

If a business rules library exists where complex rules can be constructed from lower level rules, then an evaluation of their suitability can be done. Most implementations will require custom business rules that must be coded by a software developer who is familiar with the GIS application development environment.

The following list of questions is intended to assist the evaluation of business rules:
  • Can the template’s existing business rules be enhanced?
  • How do business rules interface with the system’s architecture? Are the business rules well documented?
  • Can business rules be created with common software tools such as Visual Basic?
Life Cycle Transition Management
An effective means to understand the life cycle requirements of the system is to perform a CRUD analysis. (CRUD stands for the Create, Read, Update and Delete phases of the entity life cycle.) This is typically done by creating a CRUD matrix, which is a two dimensional cross-reference of processes and entities. This exercise becomes valuable when the GIS system is integrated with other information systems within an organization.

This type of analysis can help to answer the following questions:
  • Are there any entities that are not created?
  • Are there any entities that are not updated?
  • Is any entity created/updated by multiple processes? (ownership issues)
  • Are there any entities that are not used at all?
  • Are there entities not read by any processes?
Sample CRUD diagram for GIS update process:

Entity Name Create Read Update Delete
Transformer X X X  
Regulator X X X X
Parcel   X X  
County   X    

The following list of questions can assist the designer in the evaluation of Life Cycle management issues:
  • Does the template have adequate detail to manage life cycle requirements? Can the life cycle state names be changed if needed? Can additional life cycle states be added if needed?
  • Does the template require more detail in the life cycle states than is needed? Can the unneeded states be dropped?
  • Do life cycle issues carry over to other systems that can be linked or integrated with the GIS?
Connectivity/Ownership Model
Connectivity and asset ownership are complex issues that are difficult to express using conventional ER diagram notation. Most vendor offerings support a Network data model of some sort. If it is present and documented, then a determination of suitability can be made. This decision must be guided by evaluating the applications that will use the data. If, for example, an Outage analysis application is desired, the network model must support tracing all the way from the feeder or source to the customer service connections.

Asset ownership issues are another complex area to be considered. An owning entity usually supports or houses the owned entity in some way. Ownership status can be enforced by several different mechanisms. They are usually inferred from the network model or explicitly related through key identifiers. An important issue for consideration is whether the owned entity has, or does not have, spatial characteristics of its own. Although ownership can be represented in relational systems, complex objects provide an elegant mechanism to construct this type of relationship in object oriented systems.

The following list of questions is intended to assist the designer in the evaluation of connectivity and ownership issues:
  • Does a connectivity/ownership model exist?
  • Can connectivity/ownership relationships be easily reported or displayed?
  • Can the relationships be modified if needed?
  • If no connectivity model is shipped with the template, how is the connectivity managed?
  • Does the connectivity model meet the requirements for data management and other applications?
  • Is ownership managed separately from connectivity? Are “complex objects” used to manage owner relationships?
Fundamental Architecture Differences
Some fundamental data structures are required by the system application architecture. If a design conflict of this type is found, there is no choice but to rework the design to accommodate the vendor’s requirements. It is important to understand these requirements at an early stage of the system design. Application requirements can frequently be incorporated into the transition from the logical model to the physical model.

Conclusions
Vendor templates can serve as an excellent means to get a GIS implementation off to a fast start. Even if they are not used, they provide an excellent example of a working system, which can be an invaluable resource for those unfamiliar with them. In this era of Enterprise Resource Planning and “company in a box” software applications, it is not surprising to find organizations willing to utilize a vendor’s template as is. It is potentially easier to make slight changes to an organization than it is to extensively reconfigure a complex software system.

After reviewing the exercises for evaluating vendor templates, implementation strategies should become clear. An evaluation can effectively be summarized into a list of what’s missing and what needs to be removed from the template. All necessary additions should be evaluated in terms of their time and effort required. The closer any two designs are to one another, the less time (and cost) it will take to modify one to look like the other.

New GIS systems are rapidly moving toward object oriented environments. These new systems present additional challenges to designers who are unfamiliar with object methodologies. The ability to define the system from an abstract, real world perspective offers significant advantages over the previous relational technology. All the power and flexibility of object oriented systems does not come without a price. Evaluating an object-oriented template can be significantly more effort than evaluating a conventional template. Object oriented system design and software development experiences are required to construct any new objects that are missing from the template.

References
  • [OMG, 1999] OMG Unified Modeling Language Specification. Version 1.3, Object Management Group, 1999
© GISdevelopment.net. All rights reserved.