Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 2002


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997
Sessions

Applications

Data Development & Evolution

E-Biz

GeoSolucions

Mobile

Municipal Perspective

Network Operations Management

New Technology

Project Management

System Architecture

System Integration

The Human Factor

User Presentations

Work Management


GITA 2002


System Architecture
Printer Friendly Format

Page 1 of 3
| Next |


Version Management Revisited
Peter M. Batty
GE Smallworld
5600 Greenwood Plaza Blvd.
Englewood, CO 80111


Abstract
This paper discusses version management, which is now widely accepted in the industry as an essential technique for managing the design lifecycle. All major vendors now claim to have version management as part of their solution. Despite this, version management and the long transaction problem that it solves are still not widely understood.

This paper summarizes the fundamentals of version management, and looks at different approaches to implementing it: deep and shallow version management. It then goes on to explain how basic version management, while an essential pre-requisite, is just the beginning of a full solution for managing the design process. Other important issues that need to be addressed include design versus as-built views, future views of the network, partial job completion, jobs built on jobs, handling historical information, and support for detached design work in the field. Each of these is explained and approaches to implementing them are discussed.

The Long Transaction Problem
For a detailed discussion of the long transaction problem, see Newell and Easterfield, 1990, and Newell and Batty, 1994.

The basic technical requirement for a long transaction is the ability to lay out a design - do inserts, updates and deletes - in such a way that the changes being made are not visible to other users of the system (until the design reaches a stage where it is appropriate to share it with others). Since the user has in some sense a private copy of the data, concurrency control needs to be addressed: what happens if two people want to update something in the same area? In general, since these transactions can take a long elapsed time (weeks or months), it is unacceptable to insist that all data in an area be locked. Therefore, most approaches use an optimistic form of concurrency control in which data is not locked, but any conflicting updates are identified and resolved at some point before the transaction is completed.

Checkout
Checkout has been the most commonly used approach to the long transaction problem, in which a small geographic area is copied to a separate database or file where the work is done, and changes are passed back to the master database later. This has a number of drawbacks. The time taken to create the checked out dataset is often significant (minutes rather than seconds in many cases). Since a restricted area is checked out, it is hard to run an analysis of how the design affects a broader area of the network. With any reasonably sophisticated data model, it can be very hard to determine exactly what data should be checked out – there are many difficult issues regarding the handling of data that is related to objects that are geographically within the selected area.

Version Management
Another approach to handling long transactions is to use version management. With this approach it is possible to create different “versions” or “alternatives” of the database. Each alternative is logically equivalent to a replica of the whole database; a user can make changes within an alternative that are not seen by any other users, and the user in the alternative does not see changes made by other users (until she asks to see them). All of the data is not physically replicated to create an alternative - only the changes relative to the parent version are stored in an alternative. There are significantly different approaches to implementing version management, which will be discussed in the next section. Changes are propagated between versions in a controlled way. This paper uses the terminology that a “merge” operation propagates changes down from a parent to a child alternative, and a “post” operation propagates changes up from a child to a parent. Beware though, as Oracle’s new version management technology, known as workspace management, uses different terminology: they use the term “refresh” for propagating changes down, and “merge” for propagating changes up. In general, version management uses an optimistic approach to concurrency control, and any conflicts are detected and corrected when a merge is done. It is also possible to have a tree structure of alternatives, so in addition to handling simple long transactions, this approach also provides a mechanism for handling alternative designs in an elegant way. Version management overcomes all the problems with checkout mentioned above: there is no initial retrieval time, no copying of data is required, and the user has access to the whole database at all times.

While this technology has been available for 10 years now, only recently has the superiority of this approach been widely acknowledged. It is now accepted as the industry standard approach, with all the major GIS vendors and Oracle announcing support for version management. Recent implementations of version management use a significantly different underlying architecture than the longer established one, and the differences are discussed in the following sections.


Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book