GISdevelopment.net ---> GITA 1998 ---> Data Distribution

Taking your enterprise AM/FM/GIS on the road

Peter Scheffler
Applications Product Manager
Enghouse Systems Limited
80 Tiverton Court, Suite 800
Markham, Ontario L3R OG4 Canada


Maturing Computer Users in Your Corporation
The increasing awareness of computing technology and the recent explosion of interest in the Intemet and distributed computing has had a positive effect on computer process implementations over the past several years. With the masses learning how to use increasingly more powerful and smaller computer systems, the information age has begun to move out of the centralized and sterilized environments of the 'computer rooms' and engineering/design groups and out into the corporations' user base. For most database types, this has meant the increased use of client/server technology and multi-tiered development environments to meet this growing need.

In the world of AMIFMIGIS, this has tended to mean a double edged sword answer to getting the corporation using the data stored in your Spatial Database'.

On the one hand the data stored within your Spatial Database tends to contain many times the information stored within other database formats and the inter-relationships and special data types tend to be cumbersome when replicated or referenced from many different locations throughout the enterprise. Still, users wish to be able to 'browse' information stored within the database, refer to digital maps and perform timely queries against this data

On the other hand, from the AM/FM/GIS Project Manager's perspective, this means that not only does the project get the all important 'corporate exposure' forever needed to continue receiving funding (especially in today's increasingly competitive environment) but with users performing their own 'data mining,' your staff becomes available to do more and more data refining and data collection.

1Spatial Database is defined as a mixture of both graphic information for determining an item's geographic or schematic location and meta or attribute data stored in an enterprise available database.

What are our options?
With today's ever advancing technology, there are many different options available to the enterprising AM/FM/GIS project manager to get data out to 'the masses.' These options range in their technical complexity, data concurrency, cost to implement and maintain and what the application requirements are at the end user side. We'll begin by looking at these options in detail.

Listed in the matrix below is a generalization of the groups of options available with today's technology.

The options listed here are the broad brush groups available to us, as implementers of the system. The actual implementation of each of these, I will discuss below.
  1. Disk Image Replication
    What this essentially implies, given today's technology is the use of CD-ROM and other write-once-read-many (WORM) technology. With the advent of new technology, such as Digital Video Disks (DVD) there maybe changes in how these are implemented. But cumently these technologies are in their infancy and require some time to be proven.

    Using this media allows for the creation of a series of 'CD-Sets'that mimic the old method of generating plots and micro-fiche. This technology has several unique and compelling reasons for its use.

    Firstly, the cost is typically the easiest to bear in the initial stages of an implementation. The cost of CD-ROM write hardware or "CD-ROM writer" is currently in the $500US range, with the media price going as low as only a few dollars per CD. This puts the creation of the media on par with the existing processes that most likely already exist within the records management departments of your organization.

    The second reason for this implementation, especially early in a project is the 'don't-rock-the-boat' scenario. Most companies will, in their manual process flow generate a series of drawings and/or paper records on a fairly regular basis. This cycle varies based on the data displayed and by whom it is used. The drafting department or records management groups make this information already and the change to 'burning CDs' is a logical extension for both themselves and the data consumers. The import consumers of the information then can be at ease with the process and gain an understanding of what the AM/FM/GIS is actually doing for them.

    The down side of this type of process however is more a long term issue, when the number of CDs within your organization can become unwieldy. Quickly, the issue of which CD is the latest version becomes a problem, which are usually resolved through semi-manual processes. The cause of this is that the issue of a new CD-ROM is the data within your Spatial Database is forever changing and evolving. And, since that typically these changes are small in comparison to the total volume of information stored therein, you end up re-writing much more information that is really needed. If, for example one CD-ROM contained several central offices or pressure districts, then when one is outdated. the entire CD-ROM needs to be rewritten.

    The end-user application requirements for this type of replication depends mainly on the technical competence of the user and their business needs. Typically, users have used Viewer technology, small footprint applications allowing querying, plotting and reporting functionality on a desktop/laptop andlor palmtop PC. With the advent of the CD-ROM replication process, users can then be located outside of the normal local area network (LAN) or wide area network (WAN) environment.

    This, therefore leads into the next level of replication, which tends to be on-line access to a Spatial Database server.


  2. Server Based Querying
    At first glance this would appear to be exclusively Intemet and Intranet solutions. But actually, this can use both a thin or a thick client' architecture. In the previous section on Disk Image Replication, the user transported a complete copy of the data with them, wherever they may be, either in the office or in the field. With the use of a server, users can then request information without needing the entire dataset locally.

    Typically, this is the second tier of an implementation schedule. After users and administrators alike, become more familiar with the technology and what it can do for them, people start to become attune to such buzzwords as "latency" and "data currency." As mentioned earlier, the Disk Image Replication method tends to become unruly once a large implementation begins. This is where users need and request for data on a larger scale, and with the data more up to date.

    Server Based Querying can, as mentioned come in many different forms. Most likely, the Records or Drafting Departments maintaining the Spatial Database use this concept themselves. Here, users use the 'Cadillac'of interfaces to input and revise data, typically using clientserver mechanisms to retrieve and store data from a local server.

    But, desktop users such as Engineers, Designers and Planners can use systems to request data from the server as well. In these cases, thick clients tend to be the choice. The local interface is based on a subset of the drafting and maintenance tools used on the design seats. With specific and targeted user interfaces, this class of application is deemed the main driver of the corporate Spatial Database.

    Finally, there are the casual users out in distributed sites or connecting via temporary phone lines that need to query the system. These users typically need to request specific information as well, such as the location of a particular switch, valve or the customer's service connection. These users can make use of a variety of thick or thin clients that can request information from a server and have the required information displayed on their screen automatically. In some instances specialized decision support software aid them in the correct response to a situation happening before them in the field.

    The distinct advantage of this solution, in relation to the others is that this tends to be the least expensive to maintain. The initial implementation costs will tend to be more than the more "primitive" Disk Image Replication methods, which is why most managers will opt for this after the initial roll-out of a project. After the initial phases, however, this tends to be the more cost effective method, if the applications are intuitive and the users trained properly in their use.

    The age of the data in these implementations is the best available today. Here users are looking directly into the corporate Spatial Database during their queries. In high risk situations, is where this really shines. Linestaff climbing poles and connecting services need the latest information, and by using a laptop in the trucks or even in the garage before leaving for a job, they can be assured that the data displayed is the latest available. In most instances, the past practices have been monthly circuit plans, printed out in books of 11x 17" paper on a monthly basis. This can now be replaced with a computer in a boom truck or in a foreman's office in the service garage.

    The one drawback to this method, is the reliance on the network. This is adequate, even desirable in the office or where reliable and fast telephone lines are available, but in other situations, hybrid implementations are optimal.

    1Thick and Thin clients are a term used to describe the amount of local resources used for an application. The classic thin client is a Web browser with no addition software, used to view simple text on screen. However, in the context of a Spatial Thin client, there will be some minor additional software downloaded to the workstation to view the content. A thick client is typically a stand-alone application that runs in its own memory space and uses its own interface.


  3. Table 1: Replication Options available
    Option Description Technical complexity Data concurrency Relative cost to implement Cost to maintain Application Requirement
    Disk image replication Low Low Low Moderate High
    Server based querying Moderate High Moderate Low Varies
    Database replication High High Moderate to high Moderate to low minimal

  4. Database Replication
    Over the past several years, the major database vendors have been developing Replication Strategies that allows tabular information to be "automatically" sent to the locations where it is needed. There are several different strategies when looking into Replication, and each have their own unique complexities and benefits.

    In general, database replication is the "nirvana" of data storage. Here, users are removed from the need to load their own CD's, ensure that a reliable network connection is available or that the data stored locally during the last database load is still valid. This automation comes at the price of implementation complexity, however. The databases continue to work towards making this process easier, as do several new niche market vendors which target this specific need.

    The advantage, however is that the user can 'seamlessly' receive updates in some combination of a Push or a Puli process. In a push, the server determines when a user needs to receive a refresh of the data that has been changed. Typically, this is done via a middleware piece which 'watches' all transactions into and out of the server, logging all downloads and all uploaded changes. When a new user connects to the system, these then determine if a change is needed (based on any number of criteria from the importance of the data required, to a simple time based cycle) and sends only the changed data down to the replica.

    A Pull process is one initiated by the user. Here, the user requests a new image of the data, either a complete reload or a subset of the data. This is still a semi-automated process, however which ensures a small learning curve. Typically, these are used in conjunction with a Push process, which will track changes to the user's replica and automatically update the remote data.

    There are many reasons that the Replication process is the most favorable for the AMffM/GIS community. The most obvious is that application developers can now develop interfaces to the system which are uniform, regardless of the client machine's location. Also, the training requirements are vastly reduced, as the applications used internally on the corporate local area network (LAN) can be used on the corporate wide area network (WAN) or even over the Intemet.

    The Replication strategies tend to more complex in implementation, although this is changing, but once they are implemented they tend to manage themselves. Today, there are several vendors on the market that offer solutions that meet the needs for the typical tabular databases on the market. Utilizing N-Tiered] designs, they offer tools which act as middleware to manage the replication. Even some of the major clientherver databases offer this technology directly in their database servers.

    These middleware pieces, whether part of the server package or an add-on will typically allow client workstations to operate completely disconnected from the database server, with a client 'server' component running on the distributed computer. In some offerings, the client 'server' software is a fraction of the cost and needs a fraction of the computing power that the true server requires. And these client applications will typically make use of both the Push and Pull techniques in these instances. Here, the user will Full information down to their client workstation, disconnect from the office LAN or WAN, spend the day working remotely and then when the connection is reestablished (now, maybe by dialing a local Intemet Service Provider or ISP) the database client connects to the server and publishes any changes or receives any updates to the local image of the database.

    This agent process can be accomplished without the user even knowing it is happening or even understanding what is going on! This allows your corporate MIS people or your AM/FM/GIS project team to package up information to be sent out to client workstations, and then the automated processes ensure that the users are actually using the latest possible information.

    Finally, with some of the middleware packages, they log all of the transactions that occur. This can then be used as an auditing process by design teams to know who's using what data, and then extrapolating how their using it (or heaven forbid ask them !) and then the data and applications can be even further tuned to meet your internal customers' needs.

    1 N-Tiered designs are systems that use multiple applications (or tiers) to manage different stages of an process' life cycle. In this case, the middleware component is a tier that can reside anywhere on the network and manages the communication from the client to the server. This allows for distributed computing, even more so that the original 2 tiered clientkewer models.
What is vour solution?
Well, this becomes the complex question. Somewhat akin to asking the length of a piece of string, how can you best implement one or a combination of these in your enterprise? There are several different factors that need to be weighed in the decision.
  1. LAN/WAN Infrastructure
    Are you a utility that has an abundance of intra-office network bandwidth? In some installations, I've seen that the customers have no use of replicating data, because they have such high throughput between their offices that it rivals some intra-office LAN's. If this is your configuration, then likely you wouldn't need to have a replica in different offices.


  2. Server Loading
    The idea behind Client/Server technology is that you can have many low total-cost-of-ownership desktop computers connecting to a high performance server. But, all servers eventually top out. Is that at 100 users? 500 users? Or 10,000 users?

    This depends upon the uses that these clients will be undertaking. Are they all doing conversion of paper maps? Well, then probably less than a few hundred per computer is the limit.

    In one example of server loading, the enterprise consists of 32 divisional servers, each with their own Spatial Database for the region. Each of these regional servers are local to the clients and server roughly 100-150 design seats, typically the most task intensive.


  3. Occasionally Connected Users
    These users are a must for the replication strategy. Here, they can use their local copy of the database to refer to, when disconnected yet enjoy a data refresh during off-peak ours.

    Typical of this implementation is the user who arrives home atler being on the road all day, connects their PC into the Intemet via a local ISP and the replication agent then begins downloading and uploading the required information. Once completed, the agent can disconnect from the ISP unattended and be ready for the next day's selling, maintenance or designs.

    Further, the user could even request new information from the server through several means. Today, the two most common are FTP (file transfer protocol) and Email. Through these transports, the user sends their request to the server, with the server responding through some protocol (not necessarily the same) to update the local Spatial database replica.


  4. Permanently Disconnect Users
    These people are the ones that work in the extreme remote locations, or in places that it isn't convenient to connect into a network. In these instances, either a replica through database agents would be acceptable or even the use of CD-ROM database images.


  5. How recent is recent in your Enterprise?
    Typically, this "recent" concept is a function of the AM/FM/GIS implementation itself. In a recent implementation review, the customer was asked, "How current do you need your data to be, at the engineer's desktop?"

    The answer surprised those in the room. A project team member replied that the current system of paper records, computer aided drafting and mainframe databases tended to be at least six to eight weeks out of date. The concept that the digital world and replication could mean a delay of a day, hours or even minutes was beyond their expectations!

    All of these factors, and more need to be weighed in your decision as to which to implement. In most cases, the implementation is a hybrid of two or three. There are enterprises that use internal WAN's to convey data between offices, yet use this same WAN to write CD's for field staff to use in their disconnected operations.
On the Road Into the Sunset
Packaging up your data and moving it around your enterprise was where you knew you were headed when you decided to initiate the AM/FM/GIS project in your enterprise. Today's technology and applications are making this job easier, faster and cheaper than ever before. As your user base increases, so does the validity of the data in the system. As recent as the late 1980's, these projects were being challenged by the established experts of your network. These people, who'd worked with the network for as long as 20 and 30 years thought they knew more about the system and could work faster than anyone else. Now, we're seeing these people joining the ranks of the converted. Seeing how well the AM/FM/GIS implementations are progressing and adding their own requirements and enhancements. One of my customers tells of a story where six of these such people had to do valve closure work by had, simply because they thought the system couldn't do it right. By the time the seventh or eighth time they did this, however all of them were using the system themselves to determine which gas valves to close. Now, with the replication strategies I've presented here, they can do these and more directly from their desktops, laptops or home office computer systems. If these people can be converted, then I'm sure that all of your corporation can be persuaded to use the data that you've so carefully collected.
© GISdevelopment.net. All rights reserved.