Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 2001


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997 |  
Sessions

A tangled web of pure opportunity

Directions for data

Forging the future

How they did it - and what's next

Integrating work management

Mobile solutions- taking it to the streets

Operations support

People make the difference

Systems architecture

The local government perspective

Tying IT all together

Vertical applications


GITA 2001


System Architecture


High availability GIS: Beyond the application to the operating environment


High Availability enables recovery from server failure
While recovery from disk failure and network failure must be an integral part of any High Availability solution, the scope of this paper is specifically recovery from server failure. Consider the impact of losing a GIS application server or data server: all clients accessing the spatial applications and files on that server are affected; all users’ activities are suspended until the server problem is diagnosed and resolved. High Availability solutions are designed for computing installations that require critical applications to be automatically and seamlessly restarted in the event of server failure, ensuring that data remains accessible and that applications be kept running, even during a prolonged failure of the second and third tiers of a client/server implementation.

When considering a High Availability solution, one often thinks of expensive and custom hardware, and sophisticated, costly application design. But the more practical and robust High Availability GIS implementation is a combination of commercial off-the-shelf (COTS) hardware and software, and multiple instances of hardware (redundant standard servers, storage, and network adapters) and shared data (RAID). Such a combination enables recovery from failure of any critical component.

Consider the following analogy: A farmer has a sled that must be pulled across the country, and can choose between using one horse or several dogs to do the pulling. Assuming either choice would provide sufficient power to do the job, is either preferable? If one of the farmer’s requirements is that the sled always keep moving, then the “multiple dog” choice is better. If one dog becomes disabled, the remaining ones can work a bit harder, even though the sled may move a bit more slowly until the one dog recovers or is replaced. On the other hand, if the one horse becomes disabled, the sled will go nowhere for the duration. One less than the optimum number of dogs is better than no horse!

When configuring a computing environment to run an enterprise’s set of applications, a similar decision must be made regarding appropriate systems: one “large” one, or multiple “smaller” ones. While there are valid reasons for both approaches, a High Availability environment is strongly biased toward the “multiple smaller server” approach. Even when one smaller system fails, the application environment is still available to the users as long as the surviving servers cooperate and take on the work of the failed one.

Indeed, the multiple servers can work both cooperatively – when one server fails, the survivors take on the workload of their failed counterpart; and independently – under normal circumstances, each is running its own set of applications or serving its own set of users. In the eventuality of failure, the software component of a High Availability solution can detect the failure and automatically impose the cooperation – those applications running on the failed server are started up on (“failed over” to) the surviving servers, and those users connected to those applications are switched over as well.

The multiplicity of servers, and the software that enables the cooperation among them, is known as clustering, and should be implemented at both the middle and back tiers of a three-tier client/server configuration for complete High Availability. Rather than implement a mission-critical GIS on a hardware platform that incurred the engineering cost of fault-tolerant design, an organization can much more economically configure its GIS on a cluster of two (or more) standard, inexpensive and cooperating servers. In the event of failure on one node, the surviving nodes can take on the workload of the failed system until the problem is diagnosed and resolved.

Since the nodes of a cluster are typically configured to run independent workloads under normal circumstances, the nodes are not merely redundant. Thus an additional benefit of clustering, besides providing the basis of an affordable High Availability solution, is scalability: an organization can add members to a cluster over time to run the enterprise’s various tasks and meet its computing needs. (A farmer can add more dogs to pull a heavier sled, or to pull the same sled faster, or temporarily to pull the same sled up a steep hill!)

And the benefits of scaling are cyclic: the more nodes in a cluster, the better the performance of the cluster after a system failure. When one system fails in a two-node configuration, the survivor must take on, in addition to its own normal workload, 100% of the load of those applications on the failed server that have been set up as being highly available. In a four-node configuration, the three survivors can share the additional load of those highly available applications from the failed node, each taking on one-third of the load. Thus as the computing environment scales, the burden of cooperation is shared, and the performance degradation of any of the survivors during the outage of the failed server can be minimized. Indeed, the burden per node is inversely proportional to the number of servers in the cluster. (The more dogs pulling the sled, the less the absence of one will be noticed!)

A key component in the design of High Availability clustering is its ability to detect individual system failure, and automatically and seamlessly start up the affected applications on surviving nodes. If the GIS server fails, the GIS application can be automatically started up on a cooperating server, and client processes transparently switched over with it. The simplest High Availability configuration requires two clustered servers; to be sure, any discussion pertaining to a two-node cluster also applies to clusters of more than two servers. Under normal circumstances, one server within a cluster could run the GIS application (and possibly additional applications as well), and other servers could run other unrelated applications. Alternatively, both servers could run the GIS application, with each instance serving a subset, or partition, of the GIS data, or a subset of the clients.

Where is the High Availability?
High Availability can either be built in at the operating environment level, or can be engineered by the application software vendor into the GIS itself. High Availability as an integral component of the operating environment benefits both the user (most applications running on the cluster can easily take advantage of all the High Availability features) and the application software vendor (no incremental cost and complexity of engineering High Availability into the application). GIS developers can focus on their expertise of developing GIS solutions, and need not be concerned with the details of detecting system failure and the implementation details of seamless and transparent failover among cooperating servers.

While UNIX systems provide a solid foundation, meeting the availability requirements of mission-critical applications demands a more comprehensive and dependable clustering solution. And to be truly affordable, such a solution must require no unique system configurations, specialized operating system variants, or proprietary storage components: the clustering environment must use the same standard server platforms, operating environment, storage architecture, fibre channel, disk controllers, and network adapters as other systems.

One model for High Availability designed into the operating environment is Compaq Computer Corporation’s Tru64TM UNIX operating system and TruClusterTM Available Server, which provides a complete and robust High Availability environment on the Compaq AlphaServerTM platform.

Page 2 of 3
| Previous | Next |

Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book