Scalability - The forgotten dimension
Dan Lemkow Director, Project Management VISION* Solutions 50 O'Connor Street, Suite 501 Ottawa, ON K1 P 6L2, Canada PH: (613) 236-9734, ext. 1694, FAX: (613) 567-5433 E-Mail: dan@gis.shl.com Introduction Ignored in system design, then overlooked in operation, scalability is often not appreciated in an online transaction processing (OLTP) environment until the system becomes overloaded, unstable, and inefficient. In hindsight, system managers wish they had designed for scalability, which is the ability to add users without adversely affecting system performance for any one user. Performance is a measure of the user's ability to complete tasks in a timely manner. Why is scalability so often forgotten? Sometimes it's because systems start off with a few users and fail to anticipate the effects of growth. Also, scalability is difficult to measure and predict: many administrators simply buy a big server with lots of capacity, and assume that they can expand it if necessary. Unfortunately, the combination of a huge database with many distributed users and multiple concurrent transactions creates a complex, dynamic structure that may not respond predictably to changes. When the system gets overloaded, it is not always feasible to simply acquire faster networks, more servers, and more system administrators to cope with recoveries. The first step along the path of successfully managing this challenge is to understand the factors that affect scalability. This paper identifies these factors and proposes some methods of building scalability into a system as a permanent feature. System Administration In a perfect world, systems would never go down, software would be defect-free, and databases would never be corrupted. Reality is quite different. These risks must be managed. Increasing the number of users increases the significance and complexity of coping with such problems. For a system to be scalable, the system administration must be responsive. A system will not be scalable if it requires an army of specialized database administrators, if it is not operational when backups are being done, or if it cannot guarantee transaction integrity. Consider the typical GIS technology that uses a hybrid data store: spatial data in a proprietary database and non-spatial data in a commercial database. If a system failure occurs during a transaction that involves both spatial and non-spatial data, will the database be left in an incomplete state? The system administration solution is to fully leverage commercial databases. Database vendors specialize in tackling this problem by
Data Access Business applications that display spatial information require access to large amounts of data. This is due to the unique nature of spatial data- multiple coordinate data used to describe a single object, and many such objects required on screen to provide adequate context (Figure 1). In relational database terms, each object is a row in a primary table; coordinate data and related non-spatial data are multiple rows in multiple indirect tables, which are related using foreign keys (Figure 2). The non-spatial data is typically used to dynamically drive the rendering. For example, the type of cable might determine the color, the terminal types might control the symbols used at each end of the displayed cable, and the assignment of strands could be labeled midway along the cable (Figure 2). Contrast this with the typical administrative application (Figure 3), in which each screen of information consists of perhaps fifty pieces of data that collectively represent one primary row and perhaps several rows from a couple of related tables. The need for large amounts of data creates two key bottlenecks: access load on the database server and transmission across the network. Because so much data is required for displaying spatial information, graphics caching becomes a very effective solution. In this approach, a distinction is made between a database object and its displayed appearance. An object such as a cable, for example, would be rendered using eight graphic primitives: two symbols, one line joining the symbols, and five pieces of displayed text (Figure 2). The display engine records these primitives in a local display list structure. Each primitive is tagged with the same key that uniquely identifies the cable object in the database. In this way, all the primitives can be treated as a unit. When the user clicks any one, the display engine 'tells' the application the object identifier. The application can obtain the attributes for the cable object by going back to the database with the following request: SELECT * FROM CABLE WHERE CABLE_ID=&selected_object_id; The basis for the caching is the storage of this display list of graphics primitives in a file that can be retrieved and displayed later without going back to the database. This approach is viable as long as
An application's architecture must recognize the finite capacity of both the network and the server. The underlying software products must support task decomposition, which includes both processing and data access. Often called multi-tiered architecture, partitioning allows the processing load to be distributed among multiple platforms. The most common approach involves three tiers: database access, application logic, and user interface. The evolution to distributed objects is providing infinite tiering. The sharing of processing load between client and server is no longer dictated by the hardware configuration, but by current application needs and architecture. In techno-jargon, we could say that clients and servers are not permanently 'fat' or 'thin,' but can change their profile according to job requirements. For example, if geographic survey data is used only for background reference, it could be most efficient to store it on a networked drive on each regional LAN rather than continually sending it out across the WAN from a central database. An excellent example is British Telecom's Plant and Records Modernization (BT-PRM) project. Most utility solutions must display large amounts of data as graphics, but always having to 'draw from the database' precludes scalability. In the case of BT-PRM, the data for the entire country is held in one central database. Engineers located in the various regional offices need timely online access to the data. The infrastructure is a national WAN connected to regional office LANs. Scalability is achieved through a combination of application partitioning and display cache management (Figure 5). At any time, hundreds of users can be working with the data, and several users can be working on data for the same geographic area. The central server does not attempt to notify every user immediately about every change in the national data. Instead, it keeps track of who is working where, and notifies users only of changes to data in their current area of operations. This common-sense 'need to know' strategy saves CPU time and makes such a large-scale solution feasible. The solution also avoids the messy and expensive administration task of trying to synchronize distributed databases that anyone can change. Robust data modeling The system must provide data modeling capabilities that allow the enterprise to be described without compromise. Forced fits lead to complex application code, extra traffic, and risky maintenance of system integrity. If flexibility to try different approaches is built in, database access can be optimized. The following examples highlight this requirement:
An essential component of scalable systems is flexible system architecture. As users are added, the sensitive points in the system change. It must be possible to analyze this dynamic information and perform suitable load balancing. For example, multiple servers provide an effective means of scaling systems. Coupled with three-tier application partitioning, large organizations can adapt to support growing user communities. The workflow of an organization follows very specific patterns. For example, at the beginning of the work day, hundreds of employees may need to start the same application. This burst of activity may saturate the network beyond its capability. Should one size the network to handle this peak? It is possible for software to simulate system loading; as a starting point, it is reasonable for assessing orders of magnitude. However, the work habits of staff make this difficult. In practice, systems are usually expanded in stages. Not all of the projected user community will be online from day one. The reality, therefore, is that as users are added, system activity rises through a series of plateaus. The key flexibility provided by an open, partitionable environment can resolve many capacity problems and respond to changing conditions. A commercial database, open communication standards, and distributed processing are all key in tuning and adapting to an ever- changing environment. Conclusion Because information systems have become essential to the work processes of many organizations, they must live by the rules of efficiency in both financial and operational terms. As we have discussed, systems must be scalable to meet these criteria. We have identified the key strategies for achieving scalability as follows:
| ||
|
|