Logo GISdevelopment.net

GISdevelopment > Proceedings > GITA > 2001


GITA 2002 | GITA 2001 | GITA 2000 | GITA 1999 | GITA 1998 | GITA 1997 |  
Sessions

A tangled web of pure opportunity

Directions for data

Forging the future

How they did it - and what's next

Integrating work management

Mobile solutions- taking it to the streets

Operations support

People make the difference

Systems architecture

The local government perspective

Tying IT all together

Vertical applications


GITA 2001


System Architecture


Using middleware for GIS integration and factors for evaluating technologies


Overview of some middleware technologies
Some common characteristics of all middleware technologies provide a context for understanding those technologies. Since communication is the heart of middleware, communication methods are discussed. Then we review the way that communication is implemented - whether by a programming language or other means. Finally, we introduce one possible form of each communication's contents: XML.
  1. Communication Methods

  2. Components can communicate through (at least) one of four means: sockets, remote procedure calls, remote method invocation, and messages. Sockets have been used for decades, and are the lowest level means of connecting one computer to another (Comer & Stevens, 1996). A client and server must encode and decode data through a socket stream, so a great deal of programming effort goes into protocol specification. Using sockets often requires a large amount of programming effort, and requires dealing explicitly with such complexities as multithreading, deadlock, synchronization, and network problems. Middleware technology will be easier to implement to the degree that these details are hidden.

    Remote procedure calls (RPC) refer to the ability for function A to call upon function B as if B was local, even if it's not. An ideal RPC product would completely hide all socket-level details - effectively "wrapping" the RPC calls in functions or methods. Remote method invocations (RMI) are analogous to RPCs, but apply to objects (java.sun.com, 1999). An object, which is an instance of a class, has both data and methods (which are functions that access the data).

    Message exchange is the last major communication method used by middleware technologies. While RPCs and RMIs focus on calling remote functions or objects, message exchange only concerns data transfer between components. Message oriented middleware (MOM) works much like electronic mail, using store-and-forward queuing provided by a shrink-wrapped product that runs separately from the components themselves. The receiving component "knows" what to do with a message once it's received, and the sending component "knows" how to package that data so the receiver understands it. Two MOM products include IBM's MQS (Message Queuing Series) and MSMQ (Microsoft Message Queuing). The theory behind both products is nearly identical, but MQS works on many more platforms than MSMQ (which is for Windows computers only [Lewis, 2000]), so we only discuss MQS. Programming Languages and Interface Standards

    Various programming languages, and middleware technologies that wrap languages, make good choices for a middleware solution. A few dozen languages advertise themselves as middleware enablers, and Java is the most dominant. Distributed object systems such as Microsoft's DCOM (Distributed Component Object Model [Sessions, 1998]) and OMG's CORBA (Object Management Group's Common Object Request Broker [Siegel, 2000; corba.org, 2000]) provide standard interfaces for applications to register, discover, and use components*. CORBA stubs "wrap" programs written in other languages with a pseudo-language called IDL (interface definition language), while DCOM provides components with a Microsoft-specific interface definition that subsequently allows other consumer programs to use COM objects. Both DCOM and CORBA protect developers from socket details but don't inherently offer error recovery.

  3. Extensible Markup Language (XML)

  4. The Standard General Markup Language (SGML) and its derivatives - such as the eXtensible Markup Language (XML) and Hypertext Markup Language (HTML) - provide a means for different computers to understand, parse, or format text streams based on standardized "tags" that are embedded in the text. XML is particularly notable for its flexibility: developers can create their own tags and rules for well-formedness and validity (Walsh, 1998). XML is discussed below because of its extensibility and usefulness for providing self-describing data to components regardless of the other middleware technologies being used.
Middleware factors to consider
The communication means, implementation, and context summarized above are implemented using many different middleware technologies. These technologies vary with respect to several important factors that managers and developers must consider when choosing a technology. The definition and scope of each factor is provided first, followed by a brief discussion of how examples of middleware technologies rank compared with each other on that factor.

* One definition of a component: a unit of software with a public, contractual interface and a hidden implementation. Components are often incapable of doing useful work by themselves, in which case they are relied upon by "master" applications that use components' services to do their job. However, components can also be larger grained, complete applications or systems.
  1. Performance

  2. The fastest programs have no communication with remote systems and are already compiled into assembly language native to the computer's platform. In contrast, a completely abstract middleware solution may run on many different computers in different nations on different platforms. Nothing may be precompiled, everything has long network latencies, and locating applications requires lengthy run-time delays (to lookup host IP addresses, interface details, etc.). Achieving a balance between these extremes - local and tightly bound components vs. remote and loosely bound - is the key to performance and several other factors.

    RPC based solutions have the best performance because they are closest to the fast model described above. CORBA follows closely after that, since components can be natively compiled; only communications between components require abstractions. (Each component with public interfaces has a corresponding stub, written in IDL, that tells other components how to interact with it. Locating the component itself also requires a lookup function, i.e., a directory service. Both of these capabilities require run-time work and communications between systems.)

    DCOM and Java are slower, all things being equal, than CORBA or RPCs. DCOM wraps functionality in separate programs (DLLs) and Java has an entire virtual machine operating between the code and assembly language, which slows execution speed of all code. However, implementation decisions have a dramatic effect on speed, and so this ranking of performance will vary. Message passing systems such as MQS are selected for asynchronicity rather than speed, though they may perform very fast given favorable network and CPU configurations and load.

  3. Platforms

  4. The type of computers used by a company are typically predetermined by existing equipment or dictated by budgetary constraints. Most middleware technologies distinguish between only three basic platforms: Windows, Unix, and IBM mainframe. Java provides a "write once, test everywhere, run most places" paradigm. Since Java compiles into bytecode that is then interpreted by each machine's native Java Virtual Machine, the same bytecode is (theoretically) executable without change on any supported computer - which includes Unix and Windows computers. CORBA enables component interactions between nearly any system, since it provides an IDL that abstracts the interface from its underlying implementation regardless of language, operating system, or hardware. MQS supports over 35 different platforms, including mainframe systems (unlike Java), and so provides a key means of data marshalling with many legacy systems. DCOM runs specifically on Microsoft operating systems.

  5. Development Ease

  6. Most middleware products provide their own APIs. The granularity and availability of API functions and the number of additional, non-API layers required to implement the technology determine development ease.

    Pure RPC is the most difficult, as it involves direct coding of all details (often in C), with no APIs provided. Distributed Computing Environment (DCE) standards provide specifications for forming a common infrastructure for the development of distributed systems. DCE is a layer above RPC, simplifying RPC work. Neither RPC nor DCE is a product that can be purchased - merely a description of the type of work done by developers (low level for DCE, and extremely low-level for RPC).

    CORBA is somewhat less difficult than either: it allows developers to tie together programs that are already written in many different languages. But implementing CORBA requires additional work to support added abstractions. These abstractions include the IDL and directory services (so components can find each other), among others. DCOM may be easier than CORBA, since its components and interfaces are more tightly integrated (owing to its diminished cross-platform capabilities). Java RMI may be the easiest, as it provides extensive API support for virtually all common middleware functionality, and the virtual machine concept enables true platform independence.
Page 2 of 3
| Previous | Next |

Applications | Technology | Policy | History | News | Tenders | Events | Interviews | Career | Companies | Country Pages | Books | Publications | Education | Glossary | Tutorials | Downloads | Site Map | Subscribe | GIS@development Magazine | Updates | Guest Book