GISdevelopment.net --> Application --> Business GIS

Spatial ETL Tools: THe Bottom Line of an Enterprise GIS


S.Raghavendran
Product Manager – GIS
PIXEL SOFTEK PVT. LTD.
318 III Floor Plaza Centre,
129 GN Chetty Road,
Chennai 600 006, India srg_gis@yahoo.com  raghavendran@pixelsoftek.com

PRELUDE
Enterprise GIS is the catchword of the GIS industry today.
In the recent years many large organizations using GIS or in the deployment stage have moved from independent, stand-alone desktop GIS systems to more integrated approaches that share resources and applications: Enterprise GIS. The basic idea of an Enterprise GIS is to address the needs of user departments collectively instead of individually. The development of one comprehensive system minimizes potential conflicts, resulting in significant cost savings and performance improvements to the organization. An Enterprise GIS is a mix of tightly and loosely coupled systems. Individual departments can have tightly coupled systems, software and hardware but across departments, field units and public interfaces the coupling has to be loose to allow freedom to the users to choose their own systems. It is an integration of spatial, non-spatial data and technology across the organization, coupling centralized management with the freedom of decentralized use.

Can you imagine minding mice at crossroads the Irish equivalent for trying to herd cats? Just as cats sometimes have their own mind-set, so do most of the GIS users in any organization when it comes to Enterprise GIS. Combining technology, data and processes is the key to any enterprise GIS: easily underestimated and stated than implemented and realized! The pain challenge begins the moment an organization starts conceptualizing Enterprise GIS for inter or intra departmental use, many of them significant enough to bring the efforts to a grinding halt! There are however different schools of thought as to what the bottleneck bottom line of an Enterprise GIS is? For some the end user adopting adapting the Enterprise GIS is the bottom-line, while for some it is combining technology and processes of various departments and for many it is combining the geospatial (spatial and non-spatial) data in multifarious formats with various departments, which is GIS interoperability: the subject matter of discussion here.

OBJECTIVES
In an Enterprise GIS situation, various schools of thought are practiced to tackle GIS interoperability with their respective merits and demerits. To name few popular approaches: The Common Format, The Central Database, The Software Interfaces and the Multi-Format Direct Read/Translate/Transform approach (Swiss Army Knife).

The objective of this paper is:
  • To discuss how GIS interoperability could be a major road block to an Enterprise GIS.
  • To discuss brands of GIS interoperability, their merits and demerits.
  • To conclude how GIS Interoperability could be achieved using a Spatial ETL tool like FME® through a Multi-Format Direct Read /Translate/Transform approach.
WHAT IS GIS INTEROPERABILITY?
GIS interoperability simply means the ability to integrate or exchange information between different components of a geospatial solution, even though different organizations or departments on different GIS platforms may have developed the components: a situation typical to an Enterprise GIS. The components here would typically mean CAD/GIS files, spatial/ attribute databases and other related electronic documents of different business intelligence systems/operations and from multifarious organizations. This could be as simple as a plain GIS format A to GIS format B interoperability or could be a complicated one requiring data model transformation into the structure the user or the application needs. In other words, true GIS interoperability would mean the right tools and technologies to exchange or transform geospatial data into the structure, wherein the end user, Enterprise GIS application, or solution can take full advantage of the capabilities of the enterprise geospatial solution under implementation or used.

SPATIAL ETL TOOLS
ETL tools are "information pipes" that connect two systems. They provide an economical mechanism displacing disjointed databases with windows that are open to all business-related information and operations by quickly combining information from very large data repositories. To ultimately make more informed business decisions the seamless interconnectivity gives organizations the opportunity to improve productivity and better leverage the information they already have. Bottom line: ETL tools help companies extract dollars from data.

Spatial ETL, a term coined by Safe Software, Canada simply means ETL for spatial data. Spatial ETL is to ETL what GIS is to MIS. Spatial ETL tools are Extract, Transform, and Load (ETL) tools that can also read, write, and manipulate spatial data. The Extract function reads data from a specified source data store, extracting the desired data. The Transform function processes the acquired data, transforming it and even perhaps combining it with other data to package it into the correct structure for the destination data store. Finally, the Load function writes the resulting data to a target data store. While an ETL tool must have processing capabilities for the various column types that are in a non-spatial database or system, a spatial ETL tool must also have spatial operations (geoprocessing capabilities) for changing the structure, projection, format or style requested and representation of spatial data to move data from one spatial database or GIS to another. Again, the intended data could be for temporary use or as part of a permanent data migration/translation project.

ENTERPRISE GIS: THE BOTTLENECKS
Often Enterprise GIS users employ data translators to migrate or translate one particular data store to another. Similar to translators, Spatial ETL tools restructure specific datasets or an entire data store into another data store. However, unlike traditional data translators that are typically used when the source system is abandoned, Spatial ETL systems can create a mirror of the data on both systems allowing organizations to use both systems during the migration process – a feature that is particularly relevant for organizations with legacy systems, quite often encountered when implementing an Enterprise GIS.

As it is rightly said, “one size never fits all”. Blame it on the technology or blame it on the file formats, even if the geospatial data were readily available, they are in multitude file formats. Nevertheless, many a time this is a very serious bottleneck in the process of getting to test or successfully run an Enterprise GIS. Industry common CAD/GIS data translators often attack the problem by addressing the issue at format syntax level unlike the Spatial ETL tools that deploy a semantic approach to the problem of GIS interoperability. By virtue of a semantic approach the typical bottlenecks to GIS interoperability in an Enterprise GIS such as handling of multi-vendor solutions, multi format geospatial datasets, legacy systems, data replication between disparate systems etc is addressed.

BRANDS OF GIS INTEROPERABILITY
Every GIS file format has its own advantages and limitations. With multitude of GIS formats and multi-vendor solutions for years, GIS users and organizations across the globe continue to maintain data locked up in some proprietary format a key impediment to a successful Enterprise GIS.

Achieving true GIS interoperability:
  • A system or set of tools where the geospatial data is interoperable
  • Where there is no need for a data format conversion or
  • A system that can accept data in all the available formats with the capability to handle future data requirements
This is nothing less than a Herculean task. There are several approaches or brands of GIS interoperability of which an honest attempt to discuss the pros and cons of four major brands as listed below and as to why the Spatial ETL approach is good:

  • Brand 1: The Common Format
  • Brand 2: The Central Database
  • Brand 3: The Software Interfaces
  • Brand 4: The Multi-Format Direct Read/Translate/Transform (Swiss Army Knife)
THE COMMON FORMAT
In this approach to GIS interoperability, the premise is to use a common format for data interchange between various components of the Enterprise GIS. Industry practices include but are not limited to formats like SDTS, SAIF, DIGEST and GML. With a common format for data interchange decided upon (an act of herding the cats), translators are built to/from the formats to other systems for data exchange between components of an Enterprise GIS. To be honest, there are more issues though there are quite a good number of data interchange formats prevailing in the industry including the OGC ones like GML with no remarkable success. Some key issues of this approach include:

  • Common format becomes very complex
  • Interoperating requires an extra stop over or staging
  • Very invasive – workflow changes needed
  • Who builds the translators?
  • Historically not very successful
  • Profiles restricting usage of complex format features are required: Reduced interoperability between Enterprise GIS systems
THE CENTRAL DATABASE
This approach to GIS interoperability, advocates getting every system part of the Enterprise GIS to use a common (spatial) database; then load all data into the database and finally use applications that directly read/edit the database such as and not limited to Oracle® Spatial, IBM® DB2® and Microsoft SQL Server®. This approach is similar to the common format but for the change that the format is a database capable of handling spatial and non-spatial data as well. Some key issues in this brand of GIS interoperability include:

  • Very invasive – workflow changes needed
  • Diverse applications may have difficulty truly operating off the same database and data models
  • Database data model may not match each application’s needs
  • Requires entire organization to agree and adopt same technology: Is this not herding the cats?
THE SOFTWARE INTERFACES
This approach of building software interfaces to achieve GIS interoperability in an Enterprise GIS situation is to get all software to communicate via well-known interfaces typically agreed to via consensus process such as OGC or ISO like SFCOM (Simple Features for COM), CORBA, WFS, WMS, WCS etc. Building software interfaces typically agreed via consensus: easily said than implemented along with the following pains challenges:

  • Requires all new software (by definition)
  • Very invasive – workflow changes needed
  • Specifications often are loose and open to interpretation
  • Things simply end up not working together
  • Building software interfaces to legacy systems could be quite challenging or impossible
MULTI-FORMAT DIRECT READ/TRANSLATE/TRANSFORM (SWISS ARMY KNIFE): THE SPATIAL ETL TOOL
In the Swiss army knife approach to true GIS interoperability in an Enterprise GIS, the ideology is to attack the format and data model problem head on by providing tools and a format neutral interface to deal with constantly changing formats. This is nothing but natively reading or writing from or to formats that constitute the geospatial data store of an Enterprise GIS solution. Spatial ETL systems create mirror of the data on both systems during the migration process rather than abandoning any system after completion of the data extraction allowing organizations to use both new and legacy systems.

Any process requiring discontinuing of any legacy or un-supported systems by virtue of a common format or common database or software interface approach is the basic reason for failure with the brands of interoperability discussed earlier. Further embedding or extending direct read/translate/transform capabilities in all applications connected with the Enterprise GIS leads to true GIS interoperability. Data model transformation through semantic translation is the cornerstone of this approach.

FME©: THE ONLY TRUE SPATIAL ETL TOOL
FME©, the only true Spatial ETL (Extract, Transform and Load) tool from Safe Software, Canada, is a Swiss Army Knife in terms of cutting across the CAD-GIS and related migration barriers in an Enterprise GIS situation and in handling multitude of CAD/GIS file formats in the industry. A Spatial ETL tool such as FME© solves the complete spectrum of data interoperability challenges, including managing proprietary and evolving data formats, adapting to new schemas and lack of standards and difficulties accessing, restructuring, integrating and distributing data all required to resolve typical GIS interoperability issues faced when evolving, implementing and successfully running an Enterprise GIS.

The ability to natively read and write data in over 200 formats is the apple of FME’s eye! The benefits are not just one but many inspite of the challenge to keep upgrading to keep on top of format changes and the effort needed to set up data model conversions.

Spatial ETL systems such as FME© aims to resolve this issue by empowering a GIS package or similar spatial data management tool both over the desktop and the web with the inherent interoperability to provide seamless interaction with a multitude of datastores, and to share that information across the enterprise in real-time. FME© also tightly integrates with ETL solutions like IBM WebSphere DataStage, Informatica PowerCenter and Microsoft SQL Server Integration Services (SSIS) extending the Spatial ETL advantage enterprise wide. FME© through its inbuilt graphical authoring environment, viz. FME© Workbench featuring a graphical data flow-programming paradigm makes performing very sophisticated data model transformation tasks a breeze.

FME©employs a semantic translation (thick pipe translation), which focuses on changing the view of the data to something that matches the desires of the end user or end system. Instead of viewing the problem as a means of moving data from one format to another, FME© works entirely on generic features and concentrates on providing building blocks that enable users to manipulate the data into the desired representation. Semantic translation provides an engine that enables the redefinition of the data, either on input or on output. Underlying the engine is a rich data model, much richer than what proprietary systems support, allowing for a high degree of redefinition, which is internally consistent and inherently extensible. As a central hub of an Enterprise GIS, Spatial ETL such as FME© ensures efficient distribution of geospatial data to end users in the format, application and location they desire. There can be no doubt that Spatial ETL tools such as FME© can be the absolute bottom line of any Enterprise GIS.

DATA CONVERGENCE IS THE KEY
The spatial data used to map and manage the assets in most of the organisations for years now have been the property of mapping departments. In countries like India it is uncommon to find mapping departments in agencies where asset management is a key activity. Spatial data continues to get a step motherly treatment of having not viewed as part of core IT assets inspite of NSDI (National Spatial Data Infrastructure), an initiative by the Government of India to encapsulate different kinds of geospatial data from various government and non-government organizations for use by the community.

Thanks to the changing trend in the geospatial industry. Be it the open database types or industry standard databases, which have started embracing spatial data. Now all major databases have or have announced support for spatial data. Location is now everywhere! Organizations across the globe are now exploring the ways and means to exploit their spatial data (assets) to make better decisions. Thanks to the new applications like NASA World Wind, Google Maps, Google Earth, Microsoft Virtual Earth etc. that serve as a quick and easy geo-visualization tool for organizations needing to exploit their spatial assets to make better decisions. The bottom line is that Data Convergence is the key to a successful Enterprise GIS in many ways and with Spatial ETL tools like FME©, this can be a child’s play.



THE CLIMAX
A saying goes: a leopard cannot change its spots nor can one teach old dog new tricks. This is true with an Enterprise GIS, which is a complex beast, and like the popular aphorism, “Big things are made up of lots of little things”. Enterprise GIS feels like eating an elephant. Where do you begin? Will you ever finish? It can be overwhelming. However, a Spatial ETL tool like FME® holds the key to such riddles akin to an old riddle: How do you eat an elephant? Answer: one bite at a time. Beware: Sometimes You Eat the Elephant; Sometimes the Elephant Eats You! With an innovative Spatial ETL tool like FME®, one can definitely teach old dogs new tricks!

ACKNOWLEDGMENT
The author expresses his gratitude to Mr. P V Rai, Managing Director, PIXEL SOFTEK PVT. LTD. for his encouragement and support to kindle the author’s GIS temper.

Special mention to his tutors at the Division of Urban Systems Development, Anna University, Chennai, for being responsible for his first step into the world of GIS.

REFERENCES
© GISdevelopment.net. All rights reserved.