|
|
|
Integration across heterogeneous spatial data and applications within a large cyberinfrastructure project
Query rewriting and composite map generation services
XQuery formulation of a composite map
To present user requests against the spatial mediator in a standard system-independent declarative manner, we use two related XQuery-based abstractions of Spatial Integrated Views (or S-Views) and Mediator Integrated Views (M-Views). A composite map configuration is described as an S-view, which is formed from a set of distributed mapping services and M-views within a given initial spatial reference system and extent. The M-views can reference a single data service, or integrate over several of them, based on spatial or attribute joins. To illustrate the concept, fragments of an S-view, and a sample M-view referenced from it, are shown in Table 1. These fragments demonstrate a simplified geologic map integration scenario generated in response to user query "find all faults within a user-specified distance from geologic formations of a given age" [more details on XQuery formulation of composite maps are in Zaslavsky et al, 2004a].
Table 1: Sample XQuery specification of a composite map (note the reference to geologic age ontology in the M-View specification on the right, which expands a given geologic age to include its child concepts)
Given the XQuery composite map specification, the mediator follows a sequence of steps eventually generating an ArcIMS-based grid service instance and handling it to the map portal. Several of these steps, including GEON ontology-based query rewriting services, have been described earlier [e.g. Lin and Ludäscher, 2003]. Below we consider grid implementation of map assembly services.
Anatomy and implementation of map assembly
Several possibilities exist for creating composite maps from distributed data resources [e.g. Zaslavsky & Memon 2004]. What we describe below is a general solution that involves generation of a transient map service that combines and presents individual query results in an appropriate geographic context. The map assembly service is itself an ensemble of grid services (Figure 3). Among them, the File Transfer Service is responsible for transferring selected dataset fragments, via HTTP or GridFTP, from individual source wrappers to a staging area on a PoP node. Once the data fragments are received at the staging area, they are uncompressed as needed, with help of the Uncompress Service. The fragments may also need to be transformed, using the Data Conversion Service, into one of several raster or vector formats supported by ArcIMS (in the current version of the map assembly). Finally, the Image Assembly Service translates the initial XQuery specification of the map into a valid ArcIMS configuration file, replacing references to remote data sources with references to data fragments in the local staging area, and instantiates it as an ArcIMS image service.
Once the service is ready it is then made available to the mapping portal for further querying, so that users don't experience it as different from standard static map services. The map assembly workflow is controlled by the Command Module which matches the original XQuery with a map assembly template, and then orchestrates service calls guided by the template.
As an example, consider an application of map assembly services in BorderSafe, a project focused on information sharing across law enforcement agencies. A fragment of the BorderSafe portal, with a composite map delivered by a dynamically generated ArcIMS service, is shown in Figure 4 (Note: the figure illustrates a GIS application developed for non-operational, proof-of-concept purposes using synthetic data). In this scenario, a user enters search criteria to extract records of interest from several databases, and map them. Organizing this as a dynamic map service allows users to further query the incident locations through the map interface as they would query other layers in the service. This functionality is not available if the point locations are added to a static service. This integration model becomes especially useful given different access control and data management policies at different law enforcement agencies. The Data Conversion Service (the ASCII2SHAPE component of the service) is used to translate the point locations, which are part of the results returned to the portal, into the ESRI shapefile format. The shapefile is then included, as the top-most layer, in a newly generated ArcIMS configuration file (AXL file), which is derived from a pre-built AXL template. Finally, Image Assembly Service uses the service configuration file to instantiate a new ArcIMS service and deliver a composite map to the portal. Thus, for each new user query against the external database, a new map service needs to be created. To prevent the ArcIMS server from an eventual breakdown (i.e. when the number of image services reaches a limit) a Resource Scheduler Service is used. Each temporary service or staged dataset is registered with the resource scheduler along with the duration it needs to be maintained. The resource scheduler wakes up at regular intervals and cleans all the temporary resources which have outlived their allowed time period. However, these additional operations do not make the system noticeably slower compared to static map services, not to mention that the transient map service supports a full range of familiar ArcXML-formatted requests.
Mapping line and area data from non-hosted WMS and WFS sources follows a similar approach. In particular, getMap requests against remote WMS sources result in map images saved in the assembly service staging area, along with automatically generated TFW files, so that the image layers can be correctly added and positioned vis-ā-vis other layers in the assembled service. For WFS servers, we use an XML2SHAPE component of the Data Conversion Service to convert WFS output into a shapefile, and then register it in the assembly service.
Conclusion
We presented spatial information integration abstractions and methods that proved flexible and efficient in a large cyberinfrastructure project, where heterogeneous data sets are hosted in the grid environment or accessed as external data resources. The techniques we describe rely on grid services for ontology-aware query rewriting and dynamic configuration and generation of map services, supported by a range of conversion and data transport modules. This approach proved to be useful when the spatial data sets are under different access control and management constraints. The dynamic map assembly is quite different from merging XML fragments into a single XML tree, which is a common component of XML-based information mediation. However, such a component is essential for presenting results of mediated queries against spatial data sources, where individual query results contain information of different types (ASCII text, pure XML/GML, shapefiles, known binary vector and raster formats, etc.), the query results must be presented in a geographic context and output format determined at query time, and where the resultant composite map service should enable additional requests without re-querying individual data services.
Acknowledgements
Partial support under US National Science Foundation grant #0205049 "ITR: GEON: The Geosciences Network: A Research Project to Develop Cyberinfrastructure for the Geosciences", as well as partial funding by ESRI, are gratefully acknowledged. The Biomedical Informatics Research network is supported by NIH grants RR08605-08S1 (BIRN-CC) and RR043050-S1 (Mouse BIRN). The BorderSafe project is a part of DHS initiated integrated feasibility experiment funded under Cooperative Agreement No. NBCH2030002.
Bibliography
-
Y. Bishr, Y. (1998) Overcoming the semantic and other barriers to GIS interoperability. Int. J. Geographical Information Science 12, pp. 299-314.
- Boucelma, O., Esid, M., Lacroiz, Z. (2002) A WFS-based Mediation System for GIS Interoperability. Tenth ACM International Symposium on Advances in GIS, pp. 23-28.
- Camara, G., and R. Thome, U. Freitas, A.M.V. Monteiro (1999) Interoperability In Practice: Problems in Semantic Conversion from Current Technology to OpenGIS. In Proc.
- Interoperating Geographic Information Systems, Second International Conference, INTEROP '99, Andrej Vckovski, Kurt E. Brassel, Hans-Jörg Schek (Eds.), Zurich, Switzerland, March 10-12, 1999, pp 129-138.
- DeVogele, T, and C. Parent and S. Spaccapietra (1998) On spatial database integration. Int. J. Geographical Information Science 12, pp. 335-352.
- Foster, I., Kesselman, C. and Tuecke, S. (2001). The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3).
- Foster, I., Kesselman, C., Nick, J. and Tuecke, S. (2002) The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration (www.globus.org/research/papers/ogsa.pdf).
- Gupta, A., Marciano, R., Zaslavsky, I., Baru, C. (1999). "Integrating GIS and Imagery through XML-Based Information Mediation". In P. Agouris and A. Stefanidis (Eds.) Integrated Spatial Databases: Digital Images and GIS, Lecture Notes in Computer Science, Vol. 1737, pp. 211-234.
- Lin, K., Ludäscher, B. (2003) A System for Semantic Integration of Geologic Maps via Ontologies,. In Semantic Web Technologies for Searching and Retrieving Scientific Data (SCISW), Sanibel Island, Florida, October 20th, 2003.
- Shimada, S., and Fukui, H. (1999) Geospatial mediator functions and container-based fast transfer interfaces in Si3CO Test-bed. LNCS 1580, pp. 265-276.
- Visser, U. and Stuckenschmidt, H. (2002) Interoperability in GIS - Enabling Technologies, In:Proceedings of 5th AGILE Conference on Geographic Information Science, (Eds, Ruiz, M., Gould, M. and Ramon, J.), Palma de Mallorca, Spain, pp. 291-297,
- W3C (2003a). Web Services Description Language (WSDL) Version 1.2. W3C Working Draft 24 January 2003
- W3C (2003b). Simple Object Access Protocol, W3C Proposed Recommendation, 07 May 2003.
- W3C (2004) OWL Web Ontology Language Overview. W3C Recommendation 10 February 2004.
- Wiederhold, G. (1992) Mediators in the Architecture of Future Information Systems. IEEE Computer, 25, 3, 38-49.
- Zaslavsky, I., Memon, A., Petropoulos, M., and Baru, C. (2003) Online Querying of Heterogeneous Distributed Spatial Data on a Grid. Proceedings of the 3rd International Symposium on Digital Earth, pp. 813-823.
- Zaslavsky, I. & Memon, A. (2004). GEON: Assembling Maps on Demand From Heterogeneous Grid Sources. In Proceedings of ESRI Users Conference, San Diego, CA, August 2004.
- Zaslavsky, I., and A. Memon, P. Velikhov, C. Baru (2004a). Mapping on the Grid: From Spatial Web Services to Mobile Clients. Proceedings of the First International Joint Workshop on Ubiquitous, Pervasive and Internet Mapping (UPIMap 2004), Tokyo, September 7-9, 2004, pp. 110-119,
- Zaslavsky, I., and H. He, J. Tran, M.E. Martone, A. Gupta (2004b). Integrating Brain Data Spatially: Spatial Data Infrastructure and Atlas Environment for Online Federation and Analysis of Brain Images, Proceedings of the 15th International Workshop on Database and Expert Systems Applications (DEXA 2004), Biological Data Management (BIDM'04), pp. 389-393.
|
|
|