A Framework for Sharing Heterogeneous Geo-Spatial Information using Spatial Data Modeling and Enterprise-GIS Indira Mukherjee1, P.S. Acharya2 and S.K. Ghosh1 1 School of Information technology Indian Institute of Technology, Kharagpur 721302, India 2NRDMS, Department of Science & Technology Technology Bhawan, New Delhi 110011, India Email : indira.mukherjee.23@gmail.com, psa@nic.in, skg@iitkgp.ac.in Abstract With increasing use of Geospatial Information by various organizations, both government and private, there is an increasing demand to access these data. Further, geospatial information is becoming an essential input for any socio-economic development process. The need for sharing has aggravated with the easy availability of internet connectivity and the advancements of web technology. The major bottleneck of such sharing is the lack of interoperability between the data providers. Hence there is necessity to build a framework to seamlessly share these heterogeneous resources of geospatial information. The major hindrances in sharing these data across the organizations are mainly due to their proprietary formats or syntactic interoperability. Further, most of the organizations have own operational policy and proprietary data models. The standardization of these data models will help in proper integration of the diverse database structures and thus facilitates users’ query resolution. Hence, in order achieve seamless data sharing across organizations there is a need to address both the proprietary data models and syntactic interoperability. In this paper, a framework has been proposed to integrate various heterogeneous databases through geospatial data modeling and spatial web services. The data model has been done using Unified Markup Language (UML). This model helps in integrating the service providers databases with the service consumers requirements. The core of the system is an Enterprise GIS infrastructure which implements the OGC (Open Geospatial Consortium) geospatial web services. The data transformation is achieved through GML/XSD technology and stored in spatial database, which in-turn is used by the Enterprise-GIS infrastructure. The UML is employed to generate the schema definitions (XSD) and the appropriate database structures. A case study has also been shown to demonstrate the efficacy of the proposed framework. 1 Introduction The demand for geospatial information for various development and business activities has increased to great extend in recent time. Further, this geospatial information are collected and maintained by diverse organizations (in their proprietary formats) for their own organizational need. On the other hand, the users’ queries may involve information from one or more data repositories and thus there is a need of integration of data between the data providers. In this paper a framework has been proposed to share these heterogeneous resources of geospatial information. The proposed framework is model driven and attempts to relate the individual data providers’ model to a domain specific base feature space or base model. The data model describes the structure of the data and the relationships among them. Once the data model has been developed, the conceptual database system can be developed agreeing to this model. Further, a proper data model will also address the issues related to data heterogeneity, data autonomy and facilitates data sharing. The heterogeneity in geospatial information prevents successful integration of geospatial information. There can be two types of heterogeneity problems, namely, syntactic and semantic. In this work, we try to address the syntactic heterogeneity by spatial data modeling. The appropriate data model helps in data integration without affecting the data autonomy of the individual repositories. This will facilitate individual data providers to maintain and update geospatial information. For increasing the usability, the data model of geospatial information is to be standardized. This inherently standardizes the structure of geospatial data and thus leading to structural interoperability. The data model help in managing huge volume of complex distributed geospatial information in a seamless manner. The proposed framework uses object-relation model to integrate and query geospatial information. The proposed approach is implemented using an Enterprise-GIS (E-GIS) framework and case study for the same is also presented. The paper is organized in following sections. Section 2 presents a brief overview of related work on geospatial modeling. The proposed framework is discussed in section 3. Section 4 presents a case study to show the efficacy of the proposed framework. Finally, the conclusion is drawn in section 5. 2 Related work Integrating heterogeneous system requires appropriate data modeling. The repositories are maintained by diverse organizations in their own proprietary design and formats. The variation in the conceptual design of the data prevents access of data during decision making process, specially in case of critical decision support [11]. Several international organizations are working towards standardizing the geospatial computing mechanisms, including geospatial data formats and access mechanism. The Open Geospatial Consortium (OGC) [1] and ISO/TC 211 [2] are the two major standards in this area. According to [2], the data provider and the data receiver are supposed to agree on a so-called application schema [2,3]. In order to integrate different geospatial data, the individual organizations should follow some designing / data modeling standards [4,5]. One such effective model is the objectrelational model and the UML (Unified Modeling Language) is being used widely for this purpose. The ISO Rules for application schema [2] prescribes how to make an application schema in UML. The geospatial data model act as a logical interface between data consumers’ query, E-GIS and the data repositories [6]. For developer, it act as skeleton for developing an application and for users it provides a description of the structure of the system, independent of specific data items or details of the particular application. In [7], the characteristics of GIS are considered and four design-decision are being proposed, namely, transactional mode (synchronous versus asynchronous), service granularity (fine grained versus coarse-grained), delivery manner (chunk versus stream), and transmission formats (GML versus binary). It has been shown in [8] that sharing of digital road map databases within and among organizations is dependent on translating user requirements to a data model that supports linear and non-linear location referencing systems. The paper examines issues of creating such a data model with the aim of sharing digital road map databases, and suggests implementation choices that can accommodate a range of applications. A data model (MDLRS) is developed to provide a framework to satisfy the need of integrating the multidimensional data [9]. Another model [10] has been developed to handle and visualize vector-format geo-data in a hierarchical triangulated domain. Once the Object View of the geospatial features of a repository have been designed using UML1, conversion to GML (Geographic Markup Language) schema specified by the Open GIS Consortium (OGC) [1] can be done be defining some conversion rules [12]. The UML to GML Application Schema (UGAS) [13] mapping tool can support this conversion automatically. The representation for UML class according to GML application schema and implementation of GML are discussed in various literatures [14,15,16,17,18,19]. 3 Spatial Modeling for Geospatial Information Integration In this section, the proposed framework for sharing of heterogeneous geospatial information has been presented. The overall conceptual model is shown in figure 1. ![]() Fig.1.Geospatial Information Integration Framework 1 http://www.uml.org/ The Enterprise-GIS (E-GIS) is the core of the system which integrates the data providers and data consumers. The E-GIS employs spatial services, compliant to OGC, for achieving syntactical interoperability. The data providers maintain the spatial information for their own organizational needs. Hence, the data usually stored and maintained in proprietary storage structure and formats. Thus, in order to integrate these heterogeneous repositories, it is required to ensure both syntactic interoperability and understanding of the data models. The proposed framework addresses the interoperability issue of heterogeneous repositories through spatial data modeling. The integration of data is basically involves defining the types of data available in various repositories and generating the relationships among them. The subsequent subsections explain the steps in model driven geospatial information integration collection for answering the user queries. 3.1 Data description and relationship The geospatial datasets may come from various data providers. Further, these datasets are large in volume and appeared to be diverse in nature. Due to their dissimilar structure, the datasets may appear to be independent from each other but conceptually, but inherently the datasets were related to each other. The abstract relationships among the data can be mapped by applying geospatial modeling techniques. The following steps were followed to model and integrate these diverse data o A generic geospatial data model needs to be developed using the information of base feature space and domain knowledge to define the relationship among those data. This can be achieved through use of UML o A global XSD (schema file) can be developed from the geospatial data model by following “UMLto GML” conversion rules. o The structures of the data of various datasets were matched with the geospatial model. Here, in the proposed framework, it has been assumed that the structures of the data in various datasets are the subset of the generic model. o Using the GML (data of the datasets) and the global XSD the data can be mapped into a spatial database. 3.2 UML to GML schema (XSD) conversion Once the object view of the geospatial features of a repository have been designed using UML, the conversion of that UML into GML schema can be done by using some mapping rules. 3.3 Mapping GML/XSD to Object-Relational database The GML/XSD generated from the application schema is preserved in object-relational schema using GML to object-relational schema mapping rules. The generated GML and its schema were further mapped into object-oriented database. The database schema is to be developed by using the GML schema (XSD); and the geospatial data and its feature type in GML is to be stored in object-oriented relations. The storing the geospatial data in objectrelational database helps in organizing and managing geospatial data more effectively. ![]() Fig.2.Model driven integration framework In the next section a case study has been presented to prove the efficacy of the proposed approach. 4 Case Study: Implementation of proposed framework This case study is based on the proposed framework. A generic object oriented model (UML class diagram) has been developed based on some GIS layers like administrative boundary, roadway, railway, school, hospital, etc. and the operations that can be performed on those data (domain knowledge). Three separate data providers have been considered - first one contains data for administrative boundary, second one contains data of roads and third one contains data of school. The structure (model) of individual datasets has been integrated based on the global data model and the data of those individual datasets are stored in “Oracle Spatial 10g” database which will facilitate spatial operations. The users’ queries can be effectively answered using the integrated model and spatial database. The integration of the heterogeneous data repositories through spatial modeling can be achieved through the following steps. 4.1 Object oriented model with domain knowledge and Base Feature Space An object oriented model has been developed (refer figure 3) based on some of the data available in base feature space and the operations that can be performed on those datasets (domain knowledge). Based on this model a global XSD has been developed. An assumption has been made that the structure of all dataset are subset of the global XSD. Thus, the global XSD and the data of different datasets were used to map the data into Oracle Spatial 10g database. ![]() Fig.3.Conceptual data mode 4.2 UML to GML mapping for generating global XSD The GML schema (XSD) and the corresponding GML has been developed from the UML model following the UML to GML conversion rules (figure 4). ![]() Fig.4.Mapping of UML class to GML/XSD Schema 4.3 Extracting and mapping models of various datasets to global XSD In this case study three datasets have been considered, namely, administrative boundary data (polygon type), road data (polyline type), and school data (point type). Each of the datasets has two components, o Spatial information which includes geometrical and topological information like coordinates, shape etc. o Attribute (non-spatial) information related to the spatial data (like name of the road). 4.4 Mapping the datasets into object oriented model The global XSD defines the structure of the data and their interrelationship. The data in XML format along with the structure are mapped into Oracle 10g database using a parser, namely “GMLtoOracle”. The structures of available data are only mapped from the global XSD to database. For example, in global XSD there are spatial information about region, road, school, railway, forest, river etc. whereas the data for road, school and region are only considered for mapping. Therefore, the spatial information road, region, school are only created and stored in the database. 4.5 Mapping GML/XSD to object-relational database (Oracle spatial 10g) The GML generated from the datasets and global XSD generated from UML class diagram are further mapped into object oriented database (Oracle Spatial 10g) by using “GMLtoOracle” parser (as explained in the previous section). After inserting the data into database, answering users’ queries involving one or more datasets become easy. Few snap shots of the generated tables in the database taken from oracle interface tool (TOAD) given in figure 5 ![]() Fig.4.The generated tables in Oracle spatial 10g database The overall process that is the development of UML model and corresponding global XSD, mapping the storage structure of various datasets with the UML model, storing in database (oracle spatial 10g), and accessing those data from database though Geo-service interface is shown in figure 6. ![]() Fig.5.Overall architecture and data flow 5 Discussion In above case study it has been shown how the data of three different datasets has been integrated based on a global object oriented model (base feature space). This approach can be used to integrate any number datasets. After integration, the data are accessed using a geospatial web service (through E-GIS framework). The service consumers can query on those data using the geospatial web service. 6 Conclusion and Future works Geospatial information plays a vital role in various decision support system - it is the information about 'where' (location), 'what' (buildings, roads, water masses, etc), etc. All these information can be captured by representing the data along with their relationship in a conceptual way that is the geospatial data modeling (using UML). For increasing the sharability and usablity, the data model of geospatial information needs to be standardized. This in turn needs standardization of the structure of geospatial data and thus facilitates interoperability. Thus, a good data model will be able resolve data heterogeneity, data autonomy and distribution efficiently. In this paper a model driven framework work facilitating sharing of geospatial information has been developed. A case study has also been presented to show the efficacy of the approach. Reference [1] Open Geospatial Consortium. OGC (2003). http://www.opengeospatial.org/, accessed on [2] Rules for application schema, ISO (2001b), Final text of CD 19109, Geographic information ISO/TC 211 N 1127. 19th of July, 2001b. [3] Information technology Open distributed processing reference Model Architecture, International Standard ISO/IEC, ISO (2000) ISO/IEC 10746-3(E), 2000,http://isotc.iso.ch/livelink/fetch/2000/2489/Ittf_Home/ PubliclyStandards.htm. [4] J Brodeur, Y Bédard, MJ Proulx, “Modelling geospatial application databases using UMLbased repositories aligned with international standards in geomatics”, Proceedings of the 8th ACM international symposium on Advances in geographic information systems, pp. 39 – 46, 2000. [5] J L Filho, C Iochpe, KAV Borges, “Analysis Patterns for GIS Data Schema Reuse on Urban Management Applications”, CLEI Electronic Journal 5, pp.1, February, 2002. [6] A Hamilton, H Wang, A M Tanyer, Y Arayici, X Zhang, Y Song, “Urban Information model for city planning”, Journal of Information Technology in Construction, pp. 55-67 April 2005. [7] , S Tu, M Flanagin, Y Wu, M Abdelguerfi, E Normand, V Mahadevan “Design strategies to improve performance of GIS Web services”, International Conference on Information Technology: Coding and Computing , pp.444-448, vol.2, April 2004. [8] KJ Dueker, JA Butler, “GIS-T Enterprise Data Model with Suggested Implementation Choices” Journal of the Urban and Regional Information Systems Association, pp.12-36, vol.10, 1998. [9] NA Koncz, TM Adams, “A data model for multi-dimensional transportation applications”, International Journal of Geographical Information Science, pp. 551-569, vol.16, September, 2002. [10] G Dutton, “Encoding and handling geospatial data with hierarchical triangular meshes”, Conference paper, Symposium on Spatial Data Handling, Delft, Holland, July,1996. [11] MD Vries “Recycling Geospatial Information in Emergency Situations: OGC Standards Play an Important Role, but More Work is Needed”, Directions Magazine,November,09, 2005. [12] R Grønmo, I Solheim, D Skogan, “Experiences of UML-to-GML Encoding”, 5th AGILE Conference on Geographic Information Science, Palma, Balearic Islands, April,2002. [13] “Mapping UML to GML Application Schemas ShapeChange” - Architecture and Description, C Portele,2004. [14] User Guide - GML Application Schema auto-documentation tool, Document: TRIM D2007-27160, v-1.0, 9 March 2007. [15] M.E. de Vries, T.P.M. Tijssen, J.E. Stoter, C.W. Quak, P.J.M. van Oosterom “The GML prototype of the new TOP10vector object model”, , GISt Report No. 9, Delft, December 2001. [16] Anders Friis-Christensen “From Model to Data Transformation-One approach Mapping TeleAtlas Data to EuroRoadS”, JRC, CSL workshop, Ispra, Italy, October 12-14, 2005. [17] A Skopeliti, L Tsoulos, M Spanaki “GML Implementation on STATLAS”, STATLAS Consortium document, v-1.0, July,2003. [18] G Myrind , Application “Schemas–From conceptual models to implementation”, 18th Nordic GIS Conference, Helsinki, Finland, 2006. [19] D Nebert, “U.S. National Spatial Data Infrastructure, Update on FGDC-Coordinated Activities”, NSDIPA, Japan, 2003. | ||
|
|