|
|
|
An Improved and Efficient Storage Technique for GIS Geometric Primitives Based on Minimum Bounding Rectangles
F.Sagayaraj Francis
Department of Computer Science & Engineering and Information Technology
Pondicherry Engineering College
Pondicherry, India
fsfrancis@rediffmail.com
P.Thambidurai
Department of Computer Science & Engineering and Information Technology
Pondicherry Engineering College
Pondicherry, India
B.S.Bharadwaj
Department of Computer Science & Engineering and Information Technology
Pondicherry Engineering College
Pondicherry, India
G.N.Balagei
Department of Computer Science & Engineering and Information Technology
Pondicherry Engineering College
Pondicherry, India
N.Hemant
Department of Computer Science & Engineering and Information Technology
Pondicherry Engineering College
Pondicherry, India
Abstract
Geographical Information Systems (GIS) objects such as rivers, roads, railway lines, regions and cities are generally represented using geometric primitives such as points, lines and polygons. These primitives have dimensionalities ranging from 0 to 3. A good number of methods are proposed in the literature and consequently used in commercial packages that require the GIS objects to be approximated to the above said geometrical primitives. These methods put a lot of constraints and require compromising at various fronts when it comes to representation and storage. Hence storing objects has always been a major concern in GIS. It is also obvious that efficiency in storage also improves efficiency in processing. This paper proposes a novel way to represent and store objects in GIS space (0 to 3 dimensions) using MBRs (Minimum Bounding Rectangles). The method is also scalable and extendable to n-dimensional space. Most of the commercial packages are good in handling objects, which have regular structure, but the discrepancies in these methods arise when it comes to storing objects that are distortions, extensions or multiplications of these regular structures. Moreover they also fall short when it comes to handling complex cases like self-crossing structures and disjoint structures. As a matter of fact GIS usually deals with objects that have the above said characteristics. The data structure and the storage method proposed, which is based upon the concept of MBRs follows a generalized procedure to store the GIS objects instead of ad hoc methods. The proposed data structure has been compared with the methods in the literature and commercial packages. It is very much evident that the data structure proposed was more flexible and efficient when encountered objects, which are highly complex.
1. Introduction
Spatial data is defined as the data that pertains to the space occupied by objects. Spatial data represents the essential location characteristics of real or conceptual objects as those objects relate to the real or conceptual space in which they exist [3]. Spatial data is conceptually defined as geometric data such as points, lines, rectangles, surfaces, volumes and etc. In GIS these conceptual objects (and hence geometric data) represent the GIS objects such as, cities, rivers, roads, states, crop coverage, mountain ranges etc [1] [2]. Applications like environmental monitoring, geographic information systems, earthquake research etc require GIS that are robust and efficient in terms of storage, retrieval, processing and good graphical reconstructing. The volume of data that is involved in such systems is huge and dynamic. Static querying (queries on spatial relationships, distance measurements, directions, etc.) and dynamic querying (Addition, deletion and updating) are frequent. Graphical reproduction of GIS objects is only of paramount importance. In this paper the ‘spatial data’ and ‘spatial objects’ are used interchangeably with ‘GIS data’ and ‘GIS objects’.
The reasons said above are the motivation behind developing a generalized method for representing and storing GIS data, the prime objective of this paper. Any good and generalized representation method must be able to accommodate all types of objects from simple to highly complex. The representation in turn must facilitate efficient storage, retrieval and querying. This paper gives (i) a new GIS object representation method in terms of Minimum Boundary Rectangles (MBRs), (ii) a GIS database Schema and (iii) arguments in support of the efficiency of the proposed methods and schemes.
The usage of MBRs is motivated by the effective usage of them in indexing multidimensional data in database applications [14] [12]. Further advantages of employing MBRs to represent GIS objects are listed later in this paper.
Much spatial data is made up of coordinates. A coordinate is a number that denotes a position that is relative to a point of reference. For example, in GIS degrees of latitude are coordinates that denote positions relative to the equator. Degrees of longitude are coordinates that denote positions relative to the Greenwich meridian [15] [16]. The simplest spatial data item consists of two coordinates that define the position of a single geographic feature in a two dimensional space. A more extensive spatial data item consists of several coordinates that define a linear path such as a road or river. A third kind consists of coordinates that define the boundary of an area such as the boundary of a land parcel or flood plain. The mapping between GIS objects and geometric objects is given below:
- Points - cities
- Line strings - streets, canals, and pipelines
- Polygons - welfare districts, forests, and wildlife
habitats
- Multipoint - an island chain
- Multi line strings - river systems and highway systems
- Multi polygons - the collective farmlands in a specific region,
or a system of lakes
- Irregular Objects - map of a country, map of a road, etc.
While the space or specifically, GIS space is 3-dimensional, the objects in the space may not always have 3-dimensions. They may vary from 0 to 3 dimensions or have combinations of the dimensions. Moreover the objects may be highly complex with distortions, disjoint ness, intersections, meetings etc, or combinations of these. Figure 1 gives examples of such images.
 Figure 1a. Object with branches
 Figure 1b. Object with combination of dimensions
 Figure 1c. Objects with holes and self-crossing
2. Literature Survey
There are wide ranges of commercial GIS software that use data structures that are very specialized and fulfill a particular objective. Commercial packages give ad hoc models and methods on to which the natural models and methods of the application space have to be mapped [15] [16]. There are quite a few models in the literature that are extensions of models used for other type of data or application [6] [4].
The common feature of the commercial software and packages is the usage of points in plotting the boundaries of objects. These points are used for representing, storage, retrieval, processing and reconstruction of the object. In the case of regular objects, the edges of the object are stored if the object is composed of straight lines. On the other hand, if the object is made up of a connected sequence of circular arcs then the starting point of the arc, the end point of the arc and some arbitrary point on the arc describe each arc on the object. In the case of irregular objects, they follow ad hoc procedures, which approximate the boundary of the irregular object by plotting points along the boundary and draw straight lines connecting the points.
Space partitioning techniques are also used for approximating the irregular objects [13] [16]. Partitioning is done by successively dividing the space of the object into hierarchical circles [11] or hierarchical rectangles [16] and the circles or rectangles that are occupied by portions of the object are considered as constituents of the object. A variation of this method fills the irregular object with varying sized circles [17] [18]. These circles collectively represent the object. Careful analysis of these methods reveal following shortcomings:
- Commercial packages and software are good in handling objects, which have regular structure, but the discrepancies in these methods arise when it comes to storing objects that are distortions, extensions or multiplications of these regular structures. Moreover they also fall short when it comes to handling complex cases like self-crossing structures and disjoint structures.
- Commercial packages and software are very good in storing regular structures but lag behind in handling irregular objects. They require more space and they need a lot of approximations to store the objects. Most of the GIS primitives are irregular in shape.
- Objects that are continuous and linear can be stored very easily and efficiently with the currently available methods. Continuous objects are defined as the objects that do not have branches as in figure 1a. In such cases lot of space is wasted and also the method of storage becomes tedious.
- The existing methods fall short in storing objects that are collection of constituents with variable dimensions as in figure 1b.
- Complete support for holey and self-crossing objects as in figure 1c is not available.
- Commercial software and packages are good in handling objects made up of regular circular arcs. Arcs of other varieties are seldom supported.
- The existing methods are seldom scalable.
- Developing indexing strategies are difficult.
- Inherent expression of the relationships with the adjoining objects of the object space.
- Lack of comprehensive support for the following categories of queries [5][7][8][9][10]:
- Exact Match Query
- Point Query
- Window Query or Range Query
- Intersection Query or Region Query or Overlap Query
- Enclosure Query
- Containment Query
- Adjacency Query
- Nearest Neighbor Query
3. An Improved Model
3.1. Representing Spatial Objects
This section introduces a new method of representing GIS objects. This method partitions the given GIS objects and each partition in represented by an MBR. Each MBR may be of any dimension from 0 to n in an n-dimensional space. The MBRs may overlap, cover the entire object or cover the individual constituents of the GIS object as shown in figure 1. The GIS object then, is the collection of MBRs. The best strategy for identifying MBRs for a GIS object is beyond the scope of this work and would be taken in future works. This method facilitates the storage all types of GIS objects including regular, non-regular, open and close of any shape and size. The storage space needed for this type of data structure is relatively very small as only the coordinates of the respective MBRs are stored and not of each and every point of the object. This new data structure supports every possible spatial object types such as points, line strings, polygons, multi points, multi line strings, multi polygons and irregular objects.
3.2. Database Schema
In the proposed model, the data are stored in many tables. But these tables belong to any one of the two table types defined in the model. The tables are structurally similar to relational tables. But, the constraint on the atomicity of data at the intersection of each row and column is waived. The first type of table is the Object_Table, which collects the objects of a single object space. Each individual object of the space makes up a row of the table and belongs to the type Individual_Object. Each attribute of Individual_Object makes a column of the table. The second type of table, Metadata_Table collects the data about the object spaces. Each row of the Metadata_table record data about one table of Object_Table type (and hence one object space) and belongs to the type Metadata_Table_row. Each attribute of Metadata_table_row makes a column of the table. Obviously for a given single GIS space, there can be only one table of Metadata_Table type and many tables of Object_Table type. Each table of Object_Table type may be a collection of objects that belong to the same semantic or syntactic group based on the requirements of the application. But the definition of Object_Table type does not put any constraints of the type of objects that are collected in the table of this type. It can accommodate all types of GIS objects in all dimensions. The definition of types is given below:
type Individual_Object
{ Object_Id generic,
Object_Name generic,
Non_Spatial_Attributes Set of Attribute_Pair,
Object Set of MBR }
type Attribute_Pair
{ Attribute_Name generic,
Attribute_Value generic }
type MBR
{ Dimension numeric,
Structure_Type generic,
BLPoint Ntuple,
TRPoint Ntuple }
type Ntuple { PArray Axis_Value[N]; }
type Axis_Value { Value generic; }
type Metadata_Table_Row
{ Object_Table_Id generic,
Object_Table_Name generic,
Object_Table_Dimension numeric,
Object_Table_Bounds_Limits Ntuple }
Object_Id provides a unique identification to each object in an object space. Object_Name is the geographical name of the object. Non_Spatial_Attributes column is used to specify the non-spatial attributes that the objects possess. Non-spatial attributes are mainly physical properties of an object rather than the topological characteristics. These physical characteristics include properties like color, texture etc. This is a set of pairs (Attribute_Name, Attribute_Value). Object is the actual set of MBRs that represent the geographical object. The object in the figure is described using MBRs with 0, 1 and 2 dimensions. Dimension gives the dimension of the object. Structure_Type tells the actual structure of the object, i.e., whether the object is a line, a string or an arc or in the case of two-dimensional objects upper triangular or lower triangular area, etc. Some of the predefined values are as follows: 0 - one-dimensional line string; 1 - Upper triangular area; 2 - Lower triangular area; 3 - Whole area; 4- Circular arc. 1, 2, and 3 are used to score over a disadvantage of the inability to store lines, which are not parallel to the axes. BLPoint and TRPoint are the ones, which actually store the lower left, and upper right coordinates of the MBRs. Metadata_Table defines the boundary limits of each object space and also stores the maximum dimension of the object space. Figure 2 gives a partial instant of a schema that contains the objects in figure 1.
Figure 2. Partial instance of the schema for the objects shown in figure 1
Object_Table_1
 Object_Table_2
 Metadata_Table_1
Advantages of the Proposed Method
- As evident, the storage of regular objects is straightforward. In case of irregular objects, the method only requires the object to be broken down to MBRs, although the effectiveness depends on the algorithm that is used for generating the MBRs and the purpose for which the method is adopted.
- Storing objects using MBRs is effective in the case of discontinuous objects, which are branchy in structure.
- This method is also effective in storing disjoint objects and objects, which are made up of multiple dimensions.
- The method allows the storage of any kind of curve and not only circular arc.
- The method is a generalized one to accommodate any type of GIS object represented as geometric primitives.
- The MBRs include the neighborhood region of the GIS objects, which always improves the efficiency of query processing in enclosure, containment, adjacency, and nearest-neighbor queries.
- Graphical reproduction of the GIS objects is very simple as shown in figure 3. The accuracy of the object may be improved by efficient MBR cover over the object.
- Since each MBR represents a portion of the GIS objects, methods can be devised to effect subjective treatment to the parts of the objects.
- Proven effective indexing techniques are already available for objects represented as MBRs.
- The proposed method is scalable to higher dimensions.
 Figure 3. Graphical reproduction of the 2-D object stored as MBRs
4. Comparisons and Conclusion
Qualitative comparisons of a few features and parameters between the proposed method and very popular commercial software were done on a set of fifty spatial objects. The objects shown in the figures of this paper are a few representative ones. Quantitative comparisons in terms of number of points required were also made. The results of the qualitative and quantitative comparisons with respect to the objects in figure 1 are tabulated in table 1. The contents of the tables are to be understood in conjunction with the commercial software representation which is given below:
Commercial software representation of object in figure 1a
MDSYS.SDO_GEOMETRY (2002, 8307, NULL,
MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1,5,2,1,9,2,1,13,2,1,17,2,1,21,2,
1,25,2,1,29,2,1),
MDSYS.SDO_ORDINATE_ARRAY (2,4,4,5,2,4,5,2,5,4,10,6,7,5,8,8,
9,4,12,13,13,4,16,6,17,4,18,3,1,4,18,4))
Commercial software representation of object in figure 1b
Not Possible
Commercial software representation of object in figure 1c
(partial, another part cannot be stored)
MDSYS.SDO_GEOMETRY (2002, 8307, NULL,
MDSYS.SDO_ELEM_INFO_ARRAY (1, 3, 1, 11, 3, 1),
MDSYS.SDO_ORDINATE_ARRAY (0,4,5,4,4,0,1,0,0,4,1,1,1,3,4,3,4,1,1,1))
Table 1. Quantitative and Qualitative Comparison of Commercial Software and the proposed method

The fact that GIS objects are so rich in variety and the applications are domain specific has resulted in ad hoc procedures and methods for handling them. Attempts at providing a generic solution are few and wide apart. This paper has explored the weaknesses of the existing methods and used the findings to propose a wider solution that can accommodate a variety of spatial/GIS objects and applications. The validity and correctness of the solution provided was also verified qualitatively and quantitatively.
References
- F.Sagayaraj Francis, P.Thambidurai, and N.Sivakumar, “Performance enhancement in Geographical Information Systems by Efficient Indexing of Geometric Primitives represented as Sets”, 8th Annual International Conference, Map India 2005, Geomatics 2005, New Delhi, February 2005.
- F.Sagayaraj Francis, P.Thambidurai, and N.Sivakumar, “List and Set Based Directional and Topological Query Processing in Spatial Databases”, 11th International Conference on Advanced Computing, ADCOM 2003, Coimbatore.
- Hanan Samet, “Spatial Data structure”, Modern Database Systems: The Object Model, Interoperability, and Beyond, W. Kim, ed., Addison Wesley/ACM Press, Reading, MA, 1995, 361-385.
- Agnes Voisard and Benoit David, “A Database Perspective on Geospatial Data Modeling”, IEEE Transactions on Knowledge and Data Engineering, Vol 14, No 2, March/April 2002.
- Xuan Liu, Shashi Shekhar and Sanjay Chawla, “Object-Based Directional Query Processing in Spatial Databases”, IEEE Transactions in Knowledge and Data Engineering, Vol 15, No. 2, March/April 2003.
- L.Mohan and R.L.Kashyap, “An Object –Oriented Knowledge Representation for Spatial Information”, IEEE Transactions on Knowledge and Data Engineering, Vol 14, No 5, May 1988.
- D.Papadias, M.Egenhofer, and J.Sharma, “Hierarchical Reasoning about Directional Relations”, Proc. Fourth ACM Workshop Advances in Geographic Information Systems, pp. 105-112, 1996.
- Y.Theodoridis, E.Stefanakis, and T.K.Sellis, “Supporting Direction Relations in Spatial Database Systems”, Proc. Seventh Int’l Symp. Spatial Data Handling (SDH ’96), 1996.
- D.Papadias, Y.Theodoridis and T.K.Sellis, “The Retreival of Direction Relations Using R-Trees”, Proc. Fifth Conf Database and Expert Systems Applications (DEXA), 1994.
- D.Papadias and Y.Theodoridis, “Range Queries Involving Spatial Relations: A Performance Analysis”, Proc. Second Conf. Spatial Information Theory (COSIT), 1995.
- White D.A., Jain R., “Similarity indexing with the SS-tree”, Proceedings of the 12th International Conference on Data Engineering, New Orleans, LA, 1996, pp. 516-523
- Katayama N., Satoh S., “The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries”, Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 517-542, May 1997
- J. T. Robinson, “The k-d-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes”, Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 10-18, April 1981
- Guttman A., “R-trees: A Dynamic Index Structure for Spatial Searching”, Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA, 1984, pp. 47-57
- DB2 Spatial Reference Manual
- Oracle 9i Spatial Reference Manual
- Antoni Moore, et al. “The Storage and Representation of Polygon Boundaries Using Circles”, The 15th Annual Colloquium of the Spatial Information Research Center, University of Otago, Dunedin, New Zealand, Dec 2003.
- Antoni Moore, “The Circle Tree - a hierarchical structure for efficient storage, access and multi-scale representation of spatial data”, The 14th Annual Colloquium of the Spatial Information Research Center, University of Otago, Dunedin, New Zealand, Dec 2002.
|
|
|