Quadtree access method with Variable Threshold: Application to GIS



B. Bessaa
Image Processing and Radiation Laboratory
Electronic and Computer Science Faculty
USTHB - Algiers

A. Belhadj-Aissa
Image Processing and Radiation Laboratory
Electronic and Computer Science Faculty
USTHB - Algiers


Abstract:
The spatial queries in a geographical information system (GIS) uses the space attributes related to the geometry of the geographical objects. In this type of request, research is done according to the location of the object in space. In order to efficiently process these requests, one needs specific access method to quickly identify the objects according to their location. Among these methods one finds, the grid file, the R-trees, the quadtrees. In this paper, we propose an alternative of the quadtree with a variable threshold associated to the number of objects per node in order to minimize the depth of the tree in case of a bad distribution of the objects and a modification of the point query algorithm in case of a topological representation model. In Q-tree the initial rectangular space is decomposed in four quadrants along its axes of symmetry vertical and horizontal. Then each one of these quadrants is decomposed in same manner until the number of mbb (minimal bounding box) of objects in each quadrant is below a given threshold s. The nodes of the tree can be an interior node corresponding to a quadrant, or a leaf containing a whole of with the more s couples (mbb, oid: object identifier). The alternative that we propose, in this paper is opposed to the fast increase depth in case of a high density of objects in a given zone. Instead of having a fixed threshold of s objects by quadrant, we propose to vary this threshold according to the level of depth of the tree. Initially one starts with a threshold s0, then the threshold is calculated according to the level of depth nv with the depth of the root which is supposed to be null.

In the GIS the representation and the density of the space objects depend on the one hand on the scale and/or the resolution of the initial image and on the other hand model of representation of the space data to knowing the models spaghetti, network and topological. For the topological model, we propose an optimization of the point query algorithm.

. 1- Introduction
In SIGs, the research of the spatial attributes of the geographical objects on various scales of resolution is an essential stage in the implementation of the applications. Contrary to the nonspace requests which use only traditional attributes, the spatial queries use spatial attributes describing the geometry of a geographical object. To accelerate the identification of such object in space, the use of a method of space indexing is essential. Various methods are proposed to achieve this operation [ GAD98 ]. Among these methods we cite: Grid file, R-Trees, Q-Trees...

In this paper we are interested to the Q-Tree index and the optimization of the research of objects in a topological space.

2- The Q-Tree
The variant of Q-tree which we propose here [SCH96],[PEL96] is based on a regular decomposition of the universe into four subspaces. Initial rectangular space is decomposed into four subspaces along its axes of symmetry vertical and horizontal. The decomposition into subspaces is usually continued until the number of mbb of objects in each quadrant is below a given threshold s. A mbb can intersect several close quadrants. The nodes of the tree can be an interior node or a leaf node. Each interior node has four descendants, each corresponding to a quadrant, a leaf node contains a whole of with the more s couples (mbb,oid). Figure 1 shows the quadrants obtained on a collection C of numbered objects from 1 to 11 for a threshold s=3 and the corresponding tree.


Figure 1 : Space decomposition and the corresponding tree.


The insertion of the object number 12 creates an overtaking of the threshold on the quadrant referred by node E, which requires a decomposition of this last in four quadrants. Figure 2 shows the initial space decomposition after this insertion and the corresponding tree. The insertion of object 12 increases the depth of the tree on the level of the node E


Figure 2 : Space decomposition and the corresponding tree after the insertion.


3- Variation of the threshold
Quadtrees are not necessarily balanced; subtrees corresponding to densely populated regions may be deeper than others. The alternative that we propose here opposes to the fast increase depth in the case of a high density of objects in a given region. Instead of having a fixed threshold of s objects by quadrant, we propose to vary this threshold according to the level of depth of the tree. Initially one starts with a threshold s0, then the threshold is calculated according to the level of depth nv, for example s = s0 + f(nv).

The level of depth of the root is supposed to be null. The function must be an increasing function on the interval [0,N], where N is the maximum depth depending on the size of initial space. For our running example we chose a linear function: f(nv)=nv-1.

For s0=3, the threshold of level 1 remains equal to 3, but levels 2 and 3 will have as thresholds 4 and 5 respectively. For the preceding collection of objects from 1 to 11, the situation remains the same where the number of objects in the various quadrants of level 1 does not exceed 3 and those of level 2 does not exceed 4. In opposite of the figure 2, the insertion of object 12 does not induce the decomposition of the quadrant E, because its threshold is 4, and the node E does not need more to be burst, so the depth of the tree remains the same one. Figure 3 shows space division after insertion and the corresponding tree.


Figure 3 : Space decomposition and the corresponding tree after the insertion with variable s.


Algorithm of insertion
The algorithm of insertion is based on the same principle [SCH96], except that in this case we must calculate the threshold of the leaf concerned by the insertion. If the leaf quadrant contains a number of objects lower than the threshold s, the element is added with the leaf, if not the leaf quadrant is decomposed into four leaves. These elements are distributed in the new leaves. The algorithm is as follows:

Begin Insertion-Q(N,E,nv) 
// N : initially the root node
// E : entry (mbb,oid) to insert
// nv : level of depth initially  null
// Liste_E(N): set of couples of node N

IF N is a leaf
THEN 
    Liste_E(N) = Liste_E(N)?E
    s=s0+f(nv) 
    IF card(N)>s
    THEN
    Create 4 quadrants Q1, Q2, Q3, Q4  child(N)
    FOR  All Object O IN Liste_E   DO
             FOR Q IN { Q1, Q2, Q3, Q4} DO
             IF O.mbb overlaps Q
             THEN Liste_E(Q) ) = Liste_E(Q)?O ENDIF
             ENDDO
     ENDDO
ELSE // increase depth
     nv = nv + 1 
     FOR All Q child(N) DO
     IF E.mbb overlaps Q 
     THEN Insertion-Q(Q,E,nv) ENDIF
     ENDDO
ENDIF
4- Algorithm of research
In GIS, the number of space objects on a geographical space depends on the scale and the resolution of the initial image. The performance of research depends on the number of these objects also depends on the model of representation of the space data. Indeed according to whether the model allows the overlapping of the objects, such as the model spaghetti, or does not allow it such as the topological model [LAU93], [SCH96], the algorithm can be optimized.

In the topological model representation, the graph associated with the geometry with the layer is planar: plan partitioned in faces (polygons) adjacent and related arcs, so a point cannot belong to two distinct objects. One basing itself on this characteristic, we propose to modify the point query algorithm in order reduce the time of research, indeed when the path is followed from the tree root to the leaf that contains the point argument, one is not obliged to achieve the sequential test on all the objects (mbb, oid), but one stops research with the first found object. The algorithm is as follow:

Begin Pointe(N,P) 
// N: initially the root node  -   P: point  of space 
IF N is not leaf 
THEN
         WHILE NOT END OF LISTE(mbb,oid)
         DO IF POINT-IN-MBB(P,mbb)
               THEN  //save  then oid and break the research
                          S = oid
		EndOf LISTE(mbb,oid)
       	   ENDIFI
		Next of LISTE(mbb,oid)
         ENDDO
ELSE
	J = SUB-QUAD(P,N)
	S = POINTE(FILS(N,J),P)
ENDIF
6- Conclusion
In this paper we proposed an alternative of the quadtree method with a variable threshold according to the depth of the tree. The fact of varying the threshold according to the level of depth of the tree makes it possible to increase the number of objects in a node of the tree, and thus minimizes the quadrant division concerned. We have also proposed to modify the point query algorithm in case of topological representation of data. The solution that we proposed is a compromise between a raised threshold, which minimizes the access time to the secondary memory but which raises time CPU of the test of membership of the objects, and a weak threshold witch minimizes time CPU but increases the number of access to the secondary memory which takes much more time.

References
  • [GAD98] V. Gäde, O. Günther. Multidimensional Access Methods, ACM Computing Surveys, 1998
  • [LAU93] R. Laurini, D. Thompson. Fundamentals of Spatial Information Systems, The APIC series 1993.
  • [PEL96] J.-P.Peloux. Bases de Données Spatiales : Modélisation et Indexation. Thèse de doctorat, CNAM Paris, 1996.
  • [SCH96] M. Scholl, A Voisard, J.-P. Peloux, L. Raynal, P. Rigaux, SGBD Géographiques: spécificités, International Thomson Publishing France, Paris, 1996.