|
|
|
Data Management - The Evolution of Data
|
Large-scale multi Multidimensional coverage databases
Rasdman as an example for multidimensional database services
Conceptual Model
The conceptual model of RasDaMan centers around the notion of an n-D array (in the programming
language sense) which can be of any dimension, spatial extent, and array cell type. Following
the relational database paradigm, RasDaMan also supports sets of arrays. Hence, a RasDaMan database
can be conceived as a set of tables where each table contains a single array-valued attribute, augmented
with an OID system attribute.
Arrays can be built upon any valid C/C++ type, be it atomic or composed ("struct"), based on
ODMG's (Cattell 1997) type definition language. Arrays are defined through a template marray
which is instantiated with the array base type b and the array extent (spatial domain) d, specified by
the lower and upper bound for each dimension. Thus, an unbounded colour DOP can be defined by
typedef marray< struct{ char red, green, blue; }, [ *:*, *:* ] > RGBOrthoImg;
Array Retrieval
Like SQL, a RasQL query returns a set of items (in this case, MDD objects).
Trimming produces rectangular cut-outs, specified through the corner coordinates.
Example 1: “A cut-out between (1000,1000) and (2000,2000) from all ortho images”:
SELECT OrthoColl[1000:2000,1000:2000]
FROM OrthoColl
For each operation available on the cell (i.e., pixel) type, a corresponding induced operation is
provided which simultaneously applies the base operation to all MDD cells. Both unary (e.g., record
access) and binary operations (e.g., masking and overlaying) can be induced.
Example 2: "Topographical map bit layer 3 oveölaid with the (grayscale) ortho image":
SELECT Ortho overlay bit( Map, 3 ) * 255c
FROM Map, Ortho
In general, MDD expressions can be used in the SELECT part of a query and, if the outermost
expression is of type Boolean, also in the WHERE part. See (Baumann 1999) for further query constructs
such as condensers (the MDD counterpart to aggregates).
The expressiveness of RasQL enables a wide range of signal processing, imaging, and statistical
operations up to, e.g., the Fourier Transform. The expressive power has been limited to non-recursive
operations, thereby guaranteeing termination of any well-formed query.
Physical Array Storage
Raster objects are maintained in a standard relational database, based on the partitioning of an
MDD object into tiles (Furtado 1999). Aside from regular grids, any user or system generated partitioning
is possible (Fig. 1). A geo index is employed to quickly determine the tiles affected by a
query. Optionally tiles are compressed using one of various choices, using lossless or lossy (wavelet)
algorithms; query results can be compressed for transfer to the client. Both tiling strategy and
compression comprise database tuning parameters.

Fig. 1: arbitrary 2-D tiling
Tiles and index are stored as BLOBs in a relational database which also holds the data dictionary
needed by RasDaMan’s dynamic type system. Adaptors are available for Oracle, IBM DB2, and IBM
Informix. A coupling with object-oriented O2 has been done earlier, showing the wide range of
DBMSs with which RasDaMan can interoperate.
Query Evaluation
Queries are parsed, optimised, and executed in the RasDaMan server. The parser receives the query
string and generates the operation tree. A number of optimisations is applied to a query tree prior to its
execution (Ritsch 1999). Of the 150 heuristic rewriting rules, 110 are
actually optimising while the other 40 serve to transform the query
into canonical form. All rules are based on the algebra.
Execution of queries is parallelised. Right now, RasDaMan offers
inter-query parallelism: A dispatcher schedules requests into a pool of
server processes on a per-transaction basis. Current research work
involves intra-query parallelism where the query tree transparently is
distributed across available CPUs or computers in the network (Hahn
2002). First performance results are promising, showing speed-up /
#CPU ratios of 95.5%.
For arrays larger than disk space, hierarchical storage management (HSM) support is being
developed (Reiner 2002).
|
|
|
|