|
GISdevelopment.net --> Policy --> Geographic Information Infrastructure
The research on Metadata Management of Resource and Environment Spatial Database
Yanrong Cao, Jiantao Bi, Hongqiao Wu, Jianbang He Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences , Beijing 100101 , P.R. China Yuxia Huang Institute of Remote Sensing Application , CAS , Beijing 100101 , P.R. China
The Resource and Environment Spatial Databases (RESD) have a mass of data, which is dramatically growing. But sharing of these data has to face so many problems: To manage and share data, metadata is becoming an important tool.
This paper discusses how to establish metadata management system of Resource and Environment Spatial Database. Firstly, it develops architecture of metadata standard for the RESD, and gives the design method of Unified Modeling Language (UML) static diagram for metadata framework. Secondly, it gives data dictionary for detail element definition. Through the dictionary, the architecture of metadata is being expressed. And it uses extensible markup language(XML) to describe metadata. This paper gives a metadata DTD (Document Type Definition), discusses how to store XML files and how to query record from database. At last, it carries out metadata management and metadata’s expansibility. It can help users to query effectively and find resources quickly from RESD. 1 Foreword The RESD possess the databases of each side such as nature, society, economy and population etc crowd with the environment database. Not only the data bulk is huge, and divide the different hardware terraces of category, and possesses the structure and the content of different types by the different software supports. How solving the production, management and organization of data and sharing altogether is badly in need of the problem solved. As data producers, eagerly needs one set of effective data management and maintenance method, and the service that the same also requirement of data subscriber can be procured promptly from producer there and safety and effective and overall is so that following in resources and the environment data of magnanimity quickly and accurately find, visit, procure and use the data that need. But, how to share these information altogether on the network, and still existing several problems:
Generally thinks that the effect in the course studied the metadata in the spatial data warehouse to enjoy altogether must be started with from three aspects : 1 ) the standard of metadata ( Profile ) ;2 ) the metadata management system ( Spatial Metadata Management System, SMMS ) ;3 ) the information service that takes the metadata as the foundation to provide . Stressing here discusses preceding two the problems , namely problem of metadata management aspect. The geographical information metadata standard versions that has consulted and follow ISO/TC211 final is submitted to, as well as after the real requirement inside this item, the research group ( Research Group On Geographic Information Sharing, RGIS ) work carried on the space metadata standard of development item .Firstly, the design has been in progress to the structure of metadata by the general model building language UML of use , and has added with the corresponding metadata data dictionary , and has formed the standard that meets with ISO/TC211.Secondly, we have used XML and JAVA development tool of industry standard to develop the metadata management system on Internet of operating at the aspect of the establishment of software . 2 The design of metadata standard The geography information metadata standards are continuously the research hot spots of international geography information society , and include that the United States organizations such as federation geography data committees ( FGDC ) , Europe geography information standardization committees ( CEN/ TC287 ) and international standard organization geography information committees ( ISO/TC211 ) etc all devote self to continuously the research of geography information metadata standards . At present, ISO/TC211 is as a result of near ten renewals of version, and has entered committee draft stage . Now our nation has also made a large number of efforts the standard aspect the research, has accomplished the nation geography information standard by the nation geography information foundation centers. Therefore nation resources and the environment database metadata standard is accomplished under international and premise that the national standard is soon issued. Thinking over completeness, accuracy, structure and the compatibility of standard, and has consulted and follow the newest many international standard , national standard or the functional standards with the international standard , for example,ISO/TC211 was at the final drafts put forward of 8 months Final Text Of CD 19115 Geographic Information - Metadata in 2001 , and the geography information center on nation foundation is at metadata put forward in 2001 standards geography information numeral product metadata on foundation . Moreover, the spatial data of RESD are to give first place to with space information (vector, reflection and raster, etc.) and with the attribute data are given first place to and possesses the space locating information, as well as the non- spatial data collection such as books document archives datum catalogue warehouses and law regulations data bases etc. According to the concrete requirement, we have worked out the standard that is suitable. 2.1 Structure design of metadata The standard includes seven kinds and three kinds public data. Seven kinds include : Identification, basic information about the data set; Data Quality, the overall of data quality information; Spatial Data Organization, the organization method of data set spatial information; Spatial Reference, the description of the reference frame of the coordinate of data set as well as coded system; Entity And Attribute , the details information about the data set; Distribution, the publishing and procure information; Metadata Reference, the metadata condition and the information of responsible department at present; Three kinds of public data packs are drawn together : Citation, the concise information when quoting and the reference data collection; Date, the information of date and the time of concerned incident is provided Contact, about individual or organization quoted in the major subset . Public data not single uses of sheet , the object quoted as other elements. The concerned standard that fits internationally and the nation is confined to the length , and is not given unnecessary details about at this the concrete content of concerned metadata again. Owing to the metadata complicated logical organization and relation are existed between various big kind , if the analysis is come to the method used towards the object ,it includes the inheritance relations such as ( single and multi ) , Composition, Aggregation and Association etc . The public data are as quoting the object , and are frequently quoted by other kind again . If only describing the metadata , will very hard express clearly with the two dimension table in common use . Therefore needing to show with the means of diagram. ![]() Figure 1 The class MD_metadata’s definition and containment relationships with the other metadata classes The Unified Modeling Language is designed the graphic presentation method of structure as one kind towards the object , and is used by Object Management Group(OMG) and other organization as a standard . In the geography information series standard that ISO/ TC211 is working out includes the final draft put forward at present , and also all use UML generally as model language. Therefore we have used static structure figure in UML to express logical organization and the relation of each kind of metadata here. Class and attribute of class have been described with the data dictionary . Thus the complete metadata standard has been formed .This is also consistent with the international standard. The relation between class of metadata standard includes Generalization ( matches towards the object inheritance ) , Aggregation, Composition, Association etc. . In figure 1, MD_metadata is aggregated by MD_quality, MD_reference, MD_content and MD_spatialreference and follow system and MD_content expressions and MD_distribution, and this kind of relation is that association is got together to one kind of one-way . In the figure , the number represent multi . Such as is that 1..*, shows that the metadata has one or many identification to be known information between MD_metadata and the MD_identification . In addition , the important is still can realize the extension of metadata according to the new model element of the construction of UML Model Stereotypes above the model element of definition . 2.2 Data dictionary The data dictionary has described the characteristic of the metadata of designed bye UML. It takes subset , substance and element as the unit , and has described the structural relation and the attribute of substance and element at this architecture . It possesses as follows the attribute : Name , short name , definition , obligation/condition , maxim occurrence , data type and domain . The sum has formed the complete metadata according to the data dictionary with metadata’s static structure figure in UML, and has the distinct logical organization. It is easy to understanding , being easy to program to realize .
3 Metadata Management 3.1 What’s the problem? Metadata structure as well as data dictionary clearly tell user how to describe the data base with the metadata with what UML described. All data sets can come to describe with the metadata. But how to manage the meta database , and how more effectively to help user gains the data, it need to have the good metadata management system . The objective of metadata management is in order to procure , checks up , saves , deals with and to apply to the geographic metadata . Geographic information sharing is above the foundation of network, therefore the so-called management problem also is on the network, involving web browser, web server and metadata server, and relies on a series of requirement and the answer courses between the software parts. At present international and internally all build many metadata systems. Such as, what it was recommendation of FGDC's is used to build the information interchange package of software of center (Clearinghouse) in space with I-Site's Freeware bundle by FGDC recommended . There has been such Web Site ( www.nsii.gov.cn ) in the National Geographic Information Exchange Center ( NGIEC ) , and user can inquire about that each node comes out well in a photograph the metadata of spatial information by way of the browser . There are other fairly more famous MetaStar's series by Blue Angel Technologies's development that has had commercialization , the Metadata DOCUMENT of ARC/INFO etc. . Analyzing these metadata management systems , can reaching them , the major merit can the module below all possessing : The metadata browser : Being responsible for spatial data browsing and the navigation of database , and providing the query interface , as well as the data preview merit abilities The metadata editor : Realize various editor's merit abilities of metadata , like builds and inserting and deleting and updating etc The metadata server : Manage the metadata database , and publish metadata at the network. Contacting the real application, and still should strictly think over following several problems when realizing the RESD to manage the system except realizing the mentioned above merit ability :
3.2.1 Mapping static figure in UML using XML Thinking over the above requirement , using at present popular XML technology should be of course .Extensible Markup Language is one kind of Web's labeling language that continues after the HTML , and it is for user has provided nimble expansion mechanism , and makes that the labeling element of the ( Well-formed ) self-defining that the resources of different contents can be good with the format comes the show . In the essence, XML is one kind of meta-language, and is one kind of language that is used to describe other languages. It possesses the following characteristic: from descriptive, have ability self definition label (Tag), basing on the requirement of themselves developers can define Document Type Definition(DTD ); Half structure is fit for describing the hierarchy mould data; good extensibility; and the platform independence suits on the network to transmit . Therefore as for developing the metadata management system , XML can mapping various kinds relation of the metadata defined by UML completely , can expand the metadata , and can be satisfy with the requirement of network operating . 3.2.2 Use XML describe metadata Using XML, we can describe the RESD metadata . Concrete work is defining DTD of the metadata standards. The definition of DTD part of following ( chiefly with the mark knows in the information one part to serve as the example , other with " ... " leaves out ). <!-- RESD metadata’s DTD --> <!ELEMENT metadata ( idinfo,dataqual?,continfo?,distrib?,spatrep?,refsystem ?, metainfo) > <!—Identification, Data Quality, Content, Distribution, Spatial Data Organization, Spatial Reference, Metadata Reference -- > <!-- Identification --> <!ELEMENT idinfo(cn_name,en_name,date,version,purpose,status,geobox … …)> <!ELEMENT chinesename (#PCDATA)> <!ELEMENT englishname (#PCDATA)> <!—Date described in timeinfo -- > <!ELEMENT version (#PCDATA)> <!ELEMENT purpose (#PCDATA)> <! ELEMENT status EMPTY ) > <!ATTLIST status progress ( Complete | In work | Planned ) “Planned” update (Continually|Daily|Weekly|Monthly|Annually|Unknown|As needed | Irregular | None planned ) “Unknown” > <! ELEMENT geobox EMPTY > <!ATTLIST geobox westbc CDATA #REQUIRED eastbc CDATA #REQUIRED northbc CDATA #REQUIRED southbc CDATA #REQUIRED> The DTD can be used the accurate no mistake earth's surface of XML by after the definition and reach the metadata that UML described . If user need to expand the metadata , abides by the augment ability rule in the metadata standard , thus namely the interface that the slave system provides can self-defining DTD, achieves the requirement of metadata extension . 3.2.3 Storage and query of XML metadata The metadata is the text file owing to what XML expressed, if it’s holding is only by the document system, the query efficiency of metadata will be low. But real XML database does not still appears at present. Such tactics can be used here: We should deposit XML files in the big object ( BLOB ) of binary system of database . It is very simple to do so the storage, but for query, it has lost one part efficiency. The Full Text Search of based on keyword can only be adopted. The main DBMS,SQL Server 2000 or Oracle 9i's all supports Full Text Search ( Full-text Index ) , and all can build index on the field parcel at BLOB. 3.3 Metadata management The RESD need metadata server concentration in Chinese Academy of Sciences geography science and resources research institute, and the data are also concentrated there at the same time. Therefore the whole metadata management system did not adopt the spatial data exchanging center (Clearinghouse) method that is general internationally, but adopt the management model of concentration. ![]() Figure 2 Metadata management system of Resource and Environment Spatial Database 3.4 Conclusion Building a metadata management system is for realizing sharing of RESD. It has used UML design metadata standard for keeping consistent with ISO/TC211 . In storage and transfer it has used XML that can be extended and self-defined. These can provide the strong support for information sharing . But, there still has some work to remain to accomplish further:
| |||||||||||||||||||||||||||||||||||
|
|