Study on Data Integration and Sharing Standard and Specification System for Earth System Science Juanle Wang and Jiulin Sun Information Sharing Center for Earth System Science Institute of Geographic Sciences and Natural Resources Research, CAS 2010-05-27 Hongkong
Outline Standard and specification requirements Data Sharing Progress in China Basic principles of data sharing standards and specifications System Architecture and contents Implementation and application Future work
Standard and specification requirements
From Geosciences to Earth System Science Geosciences have evolved into Earth System Science stage in 21st century (Liu Dongsheng, 2004).
Data feature analysis Research data collections Reference data collections Resource or community data collections
Research Data Research data collections are the products of one or more focused research projects and typically contain data that are subject to limited processing or curation. They may or may not conform to community standards. They may vary greatly in size but are intended to serve a specific group. There may be no intention to preserve the collection beyond the end of a project, because their budgets are small usually. Such as Fluxes Over Snow Surfaces Project
Reference data Reference data collections are intended to serve large segments of the scientific and education community. In these circumstances, conformance to robust, well-established, and comprehensive standards is essential, and the selection of standards often has the effect of creating a universal standard. Budgets supporting reference collections are often large. Such as GenBank of National Institutes of Health
Resource or community data Resource or community data collections serve a single science or engineering community. These digital collections often establish communitylevel standards either by selecting from among preexisting standards or by bringing the community together to develop new standards where they are absent or inadequate. The budgets are intermediate in size and generally are provided through direct funding from agencies. Such as DAACs in NASA, NOAA and USGS.
Requirement How to integrate and share these data in Earth System Science research community is a challenge for ESS development. ESS data belong to research data Most of these geosciences data are distributed in different research agencies, group or even personal scientists database
Requirement Concretely, there are 3 main problems needed to be studied at present. How to establish the ESS data sharing mechanism? How to integrate and share multidisciplinary data? How to make data easily for user find and access?
Data Sharing Progress in China
Scientific Data Sharing in China Progress WDCs in China Meteorology Seismology Oceanography Geology Astronomy Space Science Geophysics Glaciology and Geocryology Renewable Resources and Environment Meteorology Agriculture Forestry Water resources Seismology Mapping and Survey Earth Science (DSNESS) Sustainable Development Rural Techniques Oceanography Land & Resources Medicine and Health Energy Environment National Science and Technology Infrastructure 1988 2001.12 2002.12 2004 时间
WDC system of ICSU
Scientific Data Sharing Program (SDSP) in China SDSP was launched by MOST in the beginning of 2002 9 pilot projects was launched in the first stage According to its plan, there will be 40 national data centers, cover 300 master databases, in 6 fields at the end of 2020.
Framework of China SDSP Gateway to China Scientific Data Sharing Program Portal Resource and Environment Agriculture Population and Health Basic and Frontier Sciences Engineering and Technology Meteorological Scientific Data Center Agricultural Scientific Data Center Rural Development Sci Data Center Population Control Sci Data Center Basic Medicine Scientific Data Center Earth System Scientific Data sharing Net Basic Sci Data Center About 300 Master Databases In 40 Data Centers Regional Development Fields/Disciplines Data Center / Networks Master Database
National Science & Technology Infrastructure (NSTI) Two years later after SDSP, NSTI was launched in 2004 supported by MOST, Ministry of Finance of China, Ministry of Education, and State Development and Reform Commission. Data sharing is the core of NSTI SDSP is one of important parts in NSTI Data sharing portal: http://www.escience.gov.cn
2. Data Sharing Network of Earth System Science (DSNESS) One of 9 Pilot Projects in SDSP and NSTI Long term project, 2002- Host by Institute of Geographic Sciences and Natural Resources Research, CAS Lead Scientist: Prof. Sun Jiulin, IGSNRR http://www.geodata.cn
Objectives Integrate and share distributed scientific data in Institutes, Universities, Individuals, government funded research projects, and International organization or data centers Enhance and support the Earth System Science Research and Science and Technology innovation At present, mainly focus on Land surface and human-land relationship research field
Standard and specification concept model in DSNESS
Basic principles for data sharing standard and specification environment
Corresponding principle (1/4) Standard and specification should correspond with their inside system and outside systems. First of all, all the standards and specifications should avoid conflict inside the Earth System Science data sharing system. At the same time, standard and specification of Earth System Science should correspond with its out side system.
For example, Earth System Science data sharing is one of important components of NSTI in China, so it should keep consistency with the standards and specifications in NSTI system. Also, these standard and specification should keep consistency with the international standards, such as ISO 19115 metadata standard, etc.
Easy access principle (2/4) The final objective of Earth System Science data sharing is to provide an easy collection, management and dissemination environment. Under this principle, Earth System Science data should show perfect and easy understand data catalogue for users firstly.
Reference model instruction principle(3/4) Geosciences data have a complex process from the data collection to dissemination. There should be a reference model to instruct the data sharing standard and specification environment. For example, ISO geography information reference model is a good example for Earth System Science standard and specification environment.
Software implementation principle(4/4) Data sharing need information and technology support. At present, distributed network is a popular tool for data sharing. This principle will contribute to the software developers develop data collection and management tools, which is a good method for the standard and specification implementation in science community.
System Architecture and contents
The general architecture includes 4 standard and specification classes. Mechanism and rules class Data management standards and specification class Platform development specification class Data service specification class.
Mechanism and rules class(1/4) Includes data sharing Constitutions platform implementation measurement platform operational management rules data sharing rules and guides. These rules keep consistency with the national data sharing law and rules.
Data sharing constitution is the core mechanism of DSNESS. Data sharing Union platform operational management rules Universal members Core members
Data management standard and specification class (2/4) Includes metadata standard, metadata editing specification, data document specification data backup specification international data collection and exchange specification data quality control specification, database design specification for vector, raster and attribute data, etc.
Core metadata standard of DSNESS include 188 metadata elements, including 22 core elements
Platform development specification class (3/4) Includes data classification specification software development and coding specification software interface specification, etc
Data Catalogue Ecosystem background data Social & economic development data Coastal and delta resources and environment data Frozen earth data arctic pole and south pole scientific explore data Typical regional thematic data (e.g. Tibetan plateau, loess plateau, mountain area, Yellow river delta, Yangze river delta, etc.) Earth observation system data and data production Solar-earth environment data
User service specification class It includes user service specification and data service guides. These specifications ensure perfect user services for data sharing. Online service Outline service
Implementation and application
Geodata Sharing Union Mongolian Academy of Sciences 14 Institutes in CAS Arctic and Antarctic Research Center, 5 WDCs in China 10 Universities in China More than 30 partners Himalayan area
Network Platform Search, Browse, Download Data Submit Management Data Users Data Providers Web Portal Background Data Integrate Metadata Data Management System Check Publication Requirement Content Service Data Manager
Data Resources in DSNESS Under this standard and specification environment, more than 18TB data resources have been collected and shared in this network. Majority of the integrated data is related to geographic science, resources science, environmental science, ecological sciences, societal sciences, and remote sensing, etc.
www.geodata.cn Data Sharing Network of Earth System Science
Data services Till to the end of 2009, almost 46000 users have resisted in the platform, 24.58 TB data have been downloaded for public.
Future work
Earth System Science data sharing need a long term processing and its standard and specification environment should be revised and fulfilled continually.
Under the DOWN direction, the basic concept model of standard and specification environment should be researched and studied deeply. Under the UP direction, different stage s concrete standards and specifications according to the data sharing life cycle should be paid more attention to especially for the data product specifications, dataset fusion and assimilation specifications, dataset service specifications, etc.
Thanks! E-mail:wangjl@igsnrr.ac.cn Institute of Geographic Sciences and Natural Resources Research, CAS