Automated Geodata Analysis and Metadata Generation Dirk Balfanz* ZGDV - Computer Graphics Center, Darmstadt, Germany

Similar documents
DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press, ISSN

NOKIS - Information Infrastructure for the North and Baltic Sea

Exploring Spatial Relationships for Knowledge Discovery in Spatial Data

The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes

Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata

Web Visualization of Geo-Spatial Data using SVG and VRML/X3D

INSPIRE - A Legal framework for environmental and land administration data in Europe

Overview. Everywhere. Over everything.

One platform for desktop, web and mobile

Desktop GIS for Geotechnical Engineering

Imagery and the Location-enabled Platform in State and Local Government

Part 1: Fundamentals

F. Deubzer and U. Lindemann Institute of Product Development, Technische Universität München

DP Project Development Pvt. Ltd.

ONLINE DECISION SUPPORT TOOL FOR AVALANCHE RISK MANAGEMENT. Patrick Nairz* Avalanche Warning Center Tyrol, Austria

Building a National Data Repository

CONCEPTUAL DEVELOPMENT OF AN ASSISTANT FOR CHANGE DETECTION AND ANALYSIS BASED ON REMOTELY SENSED SCENES

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

VISUAL ANALYTICS APPROACH FOR CONSIDERING UNCERTAINTY INFORMATION IN CHANGE ANALYSIS PROCESSES

ArcGIS & Extensions - Synergy of GIS tools. Synergy. Analyze & Visualize

Data Origin. Ron van Lammeren CGI-GIRS 0910

GIS Visualization Support to the C4.5 Classification Algorithm of KDD

GIS for spatial decision making

Geografisk information Referensmodell. Geographic information Reference model

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared.

Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies

Conceptual Aspects of 3D Map Integration in Interactive School Atlases

Data Origin. How to obtain geodata? Ron van Lammeren CGI-GIRS 0910

Map Collections and the Internet: Some Ideas about Various Online Map Services, Based on the ETH Map Collection in Zürich

GENERALIZATION IN THE NEW GENERATION OF GIS. Dan Lee ESRI, Inc. 380 New York Street Redlands, CA USA Fax:

Analysis of Regional Fundamental Datasets Questionnaire

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research

Quality and Coverage of Data Sources

What are the Spatial Data Standards?

K. Zainuddin et al. / Procedia Engineering 20 (2011)

GEOGRAPHIC INFORMATION SYSTEMS Session 8

RESEARCG ON THE MDA-BASED GIS INTEROPERABILITY Qi,LI *, Lingling,GUO *, Yuqi,BAI **

Keywords: ratio-based simplification, data reduction, mobile applications, generalization

Step-by-Step Procedure for Creating Customized BEx Maps

Understanding Interlinked Data

CyberGIS: What Still Needs to Be Done? Michael F. Goodchild University of California Santa Barbara

reviewed paper EU-Project: Cross-border Spatial Information System with High Added Value (CROSS-SIS) Stefan SANDMANN

Geo-spatial Analysis for Prediction of River Floods

CHAPTER 3 RESEARCH METHODOLOGY

Esri Production Mapping: Map Automation & Advanced Cartography MADHURA PHATERPEKAR JOE SHEFFIELD

MODELING ACTIVE DATABASE-DRIVEN CARTOGRAPHY WITHIN GIS DATABASES

Brazil Paper for the. Second Preparatory Meeting of the Proposed United Nations Committee of Experts on Global Geographic Information Management

UBGI and Address Standards

Use of the ISO Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011

ArchaeoKM: Managing Archaeological data through Archaeological Knowledge

Clustering analysis of vegetation data

Oakland County Parks and Recreation GIS Implementation Plan

Canadian Board of Examiners for Professional Surveyors Core Syllabus Item C 5: GEOSPATIAL INFORMATION SYSTEMS

EBA Engineering Consultants Ltd. Creating and Delivering Better Solutions

Teaching GIS for Land Surveying

Intro to Info Vis. CS 725/825 Information Visualization Spring } Before class. } During class. Dr. Michele C. Weigle

Lecture 2. A Review: Geographic Information Systems & ArcGIS Basics

THE SPATIAL DATA WAREHOUSE OF SEOUL

GEOGRAPHICAL INFORMATION SYSTEMS. GIS Foundation Capacity Building Course. Introduction

Massachusetts Institute of Technology Department of Urban Studies and Planning

Intelligent GIS: Automatic generation of qualitative spatial information

From Research Objects to Research Networks: Combining Spatial and Semantic Search

EXPLANATION OF G.I.S. PROJECT ALAMEIN FOR WEB PUBLISHING

TRAITS to put you on the map

USING GIS IN WATER SUPPLY AND SEWER MODELLING AND MANAGEMENT

Combining Geospatial and Statistical Data for Analysis & Dissemination

THIS IS NOT A PRESENTATION

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Roadmap to interoperability of geoinformation

Modern Education at Universities: Improvements through the Integration of a Spatial Data Infrastructure SDI into an e-learning Environment

Technical Specifications. Form of the standard

ESRI Delivering geographic information systems to millions of users

A Joint European GIS Under Construction: The 1:5 Million International Geological Map of Europe and Adjacent Areas (IGME 5000)

Land Use of the Geographical Information System (GIS) and Mathematical Models in Planning Urban Parks & Green Spaces

CentropeSTATISTICS Working Interactively with Cross-Border Statistic Data Clemens Beyer, Walter Pozarek, Manfred Schrenk

The Swedish National Geodata Strategy and the Geodata Project

Spatial Data Infrastructure Concepts and Components. Douglas Nebert U.S. Federal Geographic Data Committee Secretariat

PRELIMINARY STUDIES ON CONTOUR TREE-BASED TOPOGRAPHIC DATA MINING

2013 AND 2025 THE FUTURE OF GIS

Cell-based Model For GIS Generalization

Enhancing Decision-Making with the Alberta Modelling Expert System (MES) Presented by: Chiadih Chang September 12 th, 2017

Creation of an Internet Based Indiana Water Quality Atlas (IWQA)

ADDING METADATA TO MAPS AND STYLED LAYERS TO IMPROVE MAP EFFICIENCY

SOFTWARE ARCHITECTURE DESIGN OF GIS WEB SERVICE AGGREGATION BASED ON SERVICE GROUP

A set theoretic view of the ISA hierarchy

Mapping Landscape Change: Space Time Dynamics and Historical Periods.

CHAPTER 7 PRODUCT USE AND AVAILABILITY

for Effective Land Administration

Time Series Analysis with SAR & Optical Satellite Data

Application of Topology to Complex Object Identification. Eliseo CLEMENTINI University of L Aquila

International Conference Analysis and Management of Changing Risks for Natural Hazards November 2014 l Padua, Italy

CPSC 695. Future of GIS. Marina L. Gavrilova

The Architecture of the Georgia Basin Digital Library: Using geoscientific knowledge in sustainable development

Maps as Research Tools Within a Virtual Research Environment.

The INSPIRE Community Geoportal

State and National Standard Correlations NGS, NCGIA, ESRI, MCHE

A Model of GIS Interoperability Based on JavaRMI

Transcription:

Automated Geodata Analysis and Metadata Generation Dirk Balfanz* ZGDV - Computer Graphics Center, Darmstadt, Germany ABSTRACT With the shift from production to information society, a parallel development has taken place in processing geo information. Today, the focus is often more on intelligent and complex use and analysis of existing data than on data acquisition. The tasks of users now are to find appropriate data, as well as appropriate analysis or mining methods, for their specific exploration goals. This paper first presents an integrated approach that uses metadata technology to guide users through data and method selection. Important prerequisites in the decision process are the user's correct understanding of geodata qualities and, to this end, the availability of metadata. Therefore, the core of the presented approach is then described in detail, i.e. metadata visualization and generation. The visualization part aims to make the user aware of the goal-related geodata qualities. It consists of an automated semantic level-of-detail method, using abstraction hierarchies and linked visualization functions. The underlying metadata is provided via a repository-based generator, which creates descriptive metadata by analysis and interpretation of the original geodata. Finally, an outlook over the next steps in automated support for geodata mining is given. Keywords: metadata, geodata, semantic LoD, metadata visualization, metadata generation 1. INTRODUCTION Today s modern information acquisition systems - as satellite-based earth observation or aerial photography from planes, etc. - generate large amounts of raw data each day. In intelligent GIS (Geographic Information Systems), this data is reworked, enhanced with additional information or transformed into digital maps to various topics. Whereas in the beginning, GIS acquisition and handling of raw data for precisely defined purposes was a key task, nowadays, work concentrates on complex use and analysis of existing data. Often, sophisticated methods from the areas of Data Mining or Knowledge Discovery are used to analyze data. Given a certain user-defined goal of exploration, typical problems for users are finding suitable geodata and using suitable analysis methods. Finding suitable data, i.e. data retrieval is supported with Metadata Information Systems (MIS) and Catalogue Systems (CS) [1] allowing the user to define criteria what kind of geodata he needs. Some examples are environmental MIS, such as GCMD (Global Change Master Directory) [2] the German UDK [3] or EOSDIS (Earth Observation System Data and Information System) [4] as representatives of MIS in the EO community. But (metadata-driven) retrieval of geodata is only the first part. Data analysis is often a sophisticated task as for instance, when using Data Mining methods or Knowledge Discovery processes. In order to get valid and expressive results, the use of these methods and approaches does not only require profound knowledge about the thematic geo domain, but also knowledge about the mining methods used. Our vision is to support the end-user in this area with automated suggestion of suitable methods based upon metadata technology. The related approach provides suggestions of analysis methods based on the user s goals of exploration and metadata describing the possible geodata. [5]. Important steps towards this ultimate vision are repository-based systems for automatic analysis and interpretation of geodata in order to generate descriptive metadata and corresponding visualization of metadata, i.e. geodata qualities. *Dirk.Balfanz@zgdv.de; ZGDV (Zentrum für Graphische Datenverarbeitung) - Center of Computer Graphics, Rundeturmstrasse 6, D-64283 Darmstadt, Germany; http://www.zgdv.de Visualization and Data Analysis 2002, Robert F. Erbacher, Philip C. Chen, Matti Gröhn, Jonathan C. Roberts, Craig M. Wittenbrink, Editors, Proceedings of SPIE Vol. 4665 (2002) 2002 SPIE 0277-786X/02/$15.00 285

Chapter 2 will start with the general approach of metadata-supported geodata analysis. The kernel of this approach is elaborated in chapter 3 and 4. Chapter 3 describes the repository-based automated geodata analysis aimed at metadata generation. Chapter 4 presents an automated semantic level-of-detail method, using abstraction hierarchies and linked representation functions. Chapter 5 highlights current and future work. 2. METADATA-SUPPORTED GEODATA ANALYSIS The basic approach to metadata-supported geodata analysis corresponds with the understanding of Knowledge Discovery in Databases (KDD). In literature, KDD and Data Mining are sometimes used synonymously. In this paper, KDD shall be understood as a process with several steps, where Data Mining is only one of them. Possible processing steps are: understanding the domain, data collection / cleaning / enrichment, method selection, data mining, and evaluation of results [6]. These steps can also be understood as processing steps in geodata analysis. A general order of principal processing steps is given in Fig. 1. The user starts an exploration process by defining his specific exploration goals. Data collection can be facilitated with the help of MIS, for instance, or data is task-dependently given and exploration goals are defined, respectively. The selection of analysis or mining methods will now depend on the exploration goals chosen, as well as the specific qualities of the datasets. Based on the goal definition, a rule-based expert system can match requirements given by possible analysis methods with the goals on the one hand and the qualities of the chosen geodata on the other. Qualities of geodata are described via their metadata. Fig. 1: General process steps in data exploration After determination and suggestion of the mining or analysis methods, exploration can be executed. As far as visualization is not explicitly part of these methods anyway, result evaluation, including proper visualization of the outcome, is highly important. Beyond this, visual feedback about an estimation of result validity is needed, as well. For any implementation of this general approach, availability of suitable metadata is very important to evaluate suitability both of the dataset and of single analysis methods. In the domain of geodata, the availability of metadata is still inadequate [1]. Although metadata standardization has made good progress, in particular with the ISO standard 19115 ( Geographic information metadata ) [7], most geodata - at least in Europe - is still described insufficiently. This lack of information becomes critical, because metadata is substantially necessary in the described workflow. For guaranteeing appropriate metadata, the 3-stepped process is iterated with the following requirements. After goal definition, a first walk-through will analyze the determined datasets for extraction of required metadata. The required metadata elements result from the chosen examination goal, already existing metadata sets or appropriate templates. After discovering the missing elements, element generation and possibly evaluation, the second loop can then use the metadata to facilitate the decision process for the mining methods. In the area of metadata generation, there are a larger number of metadata editor systems available today (overviews can be found at [8]). For some of the current GIS, there are partial solutions already available for extracting at least a small set of metadata automatically from geodata. Examples in this area are SMMS (Spatial Metadata Management System) from RTSe [9] that works together with GeoMedia (Intergraph) or solutions for ARC/INFO (ESRI). ARC/INFO's [10] 286 Proc. SPIE Vol. 4665

latest version has some embedded functionality to deal with (proprietary formatted) metadata and to extract some information automated. An earlier version used an additional extension (ArcView Metadata Collector). However, the approach of current solutions is basically to retrieve some fields directly from the related geodata (like, for instance, geographic bounding coordinates) and to leave input of all other elements up to the user. General shortcomings of these implementations are appropriate visualization of metadata, i.e. the presentation of the related geodata qualities and an appropriate framework to give users, in addition to metadata extraction automated help in generating metadata. The next chapter discusses the approach of automated geodata analysis in order to generate descriptive metadata. Chapter 4 will deal with aspects of metadata visualization. 3. METADATA GENERATION The general workflow of metadata generation is the same as in the general approach given in Fig. 1. Thus, metadata generation starts with the Goal Definition which is in this case a metadata goal set. It declares which metadata elements will be needed as result of the generation. This definition can be inferred from the overlaying geodata analysis goal, or can be set by the user himself if the generation facility is not used in the context of data analysis. For inferring the goal set, the following must be known: which mining methods shall or could be used in general and which geodata qualities are crucial for them to work in a proper way. If these rules have been established, a rule-based expert system can provide the name set of metadata elements that are necessary to evaluate the different methods in relation to the exploration goals. The step Method Selection and Analysis covers the actual metadata generation and is described in the following section. Visualization and Evaluation will be dealt with in the next chapter. When trying to extract or infer metadata from geodata in general, the following classification of metadata can be found: Metadata that is explicitly part of the geodata (e.g. name of the geodata format). Metadata that is implicitly part of the geodata (for example, the spatial extent, that can be derived from the data coordinates) and may be object to knowledge extraction methods. Metadata that is not contained in the geodata in any way (e.g. contact information of data providers). The extent of these partitions varies in broad ranges depending on the examined geodata format. In general, compilation methods for metadata include a) key in, b) look-up, c) measured, d) computed and e) inferred [11, 12]. Referring to category 3 (not extractable metadata), it is obvious that automated generation / extraction of metadata from given geodata will require a complex mixture of all above-mentioned compilation methods. We adopted, therefore, a repository-based approach in combination with a graph-based optimization, as shown in Fig. 2. A repository contains a set of modules able to acquire and produce all metadata elements of the employed metadata model. These modules can use all the different input sources as extracting information from geodata, inferring from previously extracted information, using templates or simply demanding user input in the worst case. For each metadata element, there may be several alternative generation modules in respect to the different compilation methods (e.g. key in, look-up, infer). Accordingly, there may be many ways to generate a specific element. The module registry contains descriptions of the qualities of all modules (input, output, execution time, etc.). The different alternatives for generating certain elements can be derived from this registry. The result is a graph ( All Paths ), which represents all possible ways to generate metadata elements, declaring the registered modules used, their order of execution, and their respective input / output elements (basics about Graph Theory can be found in [13]). This complete set of alternatives can be pre-calculated and stays constant as long as no new modules are registered. Due to the data retrieval process in advance, some metadata elements of the required scope may already exist. Furthermore, some of the modules might not be executable with certain kind of data. Therefore, the graph is reduced on runtime to get only the missing element alternatives and the executable alternatives ( Executable Paths ). Proc. SPIE Vol. 4665 287

Fig. 2:Repository-based metadata generation Since the module description contains information concerning the quality of process (for example, the execution time or some kind of validity measure), it is also possible to optimize the generation process due to execution speed, for instance ( Execution Paths ). Finally, the Execution Paths are executed and generate the scope of metadata elements that was defined as the metadata goal set. A key feature of this approach is its extendability for new modules and its ability to provide some evaluative information about the generated metadata. 4. METADATA VISUALIZATION After execution of the metadata generation, the results have to be visualized. Visualization is an important part of the whole process [14]. It also comprises the depiction of the exact content of metadata elements and quality of metadata for evaluation purposes as the representation of metadata qualities in regard to the user s goals of data examination. Visualization of specific metadata elements has to be seen within this context under two functional aspects: classification of geodata and evaluation of metadata. The evaluation phase shall enable the user to proof the result of metadata generation and to correct it if necessary. The focus of visualization is the exact presentation of single elements. Moreover, it is necessary to give the user guidance in navigating through metadata groups and elements in order to find specific elements. In other words, the metadata structure has to be shown. During the classification phase, it is most important to enable the user to understand the qualities of the described geodata set, in particular in relation to his specific examination goals. Therefore, visualization of metadata should be as intuitive as possible and adaptive to the reviewed range of elements. The range comprises all levels from complete metadata sets to groups of metadata elements, down to single elements if necessary. The main point of visualization in this case is a comprehensible presentation of groups of metadata and overviews. 288 Proc. SPIE Vol. 4665

4.1 ABSTRACTION AND VISUALIZATION The size, as well as the complexity of the metadata models, does not allow for a direct and thereby complete presentation of contents, which the user could directly absorb cognitively. In order to reduce the complexity and scope of the information to be presented, especially for the classification phase, there is a possibility of presenting a sufficiently small part of the data or to limit the presentation to essential aspects by means of abstraction. The abstraction, meaning reduction of complexity to the bare essentials, is always guided by an objective that describes the essentials. One must also consider that the user moves in his work process (in this case, classification or evaluation) along a granularity spectrum of required information. This means requiring information in different levels of detail depending on the current stage of his work [15]. According to Beard and Sharma [12], a phasal division can be adopted for the resource discovery process : Overview: to provide an overview of digital library content Search: to enable comparison of multiple information items Details: to provide a detailed description of individual items The use of metadata, especially in the classification phase, occurs in similar steps. It is, therefore, better to follow a level-of-detail concept. In the following section, this kind of concept, based on a hierarchical abstraction model, will be introduced and the connection with underlying metadata models will be illustrated. The term level-of-detail is not to be interpreted in a purely graphic sense, but rather in a semantic contextual sense [16, p.47]. In this paper, the concept of a semantic abstraction pyramid is proposed. This LoD architecture divides the process of metadata visualization into "vertical" abstraction processes that form abstraction pyramids and horizontal representation functions that depict abstraction pyramid elements, and thereby shape the individual presentation levels (LoDs). This connection is presented in a simplified form in Fig. 3. Fig. 3: used LoD architecture The metadata visualization uses the three fundamental granularity levels: overview, refinement, and details, according to the phasal division previously described. The vertical abstraction functions f a are to be understood in this context as semantic abstractions on nongraphical data. They make the hierarchical prepared data available for the representation functions f b that define the actual visualization of the respective presentation level. The representation functions can also contain parts of graphic abstractions and can unite abstraction elements from several abstraction levels into one presentation. Proc. SPIE Vol. 4665 289

Characteristics of this approach are: At the detail level, metadata elements are visualized according to their data type. An additional abstraction does not occur. For example, text is rendered as text and not transformed into a graphic or diagram. The refinement and overview levels present representations of metadata groups or the entire metadata set. The presentation on these granularity levels is abstracted in varying degrees; the specification level can also be divided into several sublevels. The abstract presentation of groups of metadata elements may contain less abstracted representations of the lower level. In general, therefore, an individual picture element within the visualization of a certain level is the result of a representation function of one or more original metadata elements and/or abstraction levels of one or more additional elements. The abstraction levels themselves are the result of an abstraction function of one or more original metadata elements and/or abstraction levels of one or more additional elements. To put it another way, with: be i = f bi (Me, Ae ) (1) ae j = f aj (Me, Ae ) (2) ae j : single abstracted element be i : single image element of the image level Ae: set of all abstracted elements Be: set of all image elements on all levels Me: set of the original metadata elements f aj : abstraction functions f bi : representation functions Ae, Ae Ae Ae does not contain ae j Ae should only contain ae k of the same or a lower level as pertains to the abstraction level Me, Me Me 4.2 FORMING ABSTRACTIONS In the case of an ISO 19115-based metadata model, decisive parts of a possible abstraction pyramid are already explicitly presented through the metadata modeling. Therefore, during the metadata generation process, the generation of abstraction levels has already occurred in part. Additional abstraction levels that are not contained in the original data model can be expanded for the visualization. Likewise, during the generating process, these additional abstraction levels can be introduced in appropriate abstraction modules by expansion of the module basis. An example should clarify the connection between abstraction pyramids and metadata models. Also, the description of spatial and temporal extensions of a geodata record relevant to ISO 19115 should be considered. The class MD_DataIdentification [7] (subclass of MD_Identification ) contains several possibilities for describing the spatial extension of a data record described through metadata. The most fundamental is the provision of the bounding box geographicbox and a geographically descriptive name geographicdescription. Further, in the assigned subclass EX_Extent (which can also exist severalfold), it is possible to have a detailed description with following descriptive forms: 290 Proc. SPIE Vol. 4665

Description: freetext area ; GeographicElement: spatial extension (Bounding Box, Bounding Polygon, ) TemporalElement: temporal extension VerticalElement: vertical extension (minimal / maximal peak values) One possible abstraction hierarchy of Extent (exclusively for the spatial components) is shown in Fig.4 below. The most abstract level is presented here as the textual indicator geographicdescription, GD. Then comes the affiliated Bounding Box, BBx, below, as well as should the situation arise - the individual bounding boxes, BBxO i,, in the lowest level of the abstraction hierarchy of, for example, individual object classes, O i, (i.e. rivers, streets, etc.). Fig. 4: Abstraction pyramid of Extent All of these elements can be presented directly in the above-mentioned standard elements. Beyond this, they can be generated during the process of metadata generation. Thus, during the prototypical realization of the system, the minimal / maximal coordinates of every object class can be selected from the original geodata and thus form the BBxO i (function fa 1 ). From this, (fa 2 ) in turn can determine the comprehensive bounding box BBx. A geographic description GD of BBx can be achieved through the implementation of Gazetteers, i.e. referencing between location descriptions and coordinates (fa 3 ). 4.3 REPRESENTATION FUNCTIONS Representation functions are defined according to the abstract model for the individual levels of the abstraction pyramid. In Fig. 5, the simplified abstraction pyramid for the spatial and temporal Extent is illustrated. At the detail level, all elements are alphanumerically portrayed according to their data types in ISO 19115. At the specification level, three representation functions separately form the whole bounding box BBx, as well as vertical and time extent. At the overview level, various abstraction elements are integrated into one presentation via a representation function. Proc. SPIE Vol. 4665 291

Fig. 5: Single and combined visualizations of extent categories The connection between the abstraction pyramid, representation functions, and the definition of various LoDs is presented in a simplified form in Fig. 6. Fig. 6: Simplified LoD architecture for Extent categories 4.4 NAVIGATION A visually supported navigation to individual metadata elements, as well as to the individual detail levels, is essential for the evaluation and viewing of metadata. It also improves the access to comprehension of the original geodata. The presentation methods for navigation that have been observed in the scope of this work are based on the tested form of the tree-diagram, as well as on space-filling tree-maps [17; 18 p.28]. The connection between the tree-diagram and the tree-map in a combined diagram can enable the navigation to individual elements and heighten the recognition of the metadata structure through the broad organized presentation in the tree-map. If the diagram of the tree-map (s. Fig. 7a) remains limited to the presentation of higher element groups and raises only the selected group to a higher detail level, the metaphor remains clearly concise and implementable even 292 Proc. SPIE Vol. 4665

in more complex data models. Fig. 7b shows a counter-example losing clarity when all elements of the example model are displayed. (a) (b) Fig. 7: Combined tree-map and tree-diagram (a); bad tree-map example (b) 5. CURRENT AND FUTURE WORK The approaches highlighted above are currently being implemented. Fig. 8 shows the basic architecture, as well assome parts of a first demonstrator s GUI. (a) (b) Fig. 8: Architecture (a) and GUI (b) of first implementations The architecture uses the off-the-shelf GIS GeoMedia that is able to operate in the sense of a data warehouse. It is able to connect to and integrate data of several other standard GIS via so-called data servers. We provided some proprietary extensions to GeoMedia. These extensions ( Extractor Modules ) are able to extract certain pieces of information from geodata sets. The generator part runs the mechanism as described in chapter 2. The Extractor Modules are logically integrated within the module repository. The navigation part of the system is at the moment tree diagram-based with some modifications. The leaves of the tree branches present additional color-coded information as to the existence of the relating element (black indicates element has already been generated / input ) and whether it is a mandatory / conditional / optional element due to the metadata standard used (ISO 19115-based). The next step will be the integration of a more flexible visualization. It will enable navigation based on the tree-map metaphor and the definition of thematic viewpoints. These viewpoints will facilitate the dynamic definition of Proc. SPIE Vol. 4665 293

thematic areas of interest by the selection of certain groups or elements. Thematic viewpoints can be seen as a semantic magnifying glass, showing the most important qualities of a geodata set according to the user s interests. Future work will try to integrate the illustrated concept of metadata generation and, in particular, visualization within the process of data analysis as described in chapter 2. One of the main tasks will be to define a set of rules that can be used to infer which qualities a dataset must have for certain types of data analysis or mining methods. Based on the user s goal definition, existing metadata, and metadata generated on the fly, this rule-based expert system could then give recommendations for the analysis methods to use. Metadata visualization will have the important role of making this expert system black box more transparent. The described method of abstraction pyramids and related representation functions can be used in this context as a semantic magnifying glass to focus the essential and necessary geodata qualities. 6. SUMMARY This paper has proposed a general approach for supporting geodata users in the selection of suitable analysis or mining procedures with the help of metadata technology. A basic workflow with three steps was proposed to generate descriptive metadata first and then to suggest appropriate exploration methods. In particular, the process of metadata generation and metadata visualization along with first implementation examples was illustrated. Metadata generation was accomplished by repository-based automated geodata analysis and interpretation. Metadata visualization for evaluation and examination purposes used a semantic level-of-detail method with abstraction hierarchies and linked representation functions. Metadata generation and appropriate visualization can be seen as a key factor for further and intensified use of metadata in the area of geodata. The results of this work may improve the availability of metadata, as well as promote the acceptance of metadata in general. Especially the support in understanding geodata sets by the proper and flexible display of their metadata will be helpful for different application areas. ACKNOWLEDGEMENTS Realization examples have been worked out in co-operation with the Department for Graphic Information Systems, Fraunhofer Institute of Computer Graphics and GIStec GmbH, Darmstadt, Germany. REFERENCES 1. Dirk Balfanz, Stefan Göbel: Bridging Geospatial Metadata Standards towards Distributed Metadata Information Systems. Miller, C. ; Musick, R. ; IEEE Computer Society : 3rd IEEE Metadata Conference Betheseda, Maryland, April 1999 2. NASA's Global Change Master Directory, http://gcmd.gsfc.nasa.gov/ 3. Umweltdatenkatalog (UDK), http://www.umweltdatenkatalog.de/ 4. EOSDIS (Earth Observing Science Data and Information System) Home Page, http://eosdismain.gsfc.nasa.gov/eosinfo/eosdis_site/index.html 5. U. Jasnoch, D. Balfanz, S. Göbel, Managing Distributed Heterogeneous Information Spaces Proceedings of FAIM Conference 2000, University Of Maryland, College Park, Maryland, USA, June 2000 6. R. Ferber, Data Mining und Information Retrieval, Technische Universität Darmstadt, 1999 7. ISO/TC 211, Geographic information/geomatics, CD 19115.3, Geographic information Metadata, http://www.statkart.no/isotc211/ 8. RTSe SMMS tool, http://www.rtseusa.com/, http://www.intergraph.com/gis/smms/ 9. Metadata Tools Overviews: http://www.state.wi.us/agencies/wlib/sco/metatool/mtools.htm http://www.fgdc.gov/metadata/toollist/metatool.html 10. ESRI: http://www.esri.com/software/arcgis/index.html ArcView Metadata Collector v2.0 Extension: http://www.csc.noaa.gov/metadata/text/download.html 294 Proc. SPIE Vol. 4665

11. K. Beard, A Structure for Organizing Metadata Collection, 3rd International Conference / Workshop on Integration GIS and Environmental, 1996 12. K. Beard, V. Sharma, Multilevel and Graphical Views of Metadata, in "Research and Technology Advances in Digital Libraries, ADL 98 Proceedings, 0-8186-8486-X/98 IEEE S.256-265, 1998 13. R. Diestel, Graph Theory, Springer Verlag, 2000 14. Uwe Jasnoch, Stefan Göbel, Visualization Techniques in Metadata Information Systems for Geospatial Data, Advances in Environmental Research Vol. 5 (2001), Nr. 4, S. 415-424, Pergamon Press, Amsterdam, Elsevier Science Ltd., 2001 15. C. Lagoze, From Static to Dynamic Surrogates - Resource Discovery in the Digital Age, D-Lib Magazine, ISSN 1082-9873, June 1997 16. T.A. Keahey, The generalized detail in-context problem, Proceedings of IEEE Symposium on Information Visualization 1998,, Page(s) 44 51 & 152, Published 1998 17. B. Johnson, B. Schneiderman, Tree-Maps: A Space-Filling Approach to the Visualization of Hierarchical Information Structures, Proceedings IEEE Visualization 91, pp. 275-282, 1991 18. Ivan Herman, Guy Melancon, M. Scott Marshall, Graph Visualization and Navigation in Information Visualization: A Survey, IEEE Transactions on Visualization and Computer Graphics, Vol. 6, No. 1, January March 2000 Proc. SPIE Vol. 4665 295

Citations: Balfanz, Dirk: Automated Geodata Analysis and Metadata Generation. In: Erbacher, Robert F. (Ed.) u.a.; The International Society for Optical Engineering (SPIE): SPIE Conference on Visualization and Data Analysis 2002.Proceedings. 2002, pp.285-295 (Proceedings of SPIE 4665).