Usability Testing of Map Designs

Similar documents
METHODS FOR STATISTICS

Topic 2: New Directions in Distributed Geovisualization. Alan M. MacEachren

A comparison of optimal map classification methods incorporating uncertainty information

Visualization Based Approach for Exploration of Health Data and Risk Factors

CHAPTER 9 DATA DISPLAY AND CARTOGRAPHY

Geovisualization. Luc Anselin. Copyright 2016 by Luc Anselin, All Rights Reserved

Geovisualization of Attribute Uncertainty

Chapter 7: Making Maps with GIS. 7.1 The Parts of a Map 7.2 Choosing a Map Type 7.3 Designing the Map

Scottish Atlas of Variation

Physical Geography Lab Activity #15

VISUAL ANALYTICS APPROACH FOR CONSIDERING UNCERTAINTY INFORMATION IN CHANGE ANALYSIS PROCESSES

Mapping Data 1: Constructing a Choropleth Map

MATH 1150 Chapter 2 Notation and Terminology

Interactive Cumulative Curves for Exploratory Classification Maps

Intro to GIS Summer 2012 Data Visualization

Lecture 5. Symbolization and Classification MAP DESIGN: PART I. A picture is worth a thousand words

The Choropleth Map Slide #2: Choropleth mapping enumeration units

Mapping and Analysis for Spatial Social Science

Geography 360 Principles of Cartography. April 17, 2006

Variation of geospatial thinking in answering geography questions based on topographic maps

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Comparing Color and Leader Line Approaches for Highlighting in Geovisualization

Basic principles of cartographic design. Makram Murad-al-shaikh M.S. Cartography Esri education delivery team

Appropriate Selection of Cartographic Symbols in a GIS Environment

1. Exploratory Data Analysis

Map Makeovers: How to Make Your Map Great!

GED 554 IT & GIS. Lecture 6 Exercise 5. May 10, 2013

Geography 281 Map Making with GIS Project Four: Comparing Classification Methods

GTECH 380/722 Analytical and Computer Cartography Hunter College, CUNY Department of Geography

Geog183: Cartographic Design and Geovisualization Winter Quarter 2017 Lecture 6: Map types and Data types

How to Make or Plot a Graph or Chart in Excel

Analytical Graphing. lets start with the best graph ever made

Map image from the Atlas of Oregon (2nd. Ed.), Copyright 2001 University of Oregon Press

Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2

An Information Model for Maps: Towards Cartographic Production from GIS Databases

Effective Use of Geographic Maps

Combining Geospatial and Statistical Data for Analysis & Dissemination

Chapter 2: Tools for Exploring Univariate Data

Lecture Notes 2: Variables and graphics

Week 8 Cookbook: Review and Reflection

Diamonds on the soles of scholarship?

Designing Better Maps

Descriptive Data Summarization

Approaches to Spatial Analysis. Flora Vale, Linda Beale, Mark Harrower, Clint Brown Esri Redlands

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Tracey Farrigan Research Geographer USDA-Economic Research Service

Finding Outliers in Models of Spatial Data

KS3 Step Descriptors

Data Collection: What Is Sampling?

Making Maps With GIS. Making Maps With GIS

Perinatal Mental Health Profile User Guide. 1. Using Fingertips Software

Evaluating the usability of visualization methods in an exploratory geovisualization environment

Map your way to deeper insights

Analytical Graphing. lets start with the best graph ever made

Lesson 19: Understanding Variability When Estimating a Population Proportion

Theory, Concepts and Terminology

Interactive Statistics Visualisation based on Geovisual Analytics

GIS Workshop UCLS_Fall Forum 2014 Sowmya Selvarajan, PhD TABLE OF CONTENTS

SRI Briefing Note Series No.8 Communicating uncertainty in seasonal and interannual climate forecasts in Europe: organisational needs and preferences

from

A Review: Geographic Information Systems & ArcGIS Basics

GIS = Geographic Information Systems;

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

ArcGIS for Desktop. ArcGIS for Desktop is the primary authoring tool for the ArcGIS platform.

CONCEPTUAL DEVELOPMENT OF AN ASSISTANT FOR CHANGE DETECTION AND ANALYSIS BASED ON REMOTELY SENSED SCENES

ENV208/ENV508 Applied GIS. Week 2: Making maps, data visualisation, and GIS output

How not to give a poster: Some suggestions based on years of experience. D. Lund (credit to J. Granger)

Overview. GIS Data Output Methods

User Guide. Affirmatively Furthering Fair Housing Data and Mapping Tool. U.S. Department of Housing and Urban Development

Geographic Systems and Analysis

Cartography and Geovisualization. Chapters 12 and 13 of your textbook

MATH 10 INTRODUCTORY STATISTICS

1. Origins of Geography

Your web browser (Safari 7) is out of date. For more security, comfort and. the best experience on this site: Update your browser Ignore

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

Descriptive Univariate Statistics and Bivariate Correlation

Eurostat regional yearbook using statistical maps and graphs to tell a story. Teodóra Brandmüller and Åsa Önnerfors, Eurostat

Units. Exploratory Data Analysis. Variables. Student Data

Lesson 6: Accuracy Assessment

Mapping the most and the least

Major Crime Map Help Documentation

MAP SYMBOL BREWER A NEW APPROACH FOR A CARTOGRAPHIC MAP SYMBOL GENERATOR

Evolution or Devolution of Cartographic Education?

Introduction to Statistics

GEOREFERENCING, PROJECTIONS Part I. PRESENTING DATA Part II

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

POPULATION AND SAMPLE

Rural Pennsylvania: Where Is It Anyway? A Compendium of the Definitions of Rural and Rationale for Their Use

STATISTICS/MATH /1760 SHANNON MYERS

Visualizing Census Data in GIS. Andrew Rowan, Ph.D. Director, NJ Office of GIS

SANTA CLARA COUNTY THEMATIC ATLAS

Workbook Exercises for Statistical Problem Solving in Geography

A is one of the categories into which qualitative data can be classified.

Can we map ACS data with confidence?

How many states. Record high temperature

Maps as Tools for Visual Analysis of Geospatial Data. Dr. Claus Rinner Department of Geography and Environmental Studies Ryerson University

Standardized Symbologies for the Oregon Incident Response Information System (OR-IRIS)

GEOGRAPHIC INFORMATION SYSTEMS Session 8

Data Visualization (CSE 578) About this Course. Learning Outcomes. Projects

Sample assessment task. Task details. Content description. Year level 7

Transcription:

Usability Testing of Map Designs Linda Williams Pickle National Cancer Institute, Bethesda, MD 20892-8317 (PICKLEL@MAIL.NIH.GOV) Maps have the potential to display the geographic patterns of millions of statistical data points, something impossible using a tabular display. A poorly designed map, however, can fail to convey important underlying features in the data or can even distort their true geographic patterns. This paper summarizes results of a series of cognitive experiments by the National Center for Health Statistics and the National Cancer Institute to ascertain how best to communicate geographic statistics more effectively. Examples of the application of these results to the design of maps for publication and the extension of the principles to usability tests of Web-based map design are presented. Key words: cartography, spatial statistics, visualization of data 1. INTRODUCTION A picture may be worth a thousand words, but a map can represent millions of data points. However, the designer s best efforts at data collection or analysis will be wasted by failure to communicate the results clearly. For example, Figure 1 shows four of many possible ways to show age-adjusted breast cancer mortality rates on a map. Regardless of which style may be preferred by different users, the patterns implied by these maps are clearly different; in particular several show more clustering than the others. How can we design our maps so that they convey the underlying patterns in the data most accurately and effectively to the targeted user? Quintiles Standard deviation Equal Interval Dot density Figure 1. Breast cancer mortality rates, 1988-92: choropleth maps with rates categorized into quintile (20% of places in each category), standard deviation (mean +/- 0.5, 1.0, 1.5, and 2.0 or more standard deviations) and equal intervals, and a dot density map

The design of maps of statistical rates is unlike that of way-finding maps or maps of demographic characteristics. Positional accuracy is most important in the former, and the latter are typically maps of observed counts with no associated (or important) variance. However, statistical maps must present values that are comparable place to place, such as directly age-adjusted rates (Pickle and White 1995) and should, where possible, indicate the different degrees of confidence (or variance) in the statistic. In section 2, we summarize the results of a series of cognitive experiments by the National Center for Health Statistics and the National Cancer Institute to ascertain how best to communicate geo-referenced statistics more effectively. In section 3, we discuss the extension of these tests to the development of computer-based maps and Web-based data dissemination and illustrate several tools under development for the exploration and communication of the geographic patterns of cancer. 2. COGNITIVE EXPERIMENTATION In preparation for the design of a mortality atlas at the National Center for Health Statistics (NCHS), a series of cognitive experiments was conducted to determine how readers extracted statistical information from maps. We drew upon the expertise of cognitive psychologists who had been studying questionnaire design at NCHS to start an interdisciplinary research program in the design of statistical rate maps. In addition to the psychologists, the research team included statisticians, geographers and epidemiologists both on staff at NCHS and through contract to academic experts. Methods to test the usability of various designs included focus group discussions, designed experiments and think alouds, where a study subject talked through his or her attempts to use the test maps. The use of designed experiments was more formal than what had generally been used in cartographic research previously. Think-alouds were especially useful in testing computer-based map tools. The variety of possible visual representation schemes is considerable, but the accuracy and efficiency of map reading varies not only with qualities of the map, but also characteristics of the task and the reader. We investigated the cognition underlying map reading and interpretation, building upon initial assumptions drawn from the fields of cartography, statistics, and psychology, and developed a multi-stage cognitive model of how people read statistical maps (Herrmann and Pickle 1996). This model suggests that the reading of any statistical map is achieved through the performance of four cognitive stages, each of which makes its own peculiar demands on cognition: (1) map orientation (generally, what is this map showing?), (2) legend comprehension, (3) integration of the map and the map legend, and (4) discernment of spatial patterns and relationships. Each of these stages involves a series of psychological processes, e.g., perception, memo ry, and problem solving. Although other cognitive models may also be valid, this model is supported by interviews and timed experiments with map readers. Acceptance of this model implied that a cognitive experiment could focus on one map element or design aspect at a time, i.e., there was little evidence of interactions among the subtasks involved in map reading. (For more detail on the cognitive research underlying this project, the reader is referred to Pickle and Herrmann (2000).) Over a period of several years, we examined the following aspects of map reading: Questions asked of a statistical map Basic map style Legend design Color choices Indication of unreliable rates Map smoothing Classification of rates into color categories

2.1 Questions asked of a statistical map Early focus groups with epidemiologists, the primary user group for the NCHS mortality atlas, identified three types of questions that they wanted to answer, closely following the work of Bertin (1973). The first is a very specific rate readout task: what is the mortality rate in a certain place? The second is a more general pattern recognition task: are there geographic trends in the data or clusters of high or low rate areas? Finally, the last is the most general map comparison task. For example, are the mortality patterns similar for males and females, or for blacks and whites? Inconsistency of results by cartographic researchers in the past is due in part to drawing conclusions with only one of these ques tions in mind. Because we expected a diverse audience for our atlas, we wanted a map design and page layout that would allow all of these questions to be answered. Therefore, whenever possible, our map experiment required the reader to answer two or three of the types of questions above, depending on the purpose of that particular map. 2.2 Basic map style A literature review indicated that map design experts had different opinions on what basic map style would be best for disease rates, so our initia l studies compared preferences for each of a number of basic styles, including unclassed (proportional), classed and smoothed maps of various designs. In unclassed maps, a visual characteristic of the map, such as the darkness or saturation of a color, increases proportionally to the mapped rate, whereas in classed maps, the rates are classified into discrete ranges, each of which is represented by a single visual characteristic. Smoothed maps have had some generalizing algorithm applied to the actual data, such as a moving average of rates; these maps are useful for clarifying geographic patterns but obviously are not suitable for reading a rate in a particular area. For each of these styles, the mapped data are represented by lines of equal rates (isopleth maps), symbols, or shaded geographic areas (choropleth maps), as illustrated in Figure 2. Figure 2. Examples of a classed choropleth (left), smoothed isopleth (middle; Source: National Park Service: www.aqd.nps.gov/ard/figure3.html; dots show locations of monitoring stations) and proportional symbol (right) map. The first study in the series was a brief cognitive experiment comparing the map styles described above followed by a focus group discussion. Of all map styles examined, epidemiologists liked the choropleth (classed, area shaded) maps and used them most accurately (Pickle et al. 1994). People who had some training in cartography preferred the more complex map designs, but they were equally accurate on all types. In a follow-up study, Lewandowsky found that monochrome classed choropleth maps led to the most consistent identification of clusters on a map, compared to dot density, pie chart, and double-ended choropleth maps (Lewandowsky et al. 1993). These results led us to choose the classed choropleth map style for our mortality rate atlas. The remainder of

this paper will focus on design elements for choropleth maps of rates, although the theoretical approach presented here may be extended to other kinds of maps for rates or counts. 2.3 Legend design We conducted an in-house study of various legend designs using a seven-category choropleth map (Pickle et al. 1995). Epidemiologists and statisticians answered rate readout and rate comparis on questions and ranked eight different legends for ease of use. These legends included horizontal and vertical orientation, fixed box size and boxes whose size was proportional to the cutpoint rates, singly or doubly labeled (left and right), and tick ma rks for cutpoints or range labels. Only one subject made one error in 16 questions, so all of these were usable designs. One epidemiologist summed up the consensus of the group by saying, We could learn to use any of these legends given enough time, but we don t want to spend that time. We would rather get on to what is more interesting to us, i.e., reading the map, so we want easy to understand legends. Consistent with this comment was the ranking of the standard vertically-oriented, fixedbox legend as the most preferred. Therefore this design was used for the NCHS atlas. 2.4 Color choices The choice of map colors may be the most controversial of all map design decisions because everyone has personal color preferences and may not understand the consequences of using them on a map. The fact that a color combination is pleasing to the eye does not guarantee adequate performance for the required tasks. Conventional cartographic wisdom recommends the use of a multi-hue color scheme for qualitative data and a sequential color scheme for quantitative data (Dent 1993). A sequential color scheme is a progression of a visual characteristic (e.g., a light to dark sequence of a single hue) to represent a progression of the mapped values. A double-ended color scheme is a combination of a color gradient for each of two hues (one for high rates, one for low rates), with the most saturated or darkest shades used for the most extreme rates. The cognitive burden on the reader is less with the double-ended scheme than with a sequential scheme with the same number of categories because the reader need only remember and distinguish among half as many shades of the same hue. Double-ended schemes are favored when the designer wishes to draw attention to the extremes of the distribution. In two cognitive experiments, Hastie found that very distinct colors (e.g., a rainbow palette) were best for reading a single rate off the map, but Lewandowsky found that a color gradient was best for cluster recognition (Hastie et al. 1996; Lewandowsky et al. 1993), consistent with the general color use recommendations of cartographers (Dent 1993). The superior accuracy of multi-hue schemes for reading a rate may be because they are more discriminable than the single hue schemes (Pickle and Herrmann 1994). Conversely, the similarity of monochrome shades may facilitate the visual identification of areas with similar rates (Dent 1993; Pickle et al. 1994). Because the double-ended scheme combines features of these two methods and because the typical atlas reader is interested in more extreme rates, the next color studies included this type as well as a sequential scheme. The purpose of the next color study was to determine whether color conventions matter in devising the double-ended scales (Carswell 1995). That is, high rates are conventionally represented by either darker or warmer colors. As shown in Figure 3, the test maps were state maps using all pairwise combinations of red, blue and yellow -- sometimes one hue represented high, sometimes the other. The sequential gray scale was

the control. Color schemes including yellow had differences of lightness as well as hue, whereas the reds and blues were of equal saturation and lightness. All test maps included a legend specific to that map (not shown in Figure 3). Study subjects were asked to identify the most important cluster of high and low rates and were then timed in a rate readout test. For each task, the best double-ended color scale was better than the gray scale, although, consistent with expectations, the gray scale was only slightly worse than color for cluster identification. The best color scales were those varying both in hue and lightness (i.e., which included yellow). Results showed the advantage of the use of color in reducing rate readout errors and also pointed to the importance of following cartographic conventions. More errors occurred using double-ended color schemes where color convention was violated (e.g., red represented low rates, blue represented high rates) or in conflict (darker blue and warm yellow - which is high?). Figure 3. Sample test maps from the study of the importance of color convention. For the next study of color, Brewer tested eight color scales for usability using fulland quarter-page test maps of aggregated counties (Brewer et al. 1997). This experiment judged the accuracy with which students could answer all three types of map reading questions: rate readout, pattern recognition, and map comparison. Double-ended, sequential, spectral, and gray scales were tested (Figure 4). These are not just arbitrarily chosen colors. Pairs of hues for the double-ended scales were chosen to avoid confusion by color-blind readers, and to avoid simultaneous contrast, where a color s appearance changes depending on the adjacent colors. Lightness and saturation changed in roughly equal steps from top to bottom or from the center to the extremes; these changes were the same for both hues in the pair. The spectral scale was constructed of highly saturated hues with a similar dark to light to dark progression (top to bottom) as the double-ended scales. Tasks included rate readout, comparison of perceived regional rates, and perceived map clustering. There were few significant differences among the color scales but the gray scale was found to be difficult to use. This study showed that it is possible to choose hue pairs that avoid (a) confusion by colorblind readers and (b) simultaneous contrast. Also, there were few differences in results by size of map, probably because of the reduction from 7 to 5 rate classes between full and quarter page maps. The gray scale was perceived as less pleasant and more difficult to use than color scales. Contrary to cartographic recommendations, the carefully chosen spectral scale performed as well as double-ended scales. Brewer has recently developed an interactive Web site

(http://www.colorbrewer.org/) where these and other colors may be tested for suitability for different purposes (e.g., photocopying, color blindness) (Brewer 2003; Harrower and Brewer, in press). CMYK color specifications are available in print (Brewer et al. 2003) or from the Web site. Figure 4. Colors tested by Brewer et al. for full-page maps. 2.5 Representing unreliability Based on prior experience, we suspected that readers are unable to discern trends or clusters of similar rates if many of the areas with unreliable rates are blanked out or grouped together regardless of the level of the rate. The next set of studies compared methods of indicating rate reliability (unreliability) on the map. Lewandowsky and Behrens, using a spectral color scale with quintile rate categories, tested the following methods of indicating that a rate was unreliable: the color saturation was cut to 40%, the rate color was overlaid with a white hatch pattern, or its shading was removed altogether (Lewandowsky 1995). The control map had no reliability indication at all. Undergraduates, epidemiologists and geographers were asked to draw around the most important high and low rate clusters on the maps and to comment on their certainty of their response. Prior to the test, study subjects were given either very explicit or very vague instructions as to the meaning and importance of rate reliability in drawing conclusions. This study found that blanking unreliable areas impairs cluster identification but that indicating unreliability either by reduced saturation or by hatching worked well. The professionals were more sensitive to the method of indicating unreliability and were less confident of their answers when they were given explicit instructions about unreliable data. These results made it clear that we had to flag unreliable rates in some way on the NCHS atlas maps. MacEachren and Brewer conducted the next study that compared three more sophisticated ways of doing that (MacEachren 1998). The study design was similar to that of the color study described above, comparing methods using full page maps with seven rate categories and quarter page maps with five categories. The color scales used were: a double-ended purple-green, a spectral scale, and a red-yellow sequential scale. Tasks included rate readout, regional rate estimation and cluster identification. Methods tes ted were color saturation reduction, a double white-black diagonal hatching, and separation of the rate and reliability information into two separate maps. For the latter method, the reader can use just one of the maps or may need to integrate the information on these two maps, depending on the task. This study showed

that reliability can be indicated with minimal impact, i.e., there were few errors at all for the rate readout task, and whether reliability was indicated made no difference in estimating regional rates or identifying clusters. Map pairs and hatching performed better than the color change method for comparison of regional rates and selecting the most reliable region. The perceived ease of use and pleasantness were best for maps without reliability shown at all, for map pairs and for hatching. 2.6 Map smoothing Another area of research during this project was smoothing methods for maps. Techniques that have been used in the past to smooth variation in time series have been extended to two dimensions and applied to maps (Kafadar 1994). Smoothing is an important tool for disease rate maps, where there is a lot of background noise due to small populations that makes it difficult to discern the underlying patterns in the map. However, the first methods proposed for maps were not weighted. Since rate variances are heteroscedastic, depending on population size, it seemed important to include weights in the smoother. Mungiole (1999) added a weighting capability to a popular medianbased smoothing algorithm (headbanging). Figure 5 illustrates the importance of weighting using mortality rates for HIV among white males. It can be seen from the original map that, as expected, HIV rates are much higher in urban areas than in the suburbs and rural areas. An unweighted smoother treats the highly reliable urban rates, Rate/100,000 7.824-172.234 5.002-7.824 3.57-5.002 1.991-3.57 0-1.991 Rate/100,000 6.4-48.9 4.6-6.4 3.7-4.6 2.9-3.7 0.8-2.9 Rate/100,000 8.2-97.1 5.7-8.2 4.7-5.7 3.4-4.7 0.6-3.4 Figure 5. Age-adjusted HIV mortality rates, 1988-1992, among white males. Original data (top), unweighted smoothed (bottom left), weighted smoothed (bottom right). Data source: Pickle et al., 1996. Maps are color-coded by quintiles of the rates in each map. based on large populations, as equal in importance to the less reliable rates based on sparse populations. The resulting map retains the broad patterns of the original map but the isolated city excesses, e.g., St. Louis, MO, or Minneapolis, MN, have been smoothed away. When population (i.e., inverse variance) weights are included, both the regional and local features of the original pattern are retained. In focus group discussions, epidemiologists preferred these smoothed maps for pattern recognition tasks; obviously, they are not appropriate for rate readout tasks.

2.7 Classification of rates into color categories In the last study in the series we examined how to choose cutpoints for a classed choropleth map. Brewer tested seven different methods using a similar cognitive experimental design as the other color studies (Brewer and Pickle 2002). Students saw mortality maps of aggregated counties in random order using the following methods: Equal width intervals Equal percentile (quintile) intervals Natural breaks (Jenks method) method, which minimizes within class variance Minimum boundary area method, where differences across boundaries were maximized between different classes Cutpoints defined by the quartiles and outlier definitions of a boxplot Cutpoints defined in terms of standard deviation: +/- 0.5, +/- 1.5, and beyond Shared area method, where a specified land area total is in each class Performance using the quintile method was significantly better than the others, followed by the minimum boundary and natural breaks methods (although these two methods individually were not significantly different from the quantile method). This is a surprising result for cartographers, who seem to favor the Jenks (natural breaks) method. A second part of this study found that accuracy of comparing several maps was significantly better when a common legend was used. This is even more important for a series of maps, such as over time. 2.8 Recommendations for map design Our experience with this project taught us that every map imposes some cognitive burden on the reader. The trick is to minimize this burden while facilitating the use of the map for certain types of questions. Specific recommendations are: Design the map for a particular audience and purpose Use a standard legend design Colors should be chosen for the visually impaired and consistent with color conventions Identify unreliable rates, don t blank them out Multiple maps are often needed - to address different questions - to focus attention on different scales or regions - to compare modeled or smoothed values to observed values For rate maps to be used by epidemiologists, quantile-categorized choropleth maps using a double-ended color scheme seem to work well. The NCHS Atlas of United States Mortality was designed using recommendations from this research; it subsequently won an international award for the best illustrated government publication in 1997 (Pickle et al. 1996; http://www.cdc.gov/nchs/products/pubs/pubd/other/atlas/atlas.htm).

3. Extension of Research to Computer-based Maps The research described in section 2 focused on the design of paper maps. Today there is a greater demand for computer-based maps, both for their interactivity and the ease with which data can be disseminated over the Web. In this section we examine the extension of several graphics designed originally as static to a form more appropriate for Web use. 3.1 Linked micromap plots Sometimes the map reader wishes to see more statistical detail than can be shown on a map, especially when the mapped values have been classified into broad groups. In addition, we often wish to see how the distribution of one statistic compares with the mapped patterns. The linked micromap plot, an extension of Carr s rowplot graphic design, serves both of these purposes (Carr 2000). This plot links several statistical graphs to maps by means of color. As illustrated in Figure 6, the graphic can be sorted by either statistical panel or alphabetically by names of the geographic units, which are not restricted to states and counties. The graph symbols are tied to geography by color linking to the maps on the left. Each small map represents only five states these small perceptual groups facilitate accurate reading across the graphic. When the number of geographic units is odd, as is the number of U.S. states, the median value is shown in a small box in the middle of the page. Options include showing confidence intervals by lines through the statistical symbols and a vertical line for a referent value. In this example, it can be seen at a glance that the states with the highest predicted rates are in a band along the Mississippi and Ohio Rivers but that the states with the highest predicted number of cases are not among the highest rate states. The distribution of rates seems symmetric, with half of the states falling above the U.S. rate and half below. Utah s lowest rate stands out at the bottom of the plot as being much lower than its neighboring states, although a classed choropleth map would group these states together. As part of the des ign for a Web-based system to disseminate cancer statistics from the National Cancer Institute (NCI) to state and local health professionals, the linked micromap plot was one of the graphics extensively tested. Led by Sue Bell, the project director for NCI s State Cancer Profiles system, focus groups and initial tests were conducted at meetings of health professionals. More formal testing was then done in the NCI Usability Lab, a service of the Communication Technologies Branch of the NCI Office of Communications. This Branch has compiled research on Web site design and provides evidence-based recommendations on their Web site (http://usability.gov). Many changes in the linked micromap design as a result of testing are evident by comparing Figures 6 and 7. First, the orientation is landscape on a computer monitor whereas it had been originally designed in portrait orientation for hard copy. Interactivity requires a panel of selection lists and buttons on the left of the screen and required headings and logos further reduce the screen real estate available for the actual graphic. The maps have been moved to the right of the statistical panels, leaving the state names on the left. Usability testing revealed that users preferred having maps next to the scroll bar when exploring potential clustering or when searching for a particular location using the map image. Triangular buttons at the top of each panel permit sorting by either statistic or by name. Because the 51 states now cannot be shown on a single screen image, the key at the bottom and titles at the top float so they are always visible. The U.S. statistic is shown in a separate box at the top of the state list and now a vertical line through the panels indicates the target value from the Healthy People 2010 report

(http://www.cdc.gov/nchs/hphome.htm). The reader is referred to the project Web site, http://statecancerprofiles.cancer.gov/micromaps for more detail on this Web-based tool. Figure 6. Linked micromap plot of the predicted number and rate of lung cancer incidence by state among males in 1999.

Figure 7. Linked micromap plot as redesigned for the State Cancer Profiles interactive Web site (http://statecancerprofiles.cancer.gov/micromaps) 3.3 Conditioned choropleth maps The linked micromap plots allow the comparis on of patterns of several statistics, but often the user wants to look at subsets of the map, for example to explore the relationships between cancer rates and risk factors. The conditioned choropleth (CC) map, also designed by Dan Carr, is a useful interactive tool for this purpose (Carr 2000). As shown in Figure 8, a single choropleth three-class map is decomposed into stratified maps according to one or two other factors. Slider bars let the user choose the cutpoints for the mapped value and those for stratification. In this example, age-adjusted colorectal cancer mortality rates are color-coded into low, moderate and high values by the top slider, with values shown on the map as blue, gray and red, respectively. Sliders on the bottom and right allow the user to stratify this single map into nine maps, based on the values of colorectal cancer screening rates (columns) and socioeconomic (SES) status (rows). The low screening/low SES map is on the lower left, high-high on the upper right. Each stratified map shows only the cancer rates for those places that fall into that stratum s category of screening and SES. If we wanted to identify where better screening programs were needed, we can see from Figure 8 that maybe the Mississippi River valley could benefit - it has high cancer rates and low screening rates. Only informal usability tests have been conducted on this graphic. More testing is needed, although the user group for this exploratory tool is generally more technically sophisticated than the users of the communication tools of the State Cancer Profiles system.

Figure 8. Conditioned choropleth map of colorectal cancer mortality rates, 1995-99. 3.4 Exploratory spatial data analysis tools Parallel coordinate plots (PCP) were developed specifically as a high-dimensional data analysis tool (Inselberg 1985; Wegman 1990). More recently these plots were linked to maps so that analysts could explore the spatial as well as statistical structure in multivariate data (Edsall 2001). These new geovisualization tools are highly interactive and incorporate such standard features as linking and brushing. For example, in Figure 9 counties in northwestern Colorado are highlighted in blue on both the map and the PCP; the user controls the order of axes by dragging variable names in the upper right box. Currently under development as a collaboration between NCI and Alan MacEachren of Penn State University is an extension of the linked PCP which incorporates several data mining functions and time series graphs (MacEachren 2003). For example, a statistical variable clustering algorithm can run in the background to determine which of the many variables on the PCP are most important and what order of these variables would give the most distinct multivariate clustering. A prototype of this tool identified a multivariate cluster of places with a high per capita number of physicians but a low per capita number of hospitals. The corresponding places on the linked map were not geographically clustered but were the scattered locations of the major teaching hospitals in the US. This new tool will be used to examine spatio-temporal trends in cancer rates and their associations with various risk factors. This is the most complex cancer geovisualization tool developed to date and will serve as an exploratory tool for NCI statisticians and epidemiologists. Usability testing has not yet begun.

Figure 9. Parallel coordinate plot of covariates linked to a map example of new tools for exploratory analysis (http://www.geovistastudio.psu.edu/jsp/whatcanyoudo.jsp) 4. Summary We learned a great deal about map design from our cognitive research program and continue to learn how features translate from hard copy maps to Web-based to interactive computer-based tools. It is clear that no single map design is optimal for all users or for all questions. Tailoring the map and other graphics to the task at hand and for the intended audience is a difficult but necessary task for the designer. Without usability testing and careful design, our maps will not accurately convey the patterns in the underlying data.

REFERENCES Bertin, J. (1973), Semiologie Graphique (2 nd ed.), The Hague: Mouton-Gautier. (English translation by William Berg and Howard Wainer, published as Semiology of Graphics, Madison, WI: University of Wisconsin Press, 1983). Brewer, C. A. (2003), A Transition in Improving Maps: The ColorBrewer Example, in U.S. Report to the International Cartographic Association, special issue of Cartography and Geographic Information Science, 30(2), 155-158. Brewer, C. A., Hatchard, G. W. and Harrower, M. A. (2003), ColorBrewer in Print: A Catalog of Color Schemes for Maps, Cartography and Geographic Information Science, 30(1), 5-32. Brewer, C. A., MacEachren, A. M., Pickle, L. W. (1997), Mapping mortality: Evaluating color schemes for choropleth maps, Annals of the American Association of Geographers, 87(3), 411-438 Brewer, C. A. and Pickle, L. W. (2002), Comparison of Methods for Classifying Epidemiological Data on Choropleth Maps in Series, Annals of the Association of American Geographers, 92(4), 662-681. Carr, D. B., Wallin, J. F. and Carr, D. A. (2000), Two new templates for epidemiology applications: Linked micromap plots and conditioned choropleth maps, Statistics in Medicine, 19, 2521-2538. Carswell, C. M. (1995), Using color to represent magnitude in statistical maps: The case for double-ended scales, in: Pickle, L. W. and Herrmann, D. (eds) (1995), Cognitive Aspects of Statistical Mapping. NCHS Working Paper Series Report, No. 18, Hyattsville, MD: National Center for Health Statistics, p. 201-228. Dent, B. D. (1993), Cartography: Thematic Map Design, Dubuque, Iowa: Wm. C. Brown Publishers. Edsall, R., MacEachren, A. and Pickle, L. (2001), "Case Study: Design and Assessment of an Enhanced Geographic Information System for Exploration of Multivariate Health Statistics", Proceedings of IEEE Symposium on Information Visualization (InfoViz 2001), K. Andrews, S. Roth, and P.C. Wong, eds. 22-23 October, 2001; San Diego, CA, 159-163. Harrower, M. A. and Brewer, C. A. (in press), ColorBrewer: An Online Tool for Selecting Color Schemes for Maps, The Cartographic Journal. Hastie, R., Hammerle, O., Kerwin, J., Croner, C. M. and Herrmann, D. J. (1996), Human performance reading statistical maps, Journal of Experimental Psychology: Applied, 2, 3-16. Herrmann, D. and Pickle, L. W. (1996), A cognitive subtask model of statistical map reading, Visual Cognition,.3, 165-190. Inselberg, A. (1985), The plane with parallel coordinates, The Visual Computer, 1, 69-91. Kafadar, K. (1994), Choosing among two-dimensional smoothers in practice, Computational Statistics & Data Analysis, 18, 419-439. Lewandowsky, S., Behrens, J. T., Pickle, L. W., Herrmann, D. J., White, A. A. (1995), Perception of clusters in mortality maps: Representing magnitude and statistical reliability, in: Pickle, L. W. and Herrmann, D. (eds) (1995), Cognitive Aspects of Statistical Mapping. NCHS Working Paper Series Report, No. 18, Hyattsville, MD: National Center for Health Statistics, p. 107-132. Lewandowsky, S., Herrmann, D. J., Behrens, J. T., Li, S.-C., Pickle, L. W., Jobe, J. B. (1993), Perception of clusters in statistical maps, Applied Cognitive Psychology, 7, 533-551.

MacEachren, A. M., Brewer, C. A., Pickle, L. W. (1998), Visualizing georeferenced data: Representing reliability in health statistics, Environment and Planning: A. 30:1547-1561. MacEachren, A., Hardisty, F., Dai, X. and Pickle, L. (2003), Supporting visual analysis of federal geospatial statistics, Communications of the ACM, 46, 63-64. Mungiole, M., Pickle, L. W., Simonson, K. H. (1999), Application of a weighted headbanging algorithm to mortality data maps, Statistics in Medicine, 18, 3201-9. Pickle, L. W. and Herrmann, D. (1994), The process of reading statistical maps: The effect of color, Statistical Computing and Statistical Graphics Newsletter, 5(1), p. 1,12-16. Pickle, L. W. and Herrmann, D. J. (2000), Cognitive research for the design of statistical rate maps, Proceedings of the Survey Research Section of the 1999 Annual Meeting of the American Statistical Association, p. 185-191. Pickle, L. W., Herrmann, D., Kerwin, J., Croner, C. M. and White, AA (1994), The impact of statistical graphic design on interpretation of disease rate maps, Proceedings of the Statistical Graphics Section, American Statistical Association 1993 Meeting, San Francisco, CA. pp. 111-116. Pickle, L. W., Herrmann, D. and Wilson, B. (1995), A legendary study of statistical map reading: The cognitive effectiveness of statistical map legends, in: Pickle, L. W. and Herrmann, D. (eds) (1995), Cognitive Aspects of Statistical Mapping. NCHS Working Paper Series Report, No. 18, Hyattsville, MD: National Center for Health Statistics, p. 233-248. Pickle, L. W., Mungiole, M., Jones, G.K., White, A. A. (1996), Atlas of United States Mortality. DHHS Publ. No. (PHS) 97-1015. Hyattsville, MD: National Center for Health Statistics. Pickle, L. W. and White, A. A. (1995), Effect of the choice of age-adjustment method on maps of death rates, Statistics in Medicine, 14, 615-627. Wegman, E. (1990), Hyperdimensional data analysis using parallel coordinates, Journal of the American Statistical Association, 85, 664-675.