Data Visualization (DSC 530/CIS )

Similar documents
Data Visualization (CIS 468)

Your Galactic Address

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Analyzing Severe Weather Data

Sample Statistics 5021 First Midterm Examination with solutions

Use your text to define the following term. Use the terms to label the figure below. Define the following term.

Evolution Strategies for Optimizing Rectangular Cartograms

Multiway Analysis of Bridge Structural Types in the National Bridge Inventory (NBI) A Tensor Decomposition Approach

CIS 467/602-01: Data Visualization

EXST 7015 Fall 2014 Lab 08: Polynomial Regression

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Smart Magnets for Smart Product Design: Advanced Topics

Statistical Mechanics of Money, Income, and Wealth

Annual Performance Report: State Assessment Data

Cluster Analysis. Part of the Michigan Prosperity Initiative

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant:

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Drought Monitoring Capability of the Oklahoma Mesonet. Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016)

Summary of Natural Hazard Statistics for 2008 in the United States

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

Module 19: Simple Linear Regression

Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016

Empirical Application of Panel Data Regression

C Further Concepts in Statistics

Forecasting the 2012 Presidential Election from History and the Polls

Scatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association

Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015

REGRESSION ANALYSIS BY EXAMPLE

AIR FORCE RESCUE COORDINATION CENTER

Regression Diagnostics

Kari Lock. Department of Statistics, Harvard University Joint Work with Andrew Gelman (Columbia University)

Combinatorics. Problem: How to count without counting.

The veto as electoral stunt

Analysis of the USDA Annual Report (2015) of Animal Usage by Research Facility. July 4th, 2017

Meteorology 110. Lab 1. Geography and Map Skills

Further Concepts in Statistics

Do Now 18 Balance Point. Directions: Use the data table to answer the questions. 2. Explain whether it is reasonable to fit a line to the data.

Estimating Dynamic Games of Electoral Competition to Evaluate Term Limits in U.S. Gubernatorial Elections: Online Appendix

Week 3 Linear Regression I

Further Concepts in Statistics

AFRCC AIR FORCE RESCUE COORDINATION CENTER

Some concepts are so simple

Draft Report. Prepared for: Regional Air Quality Council 1445 Market Street, Suite 260 Denver, Colorado Prepared by:

Discontinuation of Support for Field Chemistry Measurements in the National Atmospheric Deposition Program National Trends Network (NADP/NTN)

Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach

Effects of Various Uncertainty Sources on Automatic Generation Control Systems

$3.6 Billion GNMA Servicing Offering

Stat 101 Exam 1 Important Formulas and Concepts 1

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining

AC/RC Regional Councils of Colonels & Partnerships

Resources. Amusement Park Physics With a NASA Twist EG GRC

Time-Series Trends of Mercury Deposition Network Data

Graphing. LI To practice reading and creating graphs

How to Make or Plot a Graph or Chart in Excel

Lecture Notes 2: Variables and graphics

Blueline Tilefish Assessment Approach: Stock structure, data structure, and modeling approach

CS-11 Tornado/Hail: To Model or Not To Model

CS 361: Probability & Statistics

An area chart emphasizes the trend of each value over time. An area chart also shows the relationship of parts to a whole.

Graphing Data. Example:

2/25/2019. Taking the northern and southern hemispheres together, on average the world s population lives 24 degrees from the equator.

Exploratory Data Analysis

Oak Maple Beech Birch 1, Hickory Total 3,

CS 5630/6630 Scientific Visualization. Elementary Plotting Techniques II

Introduction to Mathematical Statistics and Its Applications Richard J. Larsen Morris L. Marx Fifth Edition

MATH 1150 Chapter 2 Notation and Terminology

Overview. 4.1 Tables and Graphs for the Relationship Between Two Variables. 4.2 Introduction to Correlation. 4.3 Introduction to Regression 3.

Intro to Info Vis. CS 725/825 Information Visualization Spring } Before class. } During class. Dr. Michele C. Weigle

Organization of 2D Space

Creative Data Mining

Vocabulary: Data About Us

The Observational Climate Record

2. Graphing Practice. Warm Up

Teachers Curriculum Institute Map Skills Toolkit 411

Alexander Klippel and Chris Weaver. GeoVISTA Center, Department of Geography The Pennsylvania State University, PA, USA

Analytical Graphing. lets start with the best graph ever made

CS444/544: Midterm Review. Carlos Scheidegger

Math 138 Summer Section 412- Unit Test 1 Green Form, page 1 of 7

9.7 Statistics and Data (Graphical)

Multiple Choice. Chapter 2 Test Bank

APRIL 1999 THE WALL STREET JOURNAL CLASSROOM EDITION Hurricanes, typhoons, coastal storms Earthquake (San Francisco area) Flooding

Relate Attributes and Counts

Business Statistics. Lecture 9: Simple Regression

A is one of the categories into which qualitative data can be classified.

Chapter 2: Tools for Exploring Univariate Data

8/4/2009. Describing Data with Graphs

Maths GCSE Langdon Park Foundation Calculator pack A

A Second Opinion Correlation Coefficient. Rudy A. Gideon and Carol A. Ulsafer. University of Montana, Missoula MT 59812

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.)

Clear Roads Overview. AASHTO Committee on Maintenance July 23, 2018 Charlotte, North Carolina

MULTIVARIATE HOMEWORK #5

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Visualizing and summarizing data

Experimental Design and Data Analysis for Biologists

Tutorial 8 Raster Data Analysis

Transcription:

Data Visualization (DSC 530/CIS 602-01) Tables Dr. David Koop

Visualization of Tables Items and attributes For now, attributes are not known to be positions Keys and values - key is an independent attribute that is unique and identifies item - value tells some aspect of an item Keys: categorical/ordinal Values: +quantitative Levels: unique values of categorical or ordered attributes Items (rows) Attributes (columns) Cell containing value Multidimensional Table Value in cell [Munzner (ill. Maguire), 2014] 2

Express Values: Scatterplots Prices in 1980 Fish Prices over the Years 450 400 350 300 250 200 150 100 50 0 0 50 100 150 200 250 300 350 400 450 Prices in 1970 Data: two quantitative values Task: find trends, clusters, outliers How: marks at spatial position in horizontal and vertical directions Correlation: dependence between two attributes - Positive and negative correlation - Indicated by lines Coordinate system (axes) and labels are important! 3

Log-Log Plot 4 0.001 0.0001 0.00001 100 1000 10000 n dist [Wickham, 2014]

Bubble Plot Bubbles FACTS TEACH ABOUT HOW TO USE Share! " # $ English years Color World Regions 80 70 Select Search... Afghanistan Life expectancy 60 50 2015 Albania Algeria Andorra Angola Antigua and Barbuda Size Population, total 40 2015Zoom 100% 1800 1900 2000 OPTIONS EXPAND PRESENT [Gapminder, Wealth & Health of Nations] 5

Project Proposal Information Due Next Monday, March 5 Choices: - [CreateVis] Create a complex, interactive visualization of a dataset - [Research] Research project that involves visualization (new technique, evaluation, etc.) Check with me if you're interested in the research project option Start: - [CreateVis] Looking for a dataset: Awesome Public Datasets, Kaggle, etc. - [Research] Surveying literature, identifying goals 6

Separate, Order, and Align: Categorical Regions Categorical: =,!= Spatial position can be used for categorical attributes Use regions, distinct contiguous bounded areas, to encode categorical attributes Three operations on the regions: - Separate (use categorical attribute) - Align - Order (use some other ordered attribute) Alignment and order can use same or different attribute 7

List Alignment: Bar Charts Data: one quantitative attribute, one categorical attribute Task: lookup & compare values How: line marks, vertical position (quantitative), horizontal position (categorical) What about length? Ordering criteria: alphabetical or using quantitative attribute Scalability: distinguishability - bars at least one pixel wide - hundreds 100 75 50 25 0 100 75 50 25 0 Animal Type Animal Type [Munzner (ill. Maguire), 2014] 8

Streamgraphs Include a time attribute Data: multidimensional table, one quantitative attribute (count), one ordered key attribute (time), one categorical key attribute + derived attribute: layer ordering (quantitative) Task: analyze trends in time, find (maxmial) outliers How: derived position+geometry, length, color Scalability: more categories than stacked bar charts fig 2 films from the summer of 2007 [Byron and Wattenberg, 2012] 9

Streamgraphs [Ebb and Flow of Movies, M. Bloch et al., New York Times, 2008] 10

Dot and Line Charts 20 15 10 5 0 20 15 10 5 0 Year Data: one quantitative attribute, one ordered attribute Task: lookup values, find outliers and trends How: point mark and positions Line Charts: add connection mark (line) Similar to scatterplots but allow ordered attribute Year [Munzner (ill. Maguire), 2014] 11

Proper Use of Line and Bar Charts 60 50 40 30 20 10 0 Female Lion Antelope Panda Male Female Lion Antelope Male 60 50 40 30 20 10 0 60 60 50 50 40 40 30 30 20 20 10 10 0 10-year-olds 12-year-olds 0 10-year-olds 12-year-olds [Adapted from Zacks and Tversky, 1999, Munzner (ill. Maguire), 2014] 12

Proper Use of Line and Bar Charts 60 50 40 30 20 10 0 Female Lion Antelope Panda Male Female Lion Antelope Male 60 50 40 30 20 10 0 What does the line indicate? Does this make sense? [Adapted from Zacks and Tversky, 1999, Munzner (ill. Maguire), 2014] 13

Aspect Ratio Trends in line charts are more apparent because we are using angle as a channel Perception of angle (and the relative difference between angles) is important Initial experiments found people best judge differences in slope when angles are around 45 degrees (Cleveland et al., 1988, 1993) 14

Multiscale Banking 706 IEEE TRANSACTIONS ON VISUALIZATIO Sunspot Cycles Aspect Ratio = 3.96 Aspect Ratio = 22.35 Power Spectrum [Heer and Agrawala, 2006] 15

Multiscale Banking ON AND COMPUTER GRAPHICS, VOL. 12, NO. 5, SEPTEMBER/OCTOBER 2006 PRMTX Mutual Fund Aspect Ratio = 4.23 Aspect Ratio = 14.55 Power Spectrum [Heer and Agrawala, 2006] 16

Expanding the Study Cleveland et al. did not study the entire space of slope comparisons and 45 degrees was at the low end of their study (blue marks on right) Talbot et al. compared more slopes and found that people do better with smaller slopes Baselines may aid with this Slope ratio (pij) 87% 65% 48% 35% 25% 17% 11% 9 18 30 45 60 72 81 Mid angle (θ m ) [Talbot et al., 2013] 17

Heatmaps Data: Two keys, one quantitative attribute Task: Find clusters, outliers, summarize How: area marks in grid, color encoding of quantitative attribute Scalability: number of pixels for area marks (millions), <12 colors Red-green color scales often used - Be aware of colorblindness! Fast-Pitch Softball Slugging Percentage [fastpitchanalytics.com] 18

Bertin Matrices Must we only use color? - What other marks might be appropriate? [C.Perrin et al., 2014] 19

Bertin Matrices Must we only use color? - What other marks might be appropriate? [C.Perrin et al., 2014] 19

Bertin s Encodings Text 0 1 2 3 4 5 6 7 8 9 10 11 text Grayscale QUANTITY OF INK ENCODINGS POSITIONAL ENCODING MEAN-BASED ENCODINGS Circle Dual bar chart Bar chart Line Black and white bar chart Average bar chart [C.Perrin et al., 2014] 20

Matrix Reordering [Bertin Exhibit (INRIA, Vis 2014), Photo by Robert Kosara] 21

Cluster Heatmap [File System Similarity, R. Musăloiu-E., 2009] 22

Cluster Heatmap Data & Task: Same as Heatmap How: Area marks but matrix is ordered by cluster hierarchies Scalability: limited by the cluster dendrogram Dendrogram: a visual encoding of tree data with leaves aligned 23

Sepal.Length 2.0 3.0 4.0 0.5 1.5 2.5 4.5 5.5 6.5 7.5 2.0 3.0 4.0 Sepal.Width Petal.Length 1 2 3 4 5 6 7 4.5 5.5 6.5 7.5 0.5 1.5 2.5 1 2 3 4 5 6 7 Petal.Width Iris Data (red=setosa,green=versicolor,blue=virginica) Scatterplot Matrix (SPLOM) Data: Many quantitative attributes Derived Data: names of attributes Task: Find correlations, trends, outliers How: Scatterplots in matrix alignment Scale: attributes: ~12, items: hundreds? Visualizations in a visualization: at high level, marks are themselves visualizations 24 [Wikipedia]

Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? [Munzner (ill. Maguire), 2014] 25

Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? - Parallel axes Math Physics Dance Drama 100 90 80 70 60 50 40 30 20 10 0 Parallel Coordinates [Munzner (ill. Maguire), 2014] 25

Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? - Parallel axes - Radial axes Math Physics Dance Drama 100 90 80 70 60 50 40 30 20 10 0 Parallel Coordinates [Munzner (ill. Maguire), 2014] 25

Radial Axes 120 90 60 150 30 180 0 210 330 240 270 300 26

Radial Axes Polar Coordinates (angle + position along the line at that angle) 150 120 90 60 30 What types of encodings are possible for tabular data in polar coordinates? 180 0 210 330 240 270 300 26

Radial Axes Polar Coordinates (angle + position along the line at that angle) 150 120 90 60 30 What types of encodings are possible for tabular data in polar coordinates? - Radial bar charts 180 0 - Pie charts - Donut charts 210 330 240 270 300 26

Part-of-whole: Relative % comparison? 40M 35M 30M Population 65 Years and Over 45 to 64 Years 25 to 44 Years 18 to 24 Years 14 to 17 Years 5 to 13 Years Under 5 Years 25M 20M 15M 10M 5M 0M CA TX NY FL IL PA OH MI GA NC NJ VA WA AZ MA IN TN MOMD WI MN CO AL SC LA KY OR OK CT IA MS AR KS UT NV NMWV NE ID ME NH HI RI MT DE SD AK ND VT DC WY [Stacked Bar Chart, Bostock, 2017] 27

Normalized Stacked Bar Chart 100% 90% 65 Years and Over 80% 70% 45 to 64 Years 60% 50% 40% 25 to 44 Years 30% 18 to 24 Years 20% 14 to 17 Years 10% 5 to 13 Years 0% UT TX ID AZ NV GA AK MSNM NE CA OK SD CO KS WYNC AR LA IN IL MN DE HI SCMO VA IA TN KY AL WAMDND OH WI OR NJ MT MI FL NY DC CT PA MAWV RI NH ME VT Under 5 Years [Normalized Stacked Bar Chart, Bostock, 2017] 28

Pie Chart 65 <5 45-64 5-13 14-17 18-24 25-44 [Pie Chart, Bostock, 2017] 29

Pie Charts vs. bar charts [Munzner's Textbook, 2014] - Angle channel is lower precision then position in bar charts What about donut charts? Are we judging angle, or are we judging area, or arc length? - "Arcs, Angles, or Areas: Individual Data Encodings in Pie and Donut Charts", D. Skau and R. Kosara, 2016 - "Judgment Error in Pie Chart Variations", R. Kosara and D. Skau, 2016 - Summary: "An Illustrated Study of the Pie Chart Study Results" 30

Study Setup Three studies 80-100 participants each Each answered ~60 questions Computed results using 95% Confidence Intervals 95% CI Mean [R. Kosara and D. Skau, 2016] 31

Arcs, Angles, or Areas? [R. Kosara and D. Skau, 2016] 32

Signed Error [R. Kosara and D. Skau, 2016] 33

Absolute Error [R. Kosara and D. Skau, 2016] 34

Absolute Error Relative to Pie Chart [R. Kosara and D. Skau, 2016] 35

Donut Charts Width [R. Kosara and D. Skau, 2016] 36

Pie Chart Variations [R. Kosara and D. Skau, 2016] 37

Pie Chart Variations [R. Kosara and D. Skau, 2016] 38

Conclusion: We do not read pie charts by angle [R. Kosara and D. Skau, 2016] 39

Pies vs. Bars but area is still harder to judge than position Screens are usually not round 40