Data Visualization (CIS 468) Tables & Maps Dr. David Koop
Discriminability What is problematic here? File vtkdatasetreader PythonSource vtkimageclip vtkimagedatageometryfilter vtkimageresample vtkimagereslice vtkwarpscalar vtkcolortransferfunction vtklookuptable vtkelevationfilter vtkoutlinefilter PythonSource vtkpolydatanormals vtkpolydatamapper vtkproperty vtkdatasetmapper vtkscalarbaractor vtkactor vtkcamera vtkactor vtklodactor vtkcubeaxesactor2d vtkrenderer VTKCell [Koop et al., 2013] 2
Separability [GOOD] 3
Separability [GOOD] 3
Visualizing Tables Express Values Separate, Order, Align Regions Separate Order Align 1 Key 2 Keys 3 Keys Many Keys List Matrix Volume Recursive Subdivision Axis Orientation Rectilinear Parallel Radial Layout Density Dense Space-Filling [Munzner (ill. Maguire), 2014] 4
Scatterplots Prices in 1980 Fish Prices over the Years 450 400 350 300 250 200 150 100 50 0 0 50 100 150 200 250 300 350 400 450 Prices in 1970 Data: two quantitative values Task: find trends, clusters, outliers How: marks at spatial position in horizontal and vertical directions Correlation: dependence between two attributes - Positive and negative correlation - Indicated by lines Coordinate system (axes) and labels are important! 5
Bar Charts Data: one quantitative attribute, one categorical attribute Task: lookup & compare values How: line marks, vertical position (quantitative), horizontal position (categorical) What about length? Ordering criteria: alphabetical or using quantitative attribute Scalability: distinguishability - bars at least one pixel wide - hundreds 100 75 50 25 0 100 75 50 25 0 Animal Type Animal Type [Munzner (ill. Maguire), 2014] 6
Dot and Line Charts 20 15 10 5 0 20 15 10 5 0 Year Data: one quantitative attribute, one ordered attribute Task: lookup values, find outliers and trends How: point mark and positions Line Charts: add connection mark (line) Similar to scatterplots but allow ordered attribute Year [Munzner (ill. Maguire), 2014] 7
Proper Use of Line and Bar Charts 60 50 40 30 20 10 0 Female Lion Antelope Panda Male Female Lion Antelope Male 60 50 40 30 20 10 0 What does the line indicate? Does this make sense? [Adapted from Zacks and Tversky, 1999, Munzner (ill. Maguire), 2014] 8
Streamgraphs [Ebb and Flow of Movies, M. Bloch et al., New York Times, 2008] 9
Multiscale Banking 706 IEEE TRANSACTIONS ON VISUALIZATIO Sunspot Cycles Aspect Ratio = 3.96 Aspect Ratio = 22.35 Power Spectrum [Heer and Agrawala, 2006] 10
Assignment 2 Link Use Tableau and D3 to create stacked bar charts of Citibike data Due Friday, Start now! 11
Heatmaps Data: Two keys, one quantitative attribute Task: Find clusters, outliers, summarize How: area marks in grid, color encoding of quantitative attribute Scalability: number of pixels for area marks (millions), <12 colors Red-green color scales often used - Be aware of colorblindness! Fast-Pitch Softball Slugging Percentage [fastpitchanalytics.com] 12
Bertin Matrices Must we only use color? - What other marks might be appropriate? [C.Perrin et al., 2014] 13
Bertin Matrices Must we only use color? - What other marks might be appropriate? [C.Perrin et al., 2014] 13
Bertin s Encodings Text 0 1 2 3 4 5 6 7 8 9 10 11 text Grayscale QUANTITY OF INK ENCODINGS POSITIONAL ENCODING MEAN-BASED ENCODINGS Circle Dual bar chart Bar chart Line Black and white bar chart Average bar chart [C.Perrin et al., 2014] 14
Matrix Reordering [Bertin Exhibit (INRIA, Vis 2014), Photo by Robert Kosara] 15
Cluster Heatmap [File System Similarity, R. Musăloiu-E., 2009] 16
Cluster Heatmap Data & Task: Same as Heatmap How: Area marks but matrix is ordered by cluster hierarchies Scalability: limited by the cluster dendrogram Dendrogram: a visual encoding of tree data with leaves aligned 17
Sepal.Length 2.0 3.0 4.0 0.5 1.5 2.5 4.5 5.5 6.5 7.5 2.0 3.0 4.0 Sepal.Width Petal.Length 1 2 3 4 5 6 7 4.5 5.5 6.5 7.5 0.5 1.5 2.5 1 2 3 4 5 6 7 Petal.Width Iris Data (red=setosa,green=versicolor,blue=virginica) Scatterplot Matrix (SPLOM) Data: Many quantitative attributes Derived Data: names of attributes Task: Find correlations, trends, outliers How: Scatterplots in matrix alignment Scale: attributes: ~12, items: hundreds? Visualizations in a visualization: at high level, marks are themselves visualizations 18 [Wikipedia]
Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? [Munzner (ill. Maguire), 2014] 19
Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? - Parallel axes Math Physics Dance Drama 100 90 80 70 60 50 40 30 20 10 0 Parallel Coordinates [Munzner (ill. Maguire), 2014] 19
Spatial Axis Orientation So far, we have seen the vertical and horizontal axes (a rectilinear layout) used to encode almost everything What other possibilities are there for axes? - Parallel axes - Radial axes Math Physics Dance Drama 100 90 80 70 60 50 40 30 20 10 0 Parallel Coordinates [Munzner (ill. Maguire), 2014] 19
Radial Axes 120 90 60 150 30 180 0 210 330 240 270 300 20
Radial Axes Polar Coordinates (angle + position along the line at that angle) 150 120 90 60 30 What types of encodings are possible for tabular data in polar coordinates? 180 0 210 330 240 270 300 20
Radial Axes Polar Coordinates (angle + position along the line at that angle) 150 120 90 60 30 What types of encodings are possible for tabular data in polar coordinates? - Radial bar charts 180 0 - Pie charts - Donut charts 210 330 240 270 300 20
Part-of-whole: Relative % comparison? 40M 35M 30M Population 65 Years and Over 45 to 64 Years 25 to 44 Years 18 to 24 Years 14 to 17 Years 5 to 13 Years Under 5 Years 25M 20M 15M 10M 5M 0M CA TX NY FL IL PA OH MI GA NC NJ VA WA AZ MA IN TN MOMD WI MN CO AL SC LA KY OR OK CT IA MS AR KS UT NV NMWV NE ID ME NH HI RI MT DE SD AK ND VT DC WY [Stacked Bar Chart, Bostock, 2017] 21
Normalized Stacked Bar Chart 100% 90% 65 Years and Over 80% 70% 45 to 64 Years 60% 50% 40% 25 to 44 Years 30% 18 to 24 Years 20% 14 to 17 Years 10% 5 to 13 Years 0% UT TX ID AZ NV GA AK MSNM NE CA OK SD CO KS WYNC AR LA IN IL MN DE HI SCMO VA IA TN KY AL WAMDND OH WI OR NJ MT MI FL NY DC CT PA MAWV RI NH ME VT Under 5 Years [Normalized Stacked Bar Chart, Bostock, 2017] 22
Pie Chart 65 <5 45-64 5-13 14-17 18-24 25-44 [Pie Chart, Bostock, 2017] 23
Pie Charts vs. bar charts [Munzner's Textbook, 2014] - Angle channel is lower precision then position in bar charts What about donut charts? Are we judging angle, or are we judging area, or arc length? - See "An Illustrated Study of the Pie Chart Study Results" by R. Kosara 24
Arcs, Angles, or Areas? [R. Kosara and D. Skau, 2016] 25
Absolute Error Relative to Pie Chart [R. Kosara and D. Skau, 2016] 26
Conclusion: We do not read pie charts by angle 27
Pies vs. Bars but area is still harder to judge than position Screens are usually not round 28
Geographic Visualization 29
Geographic Visualization Spatial data (have positions) Cartography: the science of drawing maps - Lots of history and well-established procedures - May also have non-spatial attributes associated with items - Thematic cartography: integrate these non-spatial attributes (e.g. population, life expectancy, etc.) Goals: - Respect cartographic principles - Understand data with geographic references with the visualization principles 30
Map Projection [P. Foresman, Wikimedia] 31
Flattening the Sphere? [USGS Map Projections] 32
Lambert Conformal Conic Projection [USGS Map Projections] 33
Standard Projections [J. P. Snyder, USGS] 34
Map Projections [xkcd] 35
Projection Classification Angle-preserving [J. van Wijk, 2008] 36
Search Tasks Location known Location unknown Target known Lookup Locate Target unknown Browse Explore [Munzner (ill. Maguire), 2014] 37
Lookup 38
Locate 39
Adding Data Discrete: a value is associated with a specific position - Size - Color Hue - Charts Continuous: each spatial position has a value (fields) - Heatmap - Isolines 40
Discrete Categorical Attribute: Shape [Acadia NP, National Park Service] 41
Discrete Categorical Attribute: Shape [Acadia NP, National Park Service] 41
Discrete Quantitative Attribute: Color Saturation 42
Discrete Quantitative Attribute: Size 43
Discrete Quantitative Attributes: Bar Chart [http://mis4gis.com/hgistr.org/] 44
Continuous Quantitative Attribute: Color Hue [http://tampaseo.com/2012/02/websites-heat-mapping-users/] 45
Time as the attribute [NYTimes] 46
Isolines [USGS via Wikipedia] 47
Isolines Scalar fields: - value at each location - sampled on grids Isolines use derived data from the scalar field - Interpret field as representing continuous values - Derived data is geometry: new lines that represent the same attribute value Scalability: dozens of levels Other encodings? 48