How to display data badly

Similar documents
2017 Source of Foreign Income Earned By Fund

Canadian Imports of Honey

Appendix B: Detailed tables showing overall figures by country and measure

Do Policy-Related Shocks Affect Real Exchange Rates? An Empirical Analysis Using Sign Restrictions and a Penalty-Function Approach

ˆ GDP t = GDP t SCAN t (1) t stat : (3.71) (5.53) (3.27) AdjustedR 2 : 0.652

Quick Guide QUICK GUIDE. Activity 1: Determine the Reaction Rate in the Presence or Absence of an Enzyme

Export Destinations and Input Prices. Appendix A

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

READY TO SCRAP: HOW MANY VESSELS AT DEMOLITION VALUE?

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org

Introduction to QTL mapping in model organisms

Global Data Catalog initiative Christophe Charpentier ArcGIS Content Product Manager

Learning Objectives. Math Chapter 3. Chapter 3. Association. Response and Explanatory Variables

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES

A Prior Distribution of Bayesian Nonparametrics Incorporating Multiple Distances

FORENSIC TOXICOLOGY SCREENING APPLICATION SOLUTION

Gravity Analysis of Regional Economic Interdependence: In case of Japan

International Standardization for Measurement and Characterization of Nanomaterials

IEEE Transactions on Image Processing EiC Report

Census 2004 Summary. Membership

TIGER: Tracking Indexes for the Global Economic Recovery By Eswar Prasad and Karim Foda

04 June Dim A W V Total. Total Laser Met

The Outer Space Legal Regime and UN Register of Space Objects

International Standard

USDA Dairy Import License Circular for 2018

DESKTOP STUDY ON GLOBAL IMPORTS OF HANDMADE CARPETS & FLOOR COVERINGS AT A GLANCE

Bilateral Labour Agreements, 2004

TIGER: Tracking Indexes for the Global Economic Recovery By Eswar Prasad, Karim Foda, and Ethan Wu

CONTINENT WISE ANALYSIS OF ZOOLOGICAL SCIENCE PERIODICALS: A SCIENTOMETRIC STUDY

[ OSTRO PASS-THROUGH SAMPLE PREPARATION PRODUCT ] The Simpler Way to Cleaner Samples

Calories, Obesity and Health in OECD Countries

Chapter 6 Scatterplots, Association and Correlation

USDA Dairy Import License Circular for 2018

North-South Gap Mapping Assignment Country Classification / Statistical Analysis

Corporate Governance, and the Returns on Investment

International and regional network status

Governments that have requested pre-export notifications pursuant to article 12, paragraph 10 (a), of the 1988 Convention

Cyclone Click to go to the page. Cyclone

[ Care and Use Manual ]

TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade

St. Gallen, Switzerland, August 22-28, 2010

From Argentina to Zimbabwe: Where Should I Sell my Widgets?

STATISTICS Relationships between variables: Correlation

A COMPREHENSIVE WORLDWIDE WEB-BASED WEATHER RADAR DATABASE

protein interaction analysis bulletin 6300

Stochastic Analysis and Forecasts of the Patterns of Speed, Acceleration, and Levels of Material Stock Accumulation in Society

Parity Reversion of Absolute Purchasing Power Parity Zhi-bai ZHANG 1,a,* and Zhi-cun BIAN 2,b

USDA Dairy Import License Circular for 2018 Commodity/

DISTILLED SPIRITS - IMPORTS BY VALUE DECEMBER 2017

DISTILLED SPIRITS - IMPORTS BY VOLUME DECEMBER 2017

Measuring Export Competitiveness

Demography, Time and Space

Online Appendix to: Crises and Recoveries in an Empirical Model of. Consumption Disasters

PIRLS 2016 INTERNATIONAL RESULTS IN READING

Canadian Mapping. Grades 5-6. Written by Lynda Golletz Illustrated by S&S Learning Materials

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

ia PU BLi s g C o M Pa K T Wa i n CD-1576

Vocabulary: Data About Us

2017 Scientific Ocean Drilling Bibliographic Database Report

NOWCASTING REPORT. Updated: April 15, 2016

TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade

NOWCASTING REPORT. Updated: May 20, 2016

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Mathematics. Pre-Leaving Certificate Examination, Paper 2 Higher Level Time: 2 hours, 30 minutes. 300 marks L.20 NAME SCHOOL TEACHER

Multimedia on Nuclear Reactor Physics In order to improve education and training quality, a Multimedia on Nuclear Reactor Physics has been developed.

Complementary Use of Raman and FT-IR Imaging for the Analysis of Multi-Layer Polymer Composites

CropCast Weekly Feedgrains Report

Nigerian Capital Importation QUARTER THREE 2016

Obsolete Product(s) - Obsolete Product(s)

Introduction to QTL mapping in model organisms

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

BSH Description. 2. Features. 3. Applications. 4. Pinning information. N-channel enhancement mode field-effect transistor

ECONOMICS SERIES SWP 2009/10. A Nonlinear Approach to Testing the Unit Root Null Hypothesis: An Application to International Health Expenditures

GLOBAL INFORMATION PROCESS IN THE FIELD OF ELECTROCHEMISTRY AND MOLDAVIAN ELECTROCHEMICAL SCHOOL. SCIENCEMETRIC ANALYSIS

HCF4532B 8-BIT PRIORITY ENCODER

PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale

CropCast Europe Weekly Report

IN QUALITATIVE ANALYSIS,

PIRLS 2011 The PIRLS 2011 Safe and Orderly School Scale

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 45.0

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

Unit of Study: Physical Geography & Settlement Patterns; Culture & Civilizations; and The Spread of Ideas

SIMPLE LINEAR REGRESSION STAT 251

Data Visualization Discover, Explore and be Skeptical. Gender & Math DATA. Seminar 1 Exploratory Data Analysis

ICC Rev August 2010 Original: English. Agreement. International Coffee Council 105 th Session September 2010 London, England

Measuring Inequality with Ordinal data

An applied statistician does probability:

Objectives: 5pts Graph the average ecological footprints of several countries. Select two countries with different sized footprints and research

Keysight Technologies Measuring Substrate-Independent Young s Modulus of Low-k Films by Instrumented Indentation. Application Note

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 45.0

Keysight Technologies Young s Modulus of Dielectric Low-k Materials. Application Note

NOWCASTING REPORT. Updated: September 23, 2016

10.2 Fitting a Linear Model to Data

TIMSS 2011 The TIMSS 2011 School Discipline and Safety Scale, Fourth Grade

DISTILLED SPIRITS - EXPORTS BY VALUE DECEMBER 2017

2012 OCEAN DRILLING CITATION REPORT

New Multi-Collector Mass Spectrometry Data for Noble Gases Analysis

United Kingdom. Total MAP Caseload. Average time needed to close MAP cases (in months)

Chapter 9.D Services Trade Data

Transcription:

How to display data badly Karl W Broman Biostatistics & Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman

Using Microsoft Excel to obscure your data and annoy your readers Karl W Broman Biostatistics & Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman

Inspiration This lecture was inspired by H Wainer (1984) How to display data badly. 38(2): 137 147 American Statistician Dr. Wainer was the first to elucidate the principles of the bad display of data. The now widespread use of Microsoft Excel has resulted in remarkable advances in the field. 3

General principles The aim of good data graphics: Display data accurately and clearly. Some rules for displaying data badly: Display as little information as possible. Obscure what you do show (with chart junk). Use pseudo-3d and color gratuitously. Make a pie chart (preferably in color and 3d). Use a poorly chosen scale. Ignore sig figs. 4

Example 1 5

Example 1 6

Example 1 7

Example 1 8

Example 1 9

Example 1 10

Example 1 11

Example 1 12

Example 2 13

Example 2 14

Example 2 15

Example 2 16

Example 2 17

Example 3 18

Example 3 19

Example 3 20

Example 3 21

Example 4 22

Example 4 23

Example 4 24

Example 5 25

Example 5 26

Example 5 27

Example 5 28

Example 5 29

Example 5 30

Example 6 31

Example 6 32

Example 7 33

Example 7 34

Displaying data well Be accurate and clear. Let the data speak. Show as much information as possible, taking care not to obscure the message. Science not sales. Avoid unnecessary frills (esp. gratuitous 3d). In tables, every digit should be meaningful. Don t drop ending 0 s. 35

More on data visualization Karl W Broman Biostatistics & Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman

Ease comparisons (things to be compared should be adjacent) 18 18 16 16 Phenotype 14 Phenotype 14 12 12 10 10 Female Male Female Male Female Male AA AB BB AA AB BB AA AB BB Female Male 2

Ease comparisons (add a bit of color) 18 18 16 16 Phenotype 14 Phenotype 14 12 12 10 10 Female Male Female Male Female Male AA AB BB AA AB BB AA AB BB Female Male 3

Which comparison is easiest? 150 150 400 300 100 100 200 50 50 100 A B 0 0 0 A B A B 400 400 300 B 300 B 200 200 100 100 B A A A 0 0 4

Don t distort the quantities (value radius) Wheat (17 Gbp) Human (3.2 Gbp) Arabidopsis (0.145 Gbp) 5

Don t distort the quantities (value area) Wheat (17 Gbp) Human (3.2 Gbp) Arabidopsis (0.145 Gbp) 6

Don t use areas at all (value length) 15 Genome size (Gbp) 10 5 0 Arabidopsis Human Wheat 7

Encoding data Quantities Position Length Angle Area Luminance (light/dark) Chroma (amount of color) Categories Shape Hue (which color) Texture Width 8

Ease comparisons (align things vertically) Women Men 55 60 65 70 75 Height (in) 55 60 65 70 75 Height (in) Men 55 60 65 70 75 Height (in) 9

Ease comparisons (use common axes) Women Women 55 60 65 70 75 Height (in) 55 60 65 70 75 Height (in) Men Men 60 65 70 75 Height (in) 55 60 65 70 75 Height (in) 10

Use labels not legends Petal width (cm) 2.5 2.0 1.5 1.0 setosa versicolor virginica Petal width (cm) 2.5 2.0 1.5 1.0 versicolor virginica 0.5 0.0 0.5 0.0 setosa 1 2 3 4 5 6 7 Petal length (cm) 1 2 3 4 5 6 7 Petal length (cm) 11

Don t sort alphabetically Argentina Australia Austria Belgium Brazil Canada China France Germany India Indonesia Italy Japan Korea, Rep. Mexico Netherlands Norway Poland Russian Federation Spain Sweden Switzerland Turkey United Kingdom United States United States Netherlands France Canada Germany Switzerland Austria Belgium Italy Spain Sweden United Kingdom Japan Norway Australia Brazil Argentina Korea, Rep. Poland Turkey Russian Federation Mexico China India Indonesia 0 5 10 15 0 5 10 15 Health care spending (% GDP) Health care spending (% GDP) 12

Must you include 0? 120 100 100 96.5% 98.1% 99.2% 99 Detection rate (%) 80 60 40 Detection rate (%) 98 97 20 96 0 A B C 95 A B C Method Method 13

Summary Put the things to be compared next to each other Use color to set things apart, but consider color blind folks Use position rather than angle or area to represent quantities Align things vertically to ease comparisons Use common axis limits to ease comparisons Use labels rather than legends Sort on meaningful variables (not alphabetically) Must 0 be included in the axis limits? Consider taking logs and/or differences 14

Further reading ER Tufte (1983) The visual display of quantitative information. Graphics Press. ER Tufte (1990) Envisioning information. Graphics Press. ER Tufte (1997) Visual explanations. Graphics Press. WS Cleveland (1993) Visualizing data. Hobart Press. WS Cleveland (1994) The elements of graphing data. CRC Press. A Gelman, C Pasarica, R Dodhia (2002) Let s practice what we preach: Turning tables into graphs. The American Statistician 56:121-130 NB Robbins (2004) Creating more effective graphs. Wiley Nature Methods columns: http://bang.clearscience.info/?p=546 15