Data Visualization Discover, Explore and be Skeptical. Gender & Math DATA. Seminar 1 Exploratory Data Analysis

Similar documents
04 June Dim A W V Total. Total Laser Met

TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade

Appendix B: Detailed tables showing overall figures by country and measure

TIMSS 2011 The TIMSS 2011 School Discipline and Safety Scale, Fourth Grade

2017 Source of Foreign Income Earned By Fund

Scaling outcomes. Results of the IRT scaling and population modeling Transforming the plausible values to PISA scales...

TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade

Canadian Imports of Honey

Bilateral Labour Agreements, 2004

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

International Student Achievement in Mathematics

TIMSS 2011 The TIMSS 2011 Students Bullied at School Scale, Eighth Grade

PIRLS 2011 The PIRLS 2011 Teacher Working Conditions Scale

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES

PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale

PIRLS 2011 The PIRLS 2011 Safe and Orderly School Scale

DISTILLED SPIRITS - IMPORTS BY VALUE DECEMBER 2017

DISTILLED SPIRITS - IMPORTS BY VOLUME DECEMBER 2017

PIRLS 2016 INTERNATIONAL RESULTS IN READING

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

PIRLS 2011 The PIRLS 2011 Students Engaged in Reading Lessons Scale

PIRLS 2011 The PIRLS 2011 Teacher Career Satisfaction Scale

DISTILLED SPIRITS - EXPORTS BY VALUE DECEMBER 2017

International and regional network status

Publication Date: 15 Jan 2015 Effective Date: 12 Jan 2015 Addendum 6 to the CRI Technical Report (Version: 2014, Update 1)

About the Authors Geography and Tourism: The Attraction of Place p. 1 The Elements of Geography p. 2 Themes of Geography p. 4 Location: The Where of

USDA Dairy Import License Circular for 2018

Governments that have requested pre-export notifications pursuant to article 12, paragraph 10 (a), of the 1988 Convention

A COMPREHENSIVE WORLDWIDE WEB-BASED WEATHER RADAR DATABASE

USDA Dairy Import License Circular for 2018

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

Kernel Wt. 593,190,150 1,218,046,237 1,811,236, ,364, ,826, ,191, Crop Year

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

Does socio-economic indicator influent ICT variable? II. Method of data collection, Objective and data gathered

Teacher Job Satisfaction Scale, Eighth Grade

READY TO SCRAP: HOW MANY VESSELS AT DEMOLITION VALUE?

USDA Dairy Import License Circular for 2018 Commodity/

PRECURSORS. Pseudoephedrine preparations 3,4-MDP-2-P a P-2-P b. Ephedrine

Country of Citizenship, College-Wide - All Students, Fall 2014

2,152,283. Japan 6,350,859 32,301 6,383,160 58,239, ,790 58,464,425 6,091, ,091,085 52,565, ,420 52,768,905

CONTINENT WISE ANALYSIS OF ZOOLOGICAL SCIENCE PERIODICALS: A SCIENTOMETRIC STUDY

Bahrain, Israel, Jordan, Kuwait, Saudi Arabia, United Arab Emirates, Azerbaijan, Iraq, Qatar and Sudan.

Supplementary Appendix for. Version: February 3, 2014

WHO EpiData. A monthly summary of the epidemiological data on selected vaccine preventable diseases in the European Region

Chapter 1. International Student Achievement in Reading

ICC Rev August 2010 Original: English. Agreement. International Coffee Council 105 th Session September 2010 London, England

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 45.0

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 45.0

IEEE Transactions on Image Processing EiC Report

PIRLS International Report PIRLS. Chapter 2

North-South Gap Mapping Assignment Country Classification / Statistical Analysis

United Nations Environment Programme

Cyclone Click to go to the page. Cyclone

Export Destinations and Input Prices. Appendix A

Demography, Time and Space

Nigerian Capital Importation QUARTER THREE 2016

Gravity Analysis of Regional Economic Interdependence: In case of Japan

OCTOBER Almond Industry Position Report Crop Year /01-10/31 Kernel Wt /01-10/31 Kernel Wt.

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

United Nations Environment Programme

Government Size and Economic Growth: A new Framework and Some Evidence from Cross-Section and Time-Series Data

EuroGeoSurveys & ASGMI The Geological Surveys of Europe and IberoAmerica

Aerospace part number guide

Big Data at BBVA Research using BigQuery

2001 Environmental Sustainability Index

Additional VEX Worlds 2019 Spot Allocations

Census 2004 Summary. Membership

The Outer Space Legal Regime and UN Register of Space Objects

Fall International Student Enrollment Statistics

F M U Total. Total registrants at 31/12/2014. Profession AS 2, ,574 BS 15,044 7, ,498 CH 9,471 3, ,932

Multimedia on Nuclear Reactor Physics In order to improve education and training quality, a Multimedia on Nuclear Reactor Physics has been developed.

Global Data Catalog initiative Christophe Charpentier ArcGIS Content Product Manager

GLOBAL INFORMATION PROCESS IN THE FIELD OF ELECTROCHEMISTRY AND MOLDAVIAN ELECTROCHEMICAL SCHOOL. SCIENCEMETRIC ANALYSIS

SuperPack -Light. Data Sources. SuperPack-Light is for sophisticated weather data users who require large volumes of high quality world

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/

Parametric Emerging Markets Fund

DESKTOP STUDY ON GLOBAL IMPORTS OF HANDMADE CARPETS & FLOOR COVERINGS AT A GLANCE

The School Geography Curriculum in European Geography Education. Similarities and differences in the United Europe.

10/27/2015. Content. Well-homogenized national datasets. Difference (national global) BEST (1800) Difference BEST (1911) Difference GHCN & GISS (1911)

W o r l d O i l a n d G a s R e v i e w

Economic and Social Council

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 74,668,896 81,673, ,342,305

INTERNATIONAL S T U D E N T E N R O L L M E N T

Weighted Voting Games

A Markov system analysis application on labour market dynamics: The case of Greece

PLUTO The Transport Response to the National Planning Framework. Dr. Aoife O Grady Department of Transport, Tourism and Sport

Corporate Governance, and the Returns on Investment

Final report for the Expert Group on the Integration of Statistical and Geospatial Information, May 2015

natural gas World Oil and Gas Review

ACCESSIBILITY TO SERVICES IN REGIONS AND CITIES: MEASURES AND POLICIES NOTE FOR THE WPTI WORKSHOP, 18 JUNE 2013

To facilitate data entry, three options are provided: - Data import using ASCII data transfers. - Data entry through time series

The trade dispute between the US and China Who wins? Who loses?

Forecast Million Lbs. % Change 1. Carryin August 1, ,677, ,001, % 37,917,625 42,897,816 80,815,441

Weekly price report on Pig carcass (Class S, E and R) and Piglet prices in the EU. Carcass Class S % + 0.3% % 98.

Controls on Exports of Green List Waste to non-oecd Countries an Update.

MULTIPLE REGRESSION. part 1. Christopher Adolph. and. Department of Political Science. Center for Statistics and the Social Sciences

Transcription:

Data Visualization Discover, Explore and be Skeptical Seminar 1 Exploratory Data Analysis Di Cook Statistics, Iowa State University soon to be Business Analytics, Monash University Les Diablerets, Feb 1-4, 215 DATA Results from 212 OECD PISA tests: the world s global metric for quality, equity and efficiency in school education. 485,49 students math, science and reading test scores 65 countries, between 1-1 schools in each Student questionnaires about their environment (635 vars) Parents surveyed on work, life, income (143 vars) Principals provide information about their schools (291 vars) http://www.oecd.org/pisa/pisaproducts/ datavisualizationcontest.htm Les Diablerets, Feb 1-4, 215 3-73 Colombia Chile Costa Rica Luxembourg Liechtenstein Austria Peru Japan Italy Brazil South Korea Ireland Spain Hong Kong Argentina Tunisia Germany Denmark Mexico Turkey New Zealand Uruguay Israel United Kingdom Portugal Hungary Czech Republic Croatia Serbia Switzerland Australia France USA Vietnam Slovakia Netherlands Canada Greece Belgium Taiwan Estonia Norway Indonesia Poland Romania China Slovenia Montenegro Albania Finland Kazakhstan Russia Lithuania Bulgaria Sweden Latvia Singapore Iceland United Arab Emirates Malaysia Thailand Qatar Jordan 3 25 2 15 1 5 5 1 15 2 25 3 Math Score Gap Significant female male none Les Diablerets, Feb 1-4, 215 Gender & Math 1. Compute weighted means by country and by gender. 2. Show mean difference by country 3. t-test of difference (unadjusted) Les Diablerets, Feb 1-4, 215 4-73

Colombia Chile Costa Rica Luxembourg Liechtenstein Austria Peru Japan Italy Brazil South Korea Ireland Spain Hong Kong Argentina Tunisia Germany Denmark Mexico Turkey New Zealand Uruguay Israel United Kingdom Portugal Hungary Czech Republic Croatia Serbia Switzerland Australia France USA Vietnam Slovakia Netherlands Canada Greece Netherlands Belgium Canada Greece Belgium Taiwan Estonia Norway Indonesia Poland Romania China Slovenia Montenegro Albania Finland Kazakhstan Russia Lithuania Bulgaria Sweden Latvia Singapore Iceland United Arab Emirates Malaysia Thailand Qatar Jordan Gender & Math Colombia Chile Costa Rica Luxembourg Liechtenstein Austria Switzerland USA Les Diablerets, Feb 1-4, 215 5-73 3 25 2 15 1 5 5 1 15 2 25 3 Math Score Gap Significant female male none United Arab Emirates Malaysia Thailand Qatar Jordan Gender & Math Surprise! Les Diablerets, Feb 1-4, 215 7-73 Uruguay Israel United Kingdom Portugal Hungary Czech Republic Croatia Serbia Switzerland Australia France USA Vietnam Slovakia Netherlands Canada Greece Belgium Taiwan Estonia Norway Indonesia Poland Romania China Slovenia Montenegro Albania Finland Kazakhstan Russia Lithuania Bulgaria Sweden Latvia Singapore Iceland United Arab Emirates Malaysia Thailand Qatar Jordan Albania Colombia Peru Chile Japan Costa Rica South Korea Mexico United Kingdom Hong Kong Liechtenstein USA Indonesia Netherlands Brazil Ireland Spain China Luxembourg Vietnam Denmark Belgium Tunisia Taiwan Singapore Australia Argentina New Zealand Uruguay Hungary Austria Switzerland Canada Kazakhstan Czech Republic Italy Russia Malaysia Portugal Romania Slovakia Israel Turkey Poland Germany France Estonia Norway Serbia Croatia Sweden Greece Latvia Iceland Thailand Lithuania Finland United Arab Emirates Montenegro Slovenia Bulgaria Qatar Jordan China Sweden 3 25 2 15 1 5 5 1 15 2 25 3 Math Score Gap Gender & Math Les Diablerets, Feb 1-4, 215 6-73 Gender & Reading Surprised again Les Diablerets, Feb 1-4, 215-73 4 Reading Score Gap 8 8 6 2

In reading scores, the world is PINK! In EVERY COUNTRY tested GIRLS score significantly BETTER THAN BOYS. Why don t we talk about the gender gap in reading? "The greatest value of a picture is when it forces us to notice what we never expected to see." Les Diablerets, Feb 1-4, 215 9-73 Les Diablerets, Feb 1-4, 215 1-73 Show the data." Examples OECD PISA 212 CO 2 Emissions and Climate Change Sports: tennis Les Diablerets, Feb 1-4, 215 11-73 Les Diablerets, Feb 1-4, 215 12-73

Data analysis Exploratory data analysis Relax the focus on the problem statement and explore broadly different aspects of the data. Problem statement Data preparation Let the data inform = show the data Exploratory data analysis Quantitative analysis Facilitated by ~interactive~ graphics Time to play in the sand! Presentation Les Diablerets, Feb 1-4, 215 Good pictures of the data provide unexpected insights, or can confirm expectations 13-73 EDA vs IDA Les Diablerets, Feb 1-4, 215 14-73 Data snooping EDA and IDA (Chatfield; Crowder & Hand), although not entirely distinct, differ in emphasis. Fundamental to EDA is the desire to let the data inform us, to approach the data without pre-conceived hypotheses, so that we may discover unexpected features. Of course, some of the unexpected features may be errors in the data. IDA emphasizes finding these errors by checking the quality of data prior to formal modeling. Because EDA is very graphical, it sometimes gives rise to a suspicion that patterns in the data are being detected and reported that are not really there. This is called data snooping. Validation of findings, by all means possible, is an important component of data analysis. IDA is much more closely tied to inference than EDA: Problems with the data that violate the assumptions required for valid inference need to be discovered and fixed early. Les Diablerets, Feb 1-4, 215 15-73 Les Diablerets, Feb 1-4, 215 16-73

Data snooping In our experience, false discovery is the lesser danger when compared to nondiscovery. Nondiscovery is the failure to identify meaningful structure, and it may result in false or incomplete modeling. In a healthy scientific enterprise, the fear of nondiscovery should be at least as great as the fear of false discovery. (Buja s remark on discussion on Koschat & Swayne (1996)) PISA 212 Process Files distributed as an external representation of an R object (.rda file), dictionary files=variable coding. Read in all files to R, browse size of files, dictionary of items, make some tables/pictures Make a list of questions to explore Read map data from rworldmap package, coordinate names for all countries in OECD data, and merge polygons with subset of variables Les Diablerets, Feb 1-4, 215 17-73 Les Diablerets, Feb 1-4, 215 18-73 > head(dict_student212, 2) Example variable description 1 CNT Country code 3-character 2 SUBNATIO Adjudicated sub-region code 7-digit code (3-digit country code + region ID + stratum ID) 3 STRATUM Stratum ID 7-character (cnt + region ID + original stratum ID) 4 OECD OECD country 5 NC National Centre 6-digit Code 6 SCHOOLID School ID 7-digit (region ID + stratum ID + 3-digit school ID) 7 STIDSTD Student ID 8 ST1Q1 International Grade 9 ST2Q1 National Study Programme 1 ST3Q1 Birth - Month 11 ST3Q2 Birth -Year 12 ST4Q1 Gender 13 ST5Q1 Attend <ISCED > 14 ST6Q1 Age at <ISCED 1> 15 ST7Q1 Repeat - <ISCED 1> 16 ST7Q2 Repeat - <ISCED 2> 17 ST7Q3 Repeat - <ISCED 3> 18 ST8Q1 Truancy - Late for School 19 ST9Q1 Truancy - Skip whole school day 2 ST115Q1 Truancy - Skip classes within school day > tail(dict_student212, 5) Example variable description 631 W_FSTR8 FINAL STUDENT REPLICATE BRR-FAY WEIGHT8 632 WVARSTRR RANDOMIZED FINAL VARIANCE STRATUM (1-8) 633 VAR_UNIT RANDOMLY ASSIGNED VARIANCE UNIT 634 SENWGT_STU Senate weight - sum of weight within the country is 1 635 VER_STU Date of the database creation Les Diablerets, Feb 1-4, 215 19-73 Les Diablerets, Feb 1-4, 215 2-73

Example > sort(table(student212$cnt)) Liechtenstein Connecticut (USA) Massachusetts (USA) Perm(Russian Federation) 293 1697 1723 1761 Florida (USA) Iceland New Zealand Latvia 1896 358 4291 436 Tunisia Netherlands Costa Rica Poland 447 446 462 467 France Lithuania Hong Kong-China Slovak Republic 4613 4618 467 4678 Russian Federation Luxembourg Bulgaria Uruguay 5231 5258 5282 5315 Czech Republic Macao-China Singapore Indonesia 5327 5335 5546 5622 Portugal Kazakhstan Argentina Slovenia 5722 588 598 5911 Peru Chinese Taipei Japan Thailand 635 646 6351 666 Chile Jordan Denmark Belgium 6856 738 7481 8597 Finland Colombia Qatar Switzerland 8829 973 1966 11229 United Arab Emirates United Kingdom Australia Brazil 11 12659 14481 1924 Canada Spain Italy Mexico 21544 25313 3173 3386 Compare plausible values Les Diablerets, Feb 1-4, 215 21-73 Les Diablerets, Feb 1-4, 215 22-73 Compare plausible values for Switzerland Do ple weights matter? Les Diablerets, Feb 1-4, 215 23-73 Les Diablerets, Feb 1-4, 215 24-73

Questions Is the gender gap evident in the math scores? Do students work hard outside school hours? And does this pay off in better scores? Does truancy affect performance? Do more household possessions mean better performance? Do parents matter? Expectations Gender gap: boys are better at math USA doesn t perform well compared to other countries. I would expect parents, time studying and socioeconomic status to be important influences on performance. Les Diablerets, Feb 1-4, 215 25-73 Look at the individuals, not just summary statistics Les Diablerets, Feb 1-4, 215 26-73 Individuals Test scores range from -1. Gaps for math were at most 3 points. For reading at most 8 points. Differences are tiny. Les Diablerets, Feb 1-4, 215 27-73 Les Diablerets, Feb 1-4, 215 28-73

Individuals Even countries with a math gender gap, sometimes have a girl who scored the top mark. Czech Republic USA Individuals And although there is a gender gap in reading in all countries, individually some top reading scores are obtained by boys. France USA Jordan Colombia Uruguay Uruguay Les Diablerets, Feb 1-4, 215 29-73 Les Diablerets, Feb 1-4, 215 3-73 Hours spent out studying out of school time and average math score by country Les Diablerets, Feb 1-4, 215 31-73 Les Diablerets, Feb 1-4, 215 32-73

Reported truancy and average math score by country Les Diablerets, Feb 1-4, 215 33-73 Les Diablerets, Feb 1-4, 215 34-73 Parents in the home and average math score by country Les Diablerets, Feb 1-4, 215 35-73 Les Diablerets, Feb 1-4, 215 36-73

Possessions and average math score by country Les Diablerets, Feb 1-4, 215 37-73 Les Diablerets, Feb 1-4, 215 38-73 Peru Bulgaria Thailand Brazil Vietnam Colombia Uruguay Indonesia Peru Bulgaria Thailand Brazil Vietnam Colombia Uruguay Indonesia 6 6 3 3 Liechtenstein Mexico Tunisia Croatia Turkey Slovakia Sweden Portugal Liechtenstein Mexico Tunisia Croatia Turkey Slovakia Sweden Portugal 6 6 3 3 Malaysia Kazakhstan Argentina Denmark Hong Kong Costa Rica Chile Jordan Malaysia Kazakhstan Argentina Denmark Hong Kong Costa Rica Chile Jordan Average math score 6 3 6 3 6 3 6 Number of TVs and Montenegro Serbia Iceland UAE Greece Russia Japan China average math score by Spain Poland Israel Romania Norway Macao China Lithuania Estonia country Latvia Slovenia Qatar Singapore Albania USA Italy Canada Average math score 6 3 6 3 6 3 6 Montenegro Serbia Iceland UAE Greece Russia Japan China Spain Poland Israel Romania Norway Macao China Lithuania Estonia Latvia Slovenia Qatar Singapore Albania USA Italy Canada 3 3 Germany Finland Belgium UK Austria South Korea Netherlands Ireland Germany Finland Belgium UK Austria South Korea Netherlands Ireland 6 6 3 3 Australia Czech Republic Taiwan Luxembourg France Switzerland New Zealand Hungary Australia Czech Republic Taiwan Luxembourg France Switzerland New Zealand Hungary 6 6 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 TVs Les Diablerets, Feb 1-4, 215 39-73 Les Diablerets, Feb 1-4, 215 4-73 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 TVs

Reported internet use by Swiss teens Browse the Internet for fun Social networks Chat on line Download music Use email Obtain practical information from the Internet Read news Internet for school Homework Email students ColLabourative games. Share school material One player games. Upload content Download from School Announcements Email teachers 1 2 3 4 Average usage Never or hardly ever =! Once or twice a month =1! Once or twice a week = 2! Almost every day = 3! Every day = 4 Les Diablerets, Feb 1-4, 215 41-73 Your Turn Take two minutes to brainstorm with your neighbors. Based on the data, what other things would you like to investigate? What calculations, tables, plots would you make to tackle these questions? What we learned Gender gap is not universal in math. Gender gap is universal in reading. Individual differences trump everything else. Parents matter! Mothers more than fathers! Socioeconomic status matters. Turn the TV off, if you live in a developed country! Something s funny about Albania s data. Les Diablerets, Feb 1-4, 215 42-73 Graphics choices Sorting! Ordered dot plots Line plots Histograms Showing stats vs all values Uncertainty - barcharts of category counts Les Diablerets, Feb 1-4, 215 43-73 Les Diablerets, Feb 1-4, 215 44-73

Climate change CO2 alarm bells? Planet s CO 2 level reaches ppm for first time in human existence. "It's not a question of if air capture technology will be adopted; it's a question of when," said Klaus Lackner "In the last 8 years, as CO 2 emissions have most rapidly escalated, the annual rate of climate-related deaths worldwide fell by an incredible 98%. This means the incidence of death from climate is 5 times lower than it was 8 years ago," writes Epstein. Les Diablerets, Feb 1-4, 215 45-73 Les Diablerets, Feb 1-4, 215 46-73 Climate change Data available at http://scrippsco2.ucsd.edu > head(co2.all) date time day decdate n flg co2 lat lon stn 1 1997-1-21 13:4 35451.57 1997.56 4 365.82 23.3-11.2 bcs 2 1997-2-8 15:3 35469.63 1997.16 2 365.57 23.3-11.2 bcs 3 1997-2-22 15:7 35483.63 1997.144 2 366.28 23.3-11.2 bcs 4 1997-3-7 14:23 35496.6 1997.18 3 367.85 23.3-11.2 bcs 5 1997-3-22 14:16 35511.59 1997.221 2 365.28 23.3-11.2 bcs 6 1997-4-5 13:37 35525.57 1997.259 3 368.96 23.3-11.2 bcs Climate change I expected: CO2 concentrations to be flattening over time Raw data to look like raw data - messy Les Diablerets, Feb 1-4, 215 47-73 Les Diablerets, Feb 1-4, 215 48-73

stn stn stn CO2 concentrations at several measuring stations Order is from north (top) to south (bottom) N CO2 concentrations at several measuring stations Order is from north (top) to south (bottom) CO2 concentrations at several measuring stations Order is from north (top) to south (bottom) stn CO2 (ppm) CO2 (ppm) Range of CO2 seasonal cycle stn CO2 (ppm) Range (ppm) 196 197 198 199 2 21 Year S 196 198 2 year Les Diablerets, Feb 1-4, 215 49-73 196 197 198 199 2 21 Year Les Diablerets, Feb 1-4, 215 5-73 CO2 (ppm) CO2 concentrations at several measuring stations Order is from north (top) to south (bottom) Range (ppm) Range of CO2 seasonal cycle CO2 (ppm) stn Minimum and maximum for each year and station, show that they are stn effectively measuring the e product 196 198 2 year 196 197 198 199 2 21 Year 196 197 198 199 2 21 Year Les Diablerets, Feb 1-4, 215 51-73 Les Diablerets, Feb 1-4, 215 52-73

Residuals from a linear model fit indicate the rate of increase is exponential. 1 year 21 co2 2 199 198 34 1 1 1 196 197 198 199 2 21 196 197 198 199 2 21 196 197 198 199 2 21 1 1 1 1 Strange year is 1978, and lack measurements for summer. 1 1 1 resid 38 For station,, most northerly station, examine monthly trend. Drop in CO2 occurs every year in June. resid 1 1 1 1 1 1 32 1 1 4 7 1 1 month 1 date 1 Change aspect ratio and anomalies at different sites are visible. 1 1 Les Diablerets, Feb 1-4, 215 53-73 1 196 197 198 199 date 2 21 Les Diablerets, Feb 1-4, 215 54-73 Les Diablerets, Feb 1-4, 215 56-73 Climate change Surprises: CO2 concentrations increasing consistently. Northern stations show seasonality. Data looks very regular. Every recording station essentially producing e values, ignoring seasonality, with a few slight anomalies. Les Diablerets, Feb 1-4, 215 55-73

Climate change What s happening near Les Diablerets? Global Historical Climatological Network data Station: Col du Grand St Bernard Minimum/maximum temperature and precipitation ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt Les Diablerets, Feb 1-4, 215 57-73 Les Diablerets, Feb 1-4, 215 58-73 temperature ( o C) temp 2 2 1 2 3 4 5 6 7 8 9 1 11 12 factor(month) Maximum daily temperature Seasonal trend visible from plotting temperature by month Temperature does NOT reach freezing in SUMMER occasionally temp 2 2 1 2 3 4 5 6 7 8 9 1 11 12 factor(month) Maximum daily temperature Plot of temperature by year reveals a big gap in measurements! May need to focus on one period or the other. Les Diablerets, Feb 1-4, 215 59-73 Les Diablerets, Feb 1-4, 215 6-73

temp 2 2 1 2 3 4 5 6 7 8 9 1 11 12 factor(month) Maximum daily temperature Increasing? temp 2 2 1 2 3 4 5 6 7 8 9 1 11 12 factor(month) Maximum daily temperature Change aspect ratio for studying season effects 4 4 3 3 2 2 1 1 Similar pattern with minimum temperature? Les Diablerets, Feb 1-4, 215 61-73 Minimum daily temperature double dip winters, Les Diablerets, Feb 1-4, 215 62-73 Precipitation (sqrt scale) Small seasonal trend visible from plotting precip by month - increase Feb-Jun. -1-2 -3 Why are there so many missing values in the last decade? -4 Les Diablerets, Feb 1-4, 215 63-73 Les Diablerets, Feb 1-4, 215 64-73

Precipitation (sqrt scale) Small gap in measurement collections. Change to only look at latter measurements. And compute monthly totals (adjusted for missing values). Precipitation (sqrt scale) Are extremes becoming more common? Les Diablerets, Feb 1-4, 215 65-73 Precipitation (sqrt scale) Les Diablerets, Feb 1-4, 215 66-73 Your Turn Take two minutes to brainstorm with your neighbors. Based on what we have seen with min/max temperature and precip what might you do to find additional information that supports or refutes? What data, calculations, plots would you make? What happened in 212? Les Diablerets, Feb 1-4, 215 67-73 Les Diablerets, Feb 1-4, 215 68-73

Summary Pulling data to check things for yourself is so much easier today. One of the most useful skills you can attain! Graphics for exploratory data analysis are ephemeral. Once created, once things are learned, they evaporate, and need to be replaced with something overly produced and beautiful for general consumption. Les Diablerets, Feb 1-4, 215 69-73 Les Diablerets, Feb 1-4, 215 7-73 Acknowledgements Code is available at: https://github.com/dicook/lesdiablerets-code Software used: R (http://www.r-project.org) and primarily the package ggplot2 Les Diablerets, Feb 1-4, 215 71-73 Les Diablerets, Feb 1-4, 215 72-73

Do you like tennis? Coming next Data doesn t come with only two variables. How do you see in high dimensions? How can you quantify if what we see is real? Les Diablerets, Feb 1-4, 215 73-73