Modeling Like It Matters. Ty Ferre Hydrology and Atmospheric Sciences University of Arizona Tucson

Size: px
Start display at page:

Download "Modeling Like It Matters. Ty Ferre Hydrology and Atmospheric Sciences University of Arizona Tucson"

Transcription

1 Modeling Like It Matters Ty Ferre Hydrology and Atmospheric Sciences University of Arizona Tucson 1

2 Science Psychology The Game 2

3 The Scientific Method Observe Predict Theorize Hypothesize 3

4 Changing Your Mind Observe Predict Theorize Hypothesize Opportunity for lateral thinking 4

5 Predict Scientific Progress Observe Observe Theorize Observe Predict Theorize Hypothesize Observe Predict Theorize Hypothesize Predict Hypothesize Hypothesize Theorize 5

6 The Path of Science Observe Predict Observe Theorize Observe Predict Hypothesize Theorize Predict Observe Theorize Hypothesize Predict Hypothesize Theorize Hypothesize 6

7 Predict The Paths of Science Observe Observe Predict TheorizeObserve Observe Predict Hypothesize Predict Theorize Observe Theorize Hypothesize Observe Hypothesize Predict Hypothesize Theorize Predict Observe Theorize Hypothesize Predict Hypothesize Theorize Theorize Hypothesize 7

8 Predict Changing Your Mind Observe Observe Predict TheorizeObserve Observe Predict Hypothesize Predict Theorize Observe Theorize Hypothesize Observe Hypothesize Predict Hypothesize Theorize Predict Observe Theorize Hypothesize Predict Hypothesize Theorize Opportunity for lateral thinking Hypothesize Theorize 8

9 The Paths of Science Time 9 Model Space???

10 Splitting, Dying, Merging, Emerging Model Space Time 10

11 Visualizing Current Model Space 11 Model Space Model Space Time

12 Single Model Space (Discrete and Continuous Differences) Concepts Processes Structures Parameterizations Solution Schemes Parameter Values Boundary Conditions Space/Time Resolutions Model Space Model Space 12

13 Multi-Model Space Concepts Processes Structures Parameterizations Solution Schemes Parameter Values Boundary Conditions Space/Time Resolutions Model Space Model Space 13

14 Assessing Model Quality (Focus on A Few Models) Observe Predict Theorize Hypothesize Model Space 14

15 Assessing Model Quality Observe Predict Theorize Hypothesize Predicted Value One of all possible things that you could measure and that the models can predict. Observable Quantity 15

16 Assessing Model Quality Observe Predict Theorize Hypothesize Predicted Value Measured value Observable Quantity 16

17 Model Likelihood L = f(1/mismatch) ΣL = 1 Predicted Value Likelihood Sorted Models Observable Quantity 17

18 Model Likelihood L = f(1/mismatch) ΣL = 1 zero sum game Predicted Value Likelihood Sorted Models Observable Quantity 18

19 Model Likelihood L = f(1/mismatch) ΣL = 1 Predicted Value Likelihood Sorted Models Observable Quantity 19

20 Data Quality Predicted Value One of all possible things that you could measure and that the models can predict. Observable Quantity 20

21 Data Quality Predicted Value One of all possible things that you could measure and that the models can predict. Observable Quantity 21

22 Data Quality Better/Worse Data Likelihood Sorted Models Predicted Value We can assess their likely value BEFORE we collect the measurement. Observable Quantity 22

23 Discriminatory Data Better/Worse Data? Predicted Value Observable Quantity 23

24 Discriminatory Data Better/Worse Data? Depends on your objective Predicted Value Observable Quantity 24

25 Discriminatory Index DI = f(intergroup difference/ intragroup spreads) Better/Worse Data Even Worse Data Predicted Value Observable Quantity 25

26 Discriminatory Index DI = f(intergroup difference / intragroup variance) Better/Worse Data Predicted Value Observable Quantity 26

27 Model Importance Predicted Value Forecastable Quantity 27

28 Predictions of Interest Predicted Value Stakeholder 1 Stakeholder 2 Stakeholder 3 Forecastable Quantity 28

29 Outcomes of Concern Predicted Value Stakeholder 1 Stakeholder 2 Stakeholder 3 Forecastable Quantity 29

30 Models vs Outcomes of Concern Predicted Value Stakeholder 1 Forecastable Quantity 30

31 Models of Concern / Other Models Predicted Value Stakeholder 1 Forecastable Quantity 31

32 Discriminatory Data for a Stakeholder s Models of Concern Better/Worse Data Even Worse Data Predicted Value Predicted Value Observable Quantity Stakeholder 1 32 Forecastable Quantity 32

33 Models of Concern / Other Models Predicted Value Stakeholder 2 Forecastable Quantity 33

34 Decision Confidence Index Likelihood Models of Concern Other Models Sorted Models 34

35 Decision Confidence Index LOW CONFIDENCE Likelihood Models of Concern Other Models Sorted Models 35

36 Decision Confidence Index HIGH CONFIDENCE but unhappy Likelihood Models of Concern Other Models Sorted Models 36

37 Decision Confidence Index HIGH CONFIDENCE and happy Likelihood Models of Concern Other Models Sorted Models 37

38 Summary of Talk One Multiple plausible models can be built for any system; 38

39 Summary of Talk One Multiple plausible models can be built for any system; Model likelihoods can be defined based on fit to data; 39

40 Summary of Talk One Multiple plausible models can be built for any system; Model likelihoods can be defined based on fit to data; The expected value of data for any prediction(s) of interest can be assessed before the data are collected; 40

41 Summary of Talk One Multiple plausible models can be built for any system; Model likelihoods can be defined based on fit to data; The expected value of data for any prediction(s) of interest can be assessed before the data are collected; Stakeholders must define the prediction(s) of interest, which will identify the models of concern; 41

42 Summary of Talk One Multiple plausible models can be built for any system; Model likelihoods can be defined based on fit to data; The expected value of data for any prediction(s) of interest can be assessed before the data are collected; Stakeholders must define the prediction(s) of interest, which will identify the models of concern; Decision support relies on segregating eithermodels of concern or other models as high likelihood. 42

43 Talk Two: Time for Some Psychology Predicted Value Stakeholder 2 Forecastable Quantity 43

44 Stakeholders Must Drive the Process Each stakeholder must define their outcomes of concern. Predicted Value Stakeholder 2 Forecastable Quantity 44

45 Stakeholders Must Drive the Process (Stakeholders are Human) Predicted Value It is not a scientist s job to tell a stakeholder what to care about. Forecastable Quantity 45

46 Outcomes of Concern Are Biased (Stakeholders are Human) esp. loss aversion Predicted Value Forecastable Quantity 46

47 Modeling Is Biased (Modelers are Human) esp. confirmation bias Model Space Modeler 1 Modeler 2 Modeler 3 Time 47

48 Embrace the Bias It is not a scientist s job to inject false objectivity but, rather, to test hypotheses especially those that matter to stakeholders. Model Space Modeler 1 Time 48

49 Develop Plausible Models of Concern Predicted Value Stakeholders, with their experts (modelers), are best able to identify their outcomes of concern and find discriminatory data. Model Space Time Forecastable Quantity 49

50 An Ensemble of Diverse Biased Models Is Not Biased Stakeholder 1 Stakeholder 2 Stakeholder 3 Model Space Time Each stakeholder s modelers should be encouragedto form plausible models of concern to address their interests. 50

51 An Ensemble of Diverse Biased Models Is Not Biased Stakeholder 1 Stakeholder 2 Stakeholder 3 Models of Concern Other Models Model Space Likelihood Time Sorted Models Each stakeholder s modelers should be encouragedto form plausible models of concern to address their interests. Decisions should be made using a combined model ensemble. 51

52 View the Model Ensemble As a Team of Rivals Likelihood Likelihood Sorted Models Models of Concern Other Models Sorted Models Where advisors (models) agree, a clear decision can be made. 52

53 View the Model Ensemble As a Team of Rivals Likelihood Models of Concern Other Models Sorted Models Where advisors (models) agree, a clear decision can be made. Where models disagree, more information (data) is needed. 53

54 Summary of Talk Two Decision-making is inherently biased; 54

55 Summary of Talk Two Decision-making is inherently biased; Model construction is inherently biased; We should embrace bias to form a diverse ensemble of biased models to consider all stakeholders concerns; 55

56 Summary of Talk Two Decision-making is inherently biased; Model construction is inherently biased; We should embrace bias to form a diverse ensemble of biased models to consider all stakeholders concerns; Discriminatory data should be chosen collectively. 56

57 The Game 57 Higher Head Lower Head Higher Head Lower Head E Lower K Higher K S Lower K Higher K Higher Head Lower Head 40% B Lower K Higher K 60% 70%

58 The Game 58 Higher Head Lower Head Higher Head Lower Head E Lower K Higher K S Lower K Higher K Higher Head Lower Head 40% K eff = 0.04 B Lower K Higher K 60% K eff = % K eff = 0.46

59 59 Normalized K eff

60 60 Normalized K eff

61 Most time spent 61 Normalized Keff

62 90% Percolating 10% 62

63 90% Harmonicish 10% 63

64 90% Harmonicish 10% 64

65 Inferred K eff 0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 R² = 0, ,1 0,2 0,3 0,4 0,5 True K eff

66

67 1/

68 1/ /

69 1/ / /

70 Summary of Talk Three Use models to train your intuition; 70

71 Summary of Talk Three Use models to train your intuition; Students rock. 71

72 Summary of the Three Talks A scientist s job is to propose different plausible models; Then to find discriminatory data; This can be guided by stakeholder s interests; This requires trained intuition. 72

Henry Darcy Distinguished Lecture Series in Groundwater Science. presented by the National Ground Water Research and Educational Foundation

Henry Darcy Distinguished Lecture Series in Groundwater Science. presented by the National Ground Water Research and Educational Foundation Henry Darcy Distinguished Lecture Series in Groundwater Science presented by the National Ground Water Research and Educational Foundation 1 To foster interest and excellence in groundwater science and

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

More information

The Purpose of Hypothesis Testing

The Purpose of Hypothesis Testing Section 8 1A:! An Introduction to Hypothesis Testing The Purpose of Hypothesis Testing See s Candy states that a box of it s candy weighs 16 oz. They do not mean that every single box weights exactly 16

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions Probability versus Inference Behavior of Sample Proportions: Example Behavior of Sample Proportions: Conditions

More information

Precision Correcting for Random Error

Precision Correcting for Random Error Precision Correcting for Random Error The following material should be read thoroughly before your 1 st Lab. The Statistical Handling of Data Our experimental inquiries into the workings of physical reality

More information

STATISTICS 4, S4 (4769) A2

STATISTICS 4, S4 (4769) A2 (4769) A2 Objectives To provide students with the opportunity to explore ideas in more advanced statistics to a greater depth. Assessment Examination (72 marks) 1 hour 30 minutes There are four options

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Weather Analysis and Forecasting

Weather Analysis and Forecasting Weather Analysis and Forecasting An Information Statement of the American Meteorological Society (Adopted by AMS Council on 25 March 2015) Bull. Amer. Meteor. Soc., 88 This Information Statement describes

More information

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data

More information

Chapter 24. Comparing Means

Chapter 24. Comparing Means Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing

More information

Matter, Force, Energy, Motion, and the Nature of Science (NOS)

Matter, Force, Energy, Motion, and the Nature of Science (NOS) Matter, Force, Energy, Motion, and the Nature of Science (NOS) Elementary SCIEnCE Dr. Suzanne Donnelly Longwood University donnellysm@longwood.edu Welcome! Introductions What will we be doing this week?

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Communicating Forecast Uncertainty for service providers

Communicating Forecast Uncertainty for service providers Communicating Forecast Uncertainty for service providers Jon Gill Bureau of Meteorology, Australia Chair, WMO Expert Team on Communication Aspects of PWS International Symposium on Public Weather Services:

More information

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015 Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 12.00 14.45, July 2, 2015 Also hand in this exam and your scrap paper. Always motivate your answers. Write your answers in

More information

Addition, Subtraction, Multiplication, and Division

Addition, Subtraction, Multiplication, and Division 5. OA Write and interpret numerical expression. Use parentheses, brackets, or braces in numerical expressions, and evaluate expressions with these symbols. Write simple expressions that record calculations

More information

Data Collection: What Is Sampling?

Data Collection: What Is Sampling? Project Planner Data Collection: What Is Sampling? Title: Data Collection: What Is Sampling? Originally Published: 2017 Publishing Company: SAGE Publications, Inc. City: London, United Kingdom ISBN: 9781526408563

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Flood Risk Forecasts for England and Wales: Production and Communication

Flood Risk Forecasts for England and Wales: Production and Communication Staines Surrey Flood Risk Forecasts for England and Wales: Production and Communication Jon Millard UEF 2015 : Quantifying and Communicating Uncertainty FFC What is the FFC? Successful partnership between

More information

I. Induction, Probability and Confirmation: Introduction

I. Induction, Probability and Confirmation: Introduction I. Induction, Probability and Confirmation: Introduction 1. Basic Definitions and Distinctions Singular statements vs. universal statements Observational terms vs. theoretical terms Observational statement

More information

CONTINUOUS RANDOM VARIABLES

CONTINUOUS RANDOM VARIABLES the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

Achilles: Now I know how powerful computers are going to become!

Achilles: Now I know how powerful computers are going to become! A Sigmoid Dialogue By Anders Sandberg Achilles: Now I know how powerful computers are going to become! Tortoise: How? Achilles: I did curve fitting to Moore s law. I know you are going to object that technological

More information

Observing Dark Worlds (Final Report)

Observing Dark Worlds (Final Report) Observing Dark Worlds (Final Report) Bingrui Joel Li (0009) Abstract Dark matter is hypothesized to account for a large proportion of the universe s total mass. It does not emit or absorb light, making

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Randomized Decision Trees

Randomized Decision Trees Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Decision Trees. Tirgul 5

Decision Trees. Tirgul 5 Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture 21 K - Nearest Neighbor V In this lecture we discuss; how do we evaluate the

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation

Contents. Decision Making under Uncertainty 1. Meanings of uncertainty. Classical interpretation Contents Decision Making under Uncertainty 1 elearning resources Prof. Ahti Salo Helsinki University of Technology http://www.dm.hut.fi Meanings of uncertainty Interpretations of probability Biases in

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which

More information

Foundations of Probability and Statistics

Foundations of Probability and Statistics Foundations of Probability and Statistics William C. Rinaman Le Moyne College Syracuse, New York Saunders College Publishing Harcourt Brace College Publishers Fort Worth Philadelphia San Diego New York

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Probabilities, Uncertainties & Units Used to Quantify Climate Change

Probabilities, Uncertainties & Units Used to Quantify Climate Change Name Class Probabilities, Uncertainties & Units Used to Quantify Climate Change Most of the global average warming over the past 50 years is very likely due to anthropogenic GHG increases How does the

More information

15-388/688 - Practical Data Science: Decision trees and interpretable models. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Decision trees and interpretable models. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Decision trees and interpretable models J. Zico Kolter Carnegie Mellon University Spring 2018 1 Outline Decision trees Training (classification) decision trees Interpreting

More information

What is Structural Equation Modelling?

What is Structural Equation Modelling? methods@manchester What is Structural Equation Modelling? Nick Shryane Institute for Social Change University of Manchester 1 Topics Where SEM fits in the families of statistical models Causality SEM is

More information

Uncertainty in future climate: What you need to know about what we don t know

Uncertainty in future climate: What you need to know about what we don t know Uncertainty in future climate: What you need to know about what we don t know Philip B. Duffy Climate Central, Inc. climatecentral.org More about my favorite subject (me) Climate research since 1990 Mostly

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Changing Biosphere: Lessons from the Past

Changing Biosphere: Lessons from the Past Unit 2: Causes of Mass Extinction Worksheet The biosphere has been devastated at least five times in the last 500 million years. During those five major mass extinctions, extinction rates for marine animal

More information

Machine Learning Recitation 8 Oct 21, Oznur Tastan

Machine Learning Recitation 8 Oct 21, Oznur Tastan Machine Learning 10601 Recitation 8 Oct 21, 2009 Oznur Tastan Outline Tree representation Brief information theory Learning decision trees Bagging Random forests Decision trees Non linear classifier Easy

More information

Bayesian inference: what it means and why we care

Bayesian inference: what it means and why we care Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)

More information

Chapter 13: Forecasting

Chapter 13: Forecasting Chapter 13: Forecasting Assistant Prof. Abed Schokry Operations and Productions Management First Semester 2013-2014 Chapter 13: Learning Outcomes You should be able to: List the elements of a good forecast

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Ockham Efficiency Theorem for Randomized Scientific Methods

Ockham Efficiency Theorem for Randomized Scientific Methods Ockham Efficiency Theorem for Randomized Scientific Methods Conor Mayo-Wilson and Kevin T. Kelly Department of Philosophy Carnegie Mellon University Formal Epistemology Workshop (FEW) June 19th, 2009 1

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

Introduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric

Introduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric CS 189 Introduction to Machine Learning Fall 2017 Note 5 1 Overview Recall from our previous note that for a fixed input x, our measurement Y is a noisy measurement of the true underlying response f x):

More information

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

Section 9.1 (Part 2) (pp ) Type I and Type II Errors Section 9.1 (Part 2) (pp. 547-551) Type I and Type II Errors Because we are basing our conclusion in a significance test on sample data, there is always a chance that our conclusions will be in error.

More information

Rigorous Evaluation R.I.T. Analysis and Reporting. Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish

Rigorous Evaluation R.I.T. Analysis and Reporting. Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish Rigorous Evaluation Analysis and Reporting Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish S. Ludi/R. Kuehl p. 1 Summarize and Analyze Test Data Qualitative data - comments,

More information

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we

More information

II. Frequentist probabilities

II. Frequentist probabilities II. Frequentist probabilities II.4 Statistical interpretation or calibration 1 II.4.1 What is statistical interpretation doing? 2 In light of a (non-perfect) forecast performance corrections are applied

More information

2. An Introduction to Moving Average Models and ARMA Models

2. An Introduction to Moving Average Models and ARMA Models . An Introduction to Moving Average Models and ARMA Models.1 White Noise. The MA(1) model.3 The MA(q) model..4 Estimation and forecasting of MA models..5 ARMA(p,q) models. The Moving Average (MA) models

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some

More information

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Selection Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Ensembles Assume we have an ensemble of classifiers with a well-chosen feature set. We want to optimize the competence of this

More information

Introduction to Artificial Intelligence. Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST

Introduction to Artificial Intelligence. Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST Introduction to Artificial Intelligence Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST Chapter 3 Uncertainty management in rule-based expert systems To help interpret Bayesian reasoning in expert

More information

Shaping the Earth s Surface (Weathering, Erosion and Deposition)

Shaping the Earth s Surface (Weathering, Erosion and Deposition) Shaping the Earth s Surface (Weathering, Erosion and Deposition) 6th grade Pre Sly Park Experience Activity Content Standards: NGSS 4 ESS2 1 Earth s Systems Make Observations and/or measurements to provide

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

2 The Way Science Works

2 The Way Science Works CHAPTER 1 Introduction to Science 2 The Way Science Works SECTION KEY IDEAS As you read this section, keep these questions in mind: How can you use critical thinking to solve problems? What are scientific

More information

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model Chapter Output Analysis for a Single Model. Contents Types of Simulation Stochastic Nature of Output Data Measures of Performance Output Analysis for Terminating Simulations Output Analysis for Steady-state

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

The Scientific Method

The Scientific Method The Scientific Method Mauna Kea Observatory, Hawaii Look up in the sky. How many stars do you see? Do they all look the same? How long have they been there? What are stars made of? 2 ENGAGE Orsola studies

More information

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers Statistical Inference Greg C Elvers 1 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population and not just the sample that we used But our sample

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

5.1/4.1 Scientific Investigation, Reasoning, and Logic Question/Answer Packet #1

5.1/4.1 Scientific Investigation, Reasoning, and Logic Question/Answer Packet #1 5.1/4.1 Scientific Investigation, Reasoning, and Logic Question/Answer Packet #1 The student will demonstrate an understanding of scientific reasoning, logic, and the nature of science by planning and

More information

Bayesian analysis in nuclear physics

Bayesian analysis in nuclear physics Bayesian analysis in nuclear physics Ken Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory Tutorials presented at LANSCE Los Alamos Neutron Scattering Center July 25 August

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

TRAITS to put you on the map

TRAITS to put you on the map TRAITS to put you on the map Know what s where See the big picture Connect the dots Get it right Use where to say WOW Look around Spread the word Make it yours Finding your way Location is associated with

More information

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society Enhancing Weather Information with Probability Forecasts An Information Statement of the American Meteorological Society (Adopted by AMS Council on 12 May 2008) Bull. Amer. Meteor. Soc., 89 Summary This

More information

A first model of learning

A first model of learning A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers

More information

Probability and Statistics

Probability and Statistics CHAPTER 5: PARAMETER ESTIMATION 5-0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 5: PARAMETER

More information

STAT 285 Fall Assignment 1 Solutions

STAT 285 Fall Assignment 1 Solutions STAT 285 Fall 2014 Assignment 1 Solutions 1. An environmental agency sets a standard of 200 ppb for the concentration of cadmium in a lake. The concentration of cadmium in one lake is measured 17 times.

More information

Introduction to Design of Experiments

Introduction to Design of Experiments Introduction to Design of Experiments Jean-Marc Vincent and Arnaud Legrand Laboratory ID-IMAG MESCAL Project Universities of Grenoble {Jean-Marc.Vincent,Arnaud.Legrand}@imag.fr November 20, 2011 J.-M.

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Predicting Climate Change

Predicting Climate Change Predicting Climate Change Dave Frame Climate Dynamics Group, Department of Physics, Predicting climate change Simple model of climate system, used as the basis of a probabilistic forecast Generating a

More information

CSC 411 Lecture 7: Linear Classification

CSC 411 Lecture 7: Linear Classification CSC 411 Lecture 7: Linear Classification Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 07-Linear Classification 1 / 23 Overview Classification: predicting

More information

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017 Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Math 224 Fall 2017 Homework 1 Drew Armstrong Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Section 1.1, Exercises 4,5,6,7,9,12. Solutions to Book Problems.

More information

GRADE 6 SCIENCE REVISED 2014

GRADE 6 SCIENCE REVISED 2014 QUARTER 1 Developing and Using Models Develop and use a model to describe phenomena. (MS-LS1-2) Develop a model to describe unobservable mechanisms. (MS-LS1-7) Planning and Carrying Out Investigations

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

SRI Briefing Note Series No.8 Communicating uncertainty in seasonal and interannual climate forecasts in Europe: organisational needs and preferences

SRI Briefing Note Series No.8 Communicating uncertainty in seasonal and interannual climate forecasts in Europe: organisational needs and preferences ISSN 2056-8843 Sustainability Research Institute SCHOOL OF EARTH AND ENVIRONMENT SRI Briefing Note Series No.8 Communicating uncertainty in seasonal and interannual climate forecasts in Europe: organisational

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information