Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Size: px
Start display at page:

Download "Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate"

Transcription

1 Announcements Announcements Lecture : Simple Linear Regression Statistics 1 Mine Çetinkaya-Rundel March 29, 2 Midterm 2 - same regrade request policy: On a separate sheet write up your request, describing what specifically you think you should have earned more points on. Do NOT write on your exam if you want it to be considered for a regrade. When you submit a regrade request your entire exam will be regraded, not just the question(s) mentioned in your request. Due at the beginning of class on Tuesday, April 3. Over the weekend: 3 page paper on working in teams (posted on the course website) and online quiz. Project 2 proposal: Due by midnight on April (Sunday) in a Google Doc. All relevant instructions posted on the course website. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 1 / 43 Recap Poverty vs. HS graduate rate Review question The scatterplot below shows the relationship between HS graduate rate in the 1 states in the US (including DC) and the % of residents who live below the poverty line (income below $22,3 for a family of 4). True or False: If the p-value is sufficiently large you can reject H A. (a) True (b) False 9 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 3 / 43

2 Response vs. explanatory Eyeballing the line Eyeballing the line Response variable is on the y-axis, and explanatory variable is on the x-axis. 9 Which of the following appears to be the line that best fits the linear relationship between % in poverty and % HS grad? Choose one. 9 (a) (b) (c) (d) Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 4 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Residuals Residuals Residuals Residuals (cont.) Residuals are the leftovers from the model fit: Data = Fit + Residual 9 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Residual Residual is the difference between the observed and predicted y. y^ y RI 4. y.44 e i = y i ŷ i 9 y^ DC % living in poverty in DC is.44% more than predicted. % living in poverty in RI is 4.% less than predicted. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 7 / 43

3 Describing the relationship Quantifying the relationship 9 The relationship between % in poverty and is linear negative somewhat strong - not a huge amount of scatter around the line describes the strength of the linear relationship between two variables. It takes values between -1 (perfect negative relationship) and +1 (perfect positive relationship). A value of indicates no relationship. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 9 / 43 Guessing the correlation Calculating the correlation Which of the following is the best guess for the correlation between % in poverty and? (a). (b) -.7 (c) -.1 (d).2 (e) Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Using computation: cor(poverty$poverty, poverty$graduates) Using a formula: R = 1 n 1 n i=1 x i x s x y i ȳ s y Note: You won t be asked you to calculate the correlation coefficient by hand, because nobody does it by hand. But you might be given a scatterplot and asked to guess the correlation. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 11 / 43

4 Guessing the correlation Assessing the correlation Which of the following is the best guess for the correlation between % in poverty and? Which of the following is has the strongest correlation, i.e. correlation coefficient closest to +1 or -1? (a).1 (b) -. (c) -.4 (d).9 (e). % female householder, no husband present (a) (c) (b) (d) Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 13 / 43 Best line Play the game! A measure for the best line istics.net/ stat/ correlations Group name: sta1 We want a line that has small residuals One option: Minimize the sum of magnitudes (absolute values) of residuals e 1 + e e n Another option: Minimize the sum of squared residuals e e e2 n The line that minimizes the sum of squared residuals is the least squares line Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 1 / 43

5 Best line Why minimize squares? The least squares line 1 Most commonly used 2 Easier to compute by hand and using software 3 In many applications, a residual twice as large as another is more than twice as bad Notation: Intercept: ŷ = β + β 1 x predicted y slope explanatory variable intercept Parameter: β Point estimate: b Slope: Parameter: β 1 Point estimate: b 1 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 17 / 43 Given... Slope Slope (x) (y) mean x =.1 ȳ = 11.3 sd s x = 3.73 s y = 3.1 correlation R =.7 The slope of the regression can be calculated as In context... b 1 = s y s x R b 1 = = Interpretation For each % point increase in HS graduate rate, we would expect the % living in poverty to decrease on average by.2% points. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 19 / 43

6 Intercept Clicker Intercept The intercept is where the regression line intersects the y-axis. The calculation of the intercept uses the fact the a regression line always passes through ( x, ȳ). ȳ = b + b 1 x intercept 2 4 ȳ = b + b 1 x b = 11.3 (.2).1 = 4. Which of the following is the correct interpretation of the intercept? (a) For each % point increase in HS graduate rate, % living in poverty is expected to increase on average by 4.%. (b) For each % point decrease in HS graduate rate, % living in poverty is expected to increase on average by 4.%. (c) Having no HS graduates leads to 4.% of residents living below the poverty line. (d) States with no HS graduates are expected on average to have 4.% of residents living below the poverty line. (e) In states with no HS graduates % living in poverty is expected to increase on average by 4.%. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 21 / 43 Regression line Interpretation of slope and intercept Interpreting regression line parameter estimates = Intercept: When x =, y is expected to equal the intercept. Slope: For each unit increase in x, y is expected to increase/decrease on average by the slope. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 22 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 23 / 43

7 Extrapolation Extrapolation Extrapolation Examples of extrapolation Applying a model estimate to values outside of the realm of the original data is called extrapolation. Sometimes the intercept might be an extrapolation intercept 2 4 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 24 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 2 / 43 Extrapolation Extrapolation Examples of extrapolation Examples of extrapolation Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 27 / 43

8 Conditions: (1) Linearity 1 Linearity 2 Nearly normal residuals 3 Constant variability The relationship between the explanatory and the response variable should be linear. Methods for fitting a model to non-linear relationships exist, but are beyond the scope of this class. Check using a scatterplot of the data, or a residuals plot. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 2 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 29 / 43 Anatomy of a residuals plot Conditions: (2) Nearly normal residuals 1 9 RI: = 1 =.3 % in poverty = =.4 e = % in poverty DC: =.3.4 = 4. = =. % in poverty = 4..2 = 11.3 e = % in poverty = =.44 frequency 2 4 The residuals should be nearly normal. This condition may not be satisfied when there are unusual observations that don t follow the trend of the rest of the data. Check using a histogram or normal probability plot of residuals residuals Sample Quantiles Normal Q Q Plot Theoretical Quantiles Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 3 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 31 / 43

9 Conditions: (3) Constant variability Checking conditions The variability of points around the least squares line should be roughly constant. This implies that the variability of residuals around the line should be roughly constant as well. Also called homoscedasticity. Check using a histogram or normal probability plot of residuals. What condition is this linear model obviously violating? (a) Constant variability (b) Linear relationship (c) Non-normal residuals (d) No extreme outliers Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 32 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 33 / 43 R 2 Checking conditions R 2 y g$residuals What condition is this linear model obviously violating? (a) Constant variability (b) Linear relationship (c) Non-normal residuals (d) No extreme outliers x y g$residuals x The strength of the fit of a linear model is most commonly evaluated using R 2. R 2 is calculated as the square of the correlation coefficient. It tells us what percent of variability in the response variable is explained by the model. The remainder of the variability is explained by variables not included in the model. For the model we ve been working with, R 2 =.2 2 =.3. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 34 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 3 / 43

10 Interpretation of R 2 R 2 Types of outliers Which of the below is the correct interpretation of R =.2, R 2 =.3? (a) 3% of the variability in the % of HG graduates among the 1 states is explained by the model. (b) 3% of the variability in the % of residents living in poverty among the 1 states is explained by the model. (c) 3% of the time uates predict % living in poverty correctly. (d) 2% of the variability in the % of residents living in poverty among the 1 states is explained by the model. How do(es) the outlier(s) influence the least squares line? To answer this question think of where the regression line would be with and without the outlier(s). 2 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 3 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 37 / 43 Types of outliers Some terminology Outliers are points that fall away from the cloud of points. How do(es) the outlier(s) influence the least squares line? Outliers that fall horizontally away from the center of the cloud are called leverage points. High leverage points that actually influence the slope of the regression line are called influential points Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 3 / 43 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 39 / 43

11 Influential points Data are available on the log of the surface temperature and the log of the light intensity of 47 stars in the star cluster CYG OB log(temp) log(light intensity) w/ outliers w/o outliers Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 4 / 43 Types of outliers Which of the below best describes the outlier? (a) influential (b) low leverage (c) high leverage (d) none of the above Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 41 / 43 Types of outliers Does this outlier influence the slope of the regression line? 1 Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 42 / 43 Recap Which of following is true? (a) Influential points always change the intercept of the regression line. (b) High leverage points always reduce R 2. (c) All outliers are influential points. (d) When the data set includes an influential point, the relationship between the explanatory variable and the response variable is always nonlinear. (e) None of the above. Statistics 1 (Mine Çetinkaya-Rundel) L: Simple Linear Regression March 29, 2 43 / 43

Announcements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory

Announcements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory Announcements Announcements Lecture : Relationship between Measurement Variables Statistics Colin Rundel February, 20 In class Quiz #2 at the end of class Midterm #1 on Friday, in class review Wednesday

More information

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables Announcements Announcements Unit : Simple Linear Regression Lecture : Introduction to SLR Statistics 1 Mine Çetinkaya-Rundel April 2, 2013 Statistics 1 (Mine Çetinkaya-Rundel) U - L1: Introduction to SLR

More information

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate Review and Comments Chi-square tests Unit : Simple Linear Regression Lecture 1: Introduction to SLR Statistics 1 Monika Jingchen Hu June, 20 Chi-square test of GOF k χ 2 (O E) 2 = E i=1 where k = total

More information

Lecture 16 - Correlation and Regression

Lecture 16 - Correlation and Regression Lecture 16 - Correlation and Regression Statistics 102 Colin Rundel April 1, 2013 Modeling numerical variables Modeling numerical variables So far we have worked with single numerical and categorical variables,

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

2. Outliers and inference for regression

2. Outliers and inference for regression Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Lecture 19: Inference for SLR & Transformations

Lecture 19: Inference for SLR & Transformations Lecture 19: Inference for SLR & Transformations Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 Announcements Announcements HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 2.6 Least squares line Interpreting coefficients Cautions Want More Stats??? If you have enjoyed learning how to analyze data, and want to

More information

Lecture 20: Multiple linear regression

Lecture 20: Multiple linear regression Lecture 20: Multiple linear regression Statistics 101 Mine Çetinkaya-Rundel April 5, 2012 Announcements Announcements Project proposals due Sunday midnight: Respsonse variable: numeric Explanatory variables:

More information

Stat 101: Lecture 6. Summer 2006

Stat 101: Lecture 6. Summer 2006 Stat 101: Lecture 6 Summer 2006 Outline Review and Questions Example for regression Transformations, Extrapolations, and Residual Review Mathematical model for regression Each point (X i, Y i ) in the

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate

More information

Announcements. Unit 7: Multiple linear regression Lecture 3: Confidence and prediction intervals + Transformations. Uncertainty of predictions

Announcements. Unit 7: Multiple linear regression Lecture 3: Confidence and prediction intervals + Transformations. Uncertainty of predictions Housekeeping Announcements Unit 7: Multiple linear regression Lecture 3: Confidence and prediction intervals + Statistics 101 Mine Çetinkaya-Rundel November 25, 2014 Poster presentation location: Section

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

7.0 Lesson Plan. Regression. Residuals

7.0 Lesson Plan. Regression. Residuals 7.0 Lesson Plan Regression Residuals 1 7.1 More About Regression Recall the regression assumptions: 1. Each point (X i, Y i ) in the scatterplot satisfies: Y i = ax i + b + ɛ i where the ɛ i have a normal

More information

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24 Midterm Exam Midterm Review AMS-UCSC May 6th, 2015 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 1 / 24 Topics Topics We will talk about... 1 Review Spring 2015. Session 1 (Midterm Review)

More information

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers Objectives 2.1 Scatterplots Scatterplots Explanatory and response variables Interpreting scatterplots Outliers Adapted from authors slides 2012 W.H. Freeman and Company Relationship of two numerical variables

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships Section 3.2 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Section 3.2

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

AP Statistics L I N E A R R E G R E S S I O N C H A P 7 AP Statistics 1 L I N E A R R E G R E S S I O N C H A P 7 The object [of statistics] is to discover methods of condensing information concerning large groups of allied facts into brief and compendious

More information

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus

More information

Summarizing Data: Paired Quantitative Data

Summarizing Data: Paired Quantitative Data Summarizing Data: Paired Quantitative Data regression line (or least-squares line) a straight line model for the relationship between explanatory (x) and response (y) variables, often used to produce a

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence

More information

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to: STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.

More information

Unit 7: Multiple linear regression 1. Introduction to multiple linear regression

Unit 7: Multiple linear regression 1. Introduction to multiple linear regression Announcements Unit 7: Multiple linear regression 1. Introduction to multiple linear regression Sta 101 - Fall 2017 Duke University, Department of Statistical Science Work on your project! Due date- Sunday

More information

Scatterplots and Correlation

Scatterplots and Correlation Bivariate Data Page 1 Scatterplots and Correlation Essential Question: What is the correlation coefficient and what does it tell you? Most statistical studies examine data on more than one variable. Fortunately,

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships Section 3.2 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Section 3.2

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section: Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Announcements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.

Announcements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power. Announcements Announcements Unit 3: Foundations for inference Lecture 3:, significance levels, sample size, and power Statistics 101 Mine Çetinkaya-Rundel October 1, 2013 Project proposal due 5pm on Friday,

More information

3.2: Least Squares Regressions

3.2: Least Squares Regressions 3.2: Least Squares Regressions Section 3.2 Least-Squares Regression After this section, you should be able to INTERPRET a regression line CALCULATE the equation of the least-squares regression line CALCULATE

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3: Examining Relationships 3.1 Scatterplots 3.2 Correlation 3.3 Least-Squares Regression Fabric Tenacity, lb/oz/yd^2 26 25 24 23 22 21 20 19 18 y = 3.9951x + 4.5711 R 2 = 0.9454 3.5 4.0 4.5 5.0

More information

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight AP Statistics Chapter 9 Re-Expressing data: Get it Straight Objectives: Re-expression of data Ladder of powers Straight to the Point We cannot use a linear model unless the relationship between the two

More information

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3. Nov 13 AP STAT 1. Check/rev HW 2. Review/recap of notes 3. HW: pg 179 184 #5,7,8,9,11 and read/notes pg 185 188 1 Chapter 3 Notes Review Exploring relationships between two variables. BIVARIATE DATA Is

More information

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line Chapter 7 Linear Regression (Pt. 1) 7.1 Introduction Recall that r, the correlation coefficient, measures the linear association between two quantitative variables. Linear regression is the method of fitting

More information

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

1. Create a scatterplot of this data. 2. Find the correlation coefficient. How Fast Foods Compare Company Entree Total Calories Fat (grams) McDonald s Big Mac 540 29 Filet o Fish 380 18 Burger King Whopper 670 40 Big Fish Sandwich 640 32 Wendy s Single Burger 470 21 1. Create

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement Occupy movement - Duke edition Lecture 14: Large sample inference for proportions Statistics 101 Mine Çetinkaya-Rundel October 20, 2011 On Tuesday we asked you about how closely you re following the news

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Chapter 2: Looking at Data Relationships (Part 3)

Chapter 2: Looking at Data Relationships (Part 3) Chapter 2: Looking at Data Relationships (Part 3) Dr. Nahid Sultana Chapter 2: Looking at Data Relationships 2.1: Scatterplots 2.2: Correlation 2.3: Least-Squares Regression 2.5: Data Analysis for Two-Way

More information

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers Objectives 2.1 Scatterplots Scatterplots Explanatory and response variables Interpreting scatterplots Outliers Adapted from authors slides 2012 W.H. Freeman and Company Relationships A very important aspect

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

SECTION I Number of Questions 42 Percent of Total Grade 50

SECTION I Number of Questions 42 Percent of Total Grade 50 AP Stats Chap 7-9 Practice Test Name Pd SECTION I Number of Questions 42 Percent of Total Grade 50 Directions: Solve each of the following problems, using the available space (or extra paper) for scratchwork.

More information

Bivariate data analysis

Bivariate data analysis Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green

More information

6.0 Lesson Plan. Answer Questions. Regression. Transformation. Extrapolation. Residuals

6.0 Lesson Plan. Answer Questions. Regression. Transformation. Extrapolation. Residuals 6.0 Lesson Plan Answer Questions Regression Transformation Extrapolation Residuals 1 Information about TAs Lab grader: Pontus, npl@duke.edu Hwk grader: Rachel, rmt6@duke.edu Quiz (Tuesday): Matt, matthew.campbell@duke.edu

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22 Announcements Announcements Lecture 1 - Data and Data Summaries Statistics 102 Colin Rundel January 13, 2013 Homework 1 - Out 1/15, due 1/22 Lab 1 - Tomorrow RStudio accounts created this evening Try logging

More information

Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company

Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company Looking at Data Relationships 2.1 Scatterplots 2012 W. H. Freeman and Company Here, we have two quantitative variables for each of 16 students. 1) How many beers they drank, and 2) Their blood alcohol

More information

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships Objectives 2.3 Least-squares regression Regression lines Prediction and Extrapolation Correlation and r 2 Transforming relationships Adapted from authors slides 2012 W.H. Freeman and Company Straight Line

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

BIVARIATE DATA data for two variables

BIVARIATE DATA data for two variables (Chapter 3) BIVARIATE DATA data for two variables INVESTIGATING RELATIONSHIPS We have compared the distributions of the same variable for several groups, using double boxplots and back-to-back stemplots.

More information

Chapter 7 Linear Regression

Chapter 7 Linear Regression Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line

More information

Scatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association

Scatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association Scatterplots STAT22000 Autumn 2013 Lecture 4 Yibi Huang October 7, 2013 21 Scatterplots 22 Correlation (x 1, y 1 ) (x 2, y 2 ) (x 3, y 3 ) (x n, y n ) A scatter plot shows the relationship between two

More information

The response variable depends on the explanatory variable.

The response variable depends on the explanatory variable. A response variable measures an outcome of study. > dependent variables An explanatory variable attempts to explain the observed outcomes. > independent variables The response variable depends on the explanatory

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2 3.2c Computer Output, Regression to the Mean, & AP Formulas Be sure you can locate: the slope, the y intercept and determine the equation of the LSRL. Slope is always in context and context is x value.

More information

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) In 2007, the number of wins had a mean of 81.79 with a standard

More information

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015 Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the

More information

Final Exam Details. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, / 24

Final Exam Details. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, / 24 Final Exam Details The final is Thursday, March 17 from 10:30am to 12:30pm in the regular lecture room The final is cumulative (multiple choice will be a roughly 50/50 split between material since the

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Statistics 100 Exam 2 March 8, 2017

Statistics 100 Exam 2 March 8, 2017 STAT 100 EXAM 2 Spring 2017 (This page is worth 1 point. Graded on writing your name and net id clearly and circling section.) PRINT NAME (Last name) (First name) net ID CIRCLE SECTION please! L1 (MWF

More information

Chapter 8. Linear Regression /71

Chapter 8. Linear Regression /71 Chapter 8 Linear Regression 1 /71 Homework p192 1, 2, 3, 5, 7, 13, 15, 21, 27, 28, 29, 32, 35, 37 2 /71 3 /71 Objectives Determine Least Squares Regression Line (LSRL) describing the association of two

More information

Linear Regression and Correlation. February 11, 2009

Linear Regression and Correlation. February 11, 2009 Linear Regression and Correlation February 11, 2009 The Big Ideas To understand a set of data, start with a graph or graphs. The Big Ideas To understand a set of data, start with a graph or graphs. If

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

Announcements. Final Review: Units 1-7

Announcements. Final Review: Units 1-7 Announcements Announcements Final : Units 1-7 Statistics 104 Mine Çetinkaya-Rundel June 24, 2013 Final on Wed: cheat sheet (one sheet, front and back) and calculator Must have webcam + audio on at all

More information

Ch Inference for Linear Regression

Ch Inference for Linear Regression Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean

More information

Introduction. ECN 102: Analysis of Economic Data Winter, J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, / 51

Introduction. ECN 102: Analysis of Economic Data Winter, J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, / 51 Introduction ECN 102: Analysis of Economic Data Winter, 2011 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, 2011 1 / 51 Contact Information Instructor: John Parman Email: jmparman@ucdavis.edu

More information

Chapter 5 Least Squares Regression

Chapter 5 Least Squares Regression Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

Nonlinear Regression Curve Fitting and Regression (Statcrunch) Answers to selected problems

Nonlinear Regression Curve Fitting and Regression (Statcrunch) Answers to selected problems Nonlinear Regression Curve Fitting and Regression (Statcrunch) Answers to selected problems Act 1&3 1. a) Exponential growth fits well. b) Statcrunch: Ln ( Y ) = 8.5061554 + 0.5017053 ( x ) Exponential

More information

Relationships Regression

Relationships Regression Relationships Regression BPS chapter 5 2006 W.H. Freeman and Company Objectives (BPS chapter 5) Regression Regression lines The least-squares regression line Using technology Facts about least-squares

More information

Have you... Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments. Readiness assessment

Have you... Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments. Readiness assessment Have you... Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments Statistics 101 Mine Çetinkaya-Rundel January 15, 2013 been placed into a team? successfully logged

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

Statistical View of Least Squares

Statistical View of Least Squares May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples

More information

Announcements. Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments. Statistics 101

Announcements. Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments. Statistics 101 Announcements Unit 1: Introduction to data Lecture 1: Data collection, observational studies, and experiments Statistics 101 Mine Çetinkaya-Rundel Duke University I m still waiting on a couple more Gmail

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable Chapter 08: Linear Regression There are lots of ways to model the relationships between variables. It is important that you not think that what we do is the way. There are many paths to the summit We are

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

STATS DOESN T SUCK! ~ CHAPTER 16

STATS DOESN T SUCK! ~ CHAPTER 16 SIMPLE LINEAR REGRESSION: STATS DOESN T SUCK! ~ CHAPTER 6 The HR manager at ACME food services wants to examine the relationship between a workers income and their years of experience on the job. He randomly

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Mrs. Poyner/Mr. Page Chapter 3 page 1

Mrs. Poyner/Mr. Page Chapter 3 page 1 Name: Date: Period: Chapter 2: Take Home TEST Bivariate Data Part 1: Multiple Choice. (2.5 points each) Hand write the letter corresponding to the best answer in space provided on page 6. 1. In a statistics

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

FinalExamReview. Sta Fall Provided: Z, t and χ 2 tables

FinalExamReview. Sta Fall Provided: Z, t and χ 2 tables Final Exam FinalExamReview Sta 101 - Fall 2017 Duke University, Department of Statistical Science When: Wednesday, December 13 from 9:00am-12:00pm What to bring: Scientific calculator (graphing calculator

More information

The following formulas related to this topic are provided on the formula sheet:

The following formulas related to this topic are provided on the formula sheet: Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.

More information

SIMPLE LINEAR REGRESSION STAT 251

SIMPLE LINEAR REGRESSION STAT 251 1 SIMPLE LINEAR REGRESSION STAT 251 OUTLINE Relationships in Data The Beginning Scatterplots Correlation The Least Squares Line Cautions Association vs. Causation Extrapolation Outliers Inference: Simple

More information

Section 5.4 Residuals

Section 5.4 Residuals Section 5.4 Residuals A residual value is the difference between an actual observed y value and the corresponding predicted y value, y. Residuals are just errors. Residual error = observed value predicted

More information