Module 19: Simple Linear Regression

Size: px
Start display at page:

Download "Module 19: Simple Linear Regression"

Transcription

1 Module 19: Simple Linear Regression This module focuses on simple linear regression and thus begins the process of exploring one of the more used and powerful statistical tools. Reviewed 11 May 05 /MODULE

2 Goldman-Tono-Pen Example An ophthalmologist who is assessing intraocular pressures as a part of a community program for the prevention of glaucoma is interested in using a portable device (Tono- Pen) for making these measurements. An important question is how well the measurements made with this device compare to those made with a more standard device (Goldman) used in clinical settings. To address this question, the ophthalmologist compared the two devices by using each on n = 40 eyes. For this comparison, each eye was measured once with each device. 19-2

3 Goldman-Tono-Pen Example Data ID Goldman T-Pen ID Goldman T-Pen

4 Comparing the two Devices One approach to comparing the two devices would be to do a paired t-test, which would be appropriate since the measurements made by the two devices on the same eyes could not be considered independent and since the differences between the two measurements are of interest. 19-4

5 Goldman-Tono-Pen Worksheet Goldman Tono-Pen Goldman Tono-Pen ID x = G y = T d d 2 ID x = G y = T d d Sum Mean

6 Goldman Tono-Pen ID X=G Y=T d=g-t N Sum Mean SD Sum 2 /n 15, , Sum(x 2 ) 16,497 15, SS s SE t = mean(d)/se(d) 1.58 df = n-1 39 t (39) 2.02 d

7 1. Hypothesis: H 0 : Δ = μ G - μ T = 0 vs. H 1 : Δ 0, 2. Assumptions: Differences are a random sample with normal distribution, 3. The α level: α = 0.05, 4. Test statistic: 5. The Rejection Region: Reject if t is not between ±t (39)= The Result: d t = = s n d / d 7. The conclusion: Accept H 0 : Δ = μ G - μ T = 0, since t is between ± d s n= 40, d = 0.8, s d = t = =

8 Hence, from this standpoint, we do not have compelling evidence that the two devices are measuring intra-ocular pressures differently. Is this a sufficient assessment of the situation, or should we look further? 19-8

9 Looking Further One way to look further at this situation is to think about the relationship between the measurements made by the two machines in terms of simple linear regression. In this context, we would wonder if higher values on one machine more directly imply higher values on the other. Simple linear regression focuses on a possible straight line relationship between the measurements made by the two machines. 19-9

10 Simple Linear Regression Concepts In general, simple linear regression finds the best straight line for describing the relationship between two variables. In its simplest form, which is what we consider here, it does not do a very good job of assessing how well the line describes the data, but nevertheless provides useful information

11 y-axis Dependent variable y = a + bx b units of y a 1 unit of x 0 x-axis Independent Variable a = Intercept, that is, the point where the line crosses the y-axis, which is the value of y at x = 0. b = Slope of the regression line, that is, the number of units of increase (positive slope) or decrease (negative slope) in y for each unit increase in x

12 The Regression Line l 3 l 5 Y dependent variable l 1 l 2 l X independent variable 19-12

13 14 12 d 3 Y dependent variable d 1 d2 d 4 d X independent variable 19-13

14 14 12 l 3 d 3 l 5 Y dependent variable l 1 d1 l 2 d 2 l 4 d 4 d X independent variable 19-14

15 The context for simple linear regression is that we have a random sample of persons from a set of well-defined populations, each defined by a specific value for x- variable. We have measurements of another variable, the y-variable so that we have two variables for each person. For simple linear regression, we focus on a straight line that depicts the relationship between these two variables. The best straight line is the one for which the sum of the squared vertical distances of each point from the line is the least. This "least squares" line has slope and intercept xy x y / n b = = 2 2 x ( x) / n a = y bx. SS ( xy) SS ( x), 19-15

16 For this situation, the sample line y = a + b x is an estimate of the population line Y = α + β x, and a and b are estimates of α and β respectively. For a specific value of x, such as x = 10, the value for y calculated from the regression equation is yˆ = a+ b( x= 10), which is called the regression estimate of Y at the value x =

17 Simple Regression Example The following data are diastolic blood pressure (DBP) measurements taken at different times after an intervention for n = 5 persons. For each person, the data available include the time of the measurement and the DBP level. Of interest is the relationship between these two variables

18 n Mean 3,310 22, Sum 1,320 4, , , , , xy y 2 y x 2 x Patient DPB Time

19 For the blood pressure data, x = y = the slope is 50 / 5 = 10, 338 / 5 = 67.6, xy x y / n b = = 2 2 x ( x) / n SS ( xy) SS ( x), and the intercept is b 3,310 (50)(338) / 5 = = (50) / 5 a = y bx, a = 67.6 ( 0.28)10 = 70.4 The best line is y = a + bx = x 19-19

20 Time DBP Patient x y Diastolic Blood Pressure y y = x Minutes x 19-20

21 Example: AJPH, Dec. 2003; 93:

22 19-22

23 Never Smoking Regression Worksheet Year (x) Female (y 1 ) Male (y 2 ) x 2 xy 1 xy 2 y y 2 2 Total Mean Sum Num b Denum b b a

24 For the never smoking data x = / 9 = y = / 9 = , female y = / 9 = male The slopes are b xy x y/ n SS( xy) = =, ( ) / ( ) 2 2 x x n SS x (( )(603.92)/9) b female = = (( ) / 9) (( )(580.51)/9) b male = = (( ) / 9)

25 The intercepts are a= y bx, a = (0.285* ) = female a = (0.871* ) = male The best lines are: y = a + b x= x female female female y = a + b x= x male male male 19-25

26 75 70 y female = x Percentage Never Smokers Female (Y1) Male (Y2) Female (Line) Male (Line) y male = x Year 19-26

27 Regression ANOVA If the regression line is flat in the sense that the regression estimate of Y, being ŷ, is the same for all values of x, then there is no gain from considering the x variable as it is having no impact on ŷ. This situation occurs when the estimated slope b = 0. An important question is whether or not the population parameter β = 0, that is, whether the truth is that there is no linear relationship between y and x. To test this situation, we can proceed with a formal test

28 1. The Hypothesis: H 0 : β = 0 vs H 1 : β 0 2. The α level: α = The assumptions: Random normal samples for y- variable from populations defined by x-variable 4. The test statistic: ANOVA Source df SS MS F Regression 1 SS(Reg ) SS(Reg )/1 MS(Reg )/MS(Res) Residual n-2 SS(Res ) SS(Res )/(n-2) Total n-1 SS(y) 5. The rejection region : Reject H 0 : β = 0 if the value calculated for F is greater than F 0.95 (1, n-2) 19-28

29 2 R = SS ( Reg) / SS ( Total) R 2 is the total amount of variation in the dependent variable y explained by its regression relationship with x

30 Blood Pressure Example SS( Total) = SS( y) = ( y y) (338 ) = 22,892 5 SS( Regression) = bss( xy) 2 2 = 43.2 = b{ xy x y/ n} = 0.28{3310 (50)(338) / 5} = 19.6 SS( Residual) = SS( Total) SS( Regression) = =

31 ANOVA Source df SS MS F Regression Residual Total H 0 : β = 0 vs H 1 : β 0 For α = 0.05 F 0.95(1,3) = 10.1, Hence accept H 0 : β = 0 2 SS ( Regression ) 19.6 R = = = SS ( Total ) 43.2 or 45.37% Note: The above hypothesis test does not asses how well the straight line fits the data

32 Goldman-Tono-Pen Example We can apply these tools to the Goldman-Tono-Pen example. Note that while we test the null hypothesis H 0 : β = 0, it is of little interest as it is not a very meaningful hypothesis

33 Goldman Tono-Pen Example Goldman T-Pen ID x = G y = T d d 2 G 2 T 2 GxT Sum ,497 15,352 15,

34 y ˆ = a + bx yˆ = x Create a new table 19-34

35 Goldman-Tono-Pen Example Tono-Pen y = x Goldman 19-35

36 Regression ANOVA Goldman Tono-Pen Example 1. The Hypothesis: H 0 : β = 0 vs H 1 : β 0 2. The Assumptions: Random samples, x measured without error, y normal distributed for each level of x 3. The α-level: α = The test statistic: ANOVA 5. The rejection region: Reject H 0 : β = 0, if F = MS(Re gression) F0.95(1,38) MS(Re sidual) >

37 6. The result: n = 40, SS(Regression) = SS(Residual) = SS(Total) = F 0.95(1,38) 4.08 ANOVA Source DF SS MS F Regression Residual Total The conclusion: Reject H 0 : β = 0 since >

38 Example: AJPH, Aug. 1999; 89:

39 19-39

40 State Y X Y 2 X 2 XY 1 AL AR AZ CA CO CT FL GA IA IN KS KY LA MA MD MI MN MO MS NC ND NH NJ NY OH OK OR PA RI SC TN TX UT , VA WA WI WV WY Total Mean SD r 0.7 slope 0.24 intercept 3.92 Value at

41 Percentage Peporting Fair or Poor Health Percentage Responding 'Most People Can't Be Trusted At x = 45, y = r = 0.70 y = x 19-41

42 Regression ANOVA Social Capital and Self-Rated Health Example 1. The Hypothesis: H 0 : β = 0 vs H 1 : β 0 2. The Assumptions: Random samples, x measured without error, y normal distributed for each level of x 3. The α-level: α = The test statistic: ANOVA 5. The rejection region: Reject H 0 : β = 0, if F MS( Regression) = > F0.95(1,36) MS( Residual)

43 6. The result: n = 38, SS(Regression) = SS(Residual) = SS(Total) = F (1,36) ANOVA Source DF SS MS F Regression Residual Total The conclusion: Reject H 0 : β = 0 since >

44 Example: AJPH, July 1999; 89:

45 19-45

46 Men Lifetime SES score Women Lifetime SES score Percentage with Poor Health Percentage with Poor Health

47 Socioeconomic Environment and Adult Health Example Men Women X X 2 Y Y 2 XY X X 2 Y Y 2 XY n X SD X: Lifetime socioeconomic status (SES) score Y : Percentage with Poor Health 19-47

48 Socioeconomic Environment and Adult Health Example Men SS(x) = SS(y) = SS(xy) = b = 1.38 a = r = SS(Reg) = SS(Res) = SS(Total) = Women SS(x) = SS(y) = SS(xy) = b = 1.56 a = r = SS(Reg) = SS(Res) = SS(Total) = yˆ = x yˆ = x M W 19-48

49 Socioeconomic Environment and Adult Health Example Men Women 1. The hypothesis: H 0 : β = 0 vs H 1 : β 0 H 0 : β = 0 vs H 1 : β 0 2. The assumptions: Random samples The same as that of men x measured without error y normal distributed for each level of x 3. The α-level : α = 0.05 α = The test statistic: ANOVA ANOVA 5. The rejection region: Reject H 0 : β = 0, if The same as that of men F MS ( Regression ) MS ( Residual ) = > F0.95(1, n 2)

50 Regression ANOVA Socioeconomic Environment and Adult Health Example 6. The result: ANOVA Men Source Regression Residual Total df SS MS F Create a new table Women df SS MS F The conclusion: Reject H 0 : β = 0 since F > F 0.95(1,11) =

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Lecture 26 Section 8.4. Mon, Oct 13, 2008 Lecture 26 Section 8.4 Hampden-Sydney College Mon, Oct 13, 2008 Outline 1 2 3 4 Exercise 8.12, page 528. Suppose that 60% of all students at a large university access course information using the Internet.

More information

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables Table of Contents Table 1: Summary of Life Safety Survey Results by State Table 2: Ten Most Frequently Cited Life Safety

More information

Sample Statistics 5021 First Midterm Examination with solutions

Sample Statistics 5021 First Midterm Examination with solutions THE UNIVERSITY OF MINNESOTA Statistics 5021 February 12, 2003 Sample First Midterm Examination (with solutions) 1. Baseball pitcher Nolan Ryan played in 20 games or more in the 24 seasons from 1968 through

More information

Your Galactic Address

Your Galactic Address How Big is the Universe? Usually you think of your address as only three or four lines long: your name, street, city, and state. But to address a letter to a friend in a distant galaxy, you have to specify

More information

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 984. y ˆ = a + b x + b 2 x 2K + b n x n where n is the number of variables Example: In an earlier bivariate

More information

Use your text to define the following term. Use the terms to label the figure below. Define the following term.

Use your text to define the following term. Use the terms to label the figure below. Define the following term. Mapping Our World Section. and Longitude Skim Section of your text. Write three questions that come to mind from reading the headings and the illustration captions.. Responses may include questions about

More information

Analyzing Severe Weather Data

Analyzing Severe Weather Data Chapter Weather Patterns and Severe Storms Investigation A Analyzing Severe Weather Data Introduction Tornadoes are violent windstorms associated with severe thunderstorms. Meteorologists carefully monitor

More information

EXST 7015 Fall 2014 Lab 08: Polynomial Regression

EXST 7015 Fall 2014 Lab 08: Polynomial Regression EXST 7015 Fall 2014 Lab 08: Polynomial Regression OBJECTIVES Polynomial regression is a statistical modeling technique to fit the curvilinear data that either shows a maximum or a minimum in the curve,

More information

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44 Multivariate Regression Prof. Jacob M. Montgomery Quantitative Political Methodology (L32 363) November 14, 2016 Lecture 20 (QPM 2016) Multivariate Regression November 14, 2016 1 / 44 Class business PS

More information

REGRESSION ANALYSIS BY EXAMPLE

REGRESSION ANALYSIS BY EXAMPLE REGRESSION ANALYSIS BY EXAMPLE Fifth Edition Samprit Chatterjee Ali S. Hadi A JOHN WILEY & SONS, INC., PUBLICATION CHAPTER 5 QUALITATIVE VARIABLES AS PREDICTORS 5.1 INTRODUCTION Qualitative or categorical

More information

Regression Diagnostics

Regression Diagnostics Diag 1 / 78 Regression Diagnostics Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas 2015 Diag 2 / 78 Outline 1 Introduction 2

More information

Smart Magnets for Smart Product Design: Advanced Topics

Smart Magnets for Smart Product Design: Advanced Topics Smart Magnets for Smart Product Design: Advanced Topics Today s Presenter Jason Morgan Vice President Engineering Correlated Magnetics Research 6 Agenda Brief overview of Correlated Magnetics Research

More information

Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016

Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016 Animal and Plant Health Inspection Service Veterinary Services Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016 Information current as of 12:00 pm MDT, 06/29/2016 This report provides

More information

Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

More information

Evolution Strategies for Optimizing Rectangular Cartograms

Evolution Strategies for Optimizing Rectangular Cartograms Evolution Strategies for Optimizing Rectangular Cartograms Kevin Buchin 1, Bettina Speckmann 1, and Sander Verdonschot 2 1 TU Eindhoven, 2 Carleton University September 20, 2012 Sander Verdonschot (Carleton

More information

Multiway Analysis of Bridge Structural Types in the National Bridge Inventory (NBI) A Tensor Decomposition Approach

Multiway Analysis of Bridge Structural Types in the National Bridge Inventory (NBI) A Tensor Decomposition Approach Multiway Analysis of Bridge Structural Types in the National Bridge Inventory (NBI) A Tensor Decomposition Approach By Offei A. Adarkwa Nii Attoh-Okine (Ph.D) (IEEE Big Data Conference -10/27/2014) 1 Presentation

More information

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016)

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016) Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016) App. 5-1 Registration Renewal Assignments Dates Term # of of 1st # of Use # of Form Serv. Key & State (Years) Fee Spec. Use

More information

AIR FORCE RESCUE COORDINATION CENTER

AIR FORCE RESCUE COORDINATION CENTER AIR FORCE RESCUE COORDINATION CENTER 2006 ANNUAL REPORT 1 TABLE OF CONTENTS AFRCC CHARTER & MISSION STATEMENT 3 AFRCC ORGANIZATION 4 TABLE 1 10 YEAR LOOKBACK 5 TABLE 2 ACTIVITY BY MISSION TYPE 6 TABLE

More information

C Further Concepts in Statistics

C Further Concepts in Statistics Appendix C.1 Representing Data and Linear Modeling C1 C Further Concepts in Statistics C.1 Representing Data and Linear Modeling Use stem-and-leaf plots to organize and compare sets of data. Use histograms

More information

AFRCC AIR FORCE RESCUE COORDINATION CENTER

AFRCC AIR FORCE RESCUE COORDINATION CENTER AFRCC AIR FORCE RESCUE COORDINATION CENTER 2003 ANNUAL REPORT AFRCC ANNUAL REPORT TABLE OF CONTENTS AFRCC Mission Statement and Charter... 3 Organizational Chart... 4 Commander's Comments... 5 2003 AFRCC

More information

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class. Name Answer Key Economics 170 Spring 2003 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment

More information

Summary of Natural Hazard Statistics for 2008 in the United States

Summary of Natural Hazard Statistics for 2008 in the United States Summary of Natural Hazard Statistics for 2008 in the United States This National Weather Service (NWS) report summarizes fatalities, injuries and damages caused by severe weather in 2008. The NWS Office

More information

Drought Monitoring Capability of the Oklahoma Mesonet. Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet

Drought Monitoring Capability of the Oklahoma Mesonet. Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet Drought Monitoring Capability of the Oklahoma Mesonet Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet Mesonet History Commissioned in 1994 Atmospheric measurements with 5-minute resolution,

More information

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

What Lies Beneath: A Sub- National Look at Okun s Law for the United States. What Lies Beneath: A Sub- National Look at Okun s Law for the United States. Nathalie Gonzalez Prieto International Monetary Fund Global Labor Markets Workshop Paris, September 1-2, 2016 What the paper

More information

Forecasting the 2012 Presidential Election from History and the Polls

Forecasting the 2012 Presidential Election from History and the Polls Forecasting the 2012 Presidential Election from History and the Polls Drew Linzer Assistant Professor Emory University Department of Political Science Visiting Assistant Professor, 2012-13 Stanford University

More information

Statistical Mechanics of Money, Income, and Wealth

Statistical Mechanics of Money, Income, and Wealth Statistical Mechanics of Money, Income, and Wealth Victor M. Yakovenko Adrian A. Dragulescu and A. Christian Silva Department of Physics, University of Maryland, College Park, USA http://www2.physics.umd.edu/~yakovenk/econophysics.html

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015

Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015 Animal and Plant Health Inspection Service Veterinary Services Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015 Information current as of 12:00 pm MDT, 09/16/2015 This report provides

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Statistical Techniques II EXST7015 Simple Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression Statistical Techniques II EXST7015 Simple Linear Regression 03a_SLR 1 Y - the dependent variable 35 30 25 The objective Given points plotted on two coordinates, Y and X, find the best line to fit the data.

More information

Section 3.3. How Can We Predict the Outcome of a Variable? Agresti/Franklin Statistics, 1of 18

Section 3.3. How Can We Predict the Outcome of a Variable? Agresti/Franklin Statistics, 1of 18 Section 3.3 How Can We Predict the Outcome of a Variable? Agresti/Franklin Statistics, 1of 18 Regression Line Predicts the value for the response variable, y, as a straight-line function of the value of

More information

Simple Linear Regression Using Ordinary Least Squares

Simple Linear Regression Using Ordinary Least Squares Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression

More information

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant:

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant: Pre Audit Notification Letter Draft Dear Registrant: The Pennsylvania Department of Transportation (PennDOT) is a member of the Federally Mandated International Registration Plan (IRP). As part of this

More information

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds 2006 Supplemental Information for JennisonDryden and Strategic Partners s We have compiled the following information to help you prepare your 2006 federal and state tax returns: Percentage of income from

More information

Annual Performance Report: State Assessment Data

Annual Performance Report: State Assessment Data Annual Performance Report: 2005-2006 State Assessment Data Summary Prepared by: Martha Thurlow, Jason Altman, Damien Cormier, and Ross Moen National Center on Educational Outcomes (NCEO) April, 2008 The

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015

Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015 Animal and Plant Health Inspection Service Veterinary Services Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015 Information current as of 12:00 pm MDT, 03/04/2015 This report provides

More information

Week 3 Linear Regression I

Week 3 Linear Regression I Week 3 Linear Regression I POL 200B, Spring 2014 Linear regression is the most commonly used statistical technique. A linear regression captures the relationship between two or more phenomena with a straight

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Data Visualization (DSC 530/CIS )

Data Visualization (DSC 530/CIS ) Data Visualization (DSC 530/CIS 602-01) Tables Dr. David Koop Visualization of Tables Items and attributes For now, attributes are not known to be positions Keys and values - key is an independent attribute

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

Cluster Analysis. Part of the Michigan Prosperity Initiative

Cluster Analysis. Part of the Michigan Prosperity Initiative Cluster Analysis Part of the Michigan Prosperity Initiative 6/17/2010 Land Policy Institute Contributors Dr. Soji Adelaja, Director Jason Ball, Visiting Academic Specialist Jonathon Baird, Research Assistant

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Foundations of Correlation and Regression

Foundations of Correlation and Regression BWH - Biostatistics Intermediate Biostatistics for Medical Researchers Robert Goldman Professor of Statistics Simmons College Foundations of Correlation and Regression Tuesday, March 7, 2017 March 7 Foundations

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

The veto as electoral stunt

The veto as electoral stunt The veto as eectora stunt EITM and a test with comparative data Eric ITAM, Mexico City Apr. 2, 203 eitm@mpsa Motivation A Form: see EITM in action forma mode 2 comparative statics 3 fasifiabe impications

More information

Correlation 1. December 4, HMS, 2017, v1.1

Correlation 1. December 4, HMS, 2017, v1.1 Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

Chapter 10. Simple Linear Regression and Correlation

Chapter 10. Simple Linear Regression and Correlation Chapter 10. Simple Linear Regression and Correlation In the two sample problems discussed in Ch. 9, we were interested in comparing values of parameters for two distributions. Regression analysis is the

More information

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test)

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test) Relationship Hypothesis Tests Correlation and Regression Notes Categorical / Categorical Relationship (Chi-Squared Independence Test) Ho: Categorical Variables are independent (show distribution of conditional

More information

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister

More information

sociology 362 regression

sociology 362 regression sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Kari Lock. Department of Statistics, Harvard University Joint Work with Andrew Gelman (Columbia University)

Kari Lock. Department of Statistics, Harvard University Joint Work with Andrew Gelman (Columbia University) Bayesian Combinaion of Sae Polls and Elecion Forecass Kari Lock Deparmen of Saisics, Harvard Universiy Join Work wih Andrew Gelman (Columbia Universiy Harvard Insiue for Quaniaive Social Science Feb 4

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

EXST Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA

EXST Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA EXST7034 - Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA MODEL: Y 3 = "! + "" X 3 + % 3 MATRIX MODEL: Y = XB + E Ô Y" Ô 1 X" Ô e" Y# 1 X# b! e# or Ö Ù = Ö Ù Ö Ù b ã ã ã " ã

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

sociology 362 regression

sociology 362 regression sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

More information

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

y n 1 ( x i x )( y y i n 1 i y 2

y n 1 ( x i x )( y y i n 1 i y 2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered

More information

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know: Multiple Regression Ψ320 Ainsworth More Hypothesis Testing What we really want to know: Is the relationship in the population we have selected between X & Y strong enough that we can use the relationship

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach

Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach Krishna P. Paudel, Louisiana State University and LSU Agricultural Center Mahesh Pandit, Louisiana State University and

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories. Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships Objectives 2.3 Least-squares regression Regression lines Prediction and Extrapolation Correlation and r 2 Transforming relationships Adapted from authors slides 2012 W.H. Freeman and Company Straight Line

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Further Concepts in Statistics

Further Concepts in Statistics Appendix D Further Concepts in Statistics D1 Appendix D Further Concepts in Statistics Stem-and-Leaf Plots Histograms and Frequency Distributions Line Graphs Choosing an Appropriate Graph Scatter Plots

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or. Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.

More information

Ch. 16: Correlation and Regression

Ch. 16: Correlation and Regression Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Name: Person on right SID: Person on left There will be one, double sided, handwritten, 8.5in x 11in page of notes allowed during the exam. The exam is closed

More information

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT 350 Final (new Material) Review Problems Key Spring 2016 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,

More information

Data Visualization (CIS 468)

Data Visualization (CIS 468) Data Visualization (CIS 468) Tables & Maps Dr. David Koop Discriminability What is problematic here? File vtkdatasetreader PythonSource vtkimageclip vtkimagedatageometryfilter vtkimageresample vtkimagereslice

More information

Analysis of the USDA Annual Report (2015) of Animal Usage by Research Facility. July 4th, 2017

Analysis of the USDA Annual Report (2015) of Animal Usage by Research Facility. July 4th, 2017 Analysis of the USDA Annual Report (2015) of Animal Usage by Research Facility July 4th, 2017 Author's information: Jorge Sigler, Catherine Perry, Amanda Gray, James Videle For inquiries contact James

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Introduction to Mathematical Statistics and Its Applications Richard J. Larsen Morris L. Marx Fifth Edition

Introduction to Mathematical Statistics and Its Applications Richard J. Larsen Morris L. Marx Fifth Edition Introduction to Mathematical Statistics and Its Applications Richard J. Larsen Morris L. Marx Fifth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Estimating Dynamic Games of Electoral Competition to Evaluate Term Limits in U.S. Gubernatorial Elections: Online Appendix

Estimating Dynamic Games of Electoral Competition to Evaluate Term Limits in U.S. Gubernatorial Elections: Online Appendix Estimating Dynamic Games of Electoral Competition to Evaluate Term Limits in U.S. Gubernatorial Elections: Online ppendix Holger Sieg University of Pennsylvania and NBER Chamna Yoon Baruch College I. States

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information