SIMPLE LINEAR REGRESSION STAT 251

Size: px
Start display at page:

Download "SIMPLE LINEAR REGRESSION STAT 251"

Transcription

1 1 SIMPLE LINEAR REGRESSION STAT 251

2 OUTLINE Relationships in Data The Beginning Scatterplots Correlation The Least Squares Line Cautions Association vs. Causation Extrapolation Outliers Inference: Simple Linear Regression The Theoretical Model Testing and Estimating the Slope Coefficient Confidence Intervals and Prediction Intervals Verifying Assumptions 2

3 3 EXPLORING RELATIONSHIPS Up to now we ve mostly concentrated on exploring one variable at a time. In ANOVA and hypothesis testing, we determined whether or not there is a relationship, but we ve yet to explore how two variables relate. Here, we address the question of exploring relationships between two quantitative variables. We ll use an ongoing example to illustrate the concepts of data exploration.

4 EXPLORING LINEAR RELATIONSHIPS IN DATA 4

5 Televison, Physicians, and Life Expectancy Country Life expect ancy People /TV People / physici an Argentina Bangladesh Brazil Canada China Colombia Egypt Ethiopia France Germany India Indonesia Iran Italy Japan Kenya Korea, North Korea, South Mexico Morocco Myanmar (Burma) Pakistan Peru Philippines Poland Romania Russia South Africa Spain Sudan Taiwan Tanzania 52.5* Thailand Turkey Ukraine United Kingdom United States Venezuela Vietnam Zaire 54* SOURCE: _The World Almanac and Book of Facts 1993_ (1993), New York: Pharos Books.

6 6 OUR GOAL We hope to establish a relationship between two variables. This relationship will allow us to make predictions about the value of one variable based on the observed value of the other. Explanatory Variable (X): The explanatory (also called predictor) variable is used to try and explain or predict the other variable. E.g. the number of physicians in a country could be used to predict the life expectancy of its citizens. Response Variable (Y): The response variable is the variable we are trying to predict. E.g. life expectancy

7 7 USING A GRAPHICAL DEVICE TO VISUALIZE THE RELATIONSHIP. Is there a relationship between the number of physicians in a country and its citizens life expectancy? Would making histograms for both variables help us answer this question? Why? A Scatterplot helps us to visualize the relationship between two quantitative variables and determine if there is any association between the two variables.

8 8 MAKING A SCATTERPLOT We first identify our explanatory, or independent, variable (X), which will be conveniently plotted on the x-axis, and our response, or dependent, variable (Y) conveniently plotted on the y-axis. Secondly we collect and plot each point (x i,y i ). These are the observations for case i. In our example, (x 1,y 1 ) are the number of physicians and the life expectancy for Argentina. The origin does not have to be included in the plot.

9 9

10 10 PATTERNS OF SCATTERPLOTS When looking for patterns on a scatterplot there are generally 4 things we should think about; the direction Form scatter of the points and if there are any outliers.

11 11 1 DIRECTION The direction is Positive if the x and y values tend to go in the same direction (ie) when x is low y is generally low, and when x is high y is generally high. The direction is Negative if the x and y values tend to go in opposite directions (ie)when x is low y is generally high, and when x is high y is generally low.

12 12 2 FORM Linear Form Non-linear or Curved form No clear form

13 13 3 SCATTER When there is a clear form, the scatter of the points about the line or curve will indicate the strength of the association Are the points tight to their form or Are they loose?

14 14 4 OUTLIERS Scatterplots will often reveal outliers if they are present.

15 TRANSFORMATIONS It s much simpler to deal with linear relationships than curved ones. (for one it will allow us to use correlation) When a scatter plot has a form which is nonlinear, we can transform one or both variables to render it linear. In our example, the relationship is not linear. The most common transformation is taking the log of one of the variables. On the following page, the log of X, the number of citizens per physician, was taken.

16 16

17 17 PLOTTING GUIDELINES 1 quantitative variable: histogram, stem-andleaf, boxplot 1 categorical variable: barchart, piechart 1 quantitative 1 categorical: side-by-side boxplots 2 categorical: bar charts, stacked bar charts 2 quantitative: scatterplots

18 CORRELATION As with module 1, we ve first discussed how to use graphical tools to explore the relationship between two categorical variables and now we follow this with a numerical measure. Scatterplots are useful at displaying a relationship, but are inherently vague. Correlation measures the strength of the Linear-association between two quantitative variables, and is denoted by the letter (r or ρ).

19 19 CALCULATING CORRELATION EQUATION You Won t be expected to calculate these by hand!

20 20 PROPERTIES The direction of the association dictates the sign of the correlation. I.e. if the association is positive, then the r is positive. r is always between -1 and 1. Correlations of 1 or -1 indicate perfect positive or negative association. r near zero indicates very weak or absent linear association. x and y are interchangeable Linear transformations of variables do not affect correlation.

21 21

22 22 GUIDELINES r 0.90 very strong association 0.90 r 0.70 fairly strong association 0:70 r 0.50 somewhat weak association 0:50 r 0.30 very weak association r 0.3 no/little association * note, that we are talking about linear association.

23 23

24 24 CAREFUL Correlation is used on quantitative variables It is a measure of linear association. The scatterplot should indicate/support linear association. Outliers can severely distort the correlation.

25 For the scatterplots below, assign what you think is the appropriate correlation for each, choosing from the list of numbers below: r : -1, -0.95, -0.77, -0.55, -0.35, 0, 0, 0.35, 0.50, 0.75, 0.95, 1 25

26 Q1 A fellow researcher is exploring the relationships between various variables. She informs you that the correlation between variable A and B is You ask for the plot to confirm linearity and she says: With a correlation so high it has to be linear. Do you agree with this statement? A. Yes B. No 26

27 Q2 Betty Crocker was exploring the relationship between cooking time and temperature for brownies. She works in degrees Fahrenheit and found a sample correlation of We would like to use these findings, but we are working in degrees Celsius which is 5/9X -160/9 degrees Fahrenheit. What correlation should we report? A B. 5/9(-0.77) C. 5/9(-0.77) -160/9 D

28 Q3 In the Betty Crocker example, she found a correlation of Which of the following statements is correct? A. If we increase the temperature, the cooking time will decrease. B. If we increase the temperature, we expect the cooking time will decrease. C. If we increase the temperature, the cooking time will increase. D. If we increase the temperature, we expect the cooking time will increase. 28

29 LINEAR REGRESSION: FITTING A LINE TO THE DATA 29

30 FITTING A LINE Once a linear relation is established, we seek a numerical quantification of the linear relationship between two quantitative variables. We categorized variables as being predictor variables or explanatory variables and in these chapters we discuss how to go about predicting. Linear Regression is a method of fitting a straight line to a scatterplot and predicting the Response (y) using the Explanatory variable (x).

31 When trying to fit a line to a scatterplot, there are many lines which are acceptable candidates. In order to properly discuss how we choose the best fitting, we ll need to develop some vocabulary first. 31

32 THE MODEL The model fits a straight line to the data which can be used to make predictions on the response. Mathematically, the model is The slope tells us how big a change in y to expect for a unit increase in x. If it s positive, y will increase with x. Note: the intercept can sometimes be meaningless

33 33 RESIDUALS Once we have our regression line (from the model), for any value of x, we can predict the value of y by using the corresponding y-value on the line. These predicted values are called fitted values and are denoted by For an observed value of y, we can obtain the difference between the observation and the predicted value which we call the residual (e)

34 It is apparent from above, that the smaller the residuals are, the better our model is at making predictions. 34

35 35 MINIMIZE THE SQUARED DEVIATIONS? The Residuals are both positive and negative thus we can t minimize them directly and there are infinitely many lines which lead to residuals which sum to zero. Recall that when calculating variance, we faced the same issue and used the sum of squared differences to quantify the spread in the data. The same trick is used again and it is the sum of squared residuals or deviations from the line which we minimize. Only one line leads to the minimization of the squared residuals. We call this line the least squares line

36 36 THE LEAST SQUARES LINE Only one line leads to the minimization of the squared residuals. We call this line the least squares line. The Regression line and Least Squares line are the same line.

37 37

38 38 EXAMPLE An experiment was designed for the Department of Materials Engineering to study hydrogen embrittlement properties based on electrolytic hydrogen pressure measurements. The solution used was 0.1 N NaOH, the material being certain type of stainless steel. The cathodic charging current density was controlled and varied at four levels. Here are some summary statistics Variable Mean Sample SD Correlation Charging Current Density (ma/cm 2 ) Effective Hydrogen Pressure (atm)

39 Find the regression line 39

40 40 MORE QUESTIONS What would you predict the pressure to be if the current was 2.1 ma/cm 2? What about 4.0 ma/ cm 2? For every 1 ma/cm 2 increase in current, what is the expected increase in effective hydrogen pressure? On the 16 th trial, the current was set to 1.5 and the pressure was measured at What would the residual for this observation be?

41 SOME REMARKS The regression line goes through the mean-mean point. Interpreting the slope b 1 : On average, an increase of 1 SD x in X is associated with a change of r x SD y in Y. So in our example, for every 1.187mA/cm 2 shift in current, we have an expected shift of (0.929)( ) in pressure Interpreting the intercept b 0 : the predicted value for x=0.

42 CAUTIONS: CAUSATION, EXTRAPOLATION AND OUTLIERS 42

43 CAREFUL ASSOCIATION IS NOT CAUSATION To make predictions, we only need association, not causation. Observing strong association does not imply causation. Causation leads to association, but association does not necessarily lead to causation. Association may be purely due to luck. 43

44 HOW WERE THE DATA COLLECTED? There may be an underlying variable, called a lurking variable, which is associated to both x and y. The way the data are obtained dictates if we can imply causation: An Experiment removes the influence of other variable, so we can conclude causation A Study is susceptible to the influence of other variables. 44

45 EXAMPLES

46 Televison, Physicians, and Life Expectancy Country Life expect ancy People /TV People / physici an Argentina Bangladesh Brazil Canada China Colombia Egypt Ethiopia France Germany India Indonesia Iran Italy Japan Kenya Korea, North Korea, South Mexico Morocco Myanmar (Burma) Pakistan Peru Philippines Poland Romania Russia South Africa Spain Sudan Taiwan Tanzania 52.5* Thailand Turkey Ukraine United Kingdom United States Venezuela Vietnam Zaire 54* SOURCE: _The World Almanac and Book of Facts 1993_ (1993), New York: Pharos Books.

47 EXTRAPOLATION Extrapolation is when we try to predict the response variable for an explanatory variable which is outside the range of our observed explanatory variable. Interpolation is when we try to predict within that range. Extrapolating makes the assumption that the relationship for the two variables continues beyond the limits of this range. Often this can give misleading predictions as this assumption doesn t hold Predicting the future through regression is always extrapolating 47

48 EXAMPLES Some data on weight and age of girls between the ages of 2 and 10 were collected. The relationship is linear and very strong. The model which arises from the data is: Weight = (age) So one could interpolate the weight of an average 5 year old to be But were we to trust this model to go on beyond the range of ages here, what would we predict the weight of a 40 year old to be? Does this make sense? 48

49 The danger of making predictions outside the range of the observed x 49 values is that the linear relationship for the observed data may not hold anymore once we leave the range.

50 BEWARE OF INFLUENTIAL POINTS We ve already discussed how problematic outliers can be in the context of summary statistics (E.g. mean and variance) Outliers are also problematic in regression. We ll define three types of outliers and how they differ in their effect on the regression line. y-outliers x-outliers 50

51 51

52 INFLUENTIAL POINTS We call an observation influential if omitting it from the analysis will largely change the model. If a high leverage point or a y-outlier are model outliers, then they are influential points. When an outlier is present, one should fit two models one with and one without the potentially influential point. The outlier shouldn t be omitted without justification. 52

53 THE EFFECT OF A NON-INFLUENTIAL OUTLIER Outliers which aren t model outliers can still affect the regression. Including the outlier can, in some cases raise the R 2. R 2 =0.263 R 2 =

54 Q4 The next step Betty Crocker took was to use her data to estimate a regression line. What is the response variable here? A. Temperature B. Cooking Time 54

55 Q5 The estimated regression line was: Y = 25 (1/30)X The best interpretation of this slope would be: A. For every degree we raise the temperature, we reduce the cooking time by 1/30 of a minute. B. The cooking time decreases as we increase the temperature. C. For every degree we raise the temperature, we expect to reduce the cooking time by 1/30 of a minute. D. On average, increasing the temperature by one degree will decrease the cooking time by 1/30 of a second 55

56 Q6 The estimated regression line found by Betty Crocker is: Y = 25 (1/30)X She then cooked brownies in 10 minutes. What temperature do you predict she cooked at? A. 25 (1/30)x10 = B. (25-10)x30 = 450 C. ( )x30 = 1050 D. Can t tell from this information 56

57 THE STOCHASTIC MODEL AND ASSUMPTIONS

58 BRINGING THE STATISTICS TO REGRESSION The Least Squares Line is found on purely mathematical grounds. In order to make statistical inference, we expand the model slightly. Y = x + So for a single response we have Y i = x i + i Where i ~ N(0, 2 ) 58

59 FIGURE 17.3 DISTRIBUTION OF Y GIVEN X

60 THE ASSUMPTIONS There are 4 assumptions made in Simple Linear regression: The residuals are Normally distributed The variance of the errors are constant The observations are independent The relation is linear Note: The assumptions and Model go hand in hand The observations (y i ) are Normally distributed. The variance is constant (homoscedasticity) The relationship is linear 60

61 DIAGNOSTICS Independence: It is determined through design not by graphical investigation. Normality: Verified using a histogram or a QQ plot Homoscedasticity (Constant Variance): plot the residuals against the fitted values. Look out for patterns as they indicate that the assumptions are not met. Linearity: verified using a scatter plot or a residual plot Outliers: We should also look at the scatter plots for outliers. These are called influential points and should be avoided. 61

62 62

63 63

64 INFERENCE PART 1: INFERENCE ON THE MODEL 64

65 WHEN GIVEN A MODEL Suppose someone collected data and estimated a model. Without the data, we have no idea how good the model is. Here are questions we may want to ask about the model: How good is the model at predicting? How strong is the relationship? Should we use the explanatory variable to estimate the mean of the response? Is the slope significant? How can I construct a Confidence Interval for the mean of the response? How can I create a prediction Interval for an individual meeting certain criteria. 65

66 ANSWERS How good is the model at predicting Coefficient of Determination Is the slope significant? T-test for the slope How can I construct a Confidence Interval for the mean of the response? How can I create a prediction Interval for an individual meeting certain criteria. Confidence Interval for the Expected Value of y Prediction Interval 66

67 SO YOU VE DETERMINED THAT THE MODEL IS LINEAR Having determined that the model is a line and not just a mean, we want to know: How good at making predictions is our model? Recall that correlation is a measure of the linear association between two variables. Obviously the stronger the association, the better the model will be at making predictions. 67

68 COEFFICIENT OF DETERMINATION The Coefficient of determination is simply a reexpression of the correlation which lends itself better to the question at hand R 2 = r 2 Coefficient of Determination = Correlation 2 68

69

70 QUESTIONS Does a higher Coefficient of Determination imply a better model? If we reject the Null Hypothesis of the ANOVA test, do we also reject the t-test for the slope? What if we fail to reject the ANOVA, do we also fail to reject the t-test? 70

71 TESTING THE SLOPE As with before, ANOVA is a generalization of a t- test. We can use a t-test to test for the slope. Compared to ANOVA, we can test for a specific side and not only for a slope. H 0 : 1 = 0 H A : 1 0 or H A : 1 < 0 or H A : 1 > 0 The Conditions required for this test are those required for Simple Linear Regression. What are they? 71

72 TESTING THE SLOPE 2 If these are met, then the sampling distribution of b 1 is: Normal Has Mean 1 Has Standard Error Here 72

73 TESTING THE SLOPE 3 The test follows the same form that all our t-test have followed: With degrees of freedom n-2 We can also construct a Confidence Interval for the Slope 73

74 Q7 A suspicious Elf measured the relation between the value of toys given at Christmas and the degree of goodness of children (don t ask how). He obtained the following 95% confidence interval for the slope: [-1.2, 5.6]. Should we use degrees of goodness to predict the value of gifts? A. Yes, the slope appears to be positive. B. Yes, it s better than using nothing. C. No, the slope is not significant. 74

75 75 INFERENCE USING THE MODEL

76 HOW THIS DIFFERS The inference we saw in the last section pertained to the model itself: Is the mean of the response variable Y a constant or is it a conditional mean conditional on the value of the explanatory variable? If it is conditional, how much are we gaining in predictive power by using a conditional mean instead of a constant? In this section, we look to infer on the Conditional population mean The result of an individual within a conditional population. 76

77 Figure 17.3 Distribu)on of y Given x

78 CONFIDENCE INTERVALS The Book talks about Expected Value, which is just a fancy word for Mean. In this case it s a conditional mean. Given a specific value of x, we can construct a confidence interval 78

79 PREDICTION INTERVALS Given a specific value of x, we may be interested in predicting the behaviour of an individual rather than the mean. We have to change the interval slightly to account for the extra variability observed in individuals rather than means. 79

80 Interval Es)mates and Predic)on Intervals

81 New Example here AND new problem a?er

82 EXAMPLE: OXYGEN DEMAND One of the more challenging problems confronting the water pollution control field is presented by the tanning industry. Their wastes are chemically complex. We consider the experimental data obtained from 33 samples of chemically treated waste. The variables are: The percent reduction in total solids The percent reduction in chemical oxygen demand. 82

83 Solid Residue Oxygen Demand Mean Sample SD Correlation SSE

84 EXERCISE 1. Estimate the regression line 2. Construct a 95% confidence interval for the slope. 3. Construct a 95% Confidence Interval for the mean chemical oxygen demand of water with 32% solids reduction. 4. Construct a 95% Prediction Interval for water with 40% solids reduction. 84

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Stochastic Analysis and Forecasts of the Patterns of Speed, Acceleration, and Levels of Material Stock Accumulation in Society

Stochastic Analysis and Forecasts of the Patterns of Speed, Acceleration, and Levels of Material Stock Accumulation in Society Stochastic Analysis and Forecasts of the Patterns of Speed, Acceleration, and Levels of Material Stock Accumulation in Society Supporting information Tomer Fishman a,*, Heinz Schandl a,b, and Hiroki Tanikawa

More information

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships Objectives 2.3 Least-squares regression Regression lines Prediction and Extrapolation Correlation and r 2 Transforming relationships Adapted from authors slides 2012 W.H. Freeman and Company Straight Line

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to: STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.

More information

Sociology 6Z03 Review I

Sociology 6Z03 Review I Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

2017 Source of Foreign Income Earned By Fund

2017 Source of Foreign Income Earned By Fund 2017 Source of Foreign Income Earned By Fund Putnam Emerging Markets Equity Fund EIN: 26-2670607 FYE: 08/31/2017 Statement Pursuant to 1.853-4: The fund is hereby electing to apply code section 853 for

More information

Chapter 7 Summary Scatterplots, Association, and Correlation

Chapter 7 Summary Scatterplots, Association, and Correlation Chapter 7 Summary Scatterplots, Association, and Correlation What have we learned? We examine scatterplots for direction, form, strength, and unusual features. Although not every relationship is linear,

More information

Learning Objectives. Math Chapter 3. Chapter 3. Association. Response and Explanatory Variables

Learning Objectives. Math Chapter 3. Chapter 3. Association. Response and Explanatory Variables ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3 Learning Objectives 3.1 The Association between Two Categorical Variables 1. Identify variable type: Response or Explanatory 2. Define Association

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 2.6 Least squares line Interpreting coefficients Cautions Want More Stats??? If you have enjoyed learning how to analyze data, and want to

More information

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation Copyright 2010 Pearson Education, Inc. Looking at Scatterplots Scatterplots may be the most common and most effective display for data. In a scatterplot,

More information

Chapter 12 - Part I: Correlation Analysis

Chapter 12 - Part I: Correlation Analysis ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Canadian Imports of Honey

Canadian Imports of Honey of 0409000029 - Honey, natural, in containers of a weight > 5 kg, nes (Kilogram) Argentina 236,716 663,087 2,160,216 761,990 35.27% 202.09% /0 76,819 212,038 717,834 257,569 35.88% 205.69% /0 United States

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Describing Data: Two Variables

Describing Data: Two Variables STAT 250 Dr. Kari Lock Morgan Describing Data: Two Variables SECTIONS 2.4, 2.5 One quantitative variable (2.4) One quantitative and one categorical (2.4) Two quantitative (2.5) z- score Which is better,

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or

More information

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot

More information

Relationships Regression

Relationships Regression Relationships Regression BPS chapter 5 2006 W.H. Freeman and Company Objectives (BPS chapter 5) Regression Regression lines The least-squares regression line Using technology Facts about least-squares

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM, Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean

More information

STATISTICS Relationships between variables: Correlation

STATISTICS Relationships between variables: Correlation STATISTICS 16 Relationships between variables: Correlation The gentleman pictured above is Sir Francis Galton. Galton invented the statistical concept of correlation and the use of the regression line.

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences CIP Code Description Citizenship Graduate Undergrad Total 00.0000

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin Note on Bivariate Regression: Connecting Practice and Theory Konstantin Kashin Fall 2012 1 This note will explain - in less theoretical terms - the basics of a bivariate linear regression, including testing

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Chapter 4 Data with Two Variables

Chapter 4 Data with Two Variables Chapter 4 Data with Two Variables 1 Scatter Plots and Correlation and 2 Pearson s Correlation Coefficient Looking for Correlation Example Does the number of hours you watch TV per week impact your average

More information

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Chapter 5 Least Squares Regression

Chapter 5 Least Squares Regression Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis 4.1 Introduction Correlation is a technique that measures the strength (or the degree) of the relationship between two variables. For example, we could measure how strong the relationship is between people

More information

Correlation and regression

Correlation and regression NST 1B Experimental Psychology Statistics practical 1 Correlation and regression Rudolf Cardinal & Mike Aitken 11 / 12 November 2003 Department of Experimental Psychology University of Cambridge Handouts:

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV) Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?

More information

Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals

Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals 4 December 2018 1 The Simple Linear Regression Model with Normal Residuals In previous class sessions,

More information

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable Chapter 08: Linear Regression There are lots of ways to model the relationships between variables. It is important that you not think that what we do is the way. There are many paths to the summit We are

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

04 June Dim A W V Total. Total Laser Met

04 June Dim A W V Total. Total Laser Met 4 June 218 Member State State as on 4 June 218 Acronyms are listed in the last page of this document. AUV Mass and Related Quantities Length PR T TF EM Mass Dens Pres F Torq Visc H Grav FF Dim A W V Total

More information

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate Announcements Announcements Lecture : Simple Linear Regression Statistics 1 Mine Çetinkaya-Rundel March 29, 2 Midterm 2 - same regrade request policy: On a separate sheet write up your request, describing

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

3.1 Scatterplots and Correlation

3.1 Scatterplots and Correlation 3.1 Scatterplots and Correlation Most statistical studies examine data on more than one variable. In many of these settings, the two variables play different roles. Explanatory variable (independent) predicts

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

Chapter 6 Scatterplots, Association and Correlation

Chapter 6 Scatterplots, Association and Correlation Chapter 6 Scatterplots, Association and Correlation Looking for Correlation Example Does the number of hours you watch TV per week impact your average grade in a class? Hours 12 10 5 3 15 16 8 Grade 70

More information

Chapter 4 Data with Two Variables

Chapter 4 Data with Two Variables Chapter 4 Data with Two Variables 1 Scatter Plots and Correlation and 2 Pearson s Correlation Coefficient Looking for Correlation Example Does the number of hours you watch TV per week impact your average

More information

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight AP Statistics Chapter 9 Re-Expressing data: Get it Straight Objectives: Re-expression of data Ladder of powers Straight to the Point We cannot use a linear model unless the relationship between the two

More information

Mr. Stein s Words of Wisdom

Mr. Stein s Words of Wisdom Mr. Stein s Words of Wisdom I am writing this review essay for two tests the AP Stat exam and the Applied Stat BFT. The topics are more or less the same, so reviewing for the two tests should be a similar

More information

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

1. Create a scatterplot of this data. 2. Find the correlation coefficient. How Fast Foods Compare Company Entree Total Calories Fat (grams) McDonald s Big Mac 540 29 Filet o Fish 380 18 Burger King Whopper 670 40 Big Fish Sandwich 640 32 Wendy s Single Burger 470 21 1. Create

More information

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variable In this lecture: We shall look at two quantitative variables.

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and

Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section 2.1.1 and 8.1-8.2.6 Overview Scatterplots Explanatory and Response Variables Describing Association The Regression Equation

More information

Appendix B: Detailed tables showing overall figures by country and measure

Appendix B: Detailed tables showing overall figures by country and measure 44 country and measure % who report that they are very happy Source: World Values Survey, 2010-2014 except United States, Pew Research Center 2012 Gender and Generations survey and Argentina 32% 32% 36%

More information

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate

More information

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT 350 Final (new Material) Review Problems Key Spring 2016 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

ECON 497: Lecture 4 Page 1 of 1

ECON 497: Lecture 4 Page 1 of 1 ECON 497: Lecture 4 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 4 The Classical Model: Assumptions and Violations Studenmund Chapter 4 Ordinary least squares

More information

UNIT 12 ~ More About Regression

UNIT 12 ~ More About Regression ***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Chapter 7. Scatterplots, Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation

Chapter 7. Scatterplots, Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation Chapter 7 Scatterplots, Association, and Correlation 1 Scatterplots & Correlation Here, we see a positive relationship between a bear s age and its neck diameter. As a bear gets older, it tends to have

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate Review and Comments Chi-square tests Unit : Simple Linear Regression Lecture 1: Introduction to SLR Statistics 1 Monika Jingchen Hu June, 20 Chi-square test of GOF k χ 2 (O E) 2 = E i=1 where k = total

More information

Chapter 10: Comparing Two Quantitative Variables Section 10.1: Scatterplots & Correlation

Chapter 10: Comparing Two Quantitative Variables Section 10.1: Scatterplots & Correlation Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations Name: American River College Chapter 10: Comparing Two Quantitative Variables Section 10.1: Scatterplots

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Does socio-economic indicator influent ICT variable? II. Method of data collection, Objective and data gathered

Does socio-economic indicator influent ICT variable? II. Method of data collection, Objective and data gathered Does socio-economic indicator influent ICT variable? I. Introduction This paper obtains a model of relationship between ICT indicator and macroeconomic indicator in a country. Modern economy paradigm assumes

More information

ia PU BLi s g C o M Pa K T Wa i n CD-1576

ia PU BLi s g C o M Pa K T Wa i n CD-1576 M h M y CD-1576 o M Pa g C n ar ia PU BLi s in K T Wa i n ed National Geography Standards National Geography Standards Teachers leading discussions while completing units and activities is a prerequisite

More information

BIVARIATE DATA data for two variables

BIVARIATE DATA data for two variables (Chapter 3) BIVARIATE DATA data for two variables INVESTIGATING RELATIONSHIPS We have compared the distributions of the same variable for several groups, using double boxplots and back-to-back stemplots.

More information

3 Non-linearities and Dummy Variables

3 Non-linearities and Dummy Variables 3 Non-linearities and Dummy Variables Reading: Kennedy (1998) A Guide to Econometrics, Chapters 3, 5 and 6 Aim: The aim of this section is to introduce students to ways of dealing with non-linearities

More information

Chapter 7: Correlation and regression

Chapter 7: Correlation and regression Slide 7.1 Chapter 7: Correlation and regression Correlation and regression techniques examine the relationships between variables, e.g. between the price of doughnuts and the demand for them. Such analyses

More information

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variables In this lecture: We shall look at two quantitative variables.

More information