Estadística II Chapter 4: Simple linear regression

Size: px
Start display at page:

Download "Estadística II Chapter 4: Simple linear regression"

Transcription

1 Estadística II Chapter 4: Simple linear regression

2 Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties Statistical inference: For the slope. For the variance. Prediction for a new observation (the actual value or the average value)

3 Chapter 4. Simple linear regression Learning objectives Ability to construct a model to describe the influence of X on Y Ability to find estimates Ability to construct confidence intervals and carry out tests of hypothesis Ability to estimate the average value of Y for a given x (point estimate and confidence intervals) Ability to estimate the individual value of Y for a given x (point estimate and confidence intervals)

4 Chapter 4. Simple Linear Regression Bibliography Newbold, P. Statistics for Business and Economics (2013) Ch. 10 Ross, S. Introductory Statistics (2005) Ch. 12

5 Introduction A regression model is a model that allows us to describe an effect of a variable X on a variable Y. X: independent or explanatory or exogenous variable Y: dependent or response or endogenous variable The objective is to obtain reasonable estimates of Y for X based on a sample of n bivariate observations (x 1, y 1 ),..., (x n, y n ).

6 Introduction Examples Study how the father s height influences the son s height. Estimate the price of an apartment depending on its size. Predict an unemployment rate for a given age group. Approximate a final grade in Est II based on the weekly number of study hours. Predict the computing time as a function of the processor speed.

7 Introduction Types of relationships Deterministic: Given a value of X, the value of Y can be perfectly identified. y = f (x) Example: The relationship between the temp in C (X ) and Fahrenheit (Y ) is: y = 1.8x Plot of Grados Fahrenheit vs Grados centígrados Grados Fahrenheit Grados centígrados

8 Introduction Types of relationships Nondeterministic (random/stochastic): Given a value of X, the value of Y cannot be perfectly known. y = f (x) + u where u is an unknown (random) perturbation (random variable). Example: Production (X ) and price (Y ). 80 Plot of Costos vs Volumen 60 Costos Volumen There is a linear pattern, but not perfect.

9 Introduction Types of relationships Linear: When the function f (x) is linear, f (x) = β 0 + β 1 x If β 1 > 0 there is a positive linear relationship. If β 1 < 0 there is a negative linear relationship. Relación lineal positiva Relación lineal negativa Y 2 Y X X The scatterplot is (American) football-shaped.

10 Introduction Types of relationships Nonlinear: When f (x) is nonlinear. For example, f (x) = log(x), f (x) = x 2 + 3,... 2 Relación no lineal 1 0 Y X The scatterplot is not (American) football-shaped.

11 Introduction Types of relationships Lack of relationship: When f (x) = 0. 2,5 Ausencia de relación 1,5 0,5 Y -0,5-1,5-2, X

12 Measures of linear dependence Covariance The covariance is defined as n (x i x) (y i ȳ) n x i y i n( x)(ȳ) cov (x, y) = i=1 n 1 = i=1 n 1 If there is a positive linear relationship, cov > 0 If there is a negative linear relationship, cov < 0 If there is no relationship or the relationship is nonlinear, cov 0 Problem: Covariance depends on the units of X and Y.

13 Measures of linear dependence Correlation coefficient The correlation coefficient (unitless) is defined as r (x,y) = cor (x, y) = cov (x, y) s x s y where n (x i x) 2 n (y i ȳ) 2 s 2 x = i=1 n 1 and s 2 y = i=1 n 1-1 cor (x, y) 1 cor (x, y) = cor (y, x) cor (ax + b, cy + d) = sign(a)sign(c)cor (x, y) for arbitrary numbers a, b, c, d.

14 Simple linear regression model The simple linear regression model assumes that where Y i = β 0 + β 1 x i + u i Y i is the value of the dependent variable Y when the random variable X takes a specific value x i x i is the specific value of the random variable X u i is an error, a random variable that is assumed to be normal with mean 0 and unknown variance σ 2, u i N(0, σ 2 ) β 0 and β 1 are the population coefficients: β 0 : population intercept β1 : population slope The (population) parameters that we need to estimate are: β 0, β 1 and σ 2.

15 Simple linear regression model Our objective is to find the estimators/estimates ˆβ 0, ˆβ 1 of β 0, β 1 in order to obtain the regression line: ŷ = ˆβ 0 + ˆβ 1 x which is the best fit to the data with a linear pattern. Example: Let s say that the regression line for the last example is Price = Production 80 Plot of Fitted Model 60 Costos Volumen Based on the regression line, we can estimate the price when Production is 25 millions: Price = (25) = 16.6

16 Simple linear regression model The difference between the observed value of the response variable y i and its estimate ŷ i is called a residual: e i = y i ŷ i Valor observado Dato (y) Recta de regresión estimada Example (cont.): Clearly, if for a given year the production is 25 millions, the price will not be exactly 16.6 mil euros. That small difference, the residual, in that case will be e i = = 1.4

17 Simple linear regression model: model assumptions Linearity: The underlying relationship between X and Y is linear, f (x) = β 0 + β 1 x Homogeneity: The errors have mean zero, E[u i ] = 0 Homoscedasticity: The variance of the errors is constant, Var(u i ) = σ 2 Independence: The errors are independent, E[u i u j ] = 0 Normality: The errors follow a normal distribution, u i N(0, σ 2 )

18 Simple linear regression model: model assumptions Linearity The scaterplot should have an (American) football-shape, i.e., it should show scatter around a straight line. 80 Plot of Fitted Model 60 Costos Volumen If not, the regression line is not an adequate model for the data. 34 Plot of Fitted Model 24 Y

19 Simple linear regerssion model: model assumptions Homoscedasticity The vertical spread around the line should roughly remain constant. 80 Plot of Costos vs Volumen 60 stos Cos Volumen If that s not the case, heteroscedasticity is present.

20 independientes. Simple linear regerssion model: model assumptions Independence variables dependientes y un conjunto de factores Tipos de relaciones: The observations should be independent. One observation Regresión doesn tlineal implysimple any information about another. In general, time series fail this assumption. Regresión Lineal f ( Y, Y,..., Y X, X,..., X ) - Relación no lineal - Relación lineal 1 2 k 1 2 l Regresión Normality A priori, we Modelo assume that the observations are normal. y 0 1x u, u N(0, ) i i i i 2 H L y i 0 1 x N H 2 0, 1, x i : parámetros desconocidos In

21 (Ordinary) Least Square Estimators: LSE In 1809 Gauss proposed the least squares x xi method to obtain the estimators ˆβ 0 and ˆβ 1 that provide the best fit Regresión Lineal ŷ i = ˆβ 0 + ˆβ 1 x i The method is based on a criterion in which we minimize the sum of squares of the residuals, SSR, that is, the sum of squared vertical distances Residuos between the observed y i and predicted ŷ i values n n n ( ( )) 2 ei 2 = y (y i ŷ i ) 2 = ˆ ˆ y i ˆβ 0 + ˆβ 1 x i 0 1xi e i i i=1 i=1 i=1 Valor Observado Valor Previsto Residuo e i 7 y i yˆ i ˆ ˆ 0 1x i x i

22 Least Squares Estimators yi 0 1xi ui, ui N(0, ) The resulting estimators y are i : Variable dependiente ˆβ 1 = x i : Variable independiente u i : Parte aleatoria Regresión Lineal cov(x, y) s 2 x = n (x i x) (y i ȳ) i=1 n 0 (x i x) 2 i=1 ˆβ 0 = ȳ ˆβ Recta de regresión 1 x yˆ ˆ ˆ 1x y i y Regresión Lineal Residuos y i Valor Observ y ˆ 0 y ˆ 1x x Pendiente ˆ 1 y i Regresión Lineal 8 Regresión Lineal

23 Fitting the regression line Example 4.1. For the Spanish wheat production data from the 80 s with production (X ) and price per kilo in pesetas (Y ) we have the following table production price Fit a least squares regression line to the data. ˆβ 1 = 10 i=1 10 x 2 i=1 x i y i n xȳ i n x 2 = = ˆβ 0 = ȳ ˆβ 1 x = = Regression line is ŷ = x

24 Fitting the regression line in software

25 Estimating the error variance To estimate the error variance, σ 2, we can simply take the uncorrected sample variance, n ˆσ 2 = i=1 n e 2 i which is the so-called maximum likelihood estimator of σ 2. However, this estimator is biased. The unbiased estimator of σ 2, is called the residual variance, n s 2 R = e 2 i i=1 n 2 = SSR n 2

26 Estimating the error variance Exercise Find the residual variance for exercise 4.1. First, we find the residuals, e i, using the regression line ŷ i = x i x i y i ŷ i = x i e i = y i ŷ i The residual variance is then s 2 R = n e 2 i i=1 n 2 = =

27 Estimating the error variance in software

28 Statistical inference in simple linear regression model Up to this point we only talked about point estimation. With confidence intervals for model parameters, we can obtain information about the estimation error. And tests of hypothesis will help us to decide if a given parameter is statistically significant. In statistical inference, we begin with the distribution of the estimators.

29 Statistical inference of the slope The estimator ˆβ 1 follows a normal distribution because it is a linear combination of normally distributed random variables n (x i x) n ˆβ 1 = (n 1)sX 2 Y i = w i Y i i=1 where Y i = β 0 + β 1 x i + u i, and satisfies Y i N ( β 0 + β 1 x i, σ 2). In addition, ˆβ 1 is an unbiased estimator of β 1, [ ] n (x i x) E ˆβ 1 = (n 1)sX 2 E [Y i ] = β 1 whose variance is, [ ] Var ˆβ 1 = Thus, i=1 i=1 i=1 n ( ) (xi x) 2 σ 2 (n 1)sX 2 Var [Y i ] = (n 1)sX 2 ˆβ 1 N ( β 1, σ 2 (n 1)s 2 X )

30 Confidence interval for the slope We wish to obtain a (1 α) confidence interval for β 1. Since σ 2 is unknown, we estimate it using sr 2. The corresponding theoretical result, when the error variance is unknown is then ˆβ 1 β 1 s 2 R (n 1)s 2 X t n 2 based on which we obtain (1 α) confidence interval for β 1 : sr ˆβ 2 1 ± t n 2,α/2 (n 1)sX 2 The length of the interval decreases if: The sample size increases. The variance of x i increases. The residual variance decreases.

31 Hypothesis testing for the slope In a similar manner, we construct a hypothesis test for β 1. In particular, if the true value of β 1 is zero, this means that the variable Y does not depend on X in a linear fashion. Thus, we are mainly interested in a two-sided test: The rejection region is : H 0 : β 1 = 0 H 1 : β >< RR α = t : >: 9 t z } { ˆβ 1 >= q sr 2 /((n 1)s2 X ) > t n 2,α/2 >; Equivalently, if 0 is outside a (1 α) confidence interval for β 1, we reject the null at α significance level. The p-value is: 0 1 B ˆβ 1 C p-value = 2 n 2 > q A sr 2 /((n 1)s2 X )

32 Inference for the slope Exercise Find a 95% CI for the slope of the (population) regression model from Example Test the hypothesis that the price of wheat depends linearly on the production at a 0.05 significance level. 1. Since t n 2,α/2 = t 8,0.025 = β β Since the interval (with the same α) doesn t contain 0, we reject the null β 1 = 0 at 0.05 level. Also, the (observed) test statistic is ˆβ 1 t = = = s 2 R / (n 1) sx Thus, we have p-value = 2 Pr( T 8 > ) = 0.002

33 Inference for β 1 in software

34 Statistical inference for the intercept The estimator ˆβ 0 follows a normal distribution because it is a linear combination of normal random variables, n ( ) 1 ˆβ 0 = n xw i Y i i=1 where w i = (x i x) /nsx 2 and Y i = β 0 + β 1 x i + u i, which satisfies Y i N ( β 0 + β 1 x i, σ 2). Additionally, ˆβ 0 is an unbiased estimator of β 0, [ ] n ( ) 1 E ˆβ 0 = n xw i E [Y i ] = β 0 i=1 i=1 whose variance is, [ ] n ( ) 1 2 ( 1 Var ˆβ 0 = n xw i Var [Y i ] = σ 2 n + x 2 ) (n 1)sX 2. Thus, ( 1n ˆβ 0 N (β 0, σ 2 + x 2 )) (n 1)sX 2

35 Confidence interval for the intercept We wish to find a (1 α) confidence interval for β 0. Since σ 2 is unknown, we estimate it with sr 2 as before. We obtain: s 2 R ˆβ 0 β 0 ( 1 n + x 2 ) t n 2 (n 1)sX 2 which yields the following confidence interval for β 0 : ( ) ˆβ 0 ± t 1 n 2,α/2 n + x2 s 2 R The length of the interval decreases if: The sample size increases. Variance of x i increases. The residual variance decreases. The mean of x i decreases. (n 1)s 2 X

36 Hypothesis test for the intercept Based on the distribution of the estimator, we can carry out the test of hypothesis. In particular, if the true value of β 0 is 0, it means that the population regression line goes through the origin. For this case we would test: The rejection region is: H 0 : β 0 = 0 H 1 : β t z } { >< RR α = t : ˆβ 0 >= s «> t n 2,α/2 >: sr 2 1 n + >; x2 (n 1)s 2 X Equivalently, if 0 is outside the (1 α) confidence interval for β 0 we reject the null. The p-value is 0 p-value = 2 Pr T n 2 > s sr 2 ˆβ 0 1 n + x2 (n 1)s 2 X 1 «C A

37 Inference for the intercept Exercise Find a 95% CI for the intercept of the population regression line of Exercise Test the hypothesis that the population regression line intersects the origin at a 0.05 significance level. 1. The quantile is t n 2,α/2 = t 8,0.025 = so β ( ) β Since the interval (with the same α) doesn t contain 0, we reject the null hypothesis that β 0 = 0. Also, the (observed) test statistic is t {}}{ ˆβ 0 t = ( ) = ( ) = sr 2 1 n + x (n 1)sX Thus, we have: p-value = 2 Pr( T 8 > ) =

38 Inference for the intercept in software

39 Inference for the error variance We have: Which means that : (n 2) s 2 R σ 2 χ 2 n 2 The (1 α) confidence interval for σ 2 is: (n 2) s 2 R χ 2 n 2,α/2 σ 2 (n 2) s2 R χ 2 n 2,1 α/2 Which can be used to solve the test: H 0 : σ 2 = σ 2 0 H 1 : σ 2 σ 2 0

40 Average and individual predictions We consider two situations: 1. We wish to estimate/predict the average value of Y for a given X = x We wish to estimate/predict the actual value of Y for a given X = x 0. For example in Ex What would be the average wheat price for all years in which the production was 30? 2. If in a given year, the production was 30, what would be the corresponding price of wheat? In both cases: ŷ 0 = ˆβ 0 + ˆβ 1 x 0 = ȳ + ˆβ 1 (x 0 x) But the estimation errors are different.

41 Estimating/predicting the average value Remember that: ) Var (Ŷ0 = Var ( Ȳ ) ( ) + (x 0 x) 2 Var ˆβ 1 ( ) = σ 2 1 n + (x 0 x) 2 (n 1) sx 2 The confidence interval for the mean prediction E[Y 0 X = x 0 ] is: ( ) Ŷ 0 ± t n 2,α/2 s 2 1 R n + (x 0 x) 2 (n 1) sx 2

42 Estimating/predicting the actual value The variance for the prediction of the actual value is the mean squared error: [ ( ) ] 2 ) E Y 0 Ŷ 0 = Var (Y 0 ) + Var (Ŷ0 ( ) = σ n + (x 0 x) 2 (n 1) sx 2 And thus the confidence interval for the actual value Y 0 is: ( ) Ŷ 0 ± t n 2,α/2 s 2 R n + (x 0 x) 2 (n 1) sx 2 The size of this interval is bigger than that for the average prediction.

43 Estimating/predicting the average and actual values In red: confidence intervals for the prediction of average value. In pink: confidence intervals for the prediction of actual value. 50 Plot of Fitted Model Precio en ptas Produccion en kg.

44 Regression line: R-squared and variability decomposition Coefficient of determination, R-squared is used to assess the goodness-of-fit of the model. It is defined as R 2 = r 2 (x,y) [0, 1] R 2 tells us what percentage of the sample variability in the y variable is explained by the model, that is, by its linear dependence on x Values close to 100% indicate that the regression model is a good fit to the data (less than 60%, not so good) Variability decomposition and R 2 : The Total Sum of Squares i (y i ȳ) 2 can be decomposed into the Residual Sum of Squares i (y i ŷ) 2 + the Model Sum of Squares i (ŷ ȳ)2 SST = SSR + SSM and we have R 2 = 1 SSR SST = SSM SST

45 Regression line: R-squared and variability decomposition From Wikipedia:

46 ANOVA table ANOVA (Analysis of Variance) table for the simple linear regression model Source of variability SS DF Mean F ratio Model SSM 1 SSM/1 SSM/s 2 R Residuals/errors SSR n 2 SSR/(n 2) = s 2 R Total SST n 1 Note that the value of the F statistic is the square of that for the t statistic in the simple regression significance test.

Estadística II Chapter 5. Regression analysis (second part)

Estadística II Chapter 5. Regression analysis (second part) Estadística II Chapter 5. Regression analysis (second part) Chapter 5. Regression analysis (second part) Contents Diagnostic: Residual analysis The ANOVA (ANalysis Of VAriance) decomposition Nonlinear

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Statistics II Exercises Chapter 5

Statistics II Exercises Chapter 5 Statistics II Exercises Chapter 5 1. Consider the four datasets provided in the transparencies for Chapter 5 (section 5.1) (a) Check that all four datasets generate exactly the same LS linear regression

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Simple linear regression

Simple linear regression Simple linear regression Prof. Giuseppe Verlato Unit of Epidemiology & Medical Statistics, Dept. of Diagnostics & Public Health, University of Verona Statistics with two variables two nominal variables:

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Lecture 16 - Correlation and Regression

Lecture 16 - Correlation and Regression Lecture 16 - Correlation and Regression Statistics 102 Colin Rundel April 1, 2013 Modeling numerical variables Modeling numerical variables So far we have worked with single numerical and categorical variables,

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

Chapter 2 The Simple Linear Regression Model: Specification and Estimation Chapter The Simple Linear Regression Model: Specification and Estimation Page 1 Chapter Contents.1 An Economic Model. An Econometric Model.3 Estimating the Regression Parameters.4 Assessing the Least Squares

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

STAT2012 Statistical Tests 23 Regression analysis: method of least squares

STAT2012 Statistical Tests 23 Regression analysis: method of least squares 23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com 12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and

More information

Section 4: Multiple Linear Regression

Section 4: Multiple Linear Regression Section 4: Multiple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 The Multiple Regression

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5) 10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

ECON 450 Development Economics

ECON 450 Development Economics ECON 450 Development Economics Statistics Background University of Illinois at Urbana-Champaign Summer 2017 Outline 1 Introduction 2 3 4 5 Introduction Regression analysis is one of the most important

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

SF2930: REGRESION ANALYSIS LECTURE 1 SIMPLE LINEAR REGRESSION.

SF2930: REGRESION ANALYSIS LECTURE 1 SIMPLE LINEAR REGRESSION. SF2930: REGRESION ANALYSIS LECTURE 1 SIMPLE LINEAR REGRESSION. Tatjana Pavlenko 17 January 2018 WHAT IS REGRESSION? INTRODUCTION Regression analysis is a statistical technique for investigating and modeling

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

REGRESSION ANALYSIS AND INDICATOR VARIABLES

REGRESSION ANALYSIS AND INDICATOR VARIABLES REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora

More information

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models 4.1 Least Squares Prediction 4. Measuring Goodness-of-Fit 4.3 Modeling Issues 4.4 Log-Linear Models y = β + β x + e 0 1 0 0 ( ) E y where e 0 is a random error. We assume that and E( e 0 ) = 0 var ( e

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information