Estadística II Chapter 5. Regression analysis (second part)
|
|
- Bathsheba Poole
- 6 years ago
- Views:
Transcription
1 Estadística II Chapter 5. Regression analysis (second part)
2 Chapter 5. Regression analysis (second part) Contents Diagnostic: Residual analysis The ANOVA (ANalysis Of VAriance) decomposition Nonlinear relationships and linearizing transformations Simple linear regression model in the matrix notation Introduction to the multiple linear regression
3 Lesson 5. Regression analysis (second part) Bibliography references Newbold, P. Estadística para los negocios y la economía (1997) Chapters 12, 13 and 14 Peña, D. Regresión y Análisis de Experimentos (2002) Chapters 5 and 6
4 Regression diagnostics Diagnostics The theoretical assumptions for the simple linear regression model with one response variable y and one explicative variable x are: Linearity: y i = β 0 + β 1 x i + u i, for i = 1,..., n Homogeneity: E[u i ] = 0, for i = 1,..., n Homoscedasticity: Var[ui ] = σ 2, for i = 1,..., n Independence: ui and u j are independent for i j Normality: ui N(0, σ 2 ), for i = 1,..., n We study how to apply diagnostic procedures to test if these assumptions are appropriate for the available data (x i, y i ) Based on the analysis of the residuals ei = y i ŷ i
5 Diagnostics: scatterplots Scatterplots The simplest diagnostic procedure is based on the visual examination of the scatterplot for (x i, y i ) Often this simple but powerful method reveals patterns suggesting whether the theoretical model might be appropriate or not We illustrate its application on a classical example. Consider the four following datasets
6 Diagnostics: scatterplots The Anscombe datasets TABLE 3-10 Four Data Sets DATASET 1 DATASET 2 X Y X Y ' DATASET 3 DATASET 4 X Y X Y SOURCE: F. J. Anscombe, op. cit.
7 Diagnostics: scatterplots The Anscombe example The estimated regression model for each of the four previous datasets is the same ŷ i = x i n = 11, x = 9.0, ȳ = 7.5, rxy = The estimated standard error of the estimator ˆβ 1, s 2 R (n 1)s 2 x takes the value in all four cases. The corresponding T statistic takes the value T = 0.5/0.118 = But the corresponding scatterplots show that the four datasets are quite different. Which conclusions could we reach from these diagrams?
8 Diagnostics: scatterplots Anscombe data scatterplots o Data Set 1 Data Set Data Set 3 Data Set FIGURE 3-29 Scatterplots for the four data sets of Table 3-10 SOURCE: F. J. Anscombe, op cit.
9 Residual analysis Further analysis of the residuals If the observation of the scatterplot is not sufficient to reject the model, a further step would be to use diagnostic methods based on the analysis of the residuals e i = y i ŷ i This analysis starts by standarizing the residuals, that is, dividing them by the residuals (quasi-)standard deviation s R. The resulting quantities are known as standardized esiduals: e i s R Under the assumptions of the linear regression model, the standardized residuals are approximately independent standard normal random variables A plot of these standardized residuals should show no clear pattern
10 Residual analysis Residual plots Several types of residual plots can be constructed. The most common ones are: Plot of standardized residuals vs. x Plot of standardized residuals vs. ŷ (the predicted responses) Deviations from the model hypotheses result in patterns on these plots, which should be visually recognizable
11 Residual plots examples Consistency 5.1: Ej: of the consistencia theoretical model con el modelo teórico
12 Residual plots examples Nonlinearity 5.1: Ej: No linealidad
13 Residual plots examples Heteroscedasticity 5.1: Ej: Heterocedasticidad
14 Residual analysis Outliers In a plot of the regression line we may observe outlier data, that is, data that show significant deviations from the regression line (or from the remaining data) The parameter estimators for the regression model, ˆβ 0 and ˆβ 1, are very sensitive to these outliers It is important to identify the outliers and make sure that they really are valid data
15 Residual analysis Normality of the errors One of the theoretical assumptions of the linear regression model is that the errors follow a normal distribution This assumption can be checked visually from the analysis of the residuals e i, using different approaches: By inspection of the frequency histogram for the residuals By inspection of the Normal Probability Plot of the residuals (significant deviations of the data from the straight line in the plot correspond to significant departures from the normality assumption)
16 The ANOVA decomposition Introduction ANOVA: ANalysis Of VAriance When fitting the simple linear regression model ŷ i = ˆβ 0 + ˆβ 1 x i to a data set (x i, y i ) for i = 1,..., n, we may identify three sources of variability in the responses variability associated to the model: SSM = n i=1 (ŷ i ȳ) 2, where the initials SS denote sum of squares and M refers to the model variability of the residuals: SSR = n i=1 (y i ŷ i ) 2 = n i=1 e2 i total variability: SST = n i=1 (y i ȳ) 2 The ANOVA decomposition states that: SST = SSM + SSR
17 The ANOVA decomposition The coefficient of determination R 2 The ANOVA decomposition states that SST = SSM + SSR Note that y i ȳ = (y i ŷ i ) + (ŷ i ȳ) SSM = n i=1 (ŷ i ȳ) 2 measures the variation in the responses due to the regression model (explained by the predicted values ŷ i ) Thus, the ratio SSR/SST is the proportion of the variation in the responses that is not explained by the regression model The ratio R 2 = SSM/SST = 1 SSR/SST is the proportion of the variation in the responses that is explained by the regression model. It is known as the coefficient of determination The value of the coefficient of determination satisfies R 2 = r 2 xy (the squared correlation coefficient) For example, if R 2 = 0.85 the variable x explains 85% of the variation in the response variable y
18 The ANOVA decomposition ANOVA table Source of variability SS DF Mean F ratio Model SSM 1 SSM/1 SSM/s 2 R Residuals/errors SSR n 2 SSR/(n 2) = s 2 R Total SST n 1
19 The ANOVA decomposition ANOVA hypothesis testing Hypothesis test, H 0 : β 1 = 0 vs. H 1 : β 1 0 Consider the ratio F = SSM/1 SSR/(n 2) = SSM sr 2 Under H 0, F follows an F 1,n 2 distribution Test at a significance level α: reject H 0 if F > F 1,n 2;α How does this result relate to the test based on the Student-t we saw in Lesson 4? They are equivalent
20 The ANOVA decomposition Excel output
21 Nonlinear relationships and linearizing transformations Introduction Consider the case when the deterministic part f (x i ; a, b) of the response in the model y i = f (x i ; a, b) + u i, i = 1,..., n is a nonlinear function of x that depends on two parameters a and b (for example, f (x; a, b) = ab x ) In some cases we may apply transformations to the data to linearize them. We are then able to apply the linear regression procedure From the original data (x i, y i ) we obtain the transformed data (x i, y i ) The parameters β 0 and β 1 corresponding to the linear relation between x i and y i are transformations of the parameters a and b
22 Nonlinear relationships and linearizing transformations Linearizing transformations Examples of linearizing transformations: If y = f (x; a, b) = ax b then log y = log a + b log x. We have y = log y, x = log x, β 0 = log a, β 1 = b If y = f (x; a, b) = ab x then log y = log a + (log b)x. We have y = log y, x = x, β 0 = log a, β 1 = log b If y = f (x; a, b) = 1/(a + bx) then 1/y = a + bx. We have y = 1/y, x = x, β 0 = a, β 1 = b If y = f (x; a, b) = log(ax b ) then y = log a + b(log x). We have y = y, x = log x, β 0 = log a, β 1 = b
23 A matrix treatment of linear regression Introduction Remember the simple linear regression model, y i = β 0 + β 1 x i + u i, i = 1,..., n If we write one equation for each one of the observations, we have y 1 = β 0 + β 1 x 1 + u 1 y 2 = β 0 + β 1 x 2 + u 2.. y n = β 0 + β 1 x n + u n
24 A matrix treatment of linear regression The model in matrix form We can write the preceding equations in matrix form as y 1 β 0 + β 1 x 1 u 1 y 2 β 0 + β 1 x 2 u 2. y n =. + β 0 + β 1 x n. u n And splitting the parameters β from the variables x i, y 1 1 x 1 u 1 y 2 1 x 2 ( ) β0 u 2. y n =.. 1 x n β 1 +. u n
25 A matrix treatment of linear regression The regression model in matrix form We can write the preceding matrix relationship y 1 1 x 1 y 2 1 x 2 ( ) β0 as. y n =.. 1 x n β 1 y = Xβ + u + y: response vector; X: explanatory variables matrix (or experimental design matrix); β: vector of parameters; u: error vector u 1 u 2. u n
26 The regression model in matrix form Covariance matrix for the errors We denote as Cov(u) the n n matrix of covariances for the errors. Its (i, j)-th element is given by cov(u i, u j ) = 0 if i j and cov(u i, u i ) = Var(u i ) = σ 2 Thus, Cov(u) is the identity matrix I n n multiplied by σ 2 : σ σ 2 0 Cov(u) = = σ2 I 0 0 σ 2
27 The regression model in matrix form Least-squares estimation The least-squares vector parameter estimate ˆβ is the unique solution of the 2 2 matrix equation (check the dimensions) (X T X) ˆβ = X T y, that is, ˆβ = (X T X) 1 X T y. The vector ŷ = (ŷ i ) of response estimates is given by ŷ = X ˆβ and the residual vector is defined as e = y ŷ
28 The multiple linear regression model Introduction Use of the simple linear regression model: predict the value of a response y from the value of an explanatory variable x In many applications we wish to predict the response y from the values of several explanatory variables x 1,..., x k For example: forecast the value of a house as a function of its size, location, layout and number of bathrooms forecast the size of a parliament as a function of the population, its rate of growth, the number of political parties with parliamentary representation, etc.
29 The multiple linear regression model The model Use of the multiple linear regression model: predict a response y from several explanatory variables x 1,..., x k If we have n observations, for i = 1,..., n, y i = β 0 + β 1 x i1 + β 2 x i2 + + β k x ik + u i We assume that the error variables u i are independent random variables following a N(0, σ 2 ) distribution
30 The multiple linear regression model The least-squares fit We have n observations, and for i = 1,..., n y i = β 0 + β 1 x i1 + β 2 x i2 + + β k x ik + u i We wish to fit to the data (x i1, x i2,..., x ik, y i ) a hyperplane of the form ŷ i = ˆβ 0 + ˆβ 1 x i1 + ˆβ 2 x i2 + + ˆβ k x ik The residual for observation i is defined as e i = y i ŷ i We obtain the parameter estimates ˆβ j as the values that minimize the sum of the squares of the residuals
31 The multiple linear regression model The model in matrix form We can write the model as a matrix relationship, β y 1 1 x 11 x 12 x 0 1k y 2. = 1 x 21 x 22 x 2k β β y n 1 x n1 x n2 x nk and in compact form as y = Xβ + u y: response vector; X: explanatory variables matrix (or experimental design matrix); β: vector of parameters; u: error vector β k u 1 u 2. u n
32 The multiple linear regression model Least-squares estimation The least-squares vector parameter estimate ˆβ is the unique solution of the (k + 1) (k + 1) matrix equation (check the dimensions) (X T X) ˆβ = X T y, and as in the k = 1 case (simple linear regression) we have ˆβ = (X T X) 1 X T y. The vector ŷ = (ŷ i ) of response estimates is given by ŷ = X ˆβ and the residual vector is defined as e = y ŷ
33 The multiple linear regression model Variance estimation For the multiple linear regression model, an estimator for the error variance σ 2 is the residual (quasi-)variance, s 2 R = n i=1 e2 i n k 1, and this estimator is unbiased Note that for the simple linear regression case we had n 2 in the denominator
34 The multiple linear regression model The sampling distribution of ˆβ Under the model assumptions, the least-squares estimator ˆβ for the parameter vector β follows a multivariate normal distribution E( ˆβ) = β (it is an unbiased estimator) The covariance matrix for ˆβ is Cov( ˆβ) = σ 2 (X T X) 1 We estimate Cov( ˆβ) using s 2 R (XT X) 1 The estimate of Cov( ˆβ) provides estimates s 2 ( ˆβ j ) for the variance Var( ˆβ j ). s( ˆβ j ) is the standard error of the estimator ˆβ j If we standardize ˆβ j we have ˆβ j β j s( ˆβ j ) t n k 1 (the Student-t distribution)
35 The multiple linear regression model Inference on the parameters β j Confidence interval for the partial coefficient β j at a confidence level 1 α ˆβ j ± t n k 1;α/2 s( ˆβ j ) Hypothesis testing for H 0 : β j = 0 vs. H 1 : β j 0 at a confidence level α Reject H0 if T > t n k 1;α/2, where is the test statistic T = ˆβ j s( ˆβ j )
36 The ANOVA decomposition The multivariate case ANOVA: ANalysis Of VAriance When fitting the multiple linear regression model ŷ i = ˆβ 0 + ˆβ 1 x i1 + + ˆβ k x ik to a data set (x i1,..., x ik, y i ) for i = 1,..., n, we may identify three sources of variability in the responses variability associated to the model: SSM = n i=1 (ŷ i ȳ) 2, where the initials SS denote sum of squares and M refers to the model variability of the residuals: SSR = n i=1 (y i ŷ i ) 2 = n i=1 e2 i total variability: SST = n i=1 (y i ȳ) 2 The ANOVA decomposition states that: SST = SSM + SSR
37 The ANOVA decomposition The coefficient of determination R 2 The ANOVA decomposition states that SST = SSM + SSR Note that y i ȳ = (y i ŷ i ) + (ŷ i ȳ) SSM = n i=1 (ŷ i ȳ) 2 measures the variation in the responses due to the regression model (explained by the predicted values ŷ i ) Thus, the ratio SSR/SST is the proportion of the variation in the responses that is not explained by the regression model The ratio R 2 = SSM/SST = 1 SSR/SST is the proportion of the variation in the responses that is explained by the explanatory variables. It is known as the coefficient of multiple determination For example, if R 2 = 0.85 the variables x 1,..., x k explain 85% of the variation in the response variable y
38 The ANOVA decomposition ANOVA table Source of variability SS DF Mean F ratio Model SSM k SSM/k (SSM/k)/ Residuals/errors SSR n k 1 Total SST n 1 SSR/(n k 1) = sr 2
39 The ANOVA decomposition ANOVA hypothesis testing Hypothesis test, H 0 : β 1 = β 2 = = β k = 0 vs. H 1 : β j 0 for some j = 1,..., k H 0 : the response does not depend on any x j H 1 : the response depends on at least one x j ˆβ j interpretation: keeping everything else the same, a one unit increase in x j will produce, on average, a change by ˆβ j in y Consider the ratio F = SSM/k SSR/(n k 1) = SSM sr 2 Under H 0, F follows an F k,n k 1 distribution Test at a significance level α: reject H 0 if F > F k,n k 1;α
Estadística II Chapter 4: Simple linear regression
Estadística II Chapter 4: Simple linear regression Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties
More informationStatistics II Exercises Chapter 5
Statistics II Exercises Chapter 5 1. Consider the four datasets provided in the transparencies for Chapter 5 (section 5.1) (a) Check that all four datasets generate exactly the same LS linear regression
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationRegression Analysis. Regression: Methodology for studying the relationship among two or more variables
Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationDiagnostics of Linear Regression
Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationThe Multiple Regression Model
Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationChapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression
Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationSimple Linear Regression: The Model
Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationSIMPLE REGRESSION ANALYSIS. Business Statistics
SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients
More informationThe Simple Regression Model. Part II. The Simple Regression Model
Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationHeteroskedasticity. Part VII. Heteroskedasticity
Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationThe Simple Regression Model. Simple Regression Model 1
The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationStatistical Inference. Part IV. Statistical Inference
Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationLecture 19 Multiple (Linear) Regression
Lecture 19 Multiple (Linear) Regression Thais Paiva STA 111 - Summer 2013 Term II August 1, 2013 1 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013 Lecture Plan 1 Multiple regression
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationEconometrics Multiple Regression Analysis: Heteroskedasticity
Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More informationMultivariate Regression Analysis
Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationUNIT 12 ~ More About Regression
***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests
More informationMultiple Regression Analysis: Heteroskedasticity
Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More information4 Multiple Linear Regression
4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ
More informationassumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )
Multiple Linear Regression is used to relate a continuous response (or dependent) variable Y to several explanatory (or independent) (or predictor) variables X 1, X 2,, X k assumes a linear relationship
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More information3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationBivariate data analysis
Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationRegression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).
Regression Analysis Two variables may be related in such a way that the magnitude of one, the dependent variable, is assumed to be a function of the magnitude of the second, the independent variable; however,
More informationEcon 3790: Statistics Business and Economics. Instructor: Yogesh Uppal
Econ 3790: Statistics Business and Economics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 14 Covariance and Simple Correlation Coefficient Simple Linear Regression Covariance Covariance between
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationCorrelation 1. December 4, HMS, 2017, v1.1
Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample
More informationRegression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear
Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -
More informationLecture 9: Linear Regression
Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationAnalisi Statistica per le Imprese
, Analisi Statistica per le Imprese Dip. di Economia Politica e Statistica 4.3. 1 / 33 You should be able to:, Underst model building using multiple regression analysis Apply multiple regression analysis
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationEcon 3790: Business and Economics Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationSTAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511
STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not
More informationCorrelation in Linear Regression
Vrije Universiteit Amsterdam Research Paper Correlation in Linear Regression Author: Yura Perugachi-Diaz Student nr.: 2566305 Supervisor: Dr. Bartek Knapik May 29, 2017 Faculty of Sciences Research Paper
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationStatistics II Lesson 1. Inference on one population. Year 2009/10
Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating
More information