LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Similar documents
Chapter 8 Indicator Variables

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Polynomial Regression Models

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Chapter 12 Analysis of Covariance

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

x = , so that calculated

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Lecture 4 Hypothesis Testing

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Lecture 6 More on Complete Randomized Block Design (RBD)

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Statistics for Business and Economics

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture 6: Introduction to Linear Regression

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

1-FACTOR ANOVA (MOTIVATION) [DEVORE 10.1]

Chapter 13: Multiple Regression

18. SIMPLE LINEAR REGRESSION III

Linear Regression Analysis: Terminology and Notation

x i1 =1 for all i (the constant ).

PROBABILITY AND STATISTICS Vol. III - Analysis of Variance and Analysis of Covariance - V. Nollau ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Chapter 15 - Multiple Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Modeling and Simulation NETW 707

17 Nested and Higher Order Designs

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

STAT 511 FINAL EXAM NAME Spring 2001

Statistics for Economics & Business

Chapter 3 Describing Data Using Numerical Measures

28. SIMPLE LINEAR REGRESSION III

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

Economics 130. Lecture 4 Simple Linear Regression Continued

Chapter 9: Statistical Inference and the Relationship between Two Variables

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Comparison of Regression Lines

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Basically, if you have a dummy dependent variable you will be estimating a probability.

Chapter 11: Simple Linear Regression and Correlation

Chapter 15 Student Lecture Notes 15-1

/ n ) are compared. The logic is: if the two

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Correlation and Regression

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Introduction to Analysis of Variance (ANOVA) Part 1

Topic 23 - Randomized Complete Block Designs (RCBD)

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

STATISTICS QUESTIONS. Step by Step Solutions.

Estimation: Part 2. Chapter GREG estimation

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Analysis of Variance and Design of Experiments-II

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Statistics II Final Exam 26/6/18

STAT 3008 Applied Regression Analysis

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

A Monte Carlo Study for Swamy s Estimate of Random Coefficient Panel Data Model

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Exam. Econometrics - Exam 1

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Learning Objectives for Chapter 11

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Lecture 3 Stat102, Spring 2007

Chapter 14 Simple Linear Regression

A Comparative Study for Estimation Parameters in Panel Data Model

Properties of Least Squares

U-Pb Geochronology Practical: Background

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

Notes on Frequency Estimation in Data Streams

The Ordinary Least Squares (OLS) Estimator

January Examinations 2015

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Econometrics of Panel Data

18.1 Introduction and Recap

Effective plots to assess bias and precision in method comparison studies

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

First Year Examination Department of Statistics, University of Florida

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

ANOVA. The Observations y ij

Transcription:

LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur

Indcator varables versus quanttatve explanatory varable The quanttatve explanatory varables can be converted nto ndcator varables. For example, f e ages of persons are grouped as follows: Group : day to years Group : years to 8 years Group : 8 years to years Group 4: years to 7 years Group 5: 7 years to 5 years en e varable age can be represented by four dfferent ndcator varables. Snce t s dffcult to collect e data on ndvdual ages, so s wll help n easy collecton of data. A dsadvantage s at some loss of nformaton occurs. For example, f e ages n years are,, 4, 5, 6, 7 and suppose e ndcator varable s defned as f age of person s > 5 years D = 0 f age of person s 5 years. Then ese values become 0, 0, 0,,,. Now lookng at e value, one can not determne f t corresponds to age 5, 6 or 7 years.

Moreover, f a quanttatve explanatory varable s grouped nto m categores, en (m -) parameters are requred whereas f e orgnal varable s used as such, en only one parameter s requred. Treatng a quanttatve varable as qualtatve varable ncreases e complexty of e model. The degrees of freedom for error are also reduced. Ths can effect e nferences f data set s small. In large data sets, such effect may be small. The use of ndcator varables does not requre any assumpton about e functonal form of e relatonshp between study and explanatory varables.

4 Regresson analyss and analyss of varance The analyss of varance s usually used n analyzng e data from e desgned experments. There s a connecton between e statstcal tools used n analyss of varance and regresson analyss. We consder e case of analyss of varance n one way classfcaton and establsh ts relaton w regresson analyss. One way classfcaton Let ere are k samples each of sze n from k normally dstrbuted populatons only n er means but ey have same varance y = µ + ε, =,,..., k; j =,,..., n j j = µ + ( µ µ ) + ε = µ + τ + ε j j σ. Ths can be expressed as N µ σ = k (, ),,,...,. The populaton dffer where y j s e j observaton for e fxed treatment effect τ = µ µ or factor level, µ s e general mean effect, j are dentcally and ndependently dstrbuted random errors followng N(0, σ ). Note at k τ = µ µ, τ = 0. = The null hypoess s H : τ = τ =... = τ = 0 H 0 : τ 0 for atleast one. k ε

5 µ Employng meod of least squares, we obtan e estmator of and as follows: τ S ( y ) k n k n j j = j= = j= = ε = µ τ k n S = 0 ˆ µ = yj = µ nk = j= y where y n = yj. n j = n S = 0 ˆ τ = y ˆ µ = y y τ j n j= Based on s, e correspondng test statstc s F 0 n k ( y y) k = = k n ( yj y ) = j= kn ( ) whch follows F-dstrbuton w k - and k (n - ) degrees of freedom when null hypoess s true. The decson rule s to reject H 0 whenever F0 Fα ( k, kn ( )) and t s concluded at e k treatment means are not dentcal.

6 Connecton w regresson To llustrate e connecton between fxed effect one way analyss of varance and regresson, suppose ere are treatments so at e model becomes y = µ + τ + ε, =,,...,, j =,,..., n. j j There are treatments whch are e ree levels of a qualtatve factor. For example, e temperature can have ree possble levels low, medum and hgh. They can be represented by two ndcator varables as f e observaton s from treatment D = 0 oerwse, D f e observaton s from treatment =. 0 oerwse. The regresson model can be rewrtten as where st D : value of D for j observaton w treatment j nd D : value of D for j observaton w treatment. j yj = β0 + βdj + βd j + εj, =,,; j =,,..., n Note at parameters n regresson model are β0, β, β. parameters n analyss of varance model are µτ,, τ, τ. We establsh a relatonshp between e two sets of parameters.

7 Suppose treatment s used on j observaton, so D j =, D j = 0 and y = β + β. + β.0 + ε j 0 j = β + β + ε. 0 j In case of analyss of varance model, s s represented as y = µ + τ + ε j j = µ + ε where µ = µ + τ j β + β = µ 0. If treatment s appled on j observaton, en - n regresson model set up, D = 0, D = j j and y = β + β.0 + β.+ ε j 0 j = β + β + ε 0 j. - n analyss of varance model set up, y = µ + τ + ε j j = µ + ε where µ = µ + τ j β + β = µ 0.

When treatment s used on j observaton, en - n regresson model set up, D = D = 0 j j y = β + β.0 + β.0 + ε j 0 = β + ε 0 j. j 8 - n analyss of varance model set up y = µ + τ + ε j j = µ + ε j where µ = µ + τ β. 0 = µ So fnally, ere are followng ree relatonshps β + β = µ 0 β + β = µ 0 β = µ 0 β = µ 0 β = µ µ β µ µ =.

9 In general, f ere are k treatments, en (k - ) ndcator varables are needed. The regresson model s gven by y = β + β D + β D +... + β D + ε, =,,..., k; j =,,..., n j 0 j j k k, j j where D j f j observaton gets treatment = 0 oerwse. In s case, e relatonshp s β β = µ So always estmates e mean of k treatment and estmates e dfferences between e means of treatment 0 and k treatment. 0 k β = µ µ k, =,,..., k. β