Multivariate Lineare Modelle

Similar documents
Introduction to Linear regression analysis. Part 2. Model comparisons

Simple Linear Regression

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Checking model assumptions with regression diagnostics

DEMAND ESTIMATION (PART III)

A discussion on multiple regression models

Chapter 4. Regression Models. Learning Objectives

Correlation and Simple Linear Regression

Review of Multiple Regression

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

General Linear Model (Chapter 4)

Regression and Models with Multiple Factors. Ch. 17, 18

Estadística II Chapter 5. Regression analysis (second part)

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Cheat Sheet: Linear Regression

ECON 4230 Intermediate Econometric Theory Exam

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar

Review of Statistics 101

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Inferences for Regression

Business Statistics. Lecture 10: Correlation and Linear Regression

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

Chapter 4: Regression Models

MATH 644: Regression Analysis Methods

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

M A N O V A. Multivariate ANOVA. Data

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

STATISTICS 110/201 PRACTICE FINAL EXAM

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Wolfgang Karl Härdle Leopold Simar. Applied Multivariate. Statistical Analysis. Fourth Edition. ö Springer

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Lecture 18 MA Applied Statistics II D 2004

Basic Business Statistics, 10/e

ECON 497 Midterm Spring

Multiple Linear Regression

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Confidence Intervals, Testing and ANOVA Summary

Research Methodology Statistics Comprehensive Exam Study Guide

Exam Applied Statistical Regression. Good Luck!

Data Set 8: Laysan Finch Beak Widths

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Categorical Predictor Variables

Lecture 4: Multivariate Regression, Part 2

4 Multiple Linear Regression

Statistics: A review. Why statistics?

Statistics in medicine

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Confidence Interval for the mean response

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Unit 6 - Introduction to linear regression

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Lecture 11 Multiple Linear Regression

UNIVERSITÄT POTSDAM Institut für Mathematik

4.1. Introduction: Comparing Means

1 Introduction to Minitab

Answer Keys to Homework#10

Contents. Acknowledgments. xix

Assignment 9 Answer Keys

Practical Biostatistics

AMS 7 Correlation and Regression Lecture 8

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Time Varying Hierarchical Archimedean Copulae (HALOC)

Multiple Regression Analysis

Chapter 16. Simple Linear Regression and Correlation

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

SEEC Toolbox seminars

9. Linear Regression and Correlation

Handout 1: Predicting GPA from SAT

Statistical Modelling in Stata 5: Linear Models

SPSS Guide For MMI 409

Data Set 1A: Algal Photosynthesis vs. Salinity and Temperature

14 Multiple Linear Regression

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Multivariate Correlational Analysis: An Introduction

ECON 450 Development Economics

STAT 4385 Topic 06: Model Diagnostics

Using the Regression Model in multivariate data analysis

GLM I An Introduction to Generalized Linear Models

Lecture 19 Multiple (Linear) Regression

Lecture 19: Inference for SLR & Transformations

Quantitative Methods I: Regression diagnostics

Applied Statistics and Econometrics

Regression. Marc H. Mehlman University of New Haven

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Transcription:

0-1 TALEB AHMAD CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin

Motivation 1-1 Motivation Multivariate regression models can accommodate many explanatory which simultaneously affect the dependent variables. The hypothesis being tested is that there is a joint linear effect of the set of predictor variables on the set of response variables.

Motivation 2-1 Motivation The basic assumptions of multivariate regression model are - multivariate normality of the residuals - homogenous variances of residuals conditional on predictors - common covariance structure across observations - independent observations

Motivation 3-1 Motivation Heteroscedasticity where there is unequal variances for the predictor variables. Multicollinearity explaining variables coefficients are highly correlated. Autocorrelation when the assumption of zero correlation of the error term is violated.

Motivation 3-2 Outline 1. Motivation 2. The Boston housing data 3. für Boston housing data 4. References

The Boston housing data 4-1 Aim: explain price variation X 14 by the variations of all other 13 variables in the Boston Housing data. variables: see variables description

The Boston housing data 4-2 Explorative Zusammenhangsanalyse visual inspection skewness, kurtosis and outliers distribution Q-Q plots and Kolmogorov-Smirnov transformation, Z and log transformations to improve normality

Boxplots: original variables for Boston Housing data Boxplots: transformed variables for Boston Housing data Boxplots for variables

Q-Q plots: (left-right)original variables

Scatterplot matrix (X 1 tox 6 ) and (X 7 to X 14 ) with X 14 respectively

Scatterplot matrix for all variables

The Boston housing data 4-7 descriptive statistics Variable Mean Median Stdd. Skewn. Kurt. X 1 3.61 0.25 8.60 5.19 39.59 X 2 11.36 0 23.32 2.21 6.95 X 3 11.13 9.69 6.86 0.29 1.75 X 4 0.06 0 0.25 3.38 12.48 X 5 0.55 0.53 0.11 0.72 2.91 X 6 6.28 6.20 0.70 0.40 4.84 X 7 68.57 77.5 28.14-0.59 2.02 X 8 3.79 3.20 2.10 1.00 3.45 X 9 9.54 5 8.70 0.99 2.12 X 10 408.24 330 168.54 0.66 1.84 X 11 18.45 19.05 2.16-0.79 2.69 X 12 356.67 391.44 91.29-2.87 10.10 X 13 12.65 11.36 7.14 0.90 3.46 X 14 22.53 21.2 9.19 1.10 4.45 Table 1: Summary statistics: Boston data

Variable transformation 5-1 Transformations X 1 = log(x 1 ) X 2 = log(x 2/10) X 3 = X 3 (normal) X 4 = X 4 (binary) X 5 = log(x 5 ) X 6 = log(x 6 ) X 7 = ( X 7 ) 2 X 8 = log(x 8) 10 2 X 9 = log(x 9 ) X 10 = log(x 10 ) X 11 = exp(0.4x 11) X 10 2 12 = X 12 10 2 X 13 = (X 13 ) X 14 = log(x 14 )

Q-Q plots: (left-right) transformed variables

MLR Modelle 6-1 Multivariate Regression Lineare Modelle X = α 0 + α 1 X 1 + α 2 X 2 +... + α k X k + ε Estimate: 13 X 14 = α 0 + α j Xj + ε j=1 X j are transformed variables X 1 to X 14

MLR Modelle 6-2 Regressionschätzung: Methode Forward selection Step Multiple R R 2 F SigF Variable(s) 1 0.7851 0.6164 809.856 0.000 X13 2 0.8116 0.6587 485.399 0.000 X6 3 0.8271 0.6841 362.315 0.000 X11 4 0.8443 0.7128 310.933 0.000 X8 5 0.8535 0.7285 268.341 0.000 X5 6 0.8573 0.7350 230.723 0.000 X12 7 0.8607 0.7408 203.377 0.000 X3 8 0.8649 0.7480 184.408 0.000 X4 9 0.8663 0.7504 165.707 0.000 X 1 10 0.8689 0.7550 152.552 0.000 X10 11 0.8705 0.7578 140.542 0.000 X 9 12 0.8721 0.7605 130.451 0.000 X2 Table 2: Forward Selection

MLR Modelle 6-3 Regressionschätzung: Methode Forward selection ANOVA SS df MSS F-test P-value Regression 384.050 12 32.004 130.451 0.0000 Residuals 120.950 5e+02 0.245 Total Variation 505 505 1.000 Multip. R = 0.87 R 2 = 0.76 Adj. R 2 = 0.75 Std. Error = 0.49 Table 3: Forward Selection

MLR Modelle 6-4 Regressionschätzung PARAMETERS Beta SE StandB t-test P-value Variable α 0 0.00 0.02 0.00 0.00 1.00 Constant α 1 0.104 0.064 0.10 1.73 0.08 X 1 α 2 0.07 0.03 0.07 2.33 0.01 X 2 α 3-0.10 0.04-0.10-2.51 0.01 X 3 α 4 0.07 0.02 0.07 3.37 0.00 X 4 α 5-0.21 0.05-0.21-4.14 0.00 X 5 α 6 0.21 0.02 0.21 7.06 0.00 X 6 α 7-0.44 0.04-0.44-9.60 0.00 X 8 α 8 0.12 0.04 0.12 2.58 0.01 X 9 α 9-0.20 0.04-0.20-4.42 0.00 X 10 α 10-0.13 0.02-0.13-5.14 0.00 X 11 α 11 0.09 0.02 0.09 3.54 0.00 X 12 α 12-0.57 0.03-0.57-15.73 0.00 X 13 Table 4: Forward Selection

MLR Modelle 6-5 Regressionschätzung R 2 = 0.76 indicates 75% of variation of X 14 is explained by the model 13 X 14 = α 0 + α j Xj + ε P-values table 4 indicates that variables X 1, X 2, X 3 and X 8 have little influence on changes in X 14, the log price of the Houses. j=1

The Boston housing data: comprise 506 observations for each census district of the Boston metropolitan area.

References 7-1 References A. Handl Multivariate Analysemethoden. Springer, 2002. J. Schira Statistische Methoden der BWL und VWL. Pearson, 2002. H. Joe Multivariate Models and Dependence Concepts Chapman & Hall, London, 1997. W. Härdle und L. Simar Applied Multivariate Statistical Analysis. Springer, 2003.