Multiple Regression Analysis

Similar documents
Time Series and Forecasting

Time Series and Forecasting

Based on the original slides from Levine, et. all, First Edition, Prentice Hall, Inc

The Art of Forecasting

Introduction to Forecasting

GAMINGRE 8/1/ of 7

14. Time- Series data visualization. Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES

Technical note on seasonal adjustment for M0

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Multiple Regression Analysis

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Dummy Variables. Susan Thomas IGIDR, Bombay. 24 November, 2008

Determine the trend for time series data

Public Library Use and Economic Hard Times: Analysis of Recent Data

NATCOR Regression Modelling for Time Series

Forecasting using R. Rob J Hyndman. 1.3 Seasonality and trends. Forecasting using R 1

Time Series Analysis

Lecture 4 Forecasting

Annual Average NYMEX Strip Comparison 7/03/2017

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Rob J Hyndman. Forecasting using. 3. Autocorrelation and seasonality OTexts.com/fpp/2/ OTexts.com/fpp/6/1. Forecasting using R 1

SYSTEM BRIEF DAILY SUMMARY

Alterations to the Flat Weight For Age Scale BHA Data Published 22 September 2016

Making sense of Econometrics: Basics

FinQuiz Notes

SYSTEM BRIEF DAILY SUMMARY

Single and multiple linear regression analysis

Technical note on seasonal adjustment for Capital goods imports

2016 Year-End Benchmark Oil and Gas Prices (Average of Previous 12 months First-Day-of-the Month [FDOM] Prices)

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Average 175, , , , , , ,046 YTD Total 1,098,649 1,509,593 1,868,795 1,418, ,169 1,977,225 2,065,321

Average 175, , , , , , ,940 YTD Total 944,460 1,284,944 1,635,177 1,183, ,954 1,744,134 1,565,640

Operations Management

Statistics for IT Managers

Suan Sunandha Rajabhat University

Jackson County 2013 Weather Data

Forecasting. Copyright 2015 Pearson Education, Inc.

ENGINE SERIAL NUMBERS

Forecasting the Canadian Dollar Exchange Rate Wissam Saleh & Pablo Navarro

In Centre, Online Classroom Live and Online Classroom Programme Prices

The Effects of Weather on Urban Trail Use: A National Study

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -12 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

7CORE SAMPLE. Time series. Birth rates in Australia by year,

TILT, DAYLIGHT AND SEASONS WORKSHEET

A look into the factor model black box Publication lags and the role of hard and soft data in forecasting GDP

Modeling and Forecasting Currency in Circulation in Sri Lanka

Pre-Calc Chapter 1 Sample Test. D) slope: 3 4

Chapter 3 Multiple Regression Complete Example

FinQuiz Notes

Time-Series Analysis. Dr. Seetha Bandara Dept. of Economics MA_ECON

Time Series Analysis -- An Introduction -- AMS 586

Asitha Kodippili. Deepthika Senaratne. Department of Mathematics and Computer Science,Fayetteville State University, USA.

Chapter 3 ANALYSIS OF RESPONSE PROFILES

Jackson County 2018 Weather Data 67 Years of Weather Data Recorded at the UF/IFAS Marianna North Florida Research and Education Center

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Regression Models. Chapter 4. Introduction. Introduction. Introduction

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Time series and Forecasting

Ch 13 & 14 - Regression Analysis

The point is located eight units to the right of the y-axis and two units above the x-axis. A) ( 8, 2) B) (8, 2) C) ( 2, 8) D) (2, 8) E) ( 2, 8)

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Jackson County 2014 Weather Data

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

FEB DASHBOARD FEB JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

ECON 4230 Intermediate Econometric Theory Exam

YACT (Yet Another Climate Tool)? The SPI Explorer

Inference with Simple Regression

Figure 1. Time Series Plot of arrivals from Western Europe

The Spectrum of Broadway: A SAS

P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA

peak half-hourly New South Wales

DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR

Lesson 8: Variability in a Data Distribution

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

STAT 212 Business Statistics II 1

TIGER: Tracking Indexes for the Global Economic Recovery By Eswar Prasad, Karim Foda, and Ethan Wu

Charting Employment Loss in North Carolina Textiles 1

Chapter 4: Regression Models

. regress lchnimp lchempi lgas lrtwex befile6 affile6 afdec6 t

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

How are adding integers and subtracting integers related? Work with a partner. Use integer counters to find 4 2. Remove 2 positive counters.

Long-term Water Quality Monitoring in Estero Bay

Summary of Seasonal Normal Review Investigations CWV Review

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Econometrics Review questions for exam

Changing Hydrology under a Changing Climate for a Coastal Plain Watershed

Winter Season Resource Adequacy Analysis Status Report

ANSWERS CHAPTER 15 THINK IT OVER EXERCISES. Nick Lee and Mike Peters think it over. No answers required.

Chapter 5: Forecasting

Mr. XYZ. Stock Market Trading and Investment Astrology Report. Report Duration: 12 months. Type: Both Stocks and Option. Date: Apr 12, 2011

Regression Models - Introduction

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Analysis. Components of a Time Series

TIGER: Tracking Indexes for the Global Economic Recovery By Eswar Prasad and Karim Foda

2003 Water Year Wrap-Up and Look Ahead

Algae and Dissolved Oxygen Dynamics of Landa Lake and the Upper Spring Run

Transcription:

1

OUTLINE Analysis of Data and Model Hypothesis Testing Dummy Variables Research in Finance 2

ANALYSIS: Types of Data Time Series data Cross-Sectional data Panel data Trend Seasonal Variation Cyclical Variation Irregular Variation 1-dimensional Data set Observing many subjects (size, company, counties, etc) at the same time Multi-dimensional data set Time-Series + Cross- Sectional Data MULTIPLE REGRESSION 3

Trend Component Persistent, overall upward or downward pattern Due to population, technology etc. Several years duration Response Mo., Qtr., Yr. 1984-1994 T/Maker Co.

Trend Component Overall Upward or Downward Movement Data Taken Over a Period of Years Sales Time

Cyclical Component Repeating up & down movements Due to interactions of factors influencing economy Usually 2-10 years duration Response Cycle Mo., Qtr., Yr.

Cyclical Component Upward or Downward Swings May Vary in Length Usually Lasts 2-10 Years Sales Time

Seasonal Component Regular pattern of up & down fluctuations Due to weather, customs etc. Occurs within one year Response Summer Mo., Qtr. 1984-1994 T/Maker Co.

Seasonal Component Upward or Downward Swings Regular Patterns Observed Within One Year Sales Time (Monthly or Quarterly)

Irregular Component Erratic, unsystematic, residual fluctuations Due to random variation or unforeseen events Union strike War Short duration & nonrepeating

Apr-75 May-76 Jun-77 Jul-78 Aug-79 Sep-80 Oct-81 Nov-82 Dec-83 Jan-85 Feb-86 Mar-87 Apr-88 May-89 Jun-90 Jul-91 Aug-92 Sep-93 Oct-94 Nov-95 Dec-96 Jan-98 Feb-99 Mar-00 Apr-01 May-02 Jun-03 Jul-04 Aug-05 Sep-06 Oct-07 Nov-08 Dec-09 Jan-11 Feb-12 Mar-13 Apr-14 May-15 Jun-16 Jul-17 Time Series Data SET Index 2000 1800 1600 1400 1200 1000 800 600 400 200 0

Cross Sectional Data

Pool (Panel) Data

ANALYSIS: Type of Estimator Least Square Estimator Maximum Likelihood Estimator Y i = β 1 + β 2 X 1i + β 3 X 2i + u i 14

ANALYSIS: Type of Model Linear model Non Linear Model DTAC t = α + β 1 X 1t + β 2 X 2t + ε t Y t = AIS RETURN 15

ANALYSIS: Fitted Regression on Model Y ~ regressand var response var dependent var observed var Y = a + b x X ~ regressor independent variable explanatory variable predictor Variable Time series Time-Series with Condition Panel Model Multiple Regression ARMA/ ARIMA ARCH/GARCH Pooled or Panel Model Fixed-Effect Model Random-Effect Model 16

ANALYSIS: Fitted Regression on Model Y = a + b x Logit Model Y is discrete Probit Model 17

ANALYSIS: Fitted Regression on Model Y = a + b x Y and X are Dynamic Vector Auto Regression (VAR) Error Correction Model (ECM) 18

ANALYSIS: Expansion from Simple Regression to Multiple Regression FITTED REGRESSION MODEL Y = a + b x 19

x is the independent variable y is the dependent variable The regression model is simple linear regression y 0 1 x The model has two variables, the independent or explanatory variable, x, and the dependent variable y, the variable whose variation is to be explained. The relationship between x and y is a linear or straight line relationship. Two parameters to estimate the slope of the line β 1 and the y- intercept β 0 (where the line crosses the vertical axis). ε is the unexplained, random, or error component. Much more on this later.

The regression model is Regression line y 0 1 x Data about x and y are obtained from a sample. From the sample of values of x and y, estimates b 0 of β 0 and b 1 of β 1 are obtained using the least squares or another method. The resulting estimate of the model is ŷ yˆ b0 b1 x The symbol is termed y hat and refers to the predicted values of the dependent variable y that are associated with values of x, given the linear model.

Income hrs/week Income hrs/week 8000 38 8000 35 6400 50 18000 37.5 2500 15 5400 37 3000 30 15000 35 6000 50 3500 30 5000 38 24000 45 8000 50 1000 4 4000 20 8000 37.5 11000 45 2100 25 25000 50 8000 46 4000 20 4000 30 8800 35 1000 200 5000 30 2000 200 7000 43 4800 30

Summer Income as a Function of Hours Worked 30000 25000 20000 Income 15000 10000 5000 0 0 10 20 30 40 50 60 Hours per Week

yˆ 2461 297x R 2 = 0.311

Outliers Rare, extreme values may distort the outcome. Could be an error. Could be a very important observation. Outlier: more than 3 standard deviations from the mean.

GPA vs. Time Online 12 10 8 Time Online 6 4 2 0 50 55 60 65 70 75 80 85 90 95 100 GPA

GPA vs. Time Online 9 8 7 6 Time Online 5 4 3 2 1 0 50 55 60 65 70 75 80 85 90 95 100 GPA

U-Shaped Relationship 12 10 OMITTED VARIABLE Correlation = +0.12. 8 Y 6 4 2 0 0 2 4 6 8 10 12 X

TESTING MULTIPLE HYPOTHESIS: F-test F-Test is of interest to test more than one coefficient simultaneously. F-Test Conditional to Reject H0: Significant if p-value < 0.05 31

TESTING MULTIPLE HYPOTHESIS: t-test t-test is of interest to test ONLY one coefficient t-test Conditional to Reject H0: Significant if p-value < 0.05 Oh my gosh!!!! It fails to reject H 0, what does it mean? What I should do? Cut it or leave it? 32

Example I: Stock Asset Price Regression TMB 1990M01 2011 M12 RP1 BBL NPL FRN JAS DJ NIKKEI 33

Example II: Hedonic Pricing Model Dependent Variable : Y ~ Rental Values Definitions 34

TESTING MULTIPLE HYPOTHESIS: Goodness of Fit Testing R 2 R 2 is desirable to answer how well regression model actually fits the data In other words, R 2 is desirable to answer how well does the model containing the explanatory variables 0 R 2 1 R 2 = 1 0 < R 2 < 1 35

TESTING MULTIPLE HYPOTHESIS: Problem with using R 2 Cannot compare R 2 of two models with same X but change Y R 2 never falls if more regressors are added to the regression R 2 2 R 1 2 R2 can take values of 0.9 or higher for time series regressions, and hence it is not good at discrimanating between models 36

TESTING MULTIPLE HYPOTHESIS: Adjusted R 2 If an extra regressor is added to the model, k increases and unless R2 increases by a more than off-setting amount, will actually fall. If model contains a lot of significant and insignificant variables, can be negative 37

DUMMY VARIABLE: How to Create Dummy Dummy is variables that assume such 0 and 1 values If a model contains M categories, then only M-1 dummy variables should be created. Otherwise, multicollinearity Problem Category for which no dummy variable is assigned is known as base, benchmark 2 types of dummy variables: Intercept vs. slope change dummy 38

DUMMY VARIABLE: 2 Type of Dummy Variables I. Different Intercept II. Different Slope R t R f = α + β 1 R M R f + β 2 SMB + β 3 HML + β 4 JAN RENT t = α + β 1 LNAGE + β 2 NOROOM + β 3 DIST + β 4 DDIST JAN is dummy = 1 if January = 0 otherwise D is dummy = 1 if Safe Area = 0 Otherwise Y Regression for JAN RENT Regression for Safe Area Slop = Β 3 + β 4 D α +β 4 α β 4 Regression for Other months X α Regression for Criminal Area DISTANT 39

STEP BY STEP Quantitative Analysis (Multiple Regression) 1. Conceptual Framework 2. Choose Type of regression (Linear vs. Non Linear) 3. Group Variables 4. Analyze Data (Take logarithm or not) 5. Look at the sign of estimated parameters. 6. Test Hypothesis 7. Take a look at Adjust R 2 40

RESEARCH PAPER: THREE FACTOR MODEL Three Factor Model (Fama and French (1992)) Eugene Fama Kenneth R. French 41

42

43

WORK SHOP #1 44

WORK ORDERS : Multiple Regression (1) Using Three Factor Model to regress Multiple Regression on your group assignment (2) Interpret F-test, and T-Test. (3) Explain Adjusted R 2 (4) Create Dummy variables o Monthly Data : (1) Window Dressing in June and (2) End-Year Effect. o Annual Data : (1) Asian Crisis during 1997-1999, (2) Subprime Crisis during 2008-2010, (3) Europe Debt crisis during 2008-2012. (5) Redo Work Orders (1) (4) with new model 45