Refresher course Regression Analysis

Size: px
Start display at page:

Download "Refresher course Regression Analysis"

Transcription

1 Refresher course Regressio Aalysis Ursia Kuh Swiss Household Pael (SHP), FORS 3.6.9, Uiversity of ausae

2 Aim ad cotet of the course Refresher course o liear regressio What is a regressio? How do we obtai regressio coefficiets? How to iterpret regressio coefficiets? Iferece from sample to populatio of iterest (sigificace tests) Assumptios of liear regressio Cosequeces whe assumptios are violated Regressio with pael data

3 What is a regressio? A regressio is a statistical method for studyig the relatioship betwee a sigle depedet variable ad oe or more idepedet variables. Simplest form: liear relatioship betwee a depedet ad oe idepedet variable for a give set of observatios (bivariate liear regressio) 3

4 iear regressio: fittig a lie Y yi ŷi uit X slope xi ei Y = + x 4

5 yearly icome from employmet umber of years spet i paid work 5

6 yearly icome from employmet a b uit x umber of years spet i paid work Regressio lie: ŷ i = a + bx i = *xi 6

7 yearly icome from employmet umber of years spet i paid work Regressio lie: ŷi = a + bxi = *xi Estimated regressio equatio: y i = a + bx i + e i 7

8 yearly icome from employmet umber of years spet i paid work How to fid the lie? iimise squared errors (Ordiary least squares, OS) 8

9 Iterpretig the (liear) regressio equatio Estimated regressio equatio: y i = a + bx i + e i I the social scieces, a regressio is geerally used to represet a causal process. y represets the depedet variable x is the idepedet variable (also called predictor or regressor) a is the itercept (it represets the predicted value of Y if X is equal zero) b is called the regressio coefficiets ad provide a measure of the effect of the idepedet variable o Y (they measure the slope of the lie) e is the part of y ot explaied by the causal model (residual) ca cosist of Omitted variables easuremet errors Stochastic shock Disturbace 9

10 ultivariate regressio Effect of x holdig all other x s costat Portio of y explaied by x that is ot explaied by the other x s Bivariate model y = a + bx + e i ultivariate model y = a + b x + b x + b 3 x 3 + e i Example: geder wage gap sample: full-time employed, yearly salary betwee ad CHF) bivariate: salary = a + b sex + e i (sex: =male, =female) salary = sex (female) multivariate b costat 45'369 female -9'9 educatio (Ref: compulsory) secodary educatio 9'97 tertiary educatio 3'786 supervisio 7'8 fiacial sector 5'59 umber of years i paid work 79

11 Assumptios for OS-estimatios: coefficiets Assumptios for OS-estimatio (ecessary to calculate slope coefficiets) ) No perfect multicolliearity (Noe of the regressors ca be writte as a liear fuctio of the other regressors) ) E(e) = 3) Noe of the x is correlated with e; Cov(x,e) = (all x s are exogeous, oe of x s is edogeous) If assumptios -3 hold: OS is cosistet, regressio coefficiets (b s) ubiased

12 Edogeeity Reasos for misspecificatio of model Omitted variables easuremet error (i explaatory variables) Simultaeity Noliearity i parameters Detectio of edogeeity Difficult to detect ad correct! Cautio for causal iterpretatio Theory, literature (variable selectio ad iterpretatio)!!!! Correctio for edogeeity Test for oliear relatioship, iteractios iclude oliear terms, iteractios (but: still liearity i parameters!) Omitted variables: istrumetal variables, pael data Simultaeity: Structural equatios modellig, pael data for time orderig Theory, literature (variable selectio ad iterpretatio)!!!!

13 Iferece from liear regressio Iferece from OS-estimatios if radom sample But: OS coefficiets are estimatios Estimated regressio equatio: y i = a + bx i + e i True regressio equatio: y i = α + βx i + ε i True coefficiets (α, β) ukow, true «error term» ukow Distributio of coefficiets (a, b) E( b) = Var( b) β = β β E(β) 3

14 Iferece from liear regressio Var ( b) ( ε i ) ε = β = where = ε ( xi x) Variatio of b ( β ): decreases if icreases x are more spread out Squared residuals decrease Distributio of b Studet t-distributio Depeds o ad umber of x s Normal distributio if is large p E(β) β 4

15 Iferece from liear regressio: testig whether b If β = (i populatio), there is o relatioship betwee x ad y we have to test how likely it is, that β = H : Distributio if β = critical values for coefficiets compare estimated coefficiet with critical value if b. > critical value the b sigificat Critical value for stadardized Normal distributio ad 95% cofidece level:.96 stadardisatio: b stad b = t value = b 5 b

16 Iferece from liear regressio: example yearly icome from employmet umber of years spet i paid work Regressio lie: ŷ i = a + bx i ; example: ŷi = *xi 6

17 Sample =53 Iferece from liear regressio: example Coef. st.e. t P> t [95% Cof. Iterval] years work _cos R :. Sample =787 Coef. St.e. t P> t [95% Cof. Iterval] years work _cos R :.59 7

18 8 Iferece from liear regressio: assumptios Assumptios for iferece: assumptio o error terms Idepedece of error terms, o autocorrelatio: Cov (ε i, ε k ) = for all i,k, i k Costat error variace : Var(ε i )= ε for all i; (Homoscedasticity) Preferetially: e is ormally distributed atrix of error terms ; O k i

19 9 Autocorrelatio Reaso: Nested observatios (e.g. households, schools, time, commuities) stadard errors uderestimated OS, adjust stadard errors O ; k i ; O k i autocorrelatio o autocorrelatio

20 Heteroskedasticity Variace is ot cosistet stadard errors overestimated or uderestimated OS, adjust stadard errors (White stadard errors) Weighted least squares (WS) ; O k i ; O k i Homoskedasticity Heteroskedasticity

21 Agai: assumptios of liear regressio Geeral Cotiuous depedet variable Radom sample Coefficiet estimatio No perfect multicolliearity E(e) = No edogeeity Cov(x,e) = Omitted variables easuremet error Simultaeity Noliearity i parameters Iferece No autocorrelatio Cov (ei, ek)= Costat variace (o heterogeeity) Coefficiets biased (icosistet) Stadard errors of coefficiets biased

22 Pael data: The Swiss Household-Pael (SHP) Sample of ca. 574 households (7799 idividuals) i 999 (SHP I) Sample of ca. 538 households (443 idividuals) i 4 (SHP II) Yearly observatios of the same idividuals Up to observatio poits per idividual But: attritio, gaps betwee waves

23 Structure of pael data Wide data format og data format (perso-period-file) idpers i4empy i5empy i6empy i7empy idpers year iempy

24 OS with pael data OS for cross-sectioal aalysis (oe wave) o particular problem! OS for pooled data (differet years i oe file) Problem: assuptio of idepedece of observatios violated (autocorrelatio) Correct clusterig i error terms (leaves coefficiets uaffected) But: OS is ot the best estimator for pooled data (ot efficiet) yearly icome from OS OS, cluster i se employmet b t b t female -5'55 (-.38) -5'55 (-6.6) secodary educatio '383 (6.58) '383 (9.89) tertiary educatio 3'67 (43.5) 3'67 (.69) supervisio 3'79 (35.48) 3'79 (.46) age i 999 '98 (.43) '98 (.44) age i 999 squared -7 (-3.65) -7 (-7.79) time (999-7) 987 (4.6) 987 (.66) fiacial sector 3'93 (.3) 3'93 (.5) married 8'68 (7.5) 8'68 (9.45) wome*married -9'6 (-.6) -9'6 (-6.7) costat 5'55 (.84) 5'55 (.9) 4

25 How to aalyse pael data? Two differet types of variatio i pael data Variatio withi idividuals Variatio betwee idividuals OS does ot take accout of differece betwee the two types of variatios Take advatage of pael data characteristics! Cotrol for uobservable variables (stable persoal characteristics) reduce bias from omitted variables Fixed Effects odels (oly withi variatio) Radom itercept odels ulti level aalysis /radom effects/ fraility for evet history 5

26 yearly icome from OS OS, cluster i se Radom effects Fixed effects employmet b t b t b t b t female -5'55 (-.38) -5'55 (-6.6) -8'4 (-.8) (dropped) secodary educatio '383 (6.58) '383 (9.89) '8 (.33) 3'35 (.34) tertiary educatio 3'67 (43.5) 3'67 (.69) 8'3 (7.33) 6'53 (.33) supervisio 3'79 (35.48) 3'79 (.46) 3'74 (.6) 53 (.69) age i 999 '98 (.43) '98 (.44) ' (6.33) (dropped) age i 999 squared -7 (-3.65) -7 (-7.79) - (-.5) (dropped) time (999-7) 987 (4.6) 987 (.66) '76 (5.36) '3 (6.7) fiacial sector 3'93 (.3) 3'93 (.5) 7'549 (9.87) '45 (.48) married 8'68 (7.5) 8'68 (9.45) 6'636 (.) 4'48 (5.77) wome*married -9'6 (-.6) -9'6 (-6.7) -7'848 (-7.5) -3'767 (-.44) costat 5'55 (.84) 5'55 (.9) 7'94 (.7) 6'59 (6.77) 6

27 Workig with lagged variables iclude lags of idepedet variables y it = a + b x it + b x it + b 3 x it- + e it Do ot iclude lags of depedet variable o the right had side of the equatio! Causes edogeity y = a + b x + e y t t = a + b x t t t + e t y y t t = a + b x = a + b x t t + b + b y t + e t ( a + b x + e ) + e t t If e t ad e t- are correlated (which is likely because they are residuals of the same perso) Cov(x,e), all b s are likely to be biased But: there are more sophisticated methods which allow icludig lags of the depedet variabler t 7

28 No-liear regressio e.g. logistic regressio P(y=) y = a+ b x + b x + e) + e ( x 8

29 No liear models Depedet variable is ot cotiuous: o liear regressio Dummy variable (e.g. yes-o) ultiomial (e.g. vote for SPS, vote for FDP, vote for SVP, vote for others) Ordial (low educatio, itermediate educatio, higher educatio) Cout variable (umber of visits to the doctor) ogistic Regressio Probit Regressio ultiomal logistic regressio ultiomial probit regressio Ordial regressio Poisso regressio 9

30 Additioal possibilities with pael data ogitudial model for growth: radom itercept ad radom slope odellig evets, duratio Evet history aalysis, survival aalysis (trasitios as depedet variable) Differetiate cohort, time period ad age (two out of three) arkov Chai models 3

31 Thak you! For questios: 3

32 Refereces Itroductio Wooldridge, Jeffrey,, Ecoometric Aalysis of Cross Sectio ad Pael Data, IT Press. Further Readig Camero, A. C., ad Trivedi, P.K. 5. icroecoometrics: ethods ad Applicatios, Cambridge Uiversity Press, sectio V. Verbeek,. 4. A Guide to oder Ecoometrics. (d ed.) Wiley, ch.. Baltagi Badi H. 5. Ecoometric Aalysis of Pael Data, (3rd ed.) Wiley. 3

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n Review Sheets for Stock ad Watso Hypothesis testig p-value: probability of drawig a statistic at least as adverse to the ull as the value actually computed with your data, assumig that the ull hypothesis

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Question 1: Exercise 8.2

Question 1: Exercise 8.2 Questio 1: Exercise 8. (a) Accordig to the regressio results i colum (1), the house price is expected to icrease by 1% ( 100% 0.0004 500 ) with a additioal 500 square feet ad other factors held costat.

More information

Lecture 11 Simple Linear Regression

Lecture 11 Simple Linear Regression Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp

More information

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions CEU Departmet of Ecoomics Ecoometrics, Problem Set - Solutios Part A. Exogeeity - edogeeity The liear coditioal expectatio (CE) model has the followig form: We would like to estimate the effect of some

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Correlation and Covariance

Correlation and Covariance Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

5.4 The spatial error model Regression model with spatially autocorrelated errors

5.4 The spatial error model Regression model with spatially autocorrelated errors 54 The spatial error model 54 Regressio model with spatiall autocorrelated errors I a multiple regressio model, the depedet variable Y depeds o k regressors X (=), X,, X k ad a disturbace ε: (4) is a x

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

Logit regression Logit regression

Logit regression Logit regression Logit regressio Logit regressio models the probability of Y= as the cumulative stadard logistic distributio fuctio, evaluated at z = β 0 + β X: Pr(Y = X) = F(β 0 + β X) F is the cumulative logistic distributio

More information

ECON 3150/4150, Spring term Lecture 1

ECON 3150/4150, Spring term Lecture 1 ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad

More information

PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS

PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS PROVING CAUSALITY IN SOCIAL SCIENCE: A POTENTIAL APPLICATION OF OLOGS By Noam Agrist 1 THE GOALS OF SOCIAL SCIENCE Explai the world aroud us. What is really happeig ad why. Example: do Kidles boost test

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

UNIT 11 MULTIPLE LINEAR REGRESSION

UNIT 11 MULTIPLE LINEAR REGRESSION UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4

More information

Table 1: Mean FEV1 (and sample size) by smoking status and time. FEV (L/sec)

Table 1: Mean FEV1 (and sample size) by smoking status and time. FEV (L/sec) 1. A study i the Netherlads followed me ad wome for up to 21 years. At three year itervals, participats aswered questios about respiratory symptoms ad smokig status. Pulmoary fuctio was determied by forced

More information

Lecture 1, Jan 19. i=1 p i = 1.

Lecture 1, Jan 19. i=1 p i = 1. Lecture 1, Ja 19 Review of the expected value, covariace, correlatio coefficiet, mea, ad variace. Radom variable. A variable that takes o alterative values accordig to chace. More specifically, a radom

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information

Econometrics II Tutorial Problems No. 4

Econometrics II Tutorial Problems No. 4 Ecoometrics II Tutorial Problems No. 4 Leart Hoogerheide & Agieszka Borowska 08.03.2017 1 Summary Gauss-Markov assumptios (for multiple liear regressio model): MLR.1 (liearity i parameters): The model

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed: School of Busiess ad Ecoomics Exam: Code: Examiator: Co-reader: Busiess Statistics E_BK_BS / E_IBA_BS dr. R. Heijugs dr. G.J. Frax Date: 5 May, 08 Time: :00 Duratio: Calculator allowed: Graphical calculator

More information

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the

More information

Topic 10: Introduction to Estimation

Topic 10: Introduction to Estimation Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 9 Maximum Likelihood Estimatio 9.1 The Likelihood Fuctio The maximum likelihood estimator is the most widely used estimatio method. This chapter discusses the most importat cocepts behid maximum

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X. Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with

More information

CTL.SC0x Supply Chain Analytics

CTL.SC0x Supply Chain Analytics CTL.SC0x Supply Chai Aalytics Key Cocepts Documet V1.1 This documet cotais the Key Cocepts documets for week 6, lessos 1 ad 2 withi the SC0x course. These are meat to complemet, ot replace, the lesso videos

More information

Describing the Relation between Two Variables

Describing the Relation between Two Variables Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS

INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS UNIVERSITY OF EAST ANGLIA School of Ecoomics Mai Series UG Examiatio 04-5 INTRODUCTORY MATHEMATICS AND STATISTICS FOR ECONOMISTS ECO-400Y Time allowed: 3 hours Aswer ALL questios. Show all workig icludig

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Chapter 6: The Simple Regression Model

Chapter 6: The Simple Regression Model Chapter 6: The Simple Regressio Model Statistics ad Itroductio to Ecoometrics M. Ageles Carero Departameto de Fudametos del Aálisis Ecoómico Year 2014-15 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model

Economics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model Ecoomics 326 Methods of Empirical Research i Ecoomics Lecture 8: Multiple regressio model Hiro Kasahara Uiversity of British Columbia December 24, 2014 Why we eed a multiple regressio model I There are

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740 Ageda: Recap. Lecture. Chapter Homework. Chapt #,, 3 SAS Problems 3 & 4 by had. Copyright 06 by D.B. Rowe Recap. 6: Statistical Iferece: Procedures for μ -μ 6. Statistical Iferece Cocerig μ -μ Recall yes

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.

9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2. LINKÖPINGS UNIVERSITET Matematiska Istitutioe Matematisk Statistik HT1-2015 TAMS24 9. Simple liear regressio G2.1) Show that the vector of residuals e = Y Ŷ has the covariace matrix (I X(X T X) 1 X T )σ

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Statistics 20: Final Exam Solutions Summer Session 2007

Statistics 20: Final Exam Solutions Summer Session 2007 1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.

More information

Spurious Fixed E ects Regression

Spurious Fixed E ects Regression Spurious Fixed E ects Regressio I Choi First Draft: April, 00; This versio: Jue, 0 Abstract This paper shows that spurious regressio results ca occur for a xed e ects model with weak time series variatio

More information

CLRM estimation Pietro Coretto Econometrics

CLRM estimation Pietro Coretto Econometrics Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is: PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

REGRESSION AND ANALYSIS OF VARIANCE. Motivation. Module structure

REGRESSION AND ANALYSIS OF VARIANCE. Motivation. Module structure REGRESSION AND ANALYSIS OF VARIANCE 1 Motivatio Objective: Ivestigate associatios betwee two or more variables What tools do you already have? t-test Compariso of meas i two populatios What will we cover

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Simple Regression Model

Simple Regression Model Simple Regressio Model 1. The Model y i 0 1 x i u i where y i depedet variable x i idepedet variable u i disturbace/error term i 1,..., Eg: y wage (measured i 1976 dollars per hr) x educatio (measured

More information

REGRESSION METHODS. Logistic regression

REGRESSION METHODS. Logistic regression REGRESSION METHODS Logistic regressio 233 RECAP: Biary Outcome? NO Cotiuous Outcome? YES Liear Regressio/ANOVA NO Other Methods YES Odds ratio as measure of associatio? Relative risk as measure of associatio?

More information

REGRESSION MODELS ANOVA

REGRESSION MODELS ANOVA REGRESSION MODELS ANOVA 141 Cotiuous Outcome? NO RECAP: Logistic regressio ad other methods YES Liear Regressio Examie mai effects cosiderig predictors of iterest, ad cofouders Test effect modificatio

More information

Formulas and Tables for Gerstman

Formulas and Tables for Gerstman Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary

More information

ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t

ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t ARIMA Models Da Sauders I will discuss models with a depedet variable y t, a potetially edogeous error term ɛ t, ad a exogeous error term η t, each with a subscript t deotig time. With just these three

More information

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical

More information

A Question. Output Analysis. Example. What Are We Doing Wrong? Result from throwing a die. Let X be the random variable

A Question. Output Analysis. Example. What Are We Doing Wrong? Result from throwing a die. Let X be the random variable A Questio Output Aalysis Let X be the radom variable Result from throwig a die 5.. Questio: What is E (X? Would you throw just oce ad take the result as your aswer? Itroductio to Simulatio WS/ - L 7 /

More information

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D. ample ie Estimatio i the Proportioal Haards Model for K-sample or Regressio ettigs cott. Emerso, M.D., Ph.D. ample ie Formula for a Normally Distributed tatistic uppose a statistic is kow to be ormally

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Chapter Output Aalysis for a Sigle Model Baks, Carso, Nelso & Nicol Discrete-Evet System Simulatio Error Estimatio If {,, } are ot statistically idepedet, the S / is a biased estimator of the true variace.

More information

TAMS24: Notations and Formulas

TAMS24: Notations and Formulas TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

The Simple Regression Model

The Simple Regression Model The Simple Regressio Model Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) SLR 1 / 75 Defiitio of the Simple Regressio Model Defiitio of the Simple Regressio Model Pig Yu (HKU)

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Itroductio The term regressio ad the methods for ivestigatig the relatioships betwee two variables may date back to about 100 years ago. It was first itroduced by Fracis

More information

Basis for simulation techniques

Basis for simulation techniques Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

More information

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x

More information