Econ107 Applied Econometrics Topic 10: Dummy Dependent Variable (Studenmund, Chapter 13)

Similar documents
You already learned about dummies as independent variables. But. what do you do if the dependent variable is a dummy?

Logistic Regression I. HRP 261 2/10/ am

te Finance (4th Edition), July 2017.

Review - Probabilistic Classification

Analyzing Frequencies

The Hyperelastic material is examined in this section.

A Note on Estimability in Linear Models

Outlier-tolerant parameter estimation

10/7/14. Mixture Models. Comp 135 Introduction to Machine Learning and Data Mining. Maximum likelihood estimation. Mixture of Normals in 1D

Chapter 6 Student Lecture Notes 6-1

ST 524 NCSU - Fall 2008 One way Analysis of variance Variances not homogeneous

Soft k-means Clustering. Comp 135 Machine Learning Computer Science Tufts University. Mixture Models. Mixture of Normals in 1D

Economics 600: August, 2007 Dynamic Part: Problem Set 5. Problems on Differential Equations and Continuous Time Optimization

8-node quadrilateral element. Numerical integration

SCITECH Volume 5, Issue 1 RESEARCH ORGANISATION November 17, 2015

Grand Canonical Ensemble

CHAPTER 33: PARTICLE PHYSICS

Jones vector & matrices

Econometrics (10163) MTEE Fall 2010

September 27, Introduction to Ordinary Differential Equations. ME 501A Seminar in Engineering Analysis Page 1. Outline

COMPLEX NUMBER PAIRWISE COMPARISON AND COMPLEX NUMBER AHP

Lucas Test is based on Euler s theorem which states that if n is any integer and a is coprime to n, then a φ(n) 1modn.

Binary Choice. Multiple Choice. LPM logit logistic regresion probit. Multinomial Logit

Lecture 3: Phasor notation, Transfer Functions. Context

Introduction to logistic regression

Multivariate Linear and Non-Linear Causality Tests

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 12

The Fourier Transform

EXST Regression Techniques Page 1

Unbalanced Panel Data Models

Today s logistic regression topics. Lecture 15: Effect modification, and confounding in logistic regression. Variables. Example

Fakultät III Univ.-Prof. Dr. Jan Franke-Viebach

CHAPTER 7d. DIFFERENTIATION AND INTEGRATION

A Probabilistic Characterization of Simulation Model Uncertainties

Basically, if you have a dummy dependent variable you will be estimating a probability.

On Selection of Best Sensitive Logistic Estimator in the Presence of Collinearity

Lecture 23 APPLICATIONS OF FINITE ELEMENT METHOD TO SCALAR TRANSPORT PROBLEMS

Fakultät III Wirtschaftswissenschaften Univ.-Prof. Dr. Jan Franke-Viebach

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Physics 256: Lecture 2. Physics

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

An Overview of Markov Random Field and Application to Texture Segmentation

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Lecture 14. Relic neutrinos Temperature at neutrino decoupling and today Effective degeneracy factor Neutrino mass limits Saha equation

Search sequence databases 3 10/25/2016

The Matrix Exponential

Higher order derivatives


Advanced Macroeconomics

The Matrix Exponential

First derivative analysis

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

External Equivalent. EE 521 Analysis of Power Systems. Chen-Ching Liu, Boeing Distinguished Professor Washington State University

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

A primary objective of a phase II trial is to screen for antitumor activity; agents which are found to have substantial antitumor activity and an

y = 2xe x + x 2 e x at (0, 3). solution: Since y is implicitly related to x we have to use implicit differentiation: 3 6y = 0 y = 1 2 x ln(b) ln(b)

Correlation in tree The (ferromagnetic) Ising model

innovations shocks white noise

Heisenberg Model. Sayed Mohammad Mahdi Sadrnezhaad. Supervisor: Prof. Abdollah Langari

??? Dynamic Causal Modelling for M/EEG. Electroencephalography (EEG) Dynamic Causal Modelling. M/EEG analysis at sensor level. time.

Limited Dependent Variables

Questions k 10k 100k 1M Speaker. output

EEO 401 Digital Signal Processing Prof. Mark Fowler

The Data-Constrained Generalized Maximum Entropy Estimator of the GLM: Asymptotic Theory and Inference

Differentiation of Exponential Functions

Electrochemical Equilibrium Electromotive Force. Relation between chemical and electric driving forces

SPECTRUM ESTIMATION (2)

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Basic Polyhedral theory

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

A NEW GENERALISATION OF SAM-SOLAI S MULTIVARIATE ADDITIVE GAMMA DISTRIBUTION*

Optimal Ordering Policy in a Two-Level Supply Chain with Budget Constraint

2. Grundlegende Verfahren zur Übertragung digitaler Signale (Zusammenfassung) Informationstechnik Universität Ulm

LECTURE 6 TRANSFORMATION OF RANDOM VARIABLES

EECE 301 Signals & Systems Prof. Mark Fowler

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

R-Estimation in Linear Models with α-stable Errors

Naresuan University Journal: Science and Technology 2018; (26)1

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding...

Folding of Regular CW-Complexes

Reliability of time dependent stress-strength system for various distributions

Where k is either given or determined from the data and c is an arbitrary constant.

On Properties of the difference between two modified C p statistics in the nested multivariate linear regression models

Group Codes Define Over Dihedral Groups of Small Order

Exercise 1. Sketch the graph of the following function. (x 2

HORIZONTAL IMPEDANCE FUNCTION OF SINGLE PILE IN SOIL LAYER WITH VARIABLE PROPERTIES

ON THE COMPLEXITY OF K-STEP AND K-HOP DOMINATING SETS IN GRAPHS

What are those βs anyway? Understanding Design Matrix & Odds ratios

SER/BER in a Fading Channel

Lecture 1: Empirical economic relations

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

BLOCKS REPLICATION EXPERIMENTAL UNITS RANDOM VERSUS FIXED EFFECTS

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

Using Markov Chain Monte Carlo for Modeling Correct Enumeration and Match Rate Variability

Solution of Assignment #2

Lecture Outline Biost 517 Applied Biostatistics I

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0

Transcription:

Pag- Econ7 Appld Economtrcs Topc : Dummy Dpndnt Varabl (Studnmund, Chaptr 3) I. Th Lnar Probablty Modl Suppos w hav a cross scton of 8-24 yar-olds. W spcfy a smpl 2-varabl rgrsson modl. Th probablty of nrollng n trtary study can b wrttn: whr Y f nrolld n unvrsty; othrws. X prformanc n scondary school, famly background, ncom or walth of parnts, gndr, thncty, tc. W wrt ths as a 2-varabl rgrsson for smplcty. Could b a multpl rgrsson, whr som or all of th ndpndnt varabls ar ncludd. Assum Prob P Y X ). ( Ths s known as a Lnar Probablty Modl (LPM), bcaus th condtonal xpctaton of Y gvn X s th condtonal probablty of ths vnt occurrng: whr E( ε X ). Although w can trat ths modl lk any othr rgrsson and us OLS to stmat th paramtrs, on rstrcton s that: Only probablts wthn th, ntrval mak sns. Two undsrabl charactrstcs of LPM (.., th us of OLS whr th dpndnt varabl s dscrt):. Nonnormal/htroskdastc dsturbancs In gnral, β β X Y ε E (Y X ) β β X Prob β β X Prob Y - β - β X Y - Prob ε

Pag- 2 For th purposs of statstcal nfrnc, w assum that ths dsturbancs ar normally dstrbutd wth a constant varanc. Ths ar both volatd undr th LPM. W know that Y can tak on only on of two valus: and. Thrfor: Snc ths stmatd probablty wll always b postv, th rror trms wll fluctuat btwn postv and ngatv valus. It can b shown that th rsultng varanc of th dsturbanc trms follows a bnomal dstrbuton : Th rsult s that th dsturbancs ar not normally dstrbutd, and ar htroskdastc. If thy wr homoskdastc, th Var( ε ) should b a constant. Undr th LPM, ths varanc s a functon of X (.., t dpnds on Prob ). Wth htroskdastcty coffcnt stmats ar stll unbasd, but nffcnt (.., no longr mnmum varanc or BLUE.) Ths can b ovrcom by runnng Wghtd Last Squars (WLS). Transform th data and us OLS. 2-Stp Procdur: If Y,thn, ε - Prob If Y,thn, ε - Prob Var ( ε ) 2 σ Prob ( - Prob. Run OLS. Rtan th fttd valus and comput th followng 'wght' Prob ˆ (- Prob ˆ ) 2. Transform th data n th followng way, and run OLS. Y β β Ths lmnats htroskdastcty, snc w now hav unt varanc for th compost dsturbanc trm. Th rsultng coffcnt stmats ar now BLUE. Howvr, ths procdur lmnats only on of th two problms. X ) ε

Pag- 3 2. Unrstrctd Rang of Prob W sad arlr that ths probablty must b rstrctd to th, ntrval. Th problm s that nothng n LPM 'rstrcts th rang' of Prob. Consdr th followng numrcal xampl: Suppos w stmat: Prob Y ˆ whr X s dfnd as fathr's 'yars of ducaton mnus 2'. For xampl, f fathr complts a scondary ducaton, w say that h has 2 yars of schoolng (.g., School Crtfcat). In ths cas, X. No qualfcaton would mak X ngatv. Any post-sc qualfcaton would mak X postv (.g., f h has Ph.D, X 7). Show ths n th followng dagram..97.4 X Th data ponts l on th 2 horzontal lns, whr y and. Ethr th ndvdual s nrolld n trtary study or h or sh sn t. Th dpndnt varabl s dchotomous, although th ndpndnt varabl s mor or lss contnuous. Ths s th scattr dagram for a dummy dpndnt varabl modl.

Pag- 4 OLS trs to ft a rgrsson ln through ths data ponts that mnmss th sum of th squard rsduals. Suppos w gt th upward-slopng rgrsson functon n th dagram. Enrolmnt n trtary ducaton s postvly rlatd to th fathr s ducaton. Th ntrcpt trm (.97) s th ntrscton of th rgrsson functon wth th vrtcal axs. Th slop s th stmatd coffcnt (.4). Each yar of ducaton by th fathr rass th probablty that th offsprng wll b nrolld n trtary study by 4. prcntag ponts. W can prdct th probablty that a gvn ndvdual wll b nrolld by pluggng hs or hr fathr s ducaton nto ths condtonal xpctaton. For xampl, two yars of post-sc ducaton would gv us: Prob.97.4(2).479 Th problm s that a rgrsson functon wth any slop wll vntually pass outsd th horzontal lns dfnd by th data ponts. For xampl, somon wth a fathr who droppd out of school at ag 5 (.., X -2), wll hav a ngatv probablty of trtary study: Prob.97 Ths sn t possbl. Somon wth a fathr who has a PhD (.., X 7), wll hav a probablty of trtary study n xcss of on: Prob.97 Ths sn t possbl thr. Thus, w hav a fundamntal problm wth th LPM and forcasts. As a consqunc, w nd to xplor altrnatvs to th LPM. W want a tchnqu that stmats a 'rgrsson curv' boundd by zro and on (.., t asymptotcally approachs ths two horzontal lns.) Mght also not that th R 2 statstc sn t vry usful n th LPM as a masur of th goodnss of ft of ths rgrsson functon. It s dffcult to ft a rgrsson ln through two horzontal lns of data ponts. Th ntuton s that w r tryng to dtrmn th rlatonshp btwn th probablty of ths vnt and som.4(-2) -.85.4(7).84

Pag- 5 ndpndnt varabl. But w nvr obsrv th tru probablty. All w s s th vntual outcom of zro or on. II. Th Logt Modl Undr th LPM th probablty of an vnt occurrng s wrttn: Prob Undr th logt modl ths probablty s wrttn: β X Not that thr s now a dffrnc btwn th fttd valu and th stmatd probablty. Ths probablty s now a nonlnar functon of X. Ths s th cumulatv dstrbuton functon (CDF) for th logstc dstrbuton. W nd to vrfy that th probablty rang s now rstrctd to l wthn th, ntrval. β Prob - ( Prob - β β X ) If,thn Prob Whn s rasd to a larg ngatv numbr (n absolut valu), ths probablty approachs on. If -,thn Prob Whn s rasd to a larg postv numbr, ths probablty approachs zro. Thus, ths logstc rgrsson functon asymptotcally approachs on and zro. In btwn ths two xtrms, w can show ths logt modl rlatv to th LPM n th followng dagram.

Pag- 6 Not that th margnal or ncrmntal ffct of X on Y dclns at th xtrms. Ths s th slop of th curv at a gvn pont. Contrast ths wth th constant slop of th LPM. Th largst slop of th logt modl occurs at th nflcton pont, whr w go from ncrasng at an ncrasng rat to ncrasng at a dcrasng rat. Ths dosn t hav to corrspond to X. Log-Odds Rato How do w stmat ths logt rgrsson modl? On possblty s to convrt ths nonlnar functon nto a lnar rgrsson functon and apply OLS. Bgn by wrtng th probablty of not nrollng n trtary study as: W can now wrt th 'odds rato' as: - Prob - - - - - - - - Prob - Prob -

Pag- 7 Th trck s to ralz that: - Ths odds rato s th probablty that an vnt wll occur ovr th probablty that t wll not occur. For xampl, f th Prob.75, th odds rato s 3 or 3:. If th Prob.8, th odds rato s 4 or 4: By takng th natural log of th odds rato w gt: so that th 'log-odds rato' s a lnar functon of X, but th probablty s stll a nonlnar functon of X. For xampl, β tlls us how th log of th odds rato wll chang wth a on unt chang n X. Estmaton Prob ln - Prob Imagn that w try to us th log odds rato to stmat th arlr rgrsson modl on trtary nrolmnts. Plug n obsrvd valus for Y or Prob and run OLS. What's wrong wth ths approach? It dosn't work wth our cross scton of ndvduals bcaus w don't obsrv probablts, just actual outcoms. If Prob, thn ln( ) s undfnd If Prob, thn ln( ) s undfnd On way to stmat th modl s to us th Maxmum lklhood (ML) mthod, whch s byond th scop of ths cours. Altrnatvly, w can us a mthod calld Groupd Logt. Suppos w hav 'group' rathr than 'ndvdual' data (.g., a cross scton of scondary schools). W could stmat th probablts or frquncs of trtary nrolmnts for th graduats of ach school: ) ( β β X

Pag- 8 m Prob ˆ n m numbr who attnd unvrsts or polytchncs by som ag. n numbr who compltd scondary school n that class. Assumng that ths stmatd probablty s not or, w can run th followng wth OLS. Prob ˆ ln ˆ ˆ Prob ( ) β β X ε - ˆ whr th 'hats' on th coffcnts ndcat that ths ar stmatd wth 'groupd' data, and that w los nformaton n aggrgatng. Snc th dsturbancs ar htroskdastc: W can transform th data by multplyng through by th squar root of th wghtng varabl: Var ( ε ) n Prob ˆ (- Prob ˆ ) n Prob ˆ (- Prob ˆ ) Ths WLS procdur wll yld mor ffcnt stmators. III. Th Probt Modl Th probt modl s nothng mor than an altrnatv rgrsson functon that also asymptotcally approachs th zro and on horzontal lns. Th dffrnc s that t s basd on a 'normal' rathr than a 'logstc' dstrbuton functon. Rcall that undr th Logt modl th probablty that an vnt wll occur s wrttn: Undr probt, w lt Prob Prob - to b th CDF of a normal dstrbuton:

Pag- 9 Prob 2π 2 t xp( 2 ) - whr t s a standardsd normal varabl, wth zro man and unt varanc. For ths rason, t should b calld Normt. In gnral, thr s no rason to prfr logt ovr probt or vc vrsa. Probt dos hav a slghtly dffrnt rgrsson functon (although t asymtotcally approachs zro and on lk logt). dt It approachs th xtrm valus fastr than logt. Numrcal xampl. Th rgrsson modl: LF β β M β 2 S ε whr LF f woman n labour forc; othrws. M f woman s marrd; othrws. S numbr of yars of schoolng.. LPM (OLS). No corrcton for htroskdastcty. Lˆ F -.28 -.38 M...(.5).9 S (.3)

Pag- W can ntrprt th ffct of martal status on labour forc partcpaton: E ( E ( LF M, S 2) -.28.9(2).8 LF M, S 2) -.28 -.38.9(2).42 2. LPM (Wghtd Last Squars). LF ˆ -.2 -.39...(.5) whr th rlvant wght s th product of th probablts of bng n and out of th labour forc stmatd from th prvous rgrsson. It's asy to vrfy th problm of 'unrstrctd rang' of stmatd probablts undr LPM. E ( Th stmatd probablty of an unmarrd woman wth 6 yars of ducaton bng n th labour forc s 7%. 3. Logt (Maxmum lklhood). W us th sam ndvdual data to stmat th quaton wth logt. Th rsults ar rprsntd n trms of th log of th odds rato (vn though maxmum lklhood stmaton on th ndvdual data was usd). 4. Probt (Maxmum lklhood). M S.8...(.2) LF M, S 6) -.2.8(6).7 LF ln - 5.89-2.59 M - LF...(.8).69 S (.3) Agan, th ndvdual data ar usd wth maxmum lklhood probt. ( LF ) - 3.44 -.44 M...(.62) Whr F - s th nvrs of th normal CDF. F -.4 S (.7)

Pag- W can't compar th magntuds of th coffcnt stmats from th logt and probt, but th t tsts ar prformd n th tradtonal mannr. Th t ratos ar around 2.2 and 2.3 on M, rspctvly, and bttr than 2 on S n both rgrssons. Th magntuds of th stmatd coffcnts hav no conomc manng, bcaus thy r rlatd to labour forc partcpaton n a nonlnar way. IV. Qustons for Dscusson: Q3.2 V. Computng Exrcs: Johnson, Ch3