since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson lnear probablty model setup y = β 0 + β x + β 2 x 2 + μ y = β 0 + β x + β 2 x 2 + μ E(y X) = β 0 + β x + β 2 x snce 2 E(y X) = (prob(y = X) + 0(prob(y = 0 X))=pr(y = X), pr(y X) = E(y X) = β 0 + β x + β 2 x 2 E( X)=0 assumed true Stll true here: E(μ X) = [prob(y = ) (β 0 + β x + β 2 x 2 )] + [prob(y = 0) ( β 0 + β x + β 2 x 2 )] = 0 V( ) s constant lnearty assumed true: V( )= 2 assumed to be a reasonable approxmaton Cannot be true: V( )=pr(y=) [-( 0+ x+ 2x2)] 2 + pr(y=0) [ -( 0+ x+ 2x2)] 2 = [-( 0+ x+ 2x2 )] [ 0+ x+ 2x2]=[-prob(y=)] [prob(y=)] soluton: Weghted Least Squares Cannot be true: unboundedness problem, Wooldrdge secton 7.5 soluton: nonlnear probablty equaton Problems wth the lnear probablty model:. Lnearty assumpton (mappng all those values of Xs nto the (0,) nterval) pcture, and why we use logstc regresson and the bg grls and boys used probts, logts 2. Heteroskedastcty: Weghted Least Squares Procedure for Lnear Probablty Model to handle ths problem (another opton wll be to use the robust procedure n STATA). HETEROSKEDASTICITY ADJUSTMENT FOR LINEAR PROBABILITY MODELS ONLY:. run OLS and get the predcted value of y, call t "predcted, or P.

2. check f P >.0, then make P =.999. *to keep the probablty wthn bounds 0<P<; check f P <0, then make P =.00. *to keep the probablty wthn bounds 0<P<; 3. compute for each observaton and place the output n a column to be used as a P ( P ) "weght". In Stata and SAS you lterally compute the predcted value of the dependent varable, and use a column of values as weghts. P ( P ) 4. run weghted least squares regressons The Stata code for dong weghted least squares of the lnear probablty model s # delmt ; nfle gpa tuce_scr ps a_grade usng "e:\classrm_data\aldr_lpm.txt", clear; summarze; regress a_grade gpa tuce_scr ps; predct YHAT; replace YHAT=.999 f YHAT>=; replace YHAT=.00 f YHAT<=0; gen WT = / (YHAT*(-YHAT)); lst a_grade YHAT gpa tuce_scr ps; regress a_grade gpa tuce_scr ps [w=wt]; SAS code for the same problem; data one; nfle "e:\classrm_data\aldr_lpm.txt" delmter='09'x dsd truncover; * the opton delmter='09'x dsd truncover s for tab delmted fles; nput gpa tuce_scr ps a_grade ; run; proc means; run; proc reg; model a_grade=gpa tuce_scr ps; output out=two p=yhat; run; data two; set two; f YHAT>= then YHAT=.999 ; f YHAT<=0 then YHAT=.00; WT = / (YHAT*(-YHAT)); run; proc prnt; var a_grade YHAT gpa tuce_scr ps; run; proc reg; model a_grade=gpa tuce_scr ps ; weght WT; run; [[fastfood.do: Restaurant regonal sales manager wants to fnd out what determnes the lkelhood that each fast-food chan reached ts quota of $6,500 n fast food sales. The restaurants are located n four dfferent ctes, and traffc flow on the street where the restaurant s located vares by locaton.]] 2

OR, you can just use the robust opton to correct for heteroskedastcty (t may not be as effcent as the weghted least squares f you model the heteroskedastcty correctly), but t s robust to alternatve forms of heteroskedastcty. ***Stata robust standard error opton*** regress a_grade gpa tuce_scr ps, robust; ***SAS robust standard error opton*** proc genmod; class d; model a_grade=gpa tuce_scr ps; repeated subject=d; run; III. The General Set-up for Bnary Choce Models The outcome s zero or one, condtonal on x (the observed characterstcs). Hence bnary choce models are Bernoull processes (one, zero outcome, wth probablty fxed gven x) the only dfference wth the usual Bernoull processes you have studed (lke flppng a con) s that we are condtonng on x. Let P(y = x) =probablty of a one outcome gven x, then we have the followng: P(y = 0 x) = P(y = x); E(y x) = (P(y = x)) + 0(( P(y = x))) = P(y = x); and Var(y x) = (P(y = x))( P(y = x)) 2. There are dfferent functonal form choces for the P(y = x) functon, n partcular, the followng three are most popular: P(y = x) = G(x ) Where x s the x k vector of explanng varables, the frst element of whch s one (the ntercept), and G(.) s some approprate functon. For the lnear probablty model, Lnear probablty model (LPM): G(xβ) = xβ exp (xβ) Logt: G(xβ) = = +exp (xβ) +exp ( xβ) x Probt: G(x )= ( )d where ( ) For all functons, the margnal effect s gven by s the standard normal densty functon. p( y x) G( x ) G( z ) j x x G( z) G( z) Where = for the lnear probablty model (LPM) and =G(z) ( G( z) G(z)) = prob(y = ) ( prob(y = )) for the logt model, and = ( z) for the probt model (Lebnz rule for dfferentaton). To get the margnal effect for probt n Stata use the dprobt procedure: dprobt a_grade gpa tuce_scr ps; To get the margnal effect logt n Stata add the mfx compute command after the logt procedure as follows: logt a_grade gpa tuce_scr ps; mfx compute; More partcular nformaton follows: IV. Logstc Regresson Model j j 3

Whereas the probablty of a success (gettng an A n the frst example, or meetng your sales quota n the second example above) for the lnear probablty model s Prob(y = ) = β 0 + β x + β 2x 2 n the Logstc regresson model t s Prob(y=) = exp( ˆ ˆ x ˆ x ) 0 x 2 exp( ˆ ˆ x ˆ x ) 0 2 2 whch complcates thngs n two ways: a. the estmaton s non-lnear, and based on searchng for the best estmates rather than gettng the estmates drectly from a smple set of calculatons (as we do n OLS). The estmaton technque s known as maxmum lkelhood estmaton, and t has good propertes for moderately large and large samples (not only the tests, but the estmators are nce n large samples). b. the nterpretaton of the coeffcents s somewhat dfferent then for OLS estmates. In partcular, to fnd the mpact of ncreasng X by one unt on prob(y=) we need to multply β, the estmated coeffcent, by prob(y=)*( - prob(y=)) as follows change n prob(y ) margnal effect= ˆ prob(y ) [ prob(y )] change n x Do the logstc regressons for the samples above, and compare the resultng coeffcents. STATA: aldr_logt.do # delmt ; nfle gpa tuce_scr ps a_grade usng "e:\classrm_data\aldr_lpm.txt", clear; summarze; logt a_grade gpa tuce_scr ps; SAS: proc logstc descendng; model y=x x2 ; run;. Probt analyss s another way to model dchotomous choces (.e., the probablty of a success). It s also nonlnear and based on slghtly dfferent dstrbutonal assumptons (namely, the cumulatve normal dstrbuton assumpton). We wll dscuss these models further n the next lecture. To get the margnal effects n Stata for probts and logts use the margns command as ndcated: probt a_grade gpa tuce_scr ps margns, dydx(gpa tuce_scr ps) ((Note that the dprobt opton n Stata gves you the margnal effects at the means, not qute as accurate for most BYU research purposes, as the ones gven by the margns command above (margnal effects computed for every observaton, and then averaged).)) 4

To get the margnal effect logt n Stata add the margns, dydx(.) command agan after the logt procedure as follows: logt a_grade gpa tuce_scr ps margns, dydx(gpa tuce_scr ps) To get the margnal effect for logts n SAS use the followng: proc qlm data=one; model a_grade=gpa tuce_scr ps/ dscrete(d=logstc); /* d=probt for probts*/ output out=outqlm margnal; run; proc means data=outqlm; var meff_p2_gpa meff_p2_tuce_scr meff_p2_ps; run; [[[[[[[TIME TO PLAY: DO YOU WANT A WHOLE HERSEY BAR?. An estmated age-coeffcent value of.05 n a lnear probablty model of the probablty of beng marred (wth a zero-one dependent varable) ndcates: a. that 95 percent of the sample s not marred b. that for each addtonal year of age, the probablty of marrage ncreases by 5 percent * c. that for each addtonal year of age, the probablty of marrage ncreases by less than 5 percent d. none of the above 2. An estmated age-coeffcent value of.05 n a bnomal logt (or bnary logt, logstc regresson, or just logt) ndcates: a. that 95 percent of the sample s not marred b. that for each addtonal year of age, the probablty of marrage ncreases by 5 percent c. that for each addtonal year of age, the probablty of marrage ncreases by less than 5 percent * d. none of the above 3. The lnearty or boundedness problem wth the lnear probablty model s that: a. the errors exhbt heteroskedastcty b. the error s not normally dstrbuted c. the R 2 s not an accurate measure of goodness of ft d. a regresson lne wth any slope wll tend to rse above, and fall below 0 for some values of the ndependent varables * ]]]]]]]]]]]]]] V. Coffcents vary n these models: the A_grade example aldr_lpm_probt.do (along wth pror results( yelds): lnear probablty model probt model logt model Constant -.498-7.452-3.02 Gpa.4639 (4.206).626 (3.409) 2.826 (3.4252) Tuce.005 (.670).057 (.765).095 (.82) Ps.379 (.482).426 (.53) 2.279 (.493) 5

loglkelhood -2.978-2.89-2.890 IV. testng multple hypotheses: the lkelhood rato has a Ch-square dstrbuton Another example from A_grade example: Are pre-course standngs predctve? Testng whether the coeff (gpa)=0 and coeff (tuce)=0, smultaneously. Tests: wth and wthout (gpa and tuce) probt logt log-lkelhood wth gpa/tuce -2.89-2.890 log-lkelhood wthout gpa/tuce -7.67-7.67 log-lkelhood rato statstc (7. n Wooldrdge) 2*4.852=9.704 2*4.78=9.562 In ths example wth two varable coeffcents set equal to zero, the log-lkelhood rato statstc s dstrbuted as a Ch-square varate wth 2 degrees of freedom under the null hypothess that these varables are unmportant (and therefore, can be left out of the equaton). Is the null hypothess supported? 6