The Geometry of Logit and Probit

The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters. To recap, the normal vector s denoted as N j and ts reflecton as -N j. The normal vector s perpendcular to the cuttng plane. The cuttng plane n two dmensons s defned by the equaton N j (W Z j ) + N j (W Z j ) = 0 () where N j and N j are the components of the normal vector and (Z j, Z j ) s the mdpont of the roll call outcomes (see Fgure.). Any pont, (W, W ), that satsfes the equaton above les on the cuttng plane. For example, f the normal vector s (3, -) and the roll call mdpont s (, 0), then ths produces the equaton: 3(w ) (w 0) = 3w 3 w = 3w w 3 = 0 or 3w w = 3 so that (0, -3/), (/3, -), (, 3/), etc., all le on the plane. In three dmensons the cuttng plane s defned by the equaton N j (W Z j ) + N j (W Z j ) + N j3 (W 3 Z j3 ) = 0 () Where, as above, any pont, (W, W, W 3 ), that satsfes the equaton above les on the cuttng plane. For example, f the normal vector s (, -, ) and the roll call mdpont s (,, ), ths produces the equaton: (w ) (w ) + (w 3 ) = w w + w 3 = 0 or w w + w 3 = so that (0, 0, ), (, 0, -), (0,, 3), etc., all le on the plane.

For s dmensons the cuttng plane s defned by the vector equaton N j (W Z j ) = 0 (book,.8) (3) where N j s the s by normal vector such that N j N j =, W and Z j are s by vectors and 0 s an s by vector of zeroes. The normal vector s constraned to be of unt length for roll call votng work but t s a general vector n other applcatons. In general, f W A and W B are both ponts n the plane, N j W A = N j W B = c j, where c j s a scalar constant. Geometrcally, every pont n the plane projects onto the same pont on the lne defned by the normal vector, N j and ts reflecton -N j (see Fgure.). Ths projecton pont for the general case (N j not necessarly normalzed to one; that s, N j N j = ): M j = c j s N k= N j jk (book,.9) (4) To see ths consder the example of the 3-dmensonal plane above: w w = w w + w3 = w 3 [ ] (5) We need to fnd the pont on the normal vector [ - ] that satsfes equaton (5); that s, the pont where the plane passes through the normal vector tself. Let k be a scalar constant. The soluton s: k k = 4k + 4k + k =, so that k= 9 k [ ]

And the pont s 9 9 9 whch s gven n equaton (4). Note that, by constructon, N j M j = c j. In addton, n the roll call context, because the mdpont of the Yea and Nay polcy ponts, Z j, s on the cuttng plane, t also projects to the pont M j. The cuttng plane passes through the lne formed by the normal vector and ts reflecton (see Fgure.) at the pont M j. In the case of a smple probt analyss, the cuttng plane conssts of all possble legslator deal ponts such that the probablty of the correspondng legslator votng Yea or votng Nay s exactly.5; namely: P(legslator votes Yea) = P(legslator votes Nay) = 0+ + +... + ss Φ s = - 0+ + +... + ss Φ s = Φ ( 0) =.5 Where Φ(.) s the dstrbuton functon for the normal and,,, s are legslator s coordnates on the s dmensons. Because the s and s cannot be separately dentfed, the usual assumpton s to set s =. The equaton above reduces to: 0 + + + + s s = 0 or + + + s s = = - 0 (6) where s the s-length vector of legslator coordnates: = 3 s 3

and s an s-length vector of the coeffcents,, 3,, s ; that s: = 3 s Note that the expresson = - 0 s exactly the same as N j W = c j, whch was used above. Namely, set N j = and every pont n the plane projects onto the pont: M j = - 0 s k= k. (book,.0) (7) In other words, n a regular probt context the coeffcents on the ndependent varables form a normal vector to a plane that passes through the pont - 0 s k= k. Note that ths pont s fxed wth regard to s. In partcular, 0 + + +... + ss 0 = + s s s So that - = - s s 0 0 s s k k k= s k= (8) When there s complete separaton that s, no error, ths plays an mportant role below. are: The smple logt case s dentcal to probt when s =. The logt probabltes 4

( 0 + + +... + s s ) e = + e + e ( + + +... + ) ( + + +... + ) 0 s s 0 s s =.5 Cancelng out the denomnator and takng the natural log of both sdes yelds the same equaton as probt: 0 + + + + s s = 0 In both probt and logt the coeffcents on the ndependent varables form a normal vector to a plane that passes through the pont - 0 s k= k. The cosne of the angle between the normal vectors from probt and logt should be very close to one. That s: ' P L cosθ = P L (9) where θ s the angle between the two normal vectors, and. s the correspondng norm of the normal vector. Computng cosθ s a useful check on the two estmaton technques. Several examples of ths are shown n the Appendx. The Geometry of Complete Separaton (Perfect Votng) To smplfy the notaton below let the normal probablty densty functon be 0 + σ 0 + and the dstrbuton functon be Φ σ whch I wll smplfy to just and Φ, respectvely. I leave s n the expressons because t s the source of the dentfcaton problem. Namely, as a practcal matter, the cuttng plane s dentfed (up to a slght wggle, dependng upon the number of ponts) but, because there s no error, 5

6 s 0 and the observed coeffcent vector explodes to entres of + or - because of the mplct dvson of the coeffcents by s. In a standard presentaton the th row of the matrx of ndependent varables would be: 3 s, and the coeffcent vector would be 0 3 s. But for clarty of presentaton I want to separate the ntercept term, 0, from the other s coeffcents. In the general context my use of Yea and Nay below corresponds to the bnary dependent varable beng and 0, respectvely. The frst dervatves for the Probt Log-lkelhood functon are: ln L = Yea Nay Yea Nay Yea Nay s s Yea Nay Φ Φ Φ Φ (0) Let a and b be ndces from to s. Treatng 0 separately, the second dervatves are

+ + ( ) 0 0 +( Φ) ln L σ σ = 0 Yea Φ Nay ( ) + + ( ) 0 0 +( Φ) ln L σ σ = a 0 a Yea Φ Nay ( ) a (A) (B) And the remanng are: b + +b ( ) 0 0 +( Φ) ln L σ σ = a b ab ba bb Yea Φ Nay ( ) (C) Before turnng the pure case of complete separaton, consder the ntermedate but vexng case where one of the ndependent varables s an ndcator varable that separates wth respect to the bnary dependent varable. That s, whenever the ndcator s the correspondng value of the dependent varable s. Let the ndcator varable be. The frst partal dervatve s: ln L = + = = 0 Φ Φ Φ Φ () Yea,x= Yea,x= 0 Nay,x= 0 Yea,x= Because multplcaton by zero cancels the nd and 3 rd terms and the case Nay, x =, by defnton, does not occur. Hence, for equaton () to hold t must be the case that because =, 0 + σ + K σ = 0, where + Φ Φ + K σ σ Yea,x= Yea,x= 0 K + + +... + = s 0 3 3 s s. 7

The K can be treated as constants so that for equaton () to hold t must be the case that +. Note that ths means that the normal vector tself s not dentfed. Wth the K as constants so that s s a constant, then there wll be a dfferent normal vector wth every change n the value of so that the cuttng plane s not dentfed. The case of complete separaton has a dfferent geometry. By defnton, there exsts a plane that perfectly dvdes the Yeas ( s) from the Nays ( 0 s). Let N j be the s by normal vector to ths plane such that N j N j = (I do not need the j subscrpt here but I retan t for notatonal consstency). Hence, as dscussed above, f A and B are both ponts n the plane, then N j A = N j B = c j, where c j s a scalar constant. For complete separaton ether we have: f Yea, then N > c ; and f Nay, then N < c (3) j j j j or f Yea, then N < c ; and f Nay, then N > c j j j j Wthout loss of generalty I wll assume that equaton (3) s the correct polarty. Usng equaton (6) above, ths s equvalent to: 0 0 f Yea, then ; and f Nay, then > < (4) s s s s Or more smply: + +... + + +... + f Yea then > and f Nay, then < s s 0 s s 0 s s, 0; 0 However, f ths s true then the lkelhood functon forces the followng: 0 + 0 + f Yea, then Φ ; and f Nay, then Φ 0 σ σ 8

because σ 0. Agan, however, note that the cuttng plane s dentfed up to a slght wggle because the s are fxed constants and, by defnton, the plane perfectly dvdes the Yeas ( s) from the Nays ( 0 s). Ths plane passes through the s-dmensonal space of the ndependent varables and separates those cases correspondng to Yeas ( s) from those correspondng to Nays ( 0 s). In actual estmaton, the lkelhood functon for the lnear probt (and logt) problem s globally convex the nverse of the Hessan (equaton system above) s negatve defnte. Hence, the gradent vector wll quckly clmb up the surface of the lkelhood functon to the global maxmum. In the case of perfect separaton what ths means n practce s that the gradent vector explodes as t approaches the global maxmum of zero; that s, the vector becomes nfntely long. Therefore, a straghtforward soluton to ths problem s to apply the constrant = at each teraton and then adjustng the standard devaton term, σ, to compensate. When the standard devaton term begns to vansh, stop. It s then a smple matter to take the normal vector and fnd c j wth the Jance algorthm. Note that ths produces the coeffcent vector correspondng to a perfect specfcaton for ths partcular set of ndependent varables,, and the bnary dependent varable, y. However, snce there s no error, there s no probablty, and no standard errors. 9

Appendx Here are two examples from the NES 000 electon study. The varables are whether or not the respondent voted (0=not voted, > 0 voted), and the ndependent varables are party (0 6), ncome ( categores), race (0 = whte; = black), sex (0 = male, = female), south (0=north, =south), educaton (=hgh school, =some college, 3=college). Frst Example: Smple Probt and Smple Logt. probt voted party ncome race sex south educaton age Iteraton 0: log lkelhood = -680.766 Iteraton : log lkelhood = -607.997 Iteraton : log lkelhood = -607.895 Iteraton 3: log lkelhood = -607.890 Iteraton 4: log lkelhood = -607.890 Probt regresson Number of obs = 06 LR ch(7) = 47.08 Prob > ch = 0.0000 Log lkelhood = -607.890 Pseudo R = 0.080 voted Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- party.0905.03884 0.90 0.369 -.0755.0655 ncome.0358093.03463.70 0.007.009847.06776 race.03704.733654 0. 0.83 -.306875.376894 sex -3.3e-06.084055-0.00.000 -.64743.647366 south -.7693.08979 -.97 0.048 -.350758 -.003089 educaton.43938.0566663 7.48 0.000.3874.53500 age.094557.00784 6.99 0.000.040043.04907 _cons -.503988.0846-7.44 0.000 -.900458 -.0757. logt voted party ncome race sex south educaton age Iteraton 0: log lkelhood = -680.766 Iteraton : log lkelhood = -607.9598 Iteraton : log lkelhood = -605.9353 Iteraton 3: log lkelhood = -605.953 Iteraton 4: log lkelhood = -605.953 Logstc regresson Number of obs = 06 LR ch(7) = 49.60 Prob > ch = 0.0000 Log lkelhood = -605.953 Pseudo R = 0.099 voted Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- party.036497.035805.0 0.33 -.03407.06364 ncome.063748.09564.7 0.007.0738.073686 race.0769.8544 0.7 0.789 -.486799.6350639 sex.0598.406487 0. 0.90 -.597536.9579 south -.300596.48330 -.03 0.043 -.5938 -.0098743 educaton.70976.0964645 7.47 0.000.53909.90048 age.0335865.004884 6.97 0.000.0446.0430303 _cons -.6774.35473-7.46 0.000-3.30649 -.93899 0

Probt Normalzed (Normal Vector) Logt Normalzed (Normal Vector) Party 0.04498 0.04586 Income 0.077377 0.078054 Race 0.0807 0.096566 Sex -0.000007 0.0068 South -0.38079-0.380975 Educaton 0.9605 0.93764 Age 0.04040 0.04568 The correlaton (Cosne) between the two normal vectors = 0.99976.

Second Example: Ordered Probt and Ordered Logt. oprobt party voted ncome race sex south educaton age Iteraton 0: log lkelhood = -055.46 Iteraton : log lkelhood = -954.4476 Iteraton : log lkelhood = -954.38 Iteraton 3: log lkelhood = -954.38 Ordered probt regresson Number of obs = 06 LR ch(7) = 0.73 Prob > ch = 0.0000 Log lkelhood = -954.38 Pseudo R = 0.049 party Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- voted.4435.0395748 0.7 0.000.3465583.506887 ncome.076777.0095894.84 0.065 -.007.036476 race -.8669836.43798-6.05 0.000 -.47807 -.586603 sex -.0837943.06489 -.9 0.96 -.08568.04368 south.46657.06937 3.56 0.000.08085.385055 educaton -.009693.043899-0.04 0.964 -.0880097.08407 age -.0066873.004-3. 0.00 -.00886 -.004886 -------------+---------------------------------------------------------------- /cut -.7585.5773 -.04965 -.45474 /cut -.60099.4865 -.573096.065898 /cut3.007637.47737 -.088787.490344 /cut4.588.485304.307086.8937 /cut5.968794.50958.665737.580 /cut6.47069.557338.65386.77585. ologt party voted ncome race sex south educaton age Iteraton 0: log lkelhood = -055.46 Iteraton : log lkelhood = -958.0084 Iteraton : log lkelhood = -957.35 Iteraton 3: log lkelhood = -957.306 Ordered logstc regresson Number of obs = 06 LR ch(7) = 96.3 Prob > ch = 0.0000 Log lkelhood = -957.306 Pseudo R = 0.0477 party Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- voted.69534.06894 0.8 0.000.564633.89647 ncome.095464.060355.84 0.065 -.00886.0609755 race -.45739.404073-6.06 0.000 -.9858 -.986005 sex -.444.0974 -.3 0.89 -.35958.07083 south.3940648.7609 3.36 0.00.644337.636959 educaton -.0745.074965-0.7 0.865 -.59667.34767 age -.05873.0036647-3.6 0.00 -.087699 -.0044047 -------------+---------------------------------------------------------------- /cut -.355.54506 -.8436 -.866589 /cut -.407.47775 -.9066953.06455 /cut3.7795.460457 -.04886.76099 /cut4.8034034.477555.3785.88995 /cut5.534033.59847.0389.09874 /cut6.47.679.898674.96765

Probt Normalzed (Normal Vector) Logt Normalzed (Normal Vector) voted 0.4474 0.46670 Income 0.07680 0.07706 Race -0.867086-0.873346 Sex -0.083804-0.08646 South 0.46686 0.3645 Educaton -0.00970-0.007638 Age -0.006688-0.006944 The correlaton (Cosne) between the two normal vectors = 0.99995. 3