Chapter 14 Logistic Regression Models

Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as well as lke a dcator varables Whe the explaatory varables are qualtatve, the ther values are expressed as dcator varables ad the dummy varable models are used Whe the study varable s qualtatve varable, the ts values ca be expressed usg a dcator varable takg oly two possble values 0 ad I such a case, the logstc regresso s used For example, y ca deotes the values lke success or falure, yes or o, lke or dslke whch ca be deoted by two values 0 ad Cosder the model y β + β x + β x + + β x + ε 0 k k xβ + ε,,,, where x [, x, x,, x ], β [ β, β, β,, β ] k 0 k The study varable takes two values as y 0 or Assume that y follows a Beroull dstrbuto wth parameter, so ts probablty dstrbuto s wth Py ( y 0 wth Py ( 0 Assumg E( ε 0, E( y + 0( From the model y xβ + ε, we have E( y xβ E( y xβ Ey ( Py ( Thus respose fucto E( y s smply the probablty that y Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur

Note that ε y xβ, so - whe y, the ε xβ - y 0, the ε xβ Recall that earler ε was assumed to follow a ormal dstrbuto whe y was ot a dcator varable Whe y s a dcator varable, the ε takes oly two values, so t caot be assumed to follow a ormal dstrbuto I usual regresso model, the errors are homoskedastc, e, a dcator varable, the [ ] Var( y E y E( y ( (0 ( + y [ ] ( + ( E( y ( σ [ E y ] Thus Var( y depeds o y ad s a fucto mea of Var( ε σ ad so Var( y σ Whe y s y Moreover, sce E( y ad s the probablty, so 0 ad thus there s a costrat o E( y that 0 E( y Ths puts a bg costrat o the choce of lear respose fucto Oe caot ft a model whch the predcted values le outsde the terval of 0 ad Whe y s a dchotomous varable, the emprcal evdeces suggest that the fucto E( y o the whole real le that ca be mapped to [0,] has the sgmod shape It s a olear S shape lke E(y E(y 0 x 0 x Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur

A atural choce for E( y would be the cumulatve dstrbuto fucto of a radom varable I partcular, the logstc dstrbuto, whose cumulatve dstrbuto fucto s the smplfed logstc fucto yelds a good lk ad s gve by exp( y E( y + exp( y exp( x β + exp( x β + exp( x β Lear predctor ad lk fuctos: The systematc compoet E(y s the lear predctor ad s deoted as η β x xβ,,,,, 0,,,, k The lk fucto geeralzed lear model relates the lear predctor η to the mea respose µ Thus g( µ η or µ g ( η I the usual lear models based o the ormally dstrbuted study varable, the lk g ( µ µ s used ad s called as detty lk A lk fucto maps the rage of emprcal approxmato ad carres meagful terpretatos real applcatos µ oto the whole real le, provdes good I case of logstc regresso, the lk fucto s defed as η l Ths trasformato s called as the logt trasformato of probablty ad s called as odds The lk η s also called as log-odds Ths lk fucto s obtaed as follows: Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur 3

+ exp( η or [ η ] + exp( or or e η µ l Note: Smlar to logt fucto, there are other fuctos also whch have same shape as of logstc fucto These fuctos ca also be trasformed through There are two such popular fuctos probt trasformato ad complemetary log-log trasformato The probt trasformato s based o the trasformato of usg the cumulatve dstrbuto fucto of ormal dstrbuto ad based o ths s the probt regresso model The complemetary log-log trasformato of s l[ l( ] Maxmum lkelhood estmato of parameters: Cosder the geeral form of the logstc regresso model y E( y + ε where y s are depedet Beroull radom varable wth parameter wth E( y x β + exp( β exp( x The probablty desty fucto of y s y y ( (,,,,, 0 or f y y The lkelhood fucto s Ly (, y,, y, β, β,, β L f( y k f y y ( ( Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur 4

Sce so y l L l + l( y yl + ( y l( yl + [ l( ] exp( xβ x, + exp( β, + β exp( x x β exp(, x β l exp, l L yx β l + exp( xβ Suppose repeated observatos are avalable at each level of the x -varables Let y be the umbers of s observed for th observato ad l L y + l( y l( be the umber of trals at each observato The The maxmum lkelhood estmate ˆβ of β s obtaed by the umercal maxmzato If V ( ε Ω, the asymptotcally E( ˆ β β V ˆ β X Ω X ( ( After obtag ˆβ, the lear predctor s estmated by ˆ xβ η The ftted value s exp( ˆ η yˆ ˆ + exp( ˆ η exp( ˆ exp( ˆ + η + xβ Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur 5

Iterpretato of parameters: To uderstad the terpretato of the related case wth oly oe varable as η( x β + β x 0 β s the logstc regresso model, frst cosder a smple After fttg of model, ˆ β ˆ 0 ad β are obtaed as the estmators of β0 ad β respectvely The the ftted lear predctor at x x s ˆ( η x ˆ β + ˆ β x 0 whch s the log-odds at x x The ftted value at x x + s ˆ( η x + ˆ β + ˆ β ( x + 0 whch s the log-odds at x x + Thus ˆ β ˆ η( x + ˆ η( x [ x ] [ x ] l odds( + l odds( odds( x + l odds( x odds( x + exp( ˆ β odds( x Ths s termed as odd rato whch s the estmated crease the probablty of success whe value of explaatory varable chages by oe ut Whe there are more tha oe explaatory varables the model, the the terpretato of β s s smlar as the case of sgle explaatory varable case The odds rato s exp ( ˆ β assocated wth explaatory varable x keepg other explaatory varables costat Ths s smlar to the terpretato of β multple lear regresso model If there s a m ut chage s the explaatory varable, the the estmated crease odds rato s exp ( mβ ˆ Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur 6

Test of hypothess: The test of hypothess for the parameters the logstc regresso model s based o asymptotc theory It s a large sample test based o lkelhood rato test based o a statstc termed as devace A model wth exactly p parameters that perfectly fts to the sample data s termed as saturated model The statstc that compares the log-lkelhoods of ftted ad saturated models s called as model devace It s defed as λβ ( l L(saturated model l L( ˆ β where l L( s the log-lkelhood ad ˆβ s the maxmum lkelhood estmate of β I case of logstc regresso model, y 0 or ad s are completely urestrcted So the lkelhood wll be maxmum at y ad the maxmum value of L (saturated modal s Maxmum L(saturated model l Maxmum L(saturated model 0 Let ˆβ be the maxmum lkelhood estmator of β, the log-lkelhood s maxmum at β ˆ β, ad ˆ ˆ l L( β yx β l + exp( xβ l L(saturated model Assumg that the logstc regresso fucto s correct, the large sample dstrbuto of lkelhood rato test statstc λβ ( s approxmately dstrbuted as χ ( p, whe s large Large value of λβ ( mples model s correct Small value of λβ ( mples that model s well ftted ad s as good as the saturated model Note that geerally the ftted model wll be havg smaller umber of parameters tha the saturated model that s based o all the parameters Thus at α % level of sgfcace Regresso Aalyss Chapter 4 Logstc Regresso Models Shalabh, IIT Kapur 7