Faculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, PDF Free Download

Faculty Research Iterest Semar Departmet of Bostatstcs, GSPH Uversty of Pttsburgh Gog ag Feb. 8, 25

Itroducto Joed the departmet 2. each two courses: Elemets of Stochastc Processes (Bostat 24). Aalyss of Icomplete Data (Bostat 265). Some research terests: Statstcal Aalyss of Mssg Data. Aalyss of Correlated Outcomes. Sem-Parametrc Statstcs. Statstca of the Bostatstcal Ceter of NSABP (www.sabp.ptt.edu). Work o cacer clcal trals. Recetly volved studes to vestgate the assocato betwee geomc data ad clcal outcomes (wth Drs. Joh Bryat ad Joe Costato).

A Iterestg Pheomeo Pseudolkelhood method Lkelhood fucto of : L( ; α) = L( α). : parameter of terest; α: ussace parameter. ˆ α s a cosstet estmator of α; α s the true value. Cosder ˆ = arg max L( α ; ) ad = arg max L( ; ˆ α) <pseudolkelhood estmate>. I some crcumstaces, var( ) < var( ˆ ). Demostrate through a mssg-data problem. Questo: how to mprove the effcecy whe α s actually kow, from other source?

Bvarate data wth outcome-depedet orespose Model assumptos: ( a) X f( x; α), [ Y X] g( y x, ). ( b) Pr[ R= X, Y ] = w( y; ψ ). A codtoal lkelhood method m L( ; F) = p( x y ;, F) (for x R, gve y ) m g( y x, ), m=# of c.c. g( y x, ) df( x) where F( x) = F( x; α ) represets the true CDF of X. ˆ = arg max L( ; F) = arg max L( ; F( x; ˆ α)), where ˆ α = arg max f( x; α). α X Y R = R =

Full lkelhood for 2-patter data Model assumptos: ( a) X f( x α), [ Y X] g( y x, ). ( b) Pr[ R= X, Y ] = w( y ψ ). Full lkelhood: m L ( αψ,, ) = f( x α) g( y x, ) w( y ψ) Full m+ f( x α) g( y x, ){ w( y ψ )} dy m m+ = { f ( x α) } [{ g( y x, ) w( y ψ)} g( y x, ){ w( y ψ)} dy] = L( α) L( ψ, ) X Y R = R =

Asymptotc Varace of ˆ Let l( ) = log L( ; α), from ˆ = arg max L( ; α ), the = l ( ˆ α ; ) l ( ; α) + l ( ; α)( ˆ ) ˆ herefore, l ( ; α ) l ( ; α) E( l,) l, ( ; α). ( ) { } Var( ( ˆ )) E( l ) Var( l ( ; α )) E( l ),,, = El ( ) El ( l ) El ( ),,,,, (A sadwch-type estmator).

Asymptotc Varace of (I) Let S( α ) = log f( x; α), ˆ α = arg max f( x; α) s MLE of α, ( ˆ α α ) ( ) (, ( ) ). ESαα Sα, N ESαα From = arg max ( ; ˆ) arg max ( ; ˆ L α = l α), = l ( ; ˆ α) l ( ; ˆ α) + l ( ; ˆ α) ( ), the ( ) { } α l ( ; ˆ α) l ( ; ˆ α) l ( ˆ ; α) + lα ( α α) E( l,) El l El (,) {, ( α,) ES ( αα ) Sα, }.

Asymptotc Varace of (II) From ( ) ( ) { ( ) ( ) El, l, Elα, ESαα Sα, Var( ( )) El ( ) { El ( l ),,, El ( S ) E( S ) E( l },, α, αα α, El l l ( α,) ES ( αα ) ES ( α, l, ) El ( α,) ES ( αα ) E( α, ) } E(,). O the other had, = E( l ) = l,, = l, ( α, ) f( x; α) g( y x; ) p( r x, y ; ψ) dx dy dr ( α, ) L( α) L(, ψ) dµ So, = El (, ) = El ( α, ) + S l,, L( ) L(, ) d α α ψ µ α = El ( ) + ES ( l ) = El ( ) + ES ( l ). herefore, α, α,, α, α,, Var( ( )) E( l ) { E( l l ) El ( ) E( S ) El ( ) } El ( ),,, α, αα α,, )

Aother Pseudolkelhood Estmator Whe the fuctoal form of = arg max L( ; F ) m m j= F( x) g( y x, ) = arg m ax, g( y x, ) df ( x) j s ukow, let = arg max g( y x, ), ( PL2 ) g( y x, ) where, F ( x) = I( x x) s the emprcal dstrbuto of X. Smulato studes suggest that s eve more effcet eve though t has o assumpto o F( x). X Y R = R =

Auxlary Iformato If F( x) = F( x; α ) s kow, for example, some survey studes, ca we get more effcet estmator? Aswer: yes, wth emprcal lkelhood.

Emprcal Lkelhood " Emprcal lkelhood s a oparametrc method of ferece based o a data-drve lkelhood rato fucto" (Art Owe). Suppose { x} s a radom sample of X F( x), estmator for F( x) s F( x) = I{ x x}, the the o-parametrc or p s maxmzed subject to p =, where p = pr{ X = x}. Wth auxlary formato Ewx ( ( )) = avalable, the emprcal lkelhood estmator maxmzes p subject to costrats p = ad pw( x ) =.

Icorporate Auxlary Iformato For example, EX ( ) = µ s kow. he PL2 estmator solves the estmatg equatos New estmator: = l ( ; F ) = l ( ; F )., () Maxmzes p subject to costrats p = ad p ( x µ ) =. Get {p ˆ }. () Let solves = p ˆ l ( ; F ), the, ( ) = ( ) { ( ( ; )) ( ) ( ) ( )}( ). Var El Var l F E l x Var x E xl El X Y R = R =

X Y A smulato study Complete data: () [ x] N(,), (2) [ y x] N( + x,). Mssg-data mechasm: pr[ R = x, y] =Φ( y -). Compare the performace of the PL ad EL estmates for the regresso model (2), where β β σ = 2 =(,, ) (,,). Note: sample sze =3, average # of c.c.=5.

Smulato results able: emprcal bases ad stadard devatos of two estmators over replcates Methods β β σ 2 PL.2 (.33).9 (.5).5 (.97) EL.2 (.29).7 (.).9 (.86)

Referece ag, Lttle ad Raghuatha. Aalyss of multvarate mssg data wth ogorable orespose. Bometrka, vol. 9, pp. 747-764, 23. ag, Lttle ad Raghuatha. Aalyss of multvarate mootoe mssg data by a pseudolkelhood method. Proceedgs of the 2 d Seattle Symposum Bostatstcs: Aalyss of Correlated Data. Lecture Notes Statstcs, Vol. 79. Edtor, D. L ad P.J. Heagerty. 25. A upublshed mauscrpt.

Faculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, 2005