Estimation of accelerated failure time models with random effects

Size: px
Start display at page:

Download "Estimation of accelerated failure time models with random effects"

Transcription

1 Retrospectve Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 6 Estmaton of accelerated falure tme models wth random effects Yaqn Wang Iowa State Unversty Follow ths and addtonal works at: Part of the Bostatstcs Commons Recommended Ctaton Wang, Yaqn, "Estmaton of accelerated falure tme models wth random effects " (6). Retrospectve Theses and Dssertatons Ths Dssertaton s brought to you for free and open access by the Iowa State Unversty Capstones, Theses and Dssertatons at Iowa State Unversty Dgtal Repostory. It has been accepted for ncluson n Retrospectve Theses and Dssertatons by an authorzed admnstrator of Iowa State Unversty Dgtal Repostory. For more nformaton, please contact dgrep@astate.edu.

2 Estmaton of accelerated falure tme models wth random effects by Yaqn Wang A dssertaton submtted to the graduate faculty n partal fulfllment of the requrements for the degree of DOCTOR OF PHILOSOPHY Major: Statstcs Program of Study Commttee: Kenneth J. Koehler, Major Professor Song X Chen Rchard Evans Heke Hofmann Terry Therneau Iowa State Unversty Ames, Iowa 6 Copyrght Yaqn Wang, 6. All rghts reserved.

3 UMI Number: UMI Mcroform Copyrght 7 by ProQuest Informaton and Learnng Company. All rghts reserved. Ths mcroform edton s protected aganst unauthorzed copyng under Ttle 7, Unted States Code. ProQuest Informaton and Learnng Company 3 North Zeeb Road P.O. Box 346 Ann Arbor, MI

4 TABLE OF CONTENTS ABSTRACT... v GENERAL INTRODUCTION... Introducton... Cox Proportonal Hazards Model wth Random Effects Accelerated Falure Tme Models AFT Models Inference for AFT Models AFT Models wth Shared Fralty AFT Models wth Random Effects... 4 Dssertaton Organzaton References for General Introducton... 3 ESTIMATION OF ACCELERATED FAILURE TIME MODELS WITH RANDOM EFFECTS... 6 Abstract... 6 Introducton... 6 Accelerated Falure Tme Models wth Random Effects AFT Models wth Shared Fralty.... AFT Models wth Random Effects... 3 Estmaton Approxmate Lkelhood Asymptotc Propertes of Laplace-Based Estmaton Consstency of the Laplace-Based Estmator Asymptotc Normalty Estmaton Smulaton Studes Descrpton of Smulaton I Results of Smulaton I Descrpton of Smulaton II... 48

5 4.4 Results of Smulaton II Approxmate Grouped Jackknfe Estmator Dscusson References Appendx The Accuracy of the Laplace Approxmaton Appendx Programs for AFT Models wth Random Effects Algorthm Descrpton Algorthm Testng... 6 AFT MODELS WITH RANDOM EFFECTS FOR CORRELATED SURVIVAL DATA AND AN APPLICATION TO BREAST CANCER FAMILY DATA Abstract Introducton Mnnesota Breast Cancer Famly Studes Mnnesota Breast Cancer Famly Resource Knshp Mxed Effects Cox Models Modelng the Breast Cancer Data Usng Mxed Effects Cox Models AFT Models wth Random Effects Modelng the Breast Cancer Data Usng AFT Models wth Random Effects Dscusson References GENERAL CONCLUSIONS...

6 v ABSTRACT Correlated survval data wth possble censorng are frequently encountered n survval analyss. Ths ncludes mult center studes where subjects are clustered by clncal or other envronmental factors that nfluence expected survval tme, studes where tmes to several dfferent events are montored on each subject, and studes usng groups of genetcally related subjects. To analyze such data, we propose accelerated falure tme (AFT) models based on lognormal fraltes. AFT models provde a lnear relatonshp between the log of the falure tme and covarates that affect the expected tme to falure by contractng or expandng the tme scale. These models account for wthn cluster assocaton by ncorporatng random effects wth dependence structures that may be functons of unknown covarance parameters. They can be appled to rght, left or nterval-censored survval data. To estmate model parameters, we consder an approxmate maxmum lkelhood estmaton procedure derved from the Laplace approxmaton. Ths avods the use of computatonally ntensve methods needed to evaluate the exact log-lkelhood, such as MCMC methods or numercal ntegraton that are not feasble for large data sets. Asymptotc propertes of the proposed estmators are establshed and small sample performance s evaluated through several smulaton studes. The fxed effects parameters are estmated well wth lttle absolute bas. Asymptotc formulas tend to underestmate the standard errors for small cluster szes. Relable estmates depend on both the number of clusters and cluster sze. The methodology s used to analyze data taken from the Mnnesota Breast Cancer Famly Resource to examne age-at-onset of breast cancer for women n 46 famles.

7 GENERAL INTRODUCTION Introducton There are two mportant classes of regresson models for survval data, Cox proportonal hazards (PH) models (Cox, 97) and accelerated falure tme (AFT) models (Collett, 3). Cox proportonal hazards models relate the hazard functon to covarates, whle the AFT models specfy a drect relatonshp between the falure tme and covarates. Cox models have been extensvely appled n medcal research. AFT models are especally useful n ndustral applcatons n whch falure s accelerated by thermal, hgh-voltage or other factors. The theme of ths dssertaton s the applcaton of accelerated falure tme models to correlated survval data. Tradtonal applcatons and development of the proportonal hazards and AFT models have reled on the assumpton of ndependent responses from the montored unts that are subject to falure. Correlated survval data wth possble censorng, however, are frequently encountered n survval analyss and models for correlated survval data are recevng ncreasng attenton. Correlated data may arse from multple observatons on the same ndvduals, for nstance, recurrent nfectons n clncal trals. The lack of ndependence also appears when observatons are clustered, for example, n a mult-center study of kdney transplant survval (Lambert et al., 4), survval tmes of patents from the same transplant center were assocated snce the transplants mght be carred out by the same surgcal team. Correlated survval tme may also arse when genetcally or socally related subjects, such as famly members or classmates, are followed untl some specfc event occurs. Tradtonal methods of estmaton that treat observatons as ndependent are napproprate for such data. Varous methods have been developed for analyss of correlated observatons. One basc approach ntroduces random effects nto models to nduce correlatons. In survval analyss such random effects models are commonly referred to as fralty models. Another approach s to use estmaton methods developed for ndependent observatons, such as partal lkelhood estmaton, and then adjust the covarance matrx of the resultng estmators to reflect the

8 correlatons. Robust or Sandwch covarance estmators, or approprate resamplng methods, can be used to obtan consstent estmates of covarance matrces and standard errors. Whle ths approach provdes approprate large sample nferences, the estmators tend to be neffcent because nformaton provded by the correlatons among the survval tme s not fully ncorporated nto the estmatng equatons. Ths s a specal case of generalzed estmatng equatons. It has the advantage of not requrng a specfc model for the jont dstrbuton of the correlated responses, whch may be dffcult to assess for small or moderate samples. Estmatng equatons that ncorporate nformaton about the correlaton structure of the observatons can be developed wthout completely specfyng a model for the jont dstrbuton of the observatons, and such equatons can mprove the effcency of estmators. By completely specfyng jont dstrbutons for correlated observatons, maxmum lkelhood, maxmum partal lkelhood, or Bayesan estmaton methods can be used. Although effcency may be ganed, one practcal problem wth ths approach s that the dervaton of the margnal lkelhood, or margnal partal lkelhood, for the observed may be ntractable. Numercal ntegraton s usually not feasble, and margnal lkelhoods, or margnal partal lkelhoods, are ether evaluated wth smulaton technques or approxmated. The former may be qute expensve computatonally, and the latter s an approxmaton that may reduce effcency of estmaton. The concept of fralty ntally was used to explan varablty due to heterogenety of members of a populaton n the context of mortalty studes (Vaupel et al, 979). Fraltes are bascally random effects n survval models. Hougaard (986) examned a shared fralty model wth Webull hazards. Whtmore and Lee (99) dscussed an nverse Gaussan shared fralty model wth constant ndvdual hazards. A shared fralty descrbes some common effects on the members of a cluster. The shared fralty model has ganed broad acceptance over the last few years for clustered survval data. When there are dependences among observed survval tmes, tradtonal partal lkelhood estmaton for the Cox proportonal hazards model that assumes ndependent responses may not provde relable nferences. Although parameter estmates are generally consstent, gnorng the dependence of correlated survval data adversely affects the precson of the parameter estmates (We, Ln, and Wessfeld, 989). More mportantly, the estmated

9 3 varances of parameter estmates are based. Therefore, the Cox proportonal hazards model wth random effects was proposed to account for such dependences. Many approaches have been developed to estmate parameters n the Cox proportonal hazards model wth random effects. Next, we wll brefly revew several estmaton procedures for ths model. Cox Proportonal Hazards Model wth Random Effects Let * T j denote the event tme or survval tme for the j th (j =,, n ) subject from the th * cluster ( =,, N), and let C j represent the censorng tme. Then, the observed tme s T j = * * * mn ( T C ), the ndcator functon δ = I ({ T C }) s f the response tme s * j, j uncensored and f the response tme s censored. Gven random effects, survval tmes are assumed to be condtonally ndependent. The hazard functon for the j th subject from the th cluster of a shared fralty model s gven by λ ( t ) = λ ( t ) ω exp( β) () j where λ s the baselne hazard functon, β s a vector of fxed effects correspondng to covarate vector x j, and ω are ndependent, dentcally dstrbuted random varables wth some common densty functon. Shared fralty models have some lmtatons. For example, they can t accommodate the stuaton where the fralty s not the same for all the ndvduals n a cluster. In order to account for more complcated fralty structure, the shared fralty model needs to be extended. The hazard functon for a more general mxed-effects proportonal hazards model can be defned as j x j j λ t ) = λ ( t ) exp( x β + z b ) () j ( j j where b s a vector of random cluster effects assocated wth ndvdual vectors of covarates z j. The random effects b are assumed to be dstrbuted accordng to some dstrbuton wth mean and covarance matrx D = D(θ), where θ s a vector of unknown parameters. Several approaches have been proposed to estmate the parameters of model (). McGlchrst and Asbett (99) and McGlchrst (993) used a penalzed partal lkelhood

10 4 approach to estmate the fxed effects and an approxmate resdual maxmum lkelhood (REML) approach to estmate the varance covarance parameters based on a normal approxmaton to the dstrbuton of the resduals. They only consdered the specal case where the random effects are normally dstrbuted wth mean zero and dagonal varancecovarance matrx D. In an anmal-breedng context, Ducrocq and Casella (996) ntroduced a Bayesan approach to estmate the parameters of a specal form of model () wth Webull baselne hazards and one set of random sre effects wth ether log-gamma or Gaussan dstrbutons. For those models, the sre effects can be ntegrated out of the posteror dstrbuton algebracally. The margnal posteror dstrbuton for the dsperson parameter cannot be obtaned algebracally and a Laplace approxmaton was consdered. Smulaton results showed that the estmaton procedure performed well when there are few sres and many daughters per sre, but dd not always perform well when there were many sres wth only a few daughters per sre. Rpatt and Palmgren () proposed an approxmate margnal lkelhood approach for a multvarate lognormal fralty model based on a penalzed partal lkelhood. Ther approach allows for more complex dependence fralty structures. The random effects are assumed to be log-normally dstrbuted wth postve defnte varance-covarance matrx D(θ). The Laplace approxmaton was appled to get an approxmate margnal lkelhood as the ntegral cannot be evaluated analytcally. Ths leads to estmatng equatons based on a penalzed partal lkelhood. The estmatng procedure s smple but t tends to result n an underestmaton of the varance of the estmated fxed effects parameters. EM-algorthm based estmaton approaches have been appled by several authors. Rpatt, Larsen and Palmgren () developed an estmaton procedure based on a Monte Carlo EM algorthm wth the am of obtanng the maxmum margnal lkelhood estmaton rather than an approxmaton of the margnal lkelhood estmaton (Rpatt and Palmgren ). The fraltes are treated as mssng data and mputed n the E-step. The expectaton n the E-step cannot be solved analytcally and t s approxmated by samplng from the condtonal dstrbuton of the fraltes gven the observed data. The M-step maxmzes the complete data log-lkelhood usng the mputed fraltes as f they were observed. Ths procedure alternates

11 5 between the E-step and the M-step. It s computatonally ntensve. The more complcated the fralty structure, the more computatonally nvolved the evaluaton of the E-step becomes. Cortnas and Burzykowsk (4) proposed a modfed EM algorthm, usng a Laplace approxmaton at the E-step to numercally smplfy the estmaton procedure. Also, Cortnas (4) used smulatons to compare the performance of the estmaton procedures proposed by McGlchrst and Asbett (99), Ducrocq and Casella (996), Rpatt and Palmgren (), and Cortnas and Burzykowsk (4). Ths study assumed that model (3) was correctly specfed wth a gven baselne hazard λ. Parameters of the model were chosen to mmc a real bladder cancer clncal tral data (Royston, Parmar, and Sylvester, 4) wth 33 patents dstrbuted over 37 centers. The data were generated accordng to the proportonal hazards model, b wth b ~ N, θ λ t β, b ) = λ ( t ) exp( b + x ( β + b )) (3) j ( j. There were 37 random effects for center-specfc baselne θ hazards and 37 random coeffcents for the center-specfc covarate. All four methods produced comparable regresson parameter pont estmates. The McGlchrst and Asbett approach has problems wth the estmaton of the standard errors of the varance components. Ther varance component estmaton has large bas n the heavy censorng settng, especally when varances of random effects are large. Ducrocq and Casella s approach provdes good estmates of standard errors for regresson parameters. Whle the standard errors tend to be slghtly underestmated for the Cortnas s EM algorthm and the Rpatt and Palmgren approach. The method proposed by Ducrocq and Casella yelds conservatve estmates of the standard errors of the varance components. The Cortnas s EM algorthm and the Rpatt and Palmgren method tend to underestmate the standard errors of the varance components. Ths study also found that Ducrocq and Casella s approach does not suffer from the convergence problems that occurred wth the other two methods.

12 6 3 Accelerated Falure Tme Models Although the Cox proportonal hazards model has been extensvely used n medcal research, the assumpton of proportonal hazard functons s rather strong and may often be volated. The omsson of mportant covarates can lead to devatons from proportonal hazards and bas n the estmaton of regresson parameters n Cox models (Solomon, 984). Accelerated falure tme models are an mportant alternatve to the Cox proportonal hazards model even though they have been rarely consdered n the medcal lterature. Chapman et al. (99) appled four parametrc survval models (exponental, Webull, log logstc, and log normal) to the effects of prognostc factors on breast cancer survval and concluded that the lognormal model provded the best ft to the data. Royston () demonstrated the practcal value of the lognormal AFT model n the analyss of survval tmes of breast and ovaran cancer patents. More recently, an AFT model has been mplemented to analyss of the tme to AIDS onset n the Women s Interagency HIV Study (Komarek et al., 4). Lambert et al. (4) appled AFT models wth shared fralty to determne prognostc factors for the survval tme of a kdney graft n patents from 3 transplant centers n the UK. An advantage of AFT models, and other parametrc approaches, s that you can characterze the shape of the hazard functon. AFT models specfy a drect lnear relatonshp between the log of the falure tme and covarates, whch may be approprate when a covarate acts to speed up or slow down the expected tme to falure by contractng or expandng the tme scale. The regresson parameters can be more ntutvely nterpreted wth respect to expected change n medan survval tme. For example, a natural way of expressng a treatment effect n an AFT model s an mprovement of % n medan survval tme. Also, the log-lnear formulaton of AFT models yelds the ndependence of regresson parameter estmates and random fralty effects (Kedng et al., 997). Msspecfcaton of a parametrc famly for the fralty dstrbuton may not be a serous ssue. Emprcal results of Lambert et al. (4) demonstrated the robustness of regresson parameters estmates wth respect to msspecfcaton of the fralty dstrbuton for Webull, Gamma, lognormal, and log-logstc models. Compared to Cox proportonal hazards models, AFT models for

13 7 correlated survval data have receved much less attenton. In ths dssertaton, we wll ncorporate random effects nto the AFT model to allow for correlatons and propose an estmaton procedure for AFT models wth random effects. 3. AFT Models Accelerated falure tme models are useful n many felds of applcaton. Gven the values of the covarates x, the densty functon has the followng form, where σ s the scale parameter, and ψ (x) log t log ψ ( x) f ( t) = ( σt) f ( ) (4) σ s some functon of covarates. One of the most common choces for ψ (x) s ψ ( x ) = exp( x β ) (5) The correspondng AFT model can be expressed n a regresson form as, log T = x β + σε (6) where ε s a random varable wth densty functon f ( ) and the correspondng baselne survvor functon S ( ). Accelerated falure tme models allow a wde range of parametrc ε forms for the densty functon. The standard normal dstrbuton s a common choce for the random varableε. Also, the extreme value and logstc dstrbutons are frequently used. These three dstrbutons have the property that the logarthmc transformaton of the lfetme log T has a locaton-scale dstrbuton on (-, ). AFT models assume a survvor functon of the followng form, where S * s baselne survvor functon. ε * t σ Pr( T t ) = S ( t) = S [( ) ] (7) ψ ( x) The Webull, lognormal, and log-logstc dstrbutons for lfetme correspond to extreme value, normal, and logstc dstrbutons for log of the lfetme, and the survvor functon s gven by

14 8 If ψ ( x ) = exp( x β ) log t log ψ ( x) S ( t) = S ( ) (8) σ, the survvor functon can be rewrtten as The S ( ) functons for some common dstrbutons are: ε Normal: S ( ε ) = Φ ( ) log t x β S ( t) = S ( ) (9) σ ε ε Extreme value: S ( ε ) = exp( e ) () Logstc: S ε ( ε ) = ( + e ) 3. Inference for AFT Models For random lfetme T of subjects =,, n, wth possble rght-censorng, the lkelhood functon under model (9) s gven by Lawless (3) as L( log t x β log t x β n δ δ β, σ ) = [ f ( )] S ( ) () = σ σ σ logt x β Usng ε =, the log-lkelhood functon assumes the form σ where n l ( β, σ ) = r log σ + [ δ log f ( ε ) + ( δ ) log S ( ε )] () = r = δ s the number of uncensored event tmes. Let x = ( x,..., xj,..., x p ) denote the set of covarates under whch the -th subject responds. The frst partal dervatves of l ( β, σ ) are l β j = σ f log S n [ δ + ( δ ) ] = ε ε log ( ε ) ( ε ) x j (3) l r = σ σ σ n [ δ ε + ( δ ) ε ] = ε ε log f ( ε ) log S ( ε ) (4)

15 9 l The maxmum lkelhood estmators βˆ and σˆ are found by solvng the equatons β = l and =. The observed nformaton matrx s σ l l β β β σ I ( β, σ ) = (5) l l σ β σ Assumng needed smoothness condtons on S, we can use the approxmate normalty of the m.l.e. s or a ch-squared approxmaton to lkelhood rato tests to test hypotheses about regresson coeffcents. Ths applcaton s llustrated by Lawless (3). For testng H : β = β, a Wald test statstc s constructed as Λ = β β ( β ) V ( β ) (6) ( Here β = β, β ) and V = I ( β, σ ) s parttoned as An alternatve method for testng Λ = V V V =, V V β = β s to use the lkelhood rado statstc ~ l ( β,, ) (,, ~ β σ l β β σ ) When the null hypothess s true, both tests have asymptotc central ch-squared dstrbutons wth degree of freedom equal to the rank of V. Unless otherwse stated, we wll assume that the model s parameterzed so that V has full rank. (7) 3.3 AFT Models wth Shared Fralty For the clustered falure tme data wth N clusters, let * T j represent the survval tme for the j th ( j =,, n ) ndvdual from the th ( =,, N) cluster and let C * j represent censorng tme. Then, the observed tme s T j = mn ( T C ). Censorng s ndcated by the * * j, j

16 * * ndcator functon, δ = I ({ T C }) j j j, whch s f the ndvdual s uncensored and f the ndvdual s censored. In a classcal AFT model, the survvor functon at tme t s assumed to be of the form S j * t σ ( t) = S [( ) ] (8) ψ ( x ) j where σ s an unknown scale parameter, S s the baselne survvor functon, and ψ x ) s * ( j some functon of covarates x j. Here, t s assumed that ψ ( x ) = exp( x β) (9) j j The AFT regresson model can equvalently be expressed as a log lnear model for the random varable T j, the lfetme of the j th ndvdual n the th cluster. Smlar to equaton (6), the AFT model can be wrtten as, log T = x β + σε () j j where ε j are random varables. For clustered data, subjects are correlated wthn a cluster. Shared fralty models account for the lack of ndependence by ntroducng a random component n Equaton (), whch could be modfed as j j j log T = ω + x β + σε () Here, α = exp ω s a random fralty dstrbuted across clusters wth some dstrbuton. Usually, the fralty dstrbuton s assumed to be gamma, nverse Gaussan, lognormal, or postve stable. AFT models wth shared fralty are appled n stuatons where the unexplaned survval tme heterogenety s common to all ndvduals wthn a cluster. Ths model can be ftted usng standard software packages such as R, Splus or SAS. j 3.4 AFT Models wth Random Effects Shared fralty AFT models have some lmtatons. Frstly, these models requre the fralty to be the same for all the subjects wthn a cluster. Another restrcton s that shared fralty can only nduce postve assocaton wthn the cluster, whch mght not always reflect

17 realty. Lmted resources shared by ndvduals n a cluster could result n some competton, and negatve correlatons among some response tmes. Therefore, AFT models wth shared fralty need to be extended to ncorporate more complcated covarance structure. AFT models that nclude random effects n the regresson expresson, as n a classcal lnear mxed model, have been consdered. The basc model s, log T = x β + z b + σε () j j where β s the vector of unknown regresson coeffcents correspondng to the covarate vector for fxed effects x j and b = (,..., ) b b q s the random effects vector assocated wth a second set of covarate values denoted by z j. It s assumed that the b s are dstrbuted wth mean and covarance matrx D = D(θ), whereθ s a vector of unknown parameters. The densty functon for b s denoted by f (b ). j Pan and Lous () proposed an estmaton procedure that terates between (a) estmatng the margnal dstrbuton of (logt β) usng Kaplan-Meer estmaton and mputaton of censored event tmes, and (b) estmaton of regresson coeffcents usng a Monte Carlo EM algorthm. But only a unvarate random effect wth z j = s consdered n ther approach. j To account for more complcated fralty structure, Komarek and Lesaffre (4) have developed a full Bayesan approach to estmate the parameters of model (). The advantage of ths approach s that a general random effect vector s ncluded n the model. Also ths approach can be appled to not only rght or left censored survval data but also nterval censored survval data. In the Bayesan context, the dstrbuton of error terms ε j s modeled as a mxture of an unknown number of normal dstrbutons. A Markov Chan Monte Carlo (MCMC) algorthm s used to estmate the number of normal components as well as the parameters of the normal dstrbutons. The densty f (ε ) of the error termε j n model () s specfed as K k = x j j f ( ε ) = ω ϕ( ε μ, σ ) (3) k k k

18 where ϕ(. μ k, σ k ) s the densty of N( μ k, σ k ). The number of mxture components K, mxture weghts ω ( ω, L, ω ), means μ ( μ, L, μ ) and varances σ ( σ, L, σ ) = k = k = k are unknown. Let r j be the label of the group from whch the random errorε j s drawn. That s, ε j s drawn from N( μ, σ ). The pror for the mxture weghts ω s assumed to be a r j r j symmetrc K-dmensonal Drchlet dstrbuton, and the mean and varance of each component dstrbuton are drawn ndependently from prors wth normal and nverse-gamma dstrbutons. The estmates of K, ω, μ and σ are updated by a reversble jump MCMC algorthm of Green (995). The condtonal dstrbuton of the log-event tmes s y j r j, μ, σ, β, b, x, z ~ N ( μ + x β + z b, σ ) (4) j j rj j j r j The pror dstrbuton for each regresson coeffcent s assumed to be ndependently and normally dstrbuted. The dstrbuton for the random effect vector b s assumed to be multvarate normal, b γ, D ~ N q (γ, D) (5) and ndependently dstrbuted for =,, N, where γ ( γ, L, γ ). Each γ j has an = q ndependent normal pror N( vγ, ψ γ ). The covarance matrx D of random effects s, j, j assumed to have an nverse-wshart pror. The regresson part of the model s updated usng the Gbbs sampler. However, ths method s computatonally ntensve and cannot be practcally appled when the dmenson of D s large. In the next chapter, we wll propose a method of estmaton for model () based on a penalzed lkelhood developed by applyng the Laplace approxmaton to the margnal lkelhood functon. It s possble to nclude random effects wth general varance structure n the analyses of survval data through ths method. Ths method makes analyses of correlated survval data feasble and computatonally effcent, even for large data sets.

19 3 4 Dssertaton Organzaton Ths dssertaton s organzed nto four major parts n the paper format. The frst part s the general ntroducton ncludng lterature revews of past work on the Cox proportonal hazards models for correlated survval data, the motvaton for ths research, and an ntroducton to AFT models. The next two parts are two papers n the form to be submtted to journals. The fnal part summarzes the results of the prevous chapters and dscusses addtonal ssues. The frst paper proposes an estmaton approach for the AFT model wth random effects. Smulaton studes are used to evaluate the performance of the estmaton approach for AFT models wth shared fralty and AFT models wth nested fraltes. In the second paper, we apply the method to a dataset from the Mnnesota Breast Cancer Famly Resource usng the AFT model wth random effects. 5 References for General Introducton Chapman, J. W., Trudeau, M. E., Prtchard, K. I., Sawka, C. A., Mobbs, B. G., Hanna, W. M., Kahn, H., McCready, D. R., Lckley, L. A., A comparson of all-subset Cox and accelerated falure tme models wth Cox step-wse regresson for node-postve breast cancer, Breast Cancer Research and Treatment, (3): 63 7,99. Collett, D., Modellng Survval Data n Medcal Research- nd ed., Chapman & Hall/CRC CRC Press LLC, 3. Cortnas Abrahantes, J., Estmaton procedures for mxed-effects models wth applcatons to normally dstrbuted and survval data, Ph.D. Thess, 4. Cortnas Abrahantes, J. and Burzykowsk, T., A verson of the EM algorthm for proportonal hazards model wth random effects, Techncal Report 455, IAP statstcs network, 4. Cox, D. R., Regresson models and lfe-tables (wth dscusson), Journal of the Royal Statstcal Socety Seres. B, vol. 34: 87, 97. Ducrocq, V. and Casella, G., A Bayesan analyss of mxed survval models, Genet. Sel. Evol., 8: 55-59, 996.

20 4 Green, P. J., Reversble jump Markov chan computaton and Bayesan model determnaton, Bometrka, 8: 7-73, 995. Hougaard, P., A class of multvarate falure tme dstrbutons, Bometrka, 73: 67-8, 986. Kedng, N., Andersen, P. K. and Klen, J. P., The role of fralty models and accelerated falure tme models n descrbng heterogenety due to omtted covarates, Statstcs n Medcne, vol. 6 pp. 5 4, 997. Komarek, A., Lesaffre, E., and Hlton, J.F., Bayesan accelerated falure tme model for correlated censored data wth a normal mxture as an error dstrbuton, Techncal Report 45, IAP statstcs network, 4. Lambert, P., Collett, D., Kmber, A., and Johnson, R., Parametrc accelerated falure tme models wth random effects and an applcaton to kdney transplant survval, Statstcs n Medcne, vol. 3 pp , 4. Lawless, J. F., Statstcal Models and Methods for Lfetme Data, New York: John Wley & Sons, Inc. 3. McGlchrst, C. A. and Asbett, C. W., Regresson wth fralty n survval analyss, Bometrcs, 47: , 99. McGlchrst, C. A., REML estmaton for survval models wth fralty, Bometrcs, 49: -5, 993. Pan, W. and Lous, T. A., A lnear mxed-effects model for multvarate censored data, Bometrcs, 56, 6-66,. Rppatt, S. and Palmgren, J., Estmaton of multvarate fralty models usng penalzed partal lkelhood, Bometrcs, 56: 6-,. Rppatt, S., Larsen, K., and Palmgren, J., Maxmum lkelhood nference for multvarate fralty models usng an automated Monte Carlo EM algorthm, Lfetme Data Analyss, 8:349-36,. Royston, P. The lognormal dstrbuton as a model for survval tme n cancer, wth an emphass on prognostc factors, Statstca Neerlandca, 55:89-4,. Royston, P., Parmar, M. K. B. and Sylvester, R., Constructon and valdaton of a prognostc model across several studes, wth an applcaton n superfcal bladder cancer, Statstcs n Medcne, 3:97-96, 4. Solomon, P. J., Effect of msspecfcaton of regresson models n the analyss of survval data, Bometrka, 7:9-98, 984.

21 5 Vaupel, J. W., Manton, K. G., and Stallard, E., The mpact of heterogenety n ndvdual fralty on the dynamcs of mortalty, Demography, 6: , 979. We, L.J., Ln, D.Y., and Wessfeld, L., Regresson analyss of multvarate ncomplete falure tme data by modelng margnal dstrbutons, Journal of the Amercan Statstcal Assocaton, 84: 65-73, 989. Whtmore, G. A. and Lee, M.-L. T., A multvarate survval dstrbuton generated by an nverse Gaussan mxture of exponentals, Technometrcs, 33: 39 5, 99.

22 6 ESTIMATION OF ACCELERATED FAILURE TIME MODELS WITH RANDOM EFFECTS Yaqn Wang, Kenneth J. Koehler, Terry M. Therneau A paper to be submtted to Bometrcs Abstract There s an ncreasng nterest n ncorporatng multvarate fraltes nto the analyss of survval data to account for correlated outcomes. We propose accelerated falure tme (AFT) models based on fraltes wth a multvarate lognormal jont dstrbuton. It allows for random effects wth a complcated dependence structure that may be a functon of unknown covarance parameters. The proposed models can be appled to rght, left or nterval-censored survval data. An estmaton procedure s developed for AFT models wth random effects, whch s based on the Laplace approxmaton to the margnal lkelhood. The performance of ths approxmaton s evaluated through several smulaton studes. Key Words: AFT models; multvarate fraltes; correlated survval data; random effects; Laplace approxmaton. Introducton Correlated survval data wth possble censorng are frequently encountered n survval analyss. The observatons may be clustered n mult center studes, e.g., a group of patents may share unobserved envronmental, procedural, or genetc factors that nduce wthn cluster assocaton among response tmes. Correlated data may also arse from takng multple observatons on ndvdual subjects. Alternatvely, event tmes may be montored for socally related subjects, such as classmates, or genetcally related subjects, such as famly members n human studes, or lttermates n anmal studes.

23 7 In survval analyss, one of the most common assumptons s that event tmes are ndependent from one observaton to another gven survval to a specfc tme and observed covarate values. When there are dependences among observed event tmes, models based on ths assumpton are not plausble. Common regresson models for survval analyss are Cox proportonal hazards (PH) models (Cox, 97) and accelerated falure tme models (Collett, 3). For ether Cox models or AFT models, gnorng dependences n the analyss of the data may result n msleadng nferences. Although parameter estmates may be generally consstent, estmaton of the varablty of parameter estmates may be based. Many methods that deal wth correlatons among survval tmes have appeared n the lterature. Due to ts wdespread use, most of the attenton has been gven to extensons of the Cox proportonal hazards model to ncorporate random effects, known as fraltes, to account for correlatons among response tmes. There s a rather extensve lterature on the Cox proportonal hazards model wth random effects. We wll consder clustered falure-tme data wth N clusters. Gven the random effects, or fraltes, the condtonal hazard functon for the j th observaton from the th cluster s generally assumed to have the form t λ ( t β, b ) = λ ( t ) exp( x β + z b ) () j where λ ( ) s the baselne hazard, t s the event tme, β s the unknown regresson coeffcent vector, x j s the covarate vector of fxed effects for the j th observaton from the th cluster, and b s a vector of random effects assocated wth a vector of covarates z j. The random effects are assumed to be dstrbuted accordng to some dstrbuton wth mean and covarance matrx D = D(θ), where θ s a vector of unknown parameters unrelated to β. For a shared fralty model, b s a scalar that expresses a cluster specfc devaton, where z j s an ndcator varable defnng cluster membershp. More complex patterns of assocaton can be modeled by allowng z j to defne addtonal sub-clusters. Several approaches have been proposed to estmate the parameters of the proportonal hazards model wth random effects. McGlchrst and Asbett (99) and McGlchrst (993) used a penalzed partal lkelhood approach to estmate the fxed effects parameters and an approxmate resdual maxmum lkelhood (REML) approach to estmate the covarance parameters for the random effects. Ths approach has a problem wth the estmaton of the standard errors of the varance components. The varance component estmaton has large j j

24 8 bas n the heavy censorng settng, especally when varances of random effects are large. Ducrocq and Casella (996) ntroduced a Bayesan approach that yelds conservatve estmates of the standard errors of the varance components. Rpatt and Palmgren () proposed estmaton based on penalzed partal lkelhood for the Cox proportonal hazards model. Ther approach allows for more complex dependence fralty structure and the estmaton procedure s smple, but t tends to underestmate the standard errors of the varance components. EM-algorthm based estmaton approaches have been appled by several authors. Rpatt, Larsen and Palmgren () developed an estmaton procedure based on a Monte Carlo EM algorthm, but ths approach s numercally ntensve. Cortnas and Burzykowsk (4) proposed a modfed EM algorthm, usng a Laplace approxmaton n the E-step to smplfy the estmaton procedure. However, ths approach also tends to underestmate the standard errors of the varance components. Although the Cox model has been extensvely appled n medcal research, the assumpton of proportonal hazards s rather strong and may often be volated. A useful alternatve to proportonal hazards models s accelerated falure tme models. Accelerated falure tme models use expanson and contracton of tme scales to relate the lfetme dstrbuton to the covarates. The dstrbuton of the event tmes can be defned through the survvor functon or hazard functon. In typcal AFT models, the logarthms of the event tmes are assumed to be ndependently and dentcally drawn from some dstrbuton such as the normal dstrbuton (log normal regresson), extreme value dstrbuton (Webull regresson), or the logstc dstrbuton (log-logstc regresson). Chapman et al. (99) appled four parametrc survval models (exponental, Webull, log logstc, and log normal) to prognostc factors n breast cancer and concluded that the lognormal model provded the best ft to the data. These models provded for a wde varety of shapes of hazard functons that can be further extended by usng mxtures of dstrbutons. In ths paper, we consder AFT models wth random effects to allow for possble correlatons among the survval tmes. The varablty n survval tmes s generally modeled as arsng from two dfferent sources. The frst one s the usual varablty assocated wth the baselne hazard functon. The second source s nduced by varaton n random effects and

25 9 fxed covarates. Condtonally on the random effects, the survval tmes are often assumed to be statstcally ndependent across observatons n these random effects models. We propose an estmaton procedure based on an approxmate penalzed log-lkelhood, whch s smlar to that used by Breslow and Clayton (993) for generalzed lnear mxed models wth Gaussan random effects. Estmates of varance components can be used to assess the strength of assocaton among event tmes wthn clusters. Under the proposed random effects models, the regresson parameters β express the effect of covarates both condtonally (gven the random effects) and margnally (after ntegratng random effects out). Kedng et al. (997) reported that estmates of the regresson parameters are robust aganst the msspecfcaton of the fralty dstrbuton for Webull AFT models. Ths fndng s supported by the emprcal results of Lambert et al. (4) for AFT models wth shared fralty. The organzaton of the artcle s as follows. A descrpton of the parametrc accelerated falure tme models wth shared fralty s gven n Secton.. Secton. contnues wth an extenson to AFT models wth more general random effects. In secton 3., an estmaton procedure for AFT models wth random effects s ntroduced and the asymptotc propertes of the estmators are revewed n secton 3.. Secton 4 s devoted to smulaton studes that provde emprcal valdaton of estmaton procedures. Secton 5 summarzes the results and dscusses some addtonal ssues. Accelerated Falure Tme Models wth Random Effects In ths paper, the data are assumed to consst of rght censored event tme observatons from N clusters wth n observatons from the th cluster. Let T * j represent the event tme correspondng to the j th (j =,, n ) ndvdual from the th cluster ( =,, N), and let represent a correspondng censorng tme that s ndependent of the event tme. Thus, the observed data consst of the observed follow-up tme T j = mn ( T C * * j, j * C j ), and a censorng * * ndcator δ = I ({ T C }) whch s f the ndvdual s uncensored and otherwse. In j j j ths settng, t s natural to assume that observatons wthn a cluster wll be correlated. In the

26 lterature, many authors have proposed usng a shared fralty model to account for wthn cluster dependences.. AFT Models wth Shared Fralty Shared fralty models are approprate when observatons wthn a cluster share a common unobservable fralty. In these models, each observaton belongs to only one cluster, and fraltes of dfferent clusters are ndependent. Many dfferent fralty dstrbutons have been consdered n generalzatons of the Cox proportonal hazards model that mplement random effects: the gamma dstrbuton (Clayton, 99; Klen, 99), the postve stable dstrbuton (Hougaard, 986a), the nverse Gaussan (Hougaard, 986b) and the lognormal dstrbuton (McGlchrst and Asbett, 99). AFT models wth shared fralty have also receved some attenton recently. Klen et al. (999) consdered a lognormal regresson model wth a shared lognormal fralty and Pan () explored AFT models wth gamma fralty. Condtonal on the fralty, wthn cluster survval tmes are assumed to be ndependent. The AFT models wth shared fralty can be expressed as a log lnear model for the logarthm of the event tme as follows log T = x β + b + σε () j j where β s a vector of fxed effects correspondng to covarate vector x j, σ s a scale parameter, the ε j s are ndependent and dentcally dstrbuted random errors, and the b s are the cluster-specfc random effects whch are assumed to be ndependent, dentcally dstrbuted random varables wth densty functon p b ). In these models, fralty could be consdered as an unobserved covarate that s addtve on the log falure tme scale and descrbes some reduced or ncreased event tmes for dfferent clusters. All observatons wthn a cluster share a common unobserved random effect. AFT models wth shared fralty specfy a drect lnear relatonshp between the log of falure tme and the covarates. The regresson parameters can be ntutvely nterpreted wth respect to the expected log of the falure tme. However, the formulaton based on the survvor functon and hazard functon s more convenent for the descrpton n the next secton. The survvor functon for an AFT model at tme t has the form j (

27 log log * t t ψ σ j Pr( Tj t) = S [( ) ] = S ( ) (3) ψ σ j whereσ s the scale parameter, S * s a survvor functon defned on (, ), and S s the * baselne survvor functon satsfyng the relatonshp S ω) = S (log ), and ψ j s some ( ω functon of the covarates. One of the most common choces for AFT models wth shared fralty s ψ = exp( x β + b ) (4) j j Some falure tme dstrbutons, such as the lognormal, Webull, and log-logstc dstrbutons, have the property that log of the falure tme has a locaton-scale dstrbuton. Condtonal on the random effects, the survvor functon n (3) can be rewrtten n the followng form: S j log t x j β b ( t b ) = S ( b σ ) (5) AFT models wth shared fralty have some lmtatons. Frst, a shared fralty model forces the fralty to be the same for all the observatons wthn a cluster. Clearly, there s a need for extensons of shared fralty models to ncorporate more complcated fralty structure, e.g., one may wsh to use a herarchcal nested fralty model. Another restrcton s that shared fralty can only nduce postve assocaton wthn the cluster, whch mght not always reflect realty. To deal wth more complex assocaton structures, AFT models wth random effects are proposed.. AFT Models wth Random Effects Gven a q-dmensonal vector of random effects b, the wthn cluster event tmes are assumed ndependent. For the AFT models wth random effects, the regresson model n equaton () can be extended as follows, log T = x β + z b + σε (6) j The condtonal survvor functon of observaton j from cluster has the form S j j j log t x j β z j b ( t b ) = S ( b ) (7) σ j

28 where S ( ) s the survvor functon of ε j and β s a vector of fxed effects assocated wth a vector of covarates x j measured on the j th observaton n the th cluster. We assumed that the random effect b s randomly dstrbuted accordng to a multvarate normal dstrbuton wth mean zero and covarance matrx D (θ), where θ s an unknown vector of parameters. The densty functon for b s denoted by p(b ; D (θ)). Wthε j Tj x jβ z jb = log, the condtonal survvor and hazard functons are σ S h j j ( t b ) = S ( ε j b ) (8) ( t b ) = h ( ε j b ) (9) σ t respectvely, where h ( ) s the hazard functon of ε j. Let N denote the number of the clusters and n denote the sample sze wthn the th cluster. If condtonal on the random effects the censorng s assumed to be ndependent of survval, the condtonal lkelhood for the observed data s, L c N n δ j = [ h ( ε j b )] S ( ε j b ) () σt = j= j Integratng out the unobserved fraltes b, the margnal lkelhood functon for all clusters can be expressed as: L m N n = b ( ε b ) p ( b ; D ) db () = j= δ j [ h ( ε j )] S σtj j Our am s to use the maxmum lkelhood approach to maxmze the ntegral wth respect to unknown parameters σ, β and θ and make nferences. The ntegral n () s multdmensonal and wll be dffcult to evaluate analytcally. Computatonally ntensve methods, such as MCMC methods or numercal ntegraton, can be used to evaluate the exact loglkelhood numercally. However, these methods may not be feasble for large data sets wth correlated observatons. In ths paper, we propose an approxmate maxmum lkelhood estmaton procedure derved from a Laplace approxmaton to the margnal lkelhood.

29 3 3 Estmaton When the ntegraton n equaton () s analytcally ntractable, one opton s to maxmze the ntegral s to maxmze an approxmate lkelhood obtaned from the Laplace approxmaton to the ntegral. The Laplace approxmaton has been wdely used to obtan approxmate posteror dstrbutons (Terney and Kadane 986) and approxmate lkelhoods (Solomon and Cox 99; Shun and McCullagh, 995). Frst partal dervatves of the approxmated log-lkelhood yeld a set of estmatng equatons that produce consstent parameter estmates wth large sample normal dstrbutons under relatvely broad condtons. 3. Approxmate Lkelhood To smplfy the dscusson, we restrct the q-dmensonal vector b to follow a multvarate normal dstrbuton as set forth by Rpatt and Palmgren (). Thus, we can use arbtrary covarance matrces and handle negatve dependences wthn clusters. Followng the applcaton of the Laplace approxmaton for the generalzed lnear mxed model (Breslow and Clayton, 993), approxmate ntegrated log lkelhood can be derved. We assume the condtonal ndependence of the observatons wthn a cluster gven b. Then up to a constant factor, the condtonal lkelhood for the th cluster s ) ( )) ( ( ) ( j j n j c S h L j b b b ε ε ε σ δ = = () and the correspondng margnal lkelhood s q c q c d e d e L d p L L b θ b b ε θ b θ b b ε b K b θ D b ) ( - ) ( - ) ( D ) ( ) ( ) ( D ) ( ) ; ( ) ( = = = π π (3) where ) ( b K s the penalzed log lkelhood gven by j j n j j n j j c S h L b θ D b b D θ b b b K ) ( )] ( log )) ( (log [ log ) ( )] ( log[ ) ( = = + + = = ε ε δ σ δ (4)

30 4 Although the penalzed log lkelhood s a functon of all unknown parameters, we smplfy the notaton as K b ) n the followng dervaton. Wrtng contrbuton to the margnal ( lkelhood from the th cluster n the form of (3) wth K b ) = n [ K ( b ) / n ], we can apply ( the Laplace method for ntegral approxmaton. The Laplace method s a famly of ml (b ) asymptotc methods used to approxmate ntegrals of the form e db (See Appendx ). The approxmaton s gven by ~ ml ( b ) / ~ q ( ) ( ) ( ) / ml b e db π ml b e (5) where b s a q-dmensonal vector and b ~ denotes the soluton to the equatons obtaned from settng the frst partal dervatves of ml ( b ) wth respect to b equal to zero. Therefore, the contrbuton of the th cluster to the overall log margnal lkelhood can be approxmated as * ~ ~ ~ l ( b ) = K ( b ) log D ( θ) log K ( b ) (6) The order of accuracy assocated wth Laplace approxmaton s O ( ). Let b ~ denote the vector obtaned from stackng the b ~ vectors for all clusters. The covarance matrx, D(θ), can capture the structure for wthn cluster dependence and between cluster heterogenety. Here θ s a vector of unknown parameters, whch do not depend on β. Across all clusters, the approxmate log margnal lkelhood s gven by N * ~ ~ ~ l ( b ) = ( K ( b ) log D ( θ) log K ( b ) ) (7) = Alternatvely, the approxmate log margnal lkelhood can be rewrtten as * ~ l ( b ) = K( b ~ ) log D( θ) log K ( b ~ ) (8) where b ~ ~ s a functon of all unknown parameter ( β,σ, θ), K(b) s the penalzed log lkelhood gven by ~ K( b) N n ~ ~ r logσ + [ δ (log ( )) + log ( )] b j h ε j S ε D( θ) b, = j= = j ~ and K (b ) are the second partal dervatves of K(b) wth respect to b evaluated at b ~ gven by n

31 5 ~ N n log h ( ε j ) log S ( ε j ) K ( b) = z z [ + ] D( θ) j j δ j σ = j= ε ε j j (9) 3. Asymptotc Propertes of Laplace-Based Estmaton Maxmzng the approxmate log-lkelhood obtaned usng the Laplace method results n approxmate maxmum lkelhood estmaton. The correspondng estmates dffer from those obtaned usng the true maxmum lkelhood and are not necessarly consstent. However, the estmates are shown to be consstent under some condtons and the rate of convergence depends on both the number of clusters and cluster szes. Also, under some regularty condtons, we can establsh the asymptotc normal dstrbuton of estmated parameters. 3.. Consstency of the Laplace-Based Estmator The Laplace approxmaton s appled to the random effects of the ntegrated lkelhood for each cluster. Ths approach allows the random effects to have a q-dmensonal dstrbuton wthn each cluster and be correlated. Let γ = ( σ, β ). Up to a constant, the th cluster s contrbuton to the overall log-lkelhood (See Appendx ) s equvalent to where K ( γ) l ( ) log ( ) log γ = D θ K ( γ) + K ( γ) + O p ( n ) () n n = δ j logσ + j= j= [ δ (log h ( ε )) + log S penalzed log-lkelhood. Let l(γ) denote l ( γ) j N = j ( ε )] j ~ b D ( θ) ~ b s the. Here, we assume homogeneous cluster szes for convenence. Up to a constant, the true log-lkelhood wth respect to γ can be wrtten as, * O p l ( γ) = l ( γ) + ( Nn ) () N * where l ( γ) = ( log D ( θ) log K ( γ) + K ( γ)), N s the number of clusters = and n s the common cluster sze. For fxed q, the omtted terms n the approxmaton of the

32 6 log-lkelhood are the order of ( Nn ). A more hghly accurate approxmaton could be O p obtaned by usng hgher order terms n the expanson of the logarthm of the ntegrand (See Appendx ). α Let n = O p ( N ) for α > so that the accuracy of the Laplace approxmaton to the α margnal log-lkelhood s approxmately O ( N ) = () p o p by (). That s, the Laplace approxmaton to the margnal log-lkelhood s o p () f the cluster sze, n, grows faster than * γ the number of clusters N. Then, l ( ) converges to l (γ). The consstency of the Laplace based maxmum lkelhood estmator can be establshed by arguments smlar to those used by Vonesh (996) for the nonlnear mxed-effects models. The followng condtons are assumed: () b and ε are ndependent of one another. () Let l (γ ), the true but unspecfed log margnal lkelhood functon, satsfy the followng regularty condtons: C: The dstrbutons of log-event tmes have common support for all γ Β, where Β s the parameter space for γ. C: There exsts an open subset ω of Β contanng the true parameter pont γ such that l(γ ) s thrd dfferentable as a functon of γ for all γ ω. T C3: E[ l ( γ T )] = and [ l( γ)] I( γ), where the Fsher nformaton Nn matrx, I (γ), s fnte and postve defnte for all γ ω. () The ffth order dervatves of l (γ) exst and are contnuous n an open neghborhood of γ T for all clusters. (v) Let A be the Eucldean norm for a matrx A and assume that E +δ ( l ( γ ) E[ l ( γ )]) / n < Δ < for some Δ > andδ > and for all T T N. Let B, n ( γ ) = var( l ( γ )) / n and assume that B( γ ) = lmn, n B, n ( γ ) s T T T N = T postve defnte wth mnmum egenvalue λ > mn.

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Chapter 20 Duration Analysis

Chapter 20 Duration Analysis Chapter 20 Duraton Analyss Duraton: tme elapsed untl a certan event occurs (weeks unemployed, months spent on welfare). Survval analyss: duraton of nterest s survval tme of a subject, begn n an ntal state

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX Populaton Desgn n Nonlnear Mxed Effects Multple Response Models: extenson of PFIM and evaluaton by smulaton wth NONMEM and MONOLIX May 4th 007 Carolne Bazzol, Sylve Retout, France Mentré Inserm U738 Unversty

More information

RELIABILITY ASSESSMENT

RELIABILITY ASSESSMENT CHAPTER Rsk Analyss n Engneerng and Economcs RELIABILITY ASSESSMENT A. J. Clark School of Engneerng Department of Cvl and Envronmental Engneerng 4a CHAPMAN HALL/CRC Rsk Analyss for Engneerng Department

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

STK4080/9080 Survival and event history analysis

STK4080/9080 Survival and event history analysis SK48/98 Survval and event hstory analyss Lecture 7: Regresson modellng Relatve rsk regresson Regresson models Assume that we have a sample of n ndvduals, and let N (t) count the observed occurrences of

More information

How its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013

How its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013 Andrew Lawson MUSC INLA INLA s a relatvely new tool that can be used to approxmate posteror dstrbutons n Bayesan models INLA stands for ntegrated Nested Laplace Approxmaton The approxmaton has been known

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data Lab : TWO-LEVEL NORMAL MODELS wth school chldren popularty data Purpose: Introduce basc two-level models for normally dstrbuted responses usng STATA. In partcular, we dscuss Random ntercept models wthout

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

U-Pb Geochronology Practical: Background

U-Pb Geochronology Practical: Background U-Pb Geochronology Practcal: Background Basc Concepts: accuracy: measure of the dfference between an expermental measurement and the true value precson: measure of the reproducblty of the expermental result

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of Chapter 7 Generalzed and Weghted Least Squares Estmaton The usual lnear regresson model assumes that all the random error components are dentcally and ndependently dstrbuted wth constant varance. When

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Lab 4: Two-level Random Intercept Model

Lab 4: Two-level Random Intercept Model BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Regression Analysis of Clustered Failure Time Data under the Additive Hazards Model

Regression Analysis of Clustered Failure Time Data under the Additive Hazards Model A^VÇÚO 1 33 ò 1 5 Ï 217 c 1 Chnese Journal of Appled Probablty and Statstcs Oct., 217, Vol. 33, No. 5, pp. 517-528 do: 1.3969/j.ssn.11-4268.217.5.8 Regresson Analyss of Clustered Falure Tme Data under

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Andreas C. Drichoutis Agriculural University of Athens. Abstract Heteroskedastcty, the sngle crossng property and ordered response models Andreas C. Drchouts Agrculural Unversty of Athens Panagots Lazards Agrculural Unversty of Athens Rodolfo M. Nayga, Jr. Texas AMUnversty

More information

Limited Dependent Variables and Panel Data. Tibor Hanappi

Limited Dependent Variables and Panel Data. Tibor Hanappi Lmted Dependent Varables and Panel Data Tbor Hanapp 30.06.2010 Lmted Dependent Varables Dscrete: Varables that can take onl a countable number of values Censored/Truncated: Data ponts n some specfc range

More information

An R implementation of bootstrap procedures for mixed models

An R implementation of bootstrap procedures for mixed models The R User Conference 2009 July 8-10, Agrocampus-Ouest, Rennes, France An R mplementaton of bootstrap procedures for mxed models José A. Sánchez-Espgares Unverstat Poltècnca de Catalunya Jord Ocaña Unverstat

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Factor models with many assets: strong factors, weak factors, and the two-pass procedure Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Advances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models.

Advances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models. Advances n Longtudnal Methods n the Socal and Behavoral Scences Fnte Mxtures of Nonlnear Mxed-Effects Models Jeff Harrng Department of Measurement, Statstcs and Evaluaton The Center for Integrated Latent

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Influence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia.

Influence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia. Proceedngs of the 8th WSEAS Internatonal Conference on APPLIED MAHEMAICS, enerfe, Span, December 16-18, 5 (pp14-138) Influence Dagnostcs on Competng Rsks Usng Cox s Model wth Censored Data F. A. M. Elfak

More information

A joint frailty-copula model between disease progression and death for meta-analysis

A joint frailty-copula model between disease progression and death for meta-analysis CSA-KSS-JSS Specal Invted Sessons 4 / / 6 A jont fralty-copula model between dsease progresson and death for meta-analyss 3/5/7 Takesh Emura Graduate Insttute of Statstcs Natonal Central Unversty TAIWAN

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information