arxiv: v1 [stat.me] 29 Jul 2017
|
|
- Paul Stafford
- 5 years ago
- Views:
Transcription
1 Publshed n Statstca Neerlandca, 2016, vol. 70, no 4, p A Skew-Normal Copula-Drven GLMM Kalyan Das 1, Mohamad Elmasr 2 and Arusharka Sen 3 1 Unversty of Calcutta, 2 McGll Unversty and 3 Concorda Unversty arxv: v1 [stat.me] 29 Jul 2017 Abstract: Ths paper presents a method for fttng a copula-drven generalzed lnear mxed models. For added flexblty, the skew-normal copula s adopted for fttng. The correlaton matrx of the skew-normal copula s used to capture the dependence structure wthn unts, whle the fxed and random effects coeffcents are estmated through the mean of the copula. For estmaton, a Monte Carlo expectaton-maxmzaton algorthm s developed. Smulatons are shown alongsde a real data example from the Framngham Heart Study. Keywords: EM Algorthm, Gaussan Copula, Generalzed Lnear Mxed Models, Monte Carlo, Skew-Normal. 1 Introducton The key component drvng the development of lnear mxed models s the ablty of such models to handle data wth correlated observatons; a data structure where predctors and response varables are measured at more than one level. Such structure s common wth repeated observatons as n medcal studes, where patent characterstcs are measured at several tme ponts, not necessarly the same set for each patent. Fsher (1918) proposed the addton of a random effects term to the lnear model, whch ntroduced heteroscedastcty. As a result, the lnear mxed model takes the form Y = X β + D b + ɛ, = 1,..., m (1.1) where Y s an (n 1) vector of observed response varable for sample unt, = 1,..., m. X s an (n p) fxed effects desgn matrx wth coeffcent β of dmenson (p 1). D s an (n q) random effects desgn matrx wth coeffcent b of dmenson (q 1), and ɛ s an (n 1) vector of random errors. Inference from lnear mxed model becomes slghtly more tedous by the ntroducton of the random coeffcent b. Ths requres an dentfablty assumpton of ndependence between b and ɛ. A popular modelng assumpton s then b d N q (0, Ω b ), ɛ nd N n (0, ψ ), (1.2) where Ω = Ω(α) and ψ = ψ (γ) are assocated dsperson matrces that capture possble varablty among -and wthn- ndvduals, parametrzed by α and γ. In many lterature revews, the extra restrctveness assocated wth specfyng the dstrbuton functons of b and ɛ s deemed unnecessary. Thereupon, Arellano-Valle et al. (2005) proposed the use of skew-normal n leu of
2 2 K. Das, M. Elmasr and A. Sen the normal dstrbuton for both b and ɛ, n an attempt to capture any slght departures from normalty. Moreover, they have explctly characterzed the lkelhood functon of the resultng model, and ftted t by the constraned expectaton maxmzaton algorthm (CEM). Nevertheless, many researchers dscussed other technques and models for nference, for nstance the use of mxture of normals as n Verbeke and Lesaffre (1996), sem-parametrc models as n Zhang and Davdan (2001), non-parametrc or smoothed non-parametrc technque n maxmum lkelhood estmaton as n Newton and Zhang (1999) and predctve recurson algorthm as n Tao et al. (1999). Ths paper follows the Arellano-Valle et al. (2005) approach by modellng the dependence structure n herarchcal multvarate dstrbutons va a copula-drven generalzed lnear mxed model. Gven response varables Y j, = 1,... n, j = 1,..., n, we assume that Y = (Y 1,..., Y n ) follows an n -varate dstrbuton wth a predefned mean and covarance matrx. We model such dstrbuton by usng an n -varate skewnormal copula SN n (.), where the random effects are ntegrated n the mean structure of the copula. We chose the covarance matrx Σ = Σ(ξ, t ) to be of an autoregressve structure n order to nclude the tme-varant parameters. Formally, Y b F n (η(x β + D b ), Σ(ξ, t )) (1.3) where X, β, b, D as defned n (1.1) and (1.2), ξ s the dsperson autoregressve tme-varant parameter wth respect to t = (t 1,..., t n ), and η(.) s a lnk functon. F k (η, Σ) s a k-varate dstrbuton functon wth mean η and covarance Σ. Moreover, we assume the margnal denstes Y j b are a functon of {x j, t j, D, b, β} va the same lnk functon η. The rest of ths paper s organzed as follows. Secton 2 ntroduces a specfc characterzaton of the skew-normal dstrbuton and the copula used n ths paper. Secton 3 ntroduces the model, and constructs the lkelhood usng a skew-normal copula wthn a GLM framework. Secton 4 dscusses the use of numercal Monte Carlo EM algorthm to estmate parameters. Secton 5 llustrates smulaton results under dfferent models. Secton 6, a real data analyss s performed to llustrate the applcaton of our study. Secton 7 ends wth a general dscusson. 2 Skew-normal dstrbuton and copula For a better understandng, we begn ths secton wth the defnton of the multvarate skew-normal dstrbuton consdered through ths paper. Defnton 2.1. An n-dmensonal random vector X R n follows a skewnormal dstrbuton wth locaton vector µ R n, dsperson matrx Σ (a n n postve defnte matrx) and a skewness vector λ R n, f ts densty functon
3 3 K. Das, M. Elmasr and A. Sen s gven by sn n (x µ, Σ, λ) = 2φ n (x µ, Σ)Φ 1 (λ Σ 1/2 (x µ)), x R n. (2.1) In the unvarate case sn 1 (x µ, σ 2, λ) = 2φ 1 (x µ, σ 2 )Φ 1 (λ x µ ), (2.2) σ ( < x, µ < ), µ, σ R, 0 < σ <. Here φ n (. µ, Σ) and Φ n (. µ, Σ) denote respectvely an n-varate densty and dstrbuton functon of a normal random varable wth mean vector µ and covarance matrx Σ (σ 2 n the unvarate case). Ths notaton s used throughout ths paper. A specal case s when λ = 0, whch reduces the skew-normal to the normal dstrbuton. The skew-normal characterzaton n (2.1) s attrbuted to Arellano-Valle and Genton (2005), and the one n (2.2) s attrbuted to Azzaln (1985) and expanded further by Azzaln and Dalle-Valle (1996). Many authors have proposed dfferent forms. However, for convenence, a varaton of the characterzaton n (2.1) s the only one used n ths paper. Azzaln and Dalle-Valle (1996) proposed a smplfed parametrzaton of λ, n (2.1), n terms of an arbtrary n n postve defnte matrx, as λ = 1/2 δ 1 δ 1 δ, (2.3) where δ 1 δ < 1 for some δ R n. Ths characterzaton s used later to defne the lkelhood functon. 2.1 Skew-normal copula A prncpal part of constructng the copula s defnng the margnal dstrbuton of Y j b. In (1.3), denote the margnal dstrbuton and densty functon of Y j b by F (y j θ j ) and f(y j θ j ), where θ = (θ 1,..., θ n ) are the parameters of nterest. For the same notatons n (1.3), condtonally on b defne where the jth margnal s Z = (Z 1,..., Z n ) Skew-N n (D b, Σ, λ ), Z j Skew-N 1 ((D b ) j, 1, λ j), (2.4) where λ j s the unvarate skewness parameter, whch s not equvalent to the components of the skewness vector λ = (λ 1,..., λ n ), rather t s derved usng a lnear transformaton of the multvarate response varable, see Chapter
4 4 K. Das, M. Elmasr and A. Sen 5 of Azzaln (2013) for a detaled revew. Note that (D b ) j s the jth element of the vector D b and Σ = Σ(ξ, t ) s a correlaton matrx, whch has all ts dagonal elements equal to 1. Snce the random number F (Y j θ j ) unform(0,1), we lnk the two margnal dstrbutons of Z j and Y j n a way that for each observaton y j we have and z = (z 1,..., z n ) = z j = SN 1 1 [F (y j θ j ) (D b ) j, 1, λ j], (2.5) ( ) SN 1 1 [F (y 1 θ 1 ) 1 ],..., SN 1 1 [F (y n θ n ) n ], where SN k s a k-varate skew-normal dstrbuton functon. For presentaton smplcty, j = {(D b ) j, 1, λ j } n the above equaton. By the transformaton n (2.5), we attempt to estmate the jont dstrbuton of Y b usng a copula as The correspondng densty s then F n (y θ ) = SN n (z D b, Σ, λ ). (2.6) n f(y j θ j ) f n (y θ ) = sn n (z D b, Σ, λ ) sn 1 (z j (D b ) j, 1, λ (2.7) j ). See Landsman (2009) for a good reference on skew ellptcal copulas and Lambert and Vandenhende (2002) for copula-based longtudnal models. j=1 3 Log-lkelhood functon Despte the defned copula n (2.6) and (2.7), wrtng down the complete loglkelhood functon s stll dffcult. The skew-normal densty n (2.1) s defned partally by the normal dstrbuton functon, noted as Φ. Therefore, we frst show that the skew-normal copula n (2.1) could be smplfed by condtonng on latent random varable wth a half-normal dstrbuton. By Proposton 1 and Corollary 1 of Arellano-Valle et al. (2005), based on a characterzaton due to Henze (1986), we can rewrte the skew-normal dstrbuton of Z as follows. d Z = D b + Σ 1/2 δ v + Σ 1/2 (I δ δ ) 1/2 X where = d meanng dstrbuted as, v HN 1 (0, 1)(HN = half-normal), X N n (0, I), b N q (0, Ω b ) are ndependent and δ = λ. 1 + λ λ
5 5 K. Das, M. Elmasr and A. Sen In other words, Z v, b N n (D b + Σ 1/2 δ v, Σ 1/2 (I δ δ Smlarly n the unvarate case, v HN 1 (0, 1), b N q (0, Ω b ). )Σ 1/2 ), (3.1) Z j v, b N 1 ((D b ) j + δ j v, 1 δ 2 j) (3.2) where, v HN 1 (0, 1), b N q (0, Ω b ), δ j = λ j 1 + λ j 2. The above reparametrzaton facltates n defnng posteror dstrbuton of b z, v as gven by the followng proposton. Proposton 3.1. Gven the settngs n (3.1), the condtonal densty functon of b z, v s specfed by where b z, v N q (τ 2 D Ψ 1 (z Σ 1/2 δ v ), τ 2 ) (3.3) τ 2 Moreover, = (Ω 1 b + D Ψ 1 D ) 1, Ψ = Σ 1/2 (I δ δ T b z Skew-N q ( τ 2 D Ψ 1 z, τ 2 + d d, λ b ) )Σ 1/2. (3.4) where Σ 1/2 d = τ 2 D Ψ 1 Σ 1/2 δ, λ b = (D Ψ 1 δ ) (τ 2 + d d ) 1/ d (τ 2) 1 d Note that λ b n (3.4) s completely specfed, therefore, t does not ncrease the dmenson of estmable vector of parameters. The proof of Proposton 3.1 s essentally based on Bayes Theorem where f z v = f z b,v f b db = Φ n (Σ 1/2 δ v, Ψ + D Ω b D ). (3.5) Under general regularty condtons and by (2.7), the complete condtonal log-lkelhood s m l(θ y, x, b) = l (θ y, x, b ), (3.6) =1
6 6 K. Das, M. Elmasr and A. Sen where by the herarchcal representaton n (3.2) and (3.1) l (θ y, x, b ) 1 2 log Ψ 1 2 (z D b Σ 1/2 δ v ) Ψ 1 (z D b Σ 1/2 δ v ) 1 2 n j=1 log(1 δ 2 j) 1 2 n + log f(y j θ j ), j=1 n j=1 (z j (D b ) j δ j v ) 2 (1 δ 2 j ) (3.7) Gven that y = (y 1,..., y m ), x = (x 1,..., x m ), b = (b 1,..., b m ), and Ψ denotes the determnant of Ψ. 3.1 Autoregressve correlaton matrx To characterze the covarance matrx n a plausble manner, one needs to take n to account dfferent sources of random varaton wthn observatons. Under the multple observatons per unt settngs, these sources generally fall nto three categores: measurement error, random effect, and seral correlaton. The frst source s controlled durng the fttng process. The random effect source of varaton s accounted for wthn the model as a random ntercept b. Therefore, we would only consder ntegratng the seral correlaton source of varaton, and as noted earler the covarance matrx Σ presented n (1.3) s modeled as a functon of tme and a dsperson varable ξ. Assumng a homogeneous varance wthn unts, (σ 2 ), the correlatons amongst each unt observatons (Y ) are determned by the autocorrelaton functon ρ (.) as Cov(Y j, Y k ) = σ 2 ρ ( t j t k ). (3.8) The smplest form to express the seral correlaton above s to assume an explct dependence of the current observaton Y j on prevous observatons Y (j 1),..., Y 1, whch could be modeled usng n-th order autoregressve model. For example, consderng a frst order autoregressve model as y j = α y (j 1) + ɛ j, ɛ j d N(0, ζ). (3.9) Note that t would be dffcult to gve an explct nterpretaton of the α parameter f the measurements are not equally spaced n tme or when tmes of measurements are not common to all unts. One way of solvng ths ssue s to mplement an exponental autocorrelaton functon ρ(.), where Cov(Y j, Y k ) = σ 2 e ξ t j t k. (3.10)
7 7 K. Das, M. Elmasr and A. Sen The correlaton between two response varables s then Corr(Y j, Y k ) = e ξ t j t k. (3.11) Ths correlaton structure s used to construct the correlaton coeffcent matrx Σ = Σ (ξ, t ) n the copula structure and lkelhood. 4 Monte Carlo based EM algorthm The expectaton-maxmzaton (EM) algorthm (Dempster et al. (1977)) s an teratve approach for obtanng the maxmum lkelhood estmates. It conssts of two steps from whch the name s derved; an expectaton (E-step) and a maxmzaton step (M-step). Typcally the lkelhood of nterest nvolved a set of observed data x and unobserved latent data u, where the condtonal dstrbuton of u gven x s known. At teraton r, the E-step computes the expectaton of the log-lkelhood functon wth respect to the condtonal dstrbuton u x, θ (r). The M-step computes a new set of (provsonal) parameter estmates θ (r+1) that maxmze the expectaton of the earler E-step. Those two steps alternate to fnd a set of parameters that maxmze the lkelhood functon. Let l(θ u, x) be the log-lkelhood, then, for r = 1, 2... the alternatng steps are as follows: E-step: compute Q(θ θ (r) ) = E u x,θ (r)[l(θ u, x)]; M-step: fnd θ (r+1) = arg max θ Q(θ θ (r) ). Under certan regularty condtons dscussed n Wu (1983), the log-lkelhood functon converges to a local or global maxmum. The earlest detaled explanaton and namng of the EM algorthm was publshed by Dempster et al. (1977), where they generalzed earler attempts by Sundberg (1974), and sketched a convergence analyss for a wder class of problems. Meng and Rubn (1993) studes computatonal dffcultes encountered n the M-step, where they proposed smaller maxmzaton steps over the parameter space. They argued that nstead of maxmzng the whole set of parameters one can maxmze n a sequental manner a subset of parameters ndependently, whle the other subset s held fxed. Such modfcaton s called a constraned maxmzaton step (CM). A second mportant advancement to the EM algorthm was proposed by We and Tanner (1990), and s called the Monte Carlo (MC) EM algorthm. By applyng the law of large numbers on the E-step above, one can approxmate Q(θ x, θ (r) ) as Q(θ θ (r) ) = R 1 where R s relatvely a large sample sze. R t=1 l(θ u (t), x), (4.1)
8 8 K. Das, M. Elmasr and A. Sen In relaton to the results dscussed n earler sectons, the unobserved latent random varable s b, where ts condtonal dstrbuton b z, θ s found to be a skew-normal as llustrated n Proposton (3.1). Therefore, let θ (r) be a vector of parameter estmates n the r-th teraton, then the two MC-EM steps are MC E-step: for the -th unt at (r + 1) EM teraton, Q (θ θ (r) ) = E b z,θ (r)[l (θ x, y, b )] = R 1 R j=1 l (θ x, y, b (j) ), (4.2) and Q(θ θ (r) ) = m Q (θ θ (r) ), =1 where b (j) s the j-th draw generated from the dstrbuton of b z, θ (r), R s the number of replcaton on the -th unt. M-step: solvng the score equaton θ Q(θ θ(r) ) = 0. It s mportant to menton the work of Wu (1983), whch outlned a lst of condtons ensurng the convergence of the EM algorthm. Condtons as the boundedness of the log-lkelhood, compactness of the parameter space and the contnuty of the expectaton n the E-step wth respect to the estmated parameter. The log-lkelhood of the proposed model n (3.7) nvolves a term of the form log( Ψ ), whch could reach nfnty and compromse the convergence of the EM algorthm. To follow Wu (1983) condtons, heurstc methods of ntatng the algorthm from dfferent startng ponts s enforced n the MC-EM algorthm used n ths paper. Smlar heurstc methods were successfully used by Arellano-Valle et al. (2005). The followng sectons llustrate some numercal and real data results of the proposed model and algorthm. 4.1 An M-step for an exponental response Ths subsecton derves the lkelhood and ts partal dervatves when the response varable Y j x j, b follows an exponental dstrbuton wth mean functon η j = exp(x j β + b ), and densty f(y j η j ) = ηj 1 exp( y j ηj 1 ).
9 9 K. Das, M. Elmasr and A. Sen From (3.7) the unt log-lkelhood s l (θ y, x, b ) 1 2 log Ψ 1 2 (z b Σ 1/2 δ v ) Ψ 1 (z b Σ 1/2 δ v ) 1 2 n j=1 log(1 δ 2 j) 1 2 n j=1 n {y j e x jβ b + x j β + b }, j=1 (z j b δ j v ) 2 (1 δ 2 j ) where parameters are as defned n (3.7). Therefore, the margnal partal dervatves become β l (θ y, x, b ) = n j=1 x j {y j e x jβ b 1} 2 β 2 l n (θ y, x, b ) = x 2 jy j e x jβ b I(β) = m =1 j=1 ( ) 2 E β 2 l (θ y, x, b ) = m n =1 j=1 ˆβ = (X X) 1 X (log(d 1 (Y )I) + b I) where I = (1, 1,..., 1) and D 1 s an nverse dagonal matrx. Smlar results could be obtaned usng Gamma margnals wth canoncal lnk functon. x 2 j 5 Smulaton desgn and analyss To assess the effcency of the proposed lkelhood and model, a unvarate and a bvarate model settngs are used to nfer parameters. Under both settngs the number of unts s fxed to 200 and the number of observatons n s fxed to 5 for each unt. To generate the response varable Y b, snce the true parameters are known, we frst generate the per-unt multvarate skew-normal varable Z b as n (2.1) wth a specfed skewness vector λ. Then, we use the nverse of the lnk of the margnal dstrbutons of Z j and Y j defned n (2.5) to generate the per-unt multvarate response Y b. The followng two subsectons dscuss each model specfc settngs.
10 10 K. Das, M. Elmasr and A. Sen 5.1 Unvarate model Here we use a model of a fxed ntercept α and a unvarate random effect as Y b F n (η(α + b ), Σ(ξ )), (5.1) where F n s a multvarate dstrbuton from the exponental famly wth lnk functon η, as n Secton (4.1). The fxed and random effects coeffcents are set as α + b N 1 (3, 2) such that E[α + b ] = 3. The tme dfference per observaton wthn each unt s set to a unt dfference, that s the elements of Σ(ξ ) are { e ξ t j t k e ξ j k f j k =. (5.2) 1 f j = k, where ξ = ξ = 0.2. Fnally, snce we are smulatng frst the skew-normal varable Z b to get the response Y b we set the skewness vector λ = (1,..., 1). 5.2 Bvarate model Ths model nvestgates the convergence under extra varables, bnary and categorcal, whch n some cases could represent a measurement devaton caused by certan events. We use a model structure smlar to the one n Secton 6 of Arellano-Valle et al. (2005) as Y b F n (η(α + t j β 1 + ζ j β 2 + b ), Σ(ξ )), (5.3) where β 1 = 2, β 2 = 1 and t j = j 3 for j = 1,..., 5. A categorcal varable ζ j = 1 for 100 and ζ j = 0 otherwse. Smlar to the unvarate settngs, we let α + b N 1 (1, 4) such that E[α + b ] = 1 and V ar[α + b ] = 4. The tme dfference per observaton wthn each unt s set to a unt dfference as n (5.2), where ξ = ξ = 0.2, and the skewness vector λ = (1,..., 1). For each smulaton of a 100, we set the ntal estmates to β (0) = 1, λ (0) = 0.5, V ar[α + b ] = 1 and ξ (0) = 0.1. Usng the Monte Carlo EM algorthm, n each teraton we sample from b (k) Z, startng wth 50 samples per unt and gradually ncreasng untl convergence. 5.3 Exponental and gamma dstrbuted response Ths secton llustrates smulaton results of the proposed copula-drven GLMM usng the derved lkelhood and the proposed MC-EM algorthm, and compares t numercally to the ordnary normal copula, where the skewness vector λ s set to 0. The fnal mssng pece of the lkelhood n (3.7) s the specfcaton of the margnal dstrbuton of the response varable. Here we assume a response varable frst from the exponental and then the gamma dstrbuton wth a log-lnk
11 11 K. Das, M. Elmasr and A. Sen functon. For each smulaton a 100 Monte Carlo data sets are generated under the unvarate and bvarate settngs dscussed n the prevous subsectons. Tables 1 and 2, show the parameter estmates of the skew-normal on the left, and normal copula on the rght, usng exponental margnals, under the unvarate and bvarate settngs respectvely. The MC Mean and MC SD represent the Monte Carlo mean and standard devaton. MSE s the average standard error between Monte Carlo smulaton and the true value of the parameter. EC represents the emprcal coverage probablty computed usng Fsher nformaton matrx assumng a 95% confdence nterval. Note that n the bvarate model we calculate the EC for β 1 and β 2 usng a 95% ellptcal confdence nterval. The λ s the average skewness. Fgures 1 and 2 depct the convergence approxmaton graphcally, under both models respectvely for the skew-normal copula. Densty Densty (a) A sngle replcaton (b) 50 MC replcatons Fgure 1: Unvarate Settngs wth exponental margnals: the true and estmated densty of the response varable Y on the log scale; n bold and dotted lnes respectvely.
12 12 K. Das, M. Elmasr and A. Sen Table 1: Parameter estmaton under the unvarate settngs wth exponental margnals Skew-normal copula Normal copula Parameters True value MC Mean MC SD MSE EC MC Mean MC SD MSE EC α E[α + b] V ar[α + b] ξ λ Densty Densty (a) A sngle replcaton (b) 100 MC replcatons Fgure 2: Bvarate settngs wth exponental margnals: the true and estmated densty of the response varable Y on the log scale; n bold and dotted lnes respectvely. Table 2: Parameter estmaton under the bvarate settngs wth exponental margnals Skew-normal copula Normal copula Parameters True value MC Mean MC SD MSE EC MC Mean MC SD MSE EC α β β E[α + b] V ar[α + b] ξ λ
13 13 K. Das, M. Elmasr and A. Sen Smlarly, Fgure 3 and Table 3 show the smulaton results of the bvarate model, whle assumng gamma margnals. Table 3 also shows the estmated parameters when usng the normal copula nstead. The shape parameter of the gamma margnal s fxed to k = 3 and a log-lnk functon s used. Densty Densty (a) A sngle replcaton (b) 100 MC replcatons Fgure 3: Bvarate settngs wth gamma margnals: the true and estmated densty of the response varable Y on the log scale; n bold and dotted lnes respectvely. Table 3: margnals Parameter estmaton under the bvarate settngs wth gamma Skew-normal copula Normal copula Parameters True value MC Mean MC SD MSE EC MC Mean MC SD MSE EC α β β E[α + b] V ar[α + b] ξ λ The results presented above suggest good nference results for the proposed model, snce we are able to estmate the fxed parameters, the frst and second moments of the random effects, and to some degree the autoregressve coeffcent ξ. Nevertheless, we ntentonally fxed the number of observaton per
14 14 K. Das, M. Elmasr and A. Sen unt to 5, snce t allows the use of a unform λ vector and an autoregressve parameter ξ for all unts. In ths sense, we can estmate the unform parameters by drawng nformaton from all observatons. The reducton of the number of parameters s crtcal, snce otherwse one has more parameters than observatons. In our examples, usng unform autoregressve and skewness parameters, we only needed to estmate parameters, whle n general we have m + 5m parameters. For the case when the normal copula s used, the estmaton results of the fxed effects parameter s largely smlar to the proposed model. Ths result s evdent from (2.7), snce the choce of the copula s ndependent from the lkelhood of margnals. On the other hand, the estmaton results for the random effect show systematc bas when compared to the results of the skew-normal model. Ths estmaton bas arses from the fact that the skew-normal mean ncludes the skewness coeffcent n ts structure, thus t relates drectly to the condtonal dstrbuton of b z, v, as seen n Proposton 3.1. In the case of the correlaton parameter ξ, the results are comparable wth smaller dfferences n the bvarate settng, though a bt larger n the unvarate settng, arguably due to the heaver nfluence of the random effects on the lkelhood n the latter. It s worth mentonng that the product form of the densty n (2.7) allowed the lkelhood n (3.7) to decomposed nto three man parts. Ths n turn streamlned the estmaton procedure of the fxed effects coeffcent β to the maxmum-lkelhood estmate when assumng ndependent margnal denstes. One s then able to compute the nformaton matrx analytcally or by usng methods as n Lous (1982) to obtan the observed nformaton matrx. In ths secton we presented examples where the nformaton matrx s readly avalable. Nevertheless, we fnd t to be much more complex to calculate the observed nformaton matrx for the dsperson ξ and skewness λ varables, snce t requres dervng the autoregressve correlaton Σ n (3.7) for the former and Ψ for the latter. As a result, the coverage probabltes for both n the tables above are left blank. The smulaton was mplemented n R usng manly the packages sn and mnormt, whch are both mantaned by Adelch Azzaln. The sn package was used to sample from the skew-normal dstrbuton and ft the skew-normal parameters, manly Σ and ˆλ. Consequently, we estmate the dsperson parameter ξ by mnmzng the L 2 norm between the emprcal estmate Σ and the correlaton matrx Σ(ξ) construed usng (5.2), as ˆξ = arg mn ξ>0 { Σ Σ(ξ) 2 }. In respect to ˆξ we then realgn Σ to Σ(ˆξ). Lkewse, one could also use the general-purpose optmzaton package optm wth L-BFGS-B method wth a lower bound of τ > 0, less than an upper bound of max{δ δ}, to avod sngulartes n computng the nverse of the matrx Ψ n (3.1) and (3.3). Note
15 15 K. Das, M. Elmasr and A. Sen that dependng on the tme measurement of observatons t, the lower bound τ cannot be very small, otherwse one wll arrve at an all-ones matrx Σ. 6 An applcaton As an llustraton, we apply our methodology to the famous Framngham Heart Study that conssts of longtudnal data for a wde set of cohorts. Ths data has been analyzed earler n Zhang and Davdan (2001) and Arellano-Valle et al. (2005). The prmary objectve s to model the change of cholesterol levels over tme wthng patents. The data provdes cholesterol levels of 200 randomly selected patents, measured at the begnnng of the study and every two years for a total of 10 years. However, we only use the frst 3 observatons per patent snce t s the mnmum number of vsts seen n the data. The gender and age of those patents are also avalable. Snce the normal lnear mxed model analyzed by Zhang and Davdan (2001) s a partcular case of GLMM, we apply our methodology to a smpler mxed model under more general dstrbutonal (copula based) setup. In vew of the model proposed n Secton 5, we consder the followng model Y F n (α + β 1 sex + β 2 age + β 3 t + b, Σ(ξ, t )), (6.1) where the jth component y j of Y s the cholesterol level at the jth tme pont for unt (the observatons are normalze by a 100), t j = (tme 5)/10 (tme measured n years), b s the unt specfc random effect as n (3.2), and the correlaton coeffcents are defned as Corr(Y j, Y k ) = e ξ t j t, (6.2) where t s the tme of the frst vst. As n (2.5), the modelng s performed wth a gamma margnals and a log-lnk functon. Fgure 4a represents a hstogram of cholesterol levels of the 200 randomly selected patents where dotted lnes are the ftted model under the proposed settngs. Fgure 4b shows the same hstogram wth a 100 MC replcatons of b.
16 16 K. Das, M. Elmasr and A. Sen cholesterol levels (a) A sngle replcaton cholesterol levels (b) 100 MC replcatons Fgure 4: Fttng of Framngham Heart Study cholesterol data wth model (6.1) usng a gamma margnals wth a log-lnk functon. The shape parameter s set to k = 3. The sold lnes are the ftted model, whle the hstogram shows the frequency dstrbuton of cholesterol levels. Fgure 5a represents the denstes of the centralzed observed skew-normal varable resulted from each of the 100 MC-EM runs, where the hgh postve skewness s evdent. Fgure 5b shows the densty of the average centralzed skew-normal varable n sold, versus the densty of a zero locaton skew-normal generated usng the ftted parameters.
17 17 K. Das, M. Elmasr and A. Sen Densty Densty (a) Observed skew-normal denstes (b) Average observed skew-normal versus ftted Fgure 5: Fgure on the left s the denstes of centralzed observed skew-normal from each MC-EM run. The fgure on the rght, n bold s the average densty of the results n the left, whle the dotted lne s the densty of a zero locaton skew-normal gven the estmated parameters. Table 4 presents the parameter estmates and standard errors whch are calculated as SE(θ MLE ) = I(θ MLE ) 1/2, where I s the Fsher Informaton coeffcent of the maxmum lkelhood estmate of parameter θ. From the table, the estmated value of the correlaton coeffcent (ξ) s close to zero, ths does not automatcally mply that the proposed autoregressve correlaton structure s not adequate. The normalzaton of the tme varable t affects the magntude of ξ. To see ths better, the off-dagonal elements of the estmated correlaton matrx Σ(ˆξ) suggest a strong autoregressve structure n the data despte the low value of ˆξ Σ(ˆξ) = Moreover, β 2 and β 3 estmates are close to zero, suggestng that patents age or tme of observatons are not a predctor of cholesterol levels. Both β 1 and V ar[α + b] seem relatvely sgnfcant, emphaszng the mportance of the patents gender and the random effects coeffcent. The average skewness varable λ suggests a hghly skewed copula, as also ndcated n 5a. Nevertheless, gven the number of observatons, the model has many varables to estmate, whch dampen the estmaton accuracy. In ths case, we are estmatng 9 coeffcents for around 200 observatons.
18 18 K. Das, M. Elmasr and A. Sen Table 4: Fttng of Framngham Heart Study cholesterol data wth model (6.1) usng a gamma margnals wth a log-lnk functon, the shape parameter k = 3. Parameters Estmate SE α β β β E[α + b] V ar[α + b] ξ λ log-lkelhood AIC BIC Arellano-Valle et al. (2005) ftted the Framngham Heart Study cholesterol data under a mxture of Gaussan and skew-normal dstrbutons for the random effects and resduals. In ther model they used a bvarate random effect, whle the presented model n (6.1) uses a unvarate random effect. Moreover, Arellano- Valle et al. (2005) used a lnear mxed model formulaton whch dffers from the copula formulaton used here. For these dfferences, the average mean squared error of Arellano-Valle et al. (2005) surpasses the ft of the proposed model. Nonetheless, we beleve the copula formulaton allows more flexblty n modellng the response varable gven a robust estmaton procedure. In addton, ths s the frst step to estmate mxed models va a skew-normal copula, and future research s requred to determne better fts, and most mportantly, to ntegrate a random effects desgn matrx, and mprove the estmaton of the skewness and autoregressve varables. 7 Dscusson and future work The current nvestgaton s based on the development of a copula-drven GLMM, where the focus was on modelng the margnals n leu of the jont dstrbuton. Oftentmes margnal dstrbutons from the exponental famly do not necessarly lead to a multvarate dstrbuton of the same form. Nonetheless, we feel that copula based general multvarate dstrbutons may be of more nterest to appled statstcans. Our proposal ntended to llustrate such a typcal stuatons. In regard to the methodology, the MCEM seems to be appealng, though computatonally expensve. We feel that estmaton accuracy of the proposed model s pgged to the theoretcal lmtaton of the EM algorthm, specally n large dmensons. Oftentmes, the MCEM algorthm converged to local maxmums, and we feel that a post-em optmzaton procedure, such as gradent
19 19 K. Das, M. Elmasr and A. Sen descend, mght mprove the ft. One can also get rd of computatonal hassle to some extent by adoptng a MCMC n the Bayesan paradgm. In our subsequent nvestgatons, we are plannng to work wth a Bayesan paradgm n a more broad setup. More mportantly, we are plannng to ntegrate a desgn matrx for the random effects to extend t beyond the unvarate case. To mprove the accuracy, we are attemptng dfferent optmzaton technques. For computatonal convenence, an autoregressve structure was used to model the correlaton matrx, whch s not always applcable n real data, thus, we are plannng to nvestgate more flexble correlaton models. Acknowledgments: We would lke to acknowledge the Assocate Edtor and all revewers for ther valuable comments. References R. Arellano-Valle and M.G. Genton. Fundamental skew dstrbutons. Journal of Multvarate Analyss, 96:93 116, R. Arellano-Valle, H. Bolfarne, and V. Lachos. Skew-normal lnear mxed models. Journal of Data Scence, 3: , A. Azzaln. A class of dstrbutons whch ncludes the normal ones. Scandnavan Journal of Statstcs, 12: , A. Azzaln and A. Dalle-Valle. The multvarate skew-normal dstrbuton. Bometrka, 83: , Adelch Azzaln. The skew-normal and related famles, volume 3. Cambrdge Unversty Press, A.P. Dempster, N.M. Lard, and D.B. Rubn. Maxmum lkelhood from ncomplete data va the EM algorthm. Journal of the Royal Statstcal Socety, 39 (1):1 38, R.A. Fsher. The correlaton between relatves on the supposton of Mendelan nhertance. Transactons of the Royal Socety of Ednburgh, 52: , N. Henze. A probablstc representaton of the skew-normal dstrbuton. Scandnavan Journal of Statstcs, 13(4): , ISSN , Meels Käärk, Anne Selart, and Ene Käärk. On parametrzaton of multvarate skew-normal dstrbuton. Communcatons n Statstcs-Theory and Methods, 44(9): , 2015.
20 20 K. Das, M. Elmasr and A. Sen P. Lambert and F. Vandenhende. A copula-based model for multvarate nonnormal longtudnal data: analyss of a dose ttraton safety study on a new antdepressant. Statstcs n Medcne, 21(21): , Z. Landsman. Ellptcal famles and copulas: tltng and premum; captal allocaton. Scandnavan Actuaral Journal, 2009(2):85 103, Thomas A. Lous. Fndng the observed nformaton matrx when usng the em algorthm. Journal of the Royal Statstcal Socety. Seres B (Methodologcal), pages , X.L. Meng and D.B. Rubn. Maxmum lkelhood estmaton va the ECM algorthm: A general framework. Bometrka, 80:67 278, C. Meza, F. Osoro, and R. De la Cruz. Estmaton n nonlnear mxed-effects models usng heavy-taled dstrbutons. Statstcs and Computng, 22(1): , M.A. Newton and Y. Zhang. A recursve algorthm for non-parametrc analyss wth mssng data. Bometrca, K.J.F. Petersson, E. Hanze, R.M. Savc, and M.O. Karlsson. Semparametrc dstrbutons wth estmated shape parameters. Pharmaceutcal Research, 26 (9): , R. Sundberg. Maxmum lkelhood theory for ncomplete data from an exponental famly. Scandnavan Journal of Statstcs, 1(2):pp , H. Tao, M. Aptal, B.S. Yandell, and M.A. Newton. An estmaton method for the sem-parametrc mxed effects model. Bometrcs, 55: , G. Verbeke and E. Lesaffre. A lnear mxed-effects model wth heterogenety n the random-effects populaton. Journal of the Amercan Statstcal Assocaton, 91: , G.C.G. We and M.A. Tanner. A Monte Carlo mplementaton of the EM algorthm and the poor mans data augmentaton algorthms. Journal of the Amercan Statstcal Assocaton, 85: , C.F.J. Wu. On the convergence propertes of the EM algorthm. The Annals of Statstcs, 11(1):95 103, J. Wu, X. Wang, and S.G. Walker. Bayesan nonparametrc nference for a multvarate copula functon. Methodology and Computng n Appled Probablty, 16(3): , D. Zhang and M. Davdan. Lnear mxed models wth flexble dstrbutons of random effects for longtudnal data. Bometrcs, 57: , 2001.
Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationAdvances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models.
Advances n Longtudnal Methods n the Socal and Behavoral Scences Fnte Mxtures of Nonlnear Mxed-Effects Models Jeff Harrng Department of Measurement, Statstcs and Evaluaton The Center for Integrated Latent
More informationComputing MLE Bias Empirically
Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.
More informationHidden Markov Models & The Multivariate Gaussian (10/26/04)
CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models
More informationParametric fractional imputation for missing data analysis
Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors
Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationConjugacy and the Exponential Family
CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the
More informationSTATS 306B: Unsupervised Learning Spring Lecture 10 April 30
STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More information4.3 Poisson Regression
of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationChapter 5 Multilevel Models
Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationLab 4: Two-level Random Intercept Model
BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE
P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationHidden Markov Models
Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationRELIABILITY ASSESSMENT
CHAPTER Rsk Analyss n Engneerng and Economcs RELIABILITY ASSESSMENT A. J. Clark School of Engneerng Department of Cvl and Envronmental Engneerng 4a CHAPMAN HALL/CRC Rsk Analyss for Engneerng Department
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationLaboratory 1c: Method of Least Squares
Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased
More informationAn R implementation of bootstrap procedures for mixed models
The R User Conference 2009 July 8-10, Agrocampus-Ouest, Rennes, France An R mplementaton of bootstrap procedures for mxed models José A. Sánchez-Espgares Unverstat Poltècnca de Catalunya Jord Ocaña Unverstat
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationIntroduction to Regression
Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More information8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF
10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the
More informationLaboratory 3: Method of Least Squares
Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationIntroduction to Generalized Linear Models
INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationBIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data
Lab : TWO-LEVEL NORMAL MODELS wth school chldren popularty data Purpose: Introduce basc two-level models for normally dstrbuted responses usng STATA. In partcular, we dscuss Random ntercept models wthout
More informationDurban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications
Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department
More informationRockefeller College University at Albany
Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationPopulation Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX
Populaton Desgn n Nonlnear Mxed Effects Multple Response Models: extenson of PFIM and evaluaton by smulaton wth NONMEM and MONOLIX May 4th 007 Carolne Bazzol, Sylve Retout, France Mentré Inserm U738 Unversty
More informationApplications of GEE Methodology Using the SAS System
Applcatons of GEE Methodology Usng the SAS System Gordon Johnston Maura Stokes SAS Insttute Inc, Cary, NC Abstract The analyss of correlated data arsng from repeated measurements when the measurements
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationQUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT PLUS LINEAR BY LINEAR ASSOCIATION MODEL FOR ORDINAL CONTINGENCY TABLES
Journal of Statstcs: Advances n Theory and Applcatons Volume 6, Number, 26, Pages -5 Avalable at http://scentfcadvances.co.n DOI: http://dx.do.org/.8642/jsata_72683 QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics
ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationJoint Statistical Meetings - Biopharmaceutical Section
Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve
More informationSTAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression
STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours
UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationThis column is a continuation of our previous column
Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationLECTURE 9 CANONICAL CORRELATION ANALYSIS
LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationLimited Dependent Variables and Panel Data. Tibor Hanappi
Lmted Dependent Varables and Panel Data Tbor Hanapp 30.06.2010 Lmted Dependent Varables Dscrete: Varables that can take onl a countable number of values Censored/Truncated: Data ponts n some specfc range
More informationU-Pb Geochronology Practical: Background
U-Pb Geochronology Practcal: Background Basc Concepts: accuracy: measure of the dfference between an expermental measurement and the true value precson: measure of the reproducblty of the expermental result
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationTime-Varying Coefficient Model with Linear Smoothing Function for Longitudinal Data in Clinical Trial
Tme-Varyng Coeffcent Model wth Lnear Smoothng Functon for Longtudnal Data n Clncal Tral Masanor Ito, Toshhro Msum and Hdek Hrooka Bostatstcs Group, Data Scence Dept., Astellas Pharma Inc. Introducton In
More informationChapter 12 Analysis of Covariance
Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty
More informationParameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples
Appled Mathematcal Scences, Vol. 5, 011, no. 59, 899-917 Parameters Estmaton of the Modfed Webull Dstrbuton Based on Type I Censored Samples Soufane Gasm École Supereure des Scences et Technques de Tuns
More informationOn Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function
On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for
More information9. Binary Dependent Variables
9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More information