FINITE MIXTURE MODELLING USING THE SKEW NORMAL DISTRIBUTION
|
|
- Felicia Moody
- 6 years ago
- Views:
Transcription
1 Statstca Snca 1727, FINITE MIXTURE MODELLING USING THE SKEW NORMAL DISTRIBUTION Tsung I. Ln 1, Jack C. Lee 2 and Shu Y. Yen 2 1 Natonal Chung Hsng Unversty and 2 Natonal Chao Tung Unversty Abstract: Normal mxture models provde the most popular framework for modellng heterogenety n a populaton wth contnuous outcomes arsng n a varety of subclasses. In the last two decades, the skew normal dstrbuton has been shown benefcal n dealng wth asymmetrc data n varous theoretc and appled problems. In ths artcle, we address the problem of analyzng a mxture of skew normal dstrbutons from the lkelhood-based and Bayesan perspectves, respectvely. Computatonal technques usng EM-type algorthms are employed for teratvely computng maxmum lkelhood estmates. Also, a fully Bayesan approach usng the Markov chan Monte Carlo method s developed to carry out posteror analyses. Numercal results are llustrated through two examples. Key words and phrases: ECM algorthm, ECME algorthm, Fsher nformaton, Markov chan Monte Carlo, maxmum lkelhood estmaton, skew normal mxtures. 1. Introducton Fnte mxture models have been broadly developed and wdely appled to classfcaton, clusterng, densty estmaton and pattern recognton problems, as shown by Ttterngton, Smth and Markov 1985, McLachlan and Basord 1988, McLachlan and Peel 2, and the references theren. Wth the growng advances of computatonal methods, especally for the development of Markov chan Monte Carlo MCMC technques, many works are also devoted to Bayesan mxture modellng ssues, ncludng Debolt and Robert 1994, Escobar and West 1995, Rchardson and Green 1997 and Stephens 2, among others. In many appled problems, the shapes of ftted mxture normal components may be dstorted, and nferences can be msleadng when the data nvolves hghly asymmetrc observatons. In partcular, the normal mxture NORMIX model tends to overft when addtonal components are ncluded to capture the skewness. Sometmes, ncreasng the number of pseudo-components may lead to dffcultes and neffcences n computatons. Instead, we consder usng the skew normal dstrbutons proposed by Azzaln 1985 as component denstes to overcome the potental weakness of normal mxtures. The skew normal dstrbuton s a new class of densty functons dependent on an addtonal shape parameter,
2 91 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN and ncludes the normal densty as a specal case. It provdes a more flexble approach to the fttng of asymmetrc observatons and uses fewer components n the fttng of mxture models. A comprehensve coverage of the fundamental theory and new developments for skew-ellptcal dstrbutons s gven by Genton 24. It s not easy to deal wth computatonal aspects of parameter estmaton for the fttng of skew normal mxture SNMIX models. For smplcty, we treat the number of components as known and descrbe how to employ EM-type algorthms for fndng the maxmum lkelhood ML estmates. In addton, Bayesan samplng methods for SNMIX are consdered as an alternatve modellng strategy. Prors and hyperparameters are chosen as weakly nformatve to avod nondentfablty problems n the mxture context. The rest of the paper unfolds as follows. Secton 2 brefly outlnes some prelmnares of the skew normal dstrbuton. Azzaln and Captano 1999 ponted out that the ML estmates mght be mproved by a few EM teratons, but detaled expressons of the EM algorthm are not avalable n the lterature. We thus show how to compute the ML estmates for the skew normal dstrbuton usng two EM-type algorthms. In Secton 3 we show a herarchcal representaton for the SNMIX model by ncorporatng two latent varables. Based on the model, we also derve the correspondng EM-type algorthms for ML estmaton. Meanwhle, the nformaton-based standard errors are also presented. In Secton 4, we develop the MCMC samplng algorthm used n smulatng posteror dstrbutons to carry out Bayesan nferences. In Secton 5, two examples are gven, and n Secton 6 we provde some concludng remarks. 2. The Skew Normal Dstrbuton 2.1. Prelmnares As developed by Azzaln 1985, 1986, a random varable Y follows a unvarate skew normal dstrbuton wth locaton parameter ξ, scale parameter σ 2 and skewness parameter λ R f t has the densty ψy ξ,σ 2,λ = 2 σ φ y ξ σ Φ λ y ξ σ, 1 where φ and Φ denote the standard normal densty functon and cumulatve dstrbuton functon, respectvely; then, for brevty, we say that Y SNξ,σ 2,λ. Note that f λ =, the densty of Y reduces to the Nξ,σ 2 densty. Lemma 1. If Y SNξ,σ 2,λ and X Nξ,σ 2 /1 + λ 2, we have EX n+1 = ξex n + [σ 2 /1 + λ 2 ][dex n /dξ].
3 SKEW NORMAL MIXTURES 911 EY n+1 = ξey n + σ 2 [dey n /dξ] + 2/πδλσEX n. E Y E Y } n+1 = σ 2 [dey EY } n /dξ] + nσ 2 E Y EY } n 1 EY ξ } E Y EY } n } n. + 2/πδλσE X EY Lemma 1 provdes a smple way of obtanng hgher moments wthout usng the moment generatng functon. Wth some basc algebrac manpulatons, we can easly obtan 2 EY = ξ + π δλσ, vary = 24 πλ 3 γ Y = } 3/2, κ Y = 3 + π + π 2λ } π δ2 λ σ 2, 8π 3λ 4 π + π 2λ 2 } 2, 2 where δλ = λ/ 1 + λ 2, and γ Y and κ Y are the measures of skewness and kurtoss, respectvely. It s easly shown that γ Y s n.9953,.9953 and κ Y s n 3, Henze 1986 showed that the odd moments of the standard skew normal varable Z = Y ξ/σ have the expresson EZ 2k+1 = 2 π λ1 + λ2 k+.5 2 k 2k + 1! k j= j!2λ 2j 2j + 1!k j!, whle the even moments concde wth those of standard normal, as Z 2 χ 2 1 Roberts and Gesser From 2, Arnold, Beaver, Groeneveld and Meeker 1993 showed the followng method of moments estmators: ξ = m 1 a 1 m3 σ 2 = m 2 + a 2 1 δλ = b 1 m3 b 1 1 3, 2 3, 2 a 2 b1 3 } m 2, 3 m 3 where a 1 = 2/π, b 1 = 4/π 1a 1, m 1 = n 1 n =1 Y, m 2 = n 1 1 n =1 Y Ȳ 2, and m 3 = n 1 1 n =1 Y Ȳ Parameter estmaton usng EM-type algorthms In ths subsecton, we show how to explot two extensons of the EM algorthm Dempster, Lard and Rubn 1977, the ECM algorthm Meng and Rubn 1993 and the ECME algorthm Lu and Rubn 1994, for ML estmaton of the skew normal dstrbuton. A key feature of these two EM-type
4 912 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN algorthms s that they preserve the stablty of the EM algorthm wth ther monotone convergence. In order to represent the skew normal model n an ncomplete data framework, we extend the result of Azzaln 1986, p.21 and Henze 1986, Thm. 1 to show that f Y j SNξ,σ 2,λ, then Y j = ξ + δλτ j + 1 δ 2 λu j, 4 wth τ j TN,σ 2 Iτ j > }, U j N,σ 2, where τ j and U j are ndependent, TN, denotes the truncated normal dstrbuton, and I } represents an ndcator functon. Lettng Y = Y 1,...,Y n and τ = τ 1,...,τ n, the complete-data log-lkelhood of θ = ξ,σ 2,λ gven Y,τ, after omttng addtve constants, s l c θ = n logσ 2 n 1 2 log δ 2 λ n τ2 j 2δλ n τ jy j ξ + n y j ξ σ 1 2 δ 2 λ Obvously, the posteror dstrbuton of τ j s τ j Y j = y j TNµ τj,σ 2 τ Iτ j > }, 6 where µ τj = δλy j ξ and σ τ = σ 1 δ 2 λ. Lemma 2. Let X TNµ,σ 2 Ia 1 < x < a 2 } be a truncated normal dstrbuton wth the densty fx µ,σ 2 = } 1 1 Φα 2 Φα 1 exp 2πσ where α = a µ/σ, = 1, 2. Then EX = µ σ φα 2 φα 1 Φα 2 Φα 1. EX 2 = µ 2 + σ 2 σ 2α 2φα 2 α 1 φα 1 Φα 2 Φα 1 By Lemma 2, we have Eτ j y j = µ τj + φµτj σ τ The ECM algorthm s as follows. 1 2σ2x µ2 2µσ φα 2 φα 1 Φα 2 Φα 1. Φ µτ j σ τ σ τ and Eτ 2 j y j = µ 2 τ j + σ 2 τ + }, a 1 < x < a 2, φ µτ j σ τ Φ µτ j σ τ µ τ j σ τ.
5 SKEW NORMAL MIXTURES 913 E-step: Calculatng the condtonal expectaton of 5 at the kth teraton yelds ŝ k 1j = Eˆθ kτ j y j = ˆµ k τ j + ŝ k 2j = Eˆθ kτ2 j y j = ˆµ k2 τ j + ˆσ τ k2 + } φˆλk yj ˆξ k ˆσ k ˆλk }ˆσ yj ˆξ Φ k τ k, ˆσ k } φˆλk yj ˆξ k ˆσ k ˆλk yj ˆξ Φ k ˆσ k } ˆµ k τ j ˆσ k τ, where ˆµ τ k j, ˆσ τ k are µ τj and σ τ n 6 wth ξ, σ and λ replaced by ˆξ k, ˆσ k and ˆλ k, respectvely. CM-steps CM-step 1: Update ˆξ k by ˆξ k+1 = 1 y j δˆλ k n CM-step 2: Update ˆσ 2k by ŝ k 1j. ˆσ 2k+1 = n ŝk 2j 2δˆλ k n y j ˆξ k+1 ŝ k 1j + n y j ˆξ k+1 2 2n 1 δ 2 ˆλ k. CM-step 3: Fx ξ = ˆξ k+1 and σ 2 = ˆσ 2k+1, obtan ˆλ k+1 as the soluton of nˆσ 2k+1 δλ 1 δ 2 λ δ 2 λ n y j ˆξ k+1 ŝ k δλ ŝ k 2j δλ y j ˆξ k+1 2 =. For the ECME algorthm, the E-step and the frst two CM steps are the same as ECM, whle the CM-Step 3 of ECM s modfed as the followng CML-step. CML-step: Update ˆλ k by optmzng the constraned log-lkelhood functon,.e., ˆλ k+1 = argmax log Φ λ y j ˆξ k+1 }. λ ˆσ k+1 The maxmzaton n the CML-step requres a one-dmensonal search, whch can be easly solved by the functon optm embedded n the statstcal package R. As noted by Lu and Rubn 1994, the ECME has a faster convergence rate than the ECM algorthm. 1j
6 914 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN Lemma 3. If Z } SN,1,λ, then E φλz 2 ΦλZ = 1 π. 1+λ 2 } E Z 2k+1 φλz ΦλZ =, k =, 1, 2,... } E = Z 2 φλz ΦλZ 2 π λ 1+λ The method of moments estmators n 3 can provde good ntal values. Applyng Lemma 3, the Fsher nformaton Iξ, σ, λ can be easly obtaned. The results are shown n Azzaln 1985, p.175. The standard errors of ML estmates can be computed by takng the square root of the correspondng dagonal elements of I 1 ˆξ, ˆσ, ˆλ. 3. The Skew Normal Mxtures 3.1. The model We consder a fnte mxture model n whch a set of ndependent data Y 1,...,Y n are from a g-component mxture of skew normal denstes fy j Θ = g ω ψy j ξ,σ 2,λ, 7 =1 where ω = ω 1,...,ω g are the mxng probabltes, constraned to be nonnegatve and sum to unty, and Θ = θ,...,θ g wth θ = ω,ξ,σ 2,λ beng the specfc parameters for component. We ntroduce a set of latent component-ndcators Z j = Z 1j,...,Z gj, j = 1,...,n, whose values are a set of bnary varables wth 1 f Yj belongs to group k, Z kj = otherwse, and g =1 Z j = 1. Gven the mxng probabltes ω, the component-ndcators Z 1,...,Z j are ndependent, wth multnomal denstes fz j = ω z 1j 1 ωz 2j 2 1 ω 1 ω g 1 z gj. 8 We wrte Z j M1; ω 1,...,ω g to denote Z j wth densty 8. From 4, a herarchcal model for skew normal mxtures can be wrtten as Y j τ j, Z j = 1 N ξ + δλ τ j, 1 δ 2 λ σ 2, τ j Z j = 1 TN,σ 2 Iτ j >, Z j M1; ω 1,...,ω g j = 1,...,n. 9
7 SKEW NORMAL MIXTURES Maxmum lkelhood estmaton As n 6, we have τ j Y j = y j,z j = 1 TNµ τj,σ 2 τ Iτ j > }, where µ τj = δλ y j ξ, σ τ = σ 1 δ 2 λ. 1 From 9, the complete-data log-lkelhood functon s l c θ = =1 g Z j logω logσ log δ 2 λ } τ2 j 2δλ τ j y j ξ + y j ξ δ 2 λ 2σ 2 Lettng ẑ j =E ˆΘ kz j Y, ŝ 1j =E ˆΘ kz jτ j Y and ŝ 2j =E ˆΘ kz jτ 2 j Y be the necessary condtonal expectatons of 11, we obtan ẑ k j = ˆω k g m=1 ˆωk ŝ k 1j = ẑk j ˆµk ŝ k 2j = ẑk j ˆµk2 ψy j ˆξ k, ˆσ 2k k ˆξ k, ˆλ m ψy j m, ˆσ 2k φ τ j + ˆσ τ k Φ ˆλ k φ τ j + ˆσ τ k2 + Φ k m, ˆλ ˆλ k yj ˆξ k ˆσ k yj ˆξ k ˆσ k m, 12 } ˆλ k yj ˆξ k ˆλ k }, 13 ˆσ k yj ˆξ k ˆσ k } } ˆµk τ j ˆσ τ k, 14 where ˆµ k τ j, ˆσ τ k are µ τj and σ τ n 1 wth ξ, σ and λ replaced by ˆξ k, ˆσ k and ˆλ k, respectvely. The ECM algorthm s as follows. E-step: Gven Θ = ˆΘ k, compute ẑ k j, ŝk 1j j = 1,...,n, usng 12, 13 and 14. CM-step 1: Calculate ˆω k+1 CM-step 2: Calculate = n 1 n ẑk j. n ˆξ k+1 = ẑk and ŝk 2j j y j δˆλ k n n ẑk j for = 1,...,g and ŝk 1j.
8 916 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN CM-step 3: Calculate ˆσ 2k+1 = n ŝk 2j CM-step 4: Fx ξ = soluton of ˆσ 2k+1 k 2δˆλ n ŝk 1j y j ˆξ k+1 δλ 1 δ 2 λ n δλ ŝ k 2j δλ 2 1 δ 2 ˆλ k n ẑk j k+1 ˆξ + n ẑk j y k+1 j ˆξ 2 and σ 2 = k+1 ˆσ2k+1, obtan ˆλ = 1,...,g as the ẑ k j δ 2 λ n y j ẑ k j y k+1 j ˆξ 2 =. k+1 ˆξ ŝ k 1j ECME s dentcal to ECM except for the CM-Step 4 of ECM, whch can be modfed by the followng CML-Step. CML-step: Let λ = λ 1,...,λ g, and update ˆλ k to ˆλ k+1 = argmax λ 1,...,λ g g log =1 ˆω k+1 ψy j k+1 ˆξ, ˆσ 2k+1, λ. We remark here that f the skewness parameters λ 1,...,λ g are assumed to be dentcal, we use ECME snce t s more effcent than ECM. Otherwse, the CML-step becomes a non-trval hgh dmensonal optmzaton problem, whle usng the CM-step 4 can avod the complcaton Standard errors We let I o Θ y = 2 lθ Y / Θ Θ T be the observed nformaton matrx for the mxture model 7. Under some regularty condtons, the covarance matrx of ML estmates ˆΘ can be approxmated by the nverse of I o ˆΘ y. We follow Basford, Greenway, McLachlan and Peel 1997 to evaluate I o ˆΘ y = ŝ j ŝ T j, 15 g } where ŝ j = log =1 ω ψy j ξ,σ 2,λ / Θ ˆΘ. Θ= Correspondng to the vector of all 4g 1 unknown parameters n Θ, we partton ŝ j j = 1,...,n as ŝ j = ŝ j,ω1,...,ŝ j,ωg 1,ŝ j,ξ1,...,ŝ j,ξg,ŝ j,σ1,...,ŝ j,σg,ŝ j,λ1,...,ŝ j,λg T..
9 SKEW NORMAL MIXTURES 917 The elements of ŝ j are gven by ŝ j,ωr = ψy j ˆξ r, ˆσ 2 r, ˆλ r ψy j ˆξ g, ˆσ 2 g, ˆλ g g =1 ˆω ψy j ˆξ, ˆσ 2, ˆλ r = 1,...,g 1, ŝ j,ξr = 2ˆω r φ y j ˆξ r ˆσ r } ˆσ 2 r g =1 ˆω ψy j ˆξ, ˆσ 2, ˆλ ŝ j,σr = ˆω rψy j ˆξ r, ˆσ r, 2 ˆλ r g =1 ˆω ψy j ˆξ, ˆσ 2, ˆλ 1ˆσ + y j ˆξ r 2 r ˆσ r 3 2ˆω rˆλ r y j ˆξ r φ y j ˆξr ˆλry ˆσ r φ j ˆξ r ˆσ r ˆσ r 3 g =1 ˆω ψy j ˆξ, ˆσ 2, ˆλ yj ˆξ r y j Φ ˆλ ˆξ r r ˆσ r ˆσ r y j ˆλ r φ ˆλ ˆξ } r r r = 1,...,g, ˆσ r } r = 1,...,g, ŝ j,λr = ˆω rψy j ˆξ r, ˆσ r, 2 ˆλ r yj ˆξ ˆλry φ j ˆξ r r g =1 ˆω ψy j ˆξ, ˆσ 2, ˆλ } r = 1,...,g. ˆσ r ˆλry Φ j ˆξ r ˆσ r ˆσ r } The nformaton-based approxmaton 15 s asymptotcally applcable. However, t may not be relable unless the sample sze s large. It s common n practce to perform the bootstrap approach Efron and Tbshran 1986 for obtanng an alternatve estmate of the covarance matrx for ˆΘ. The bootstrap method may provde more accurate standard error estmates than 15, but, t requres enormous computng power Notes on mplementaton In the mxture context, the log-lkelhood functon may have multple modes. A convenent way to crcumvent such lmtatons s to try several EM teratons wth a varety of startng values that are representatves of the parameter space. If there exst several modes, one can fnd the global mode by comparng ther relatve masses and log-lkelhood values. In partcular, the algorthm runnng wth dfferent startng values can be used to assess the stablty of the resultng estmates. Although the EM-type algorthm tends to be robust wth respect to the choce of the startng values, t may not converge when ntal values are far from optmum. The followng outlnes a smple procedure to acheve a set of reasonable ntal values. a Randomly generate a set of B bootstrap resamplng samples y 1,...,y B from the orgnal data y. b For each bootstrap sample, partton them nto g components usng the K-means clusterng algorthm and compute the
10 918 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN ntal values ŵ the ntal values = n Z j ˆξ, ˆσ 2 and /n. c For each parttoned component, compute ˆδ λ usng the method of moments as n Bayesan Modellng For Skew Normal Mxtures 4.1. The pror dstrbutons and posteror MCMC samplng We consder a Bayesan approach to 7 n whch Θ s regarded as random wth a pror dstrbuton that reflects our degree of belef n dfferent values of these quanttes. Snce fully non-nformatve pror dstrbutons are not permssble n the mxture context, the pror dstrbutons chosen are weakly nformatve subject to vague pror knowledge and ths avods nonntegrable posteror dstrbutons. The pror dstrbutons for model 7 takes ξ Nη,κ 1 = 1,...,g, σ 2 β Γα,β = 1,...,g, β Γν 1,ν 2, δλ U 1,1 ω Dh,...,h, = 1,...,g, where β s an unknown hyperparameter, η,κ,α,ν 1,ν 2,h are known datadependent constants, Γα, β denotes the gamma dstrbuton wth mean α/β and varance α/β 2, U 1,1 denotes the contnuous unform dstrbuton on the nterval [ 1,1], and Dh,...,h stands for the Drchlet dstrbuton wth the densty functon Γgh Γh g ωh 1 1 ωg 1 h 1 1 g 1 h 1. ω =1 For the values of η,κ,α,ν 1,ν 2,h, we follow Rchardson and Green 1997 n lettng η equal the mdpont of the observed nterval and κ 1 = R 2, where R s the range of the nterval, and n settng α = 2, ν 1 =.2, ν 2 = 1α/αR 2 and h = 2. Gven Θ = Θ k, the MCMC samplng scheme at the k + 1st teraton conssts of the followng steps. Step 1. Sample Z k+1 j j = 1,...,n from M1; ω1,...,ω g, where ω = ψy j ξ k g m=1 ωk m ψy j ξ k,σ 2k,λ k m,σ 2k m,λ k m = 1,...,g. Step 2. Gven Z j = 1, sample τ k+1 j j = 1,...,n from TN δλ k y j ξ k, σ 2k 1 δ 2 λ k Iτ j > }.
11 SKEW NORMAL MIXTURES 919 Step 3. Sample β k+1 from Γν 1 + gα, ν 2 + g =1 σ 2k. Step 4. Sample ω k+1 from Dh + n k+1 1,...,h + n k+1 g n Zk+1 j. Step 5. Gven Z j = 1, sample ξ k+1 N where µ k+1 ξ, µ k+1 ξ = σ 2k n Zk+1 j n k+1 1 δ 2 λ k + κ y j δλ k n n k+1 Step 6. Gven Z j = 1, sample σ 2k+1 1 b = 2 1 δ 2 λ k + from } 1, Zk+1 j τ k+1 j + κσ 2k Z k+1 j τj 2k+1 2δλ k, where n k+1 = 1 δ 2 λ k + κησ 2k 1 δ 2 λ k. from Γ α + n k+1, β k+1 + b, where Z k+1 j y j ξ k+1 2}. Z k+1 j τ k+1 j y j ξ k+1 Step 7. Sample δ k+1 = δλ k+1 1,...,δλ k+1 g va the Metropols Hastngs M-H algorthm Hastngs 197 from f δ [ g n 1 δ 2 λ 1 2 =1 k+1 τ 2 j exp 2δλ τ k+1 j 2σ 2k+1 y j ξ k+1 + y j ξ k+1 2 } ] Z 1 δ 2 λ k+1 j To elaborate on Step 7 of the above algorthm, we transform δλ to δ λ = log 1 + δλ / 1 δλ } and then apply the M-H algorthm to g δ = f δδ g =1 J δ λ, where δ = δ λ 1,...,δ λ g, and J δ λ =2e δ λ / 1+ e δ λ 2 s the Jacoban of transformaton from δλ to δ λ. A g-dmensonal multvarate normal dstrbuton wth mean δ k and covarance matrx c 2 Σ k δ s chosen as the proposal dstrbuton, where the scale c 2.4/ g, as suggested n Gelman, Robert and Glks The value of Σ k δ can be estmated by the nverted sample nformaton matrx gven y and Θ = Θ k. Havng obtaned δ from the M-H algorthm, we transform t back to δ by δλ =.
12 92 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN e δ λ 1/e δ λ + 1 = 1,...,g, and then transform δλ back to λ by δλ / 1 δ 2 λ. To avod the label-swtchng problem and slow stablzaton of the Markov chan, our ntal values Θ are chosen to be dspersed around the ML estmates wth the restrcton ξ 1 < < ξ g Convergence assessment usng multple chans Before conductng nference usng MCMC samples, the output should be analyzed to determne the requred run length of MCMC sequences. Gelman and Rubn 1992 proposed a convergence dagnostc ˆR, the potental scale reducton factor PSRF, obtaned by runnng multple chans wth overdspersed startng values. However, the approach s essentally unvarate. Recently, Brooks and Gelman 1998 provded a generalzaton of Gelman and Rubn s method that consder several parameters smultaneously. Suppose there are I ndependent parallel chans and the length of each chan s 2n. Let θ denote a p 1 vector of parameters and θ = θ 1,...,θ n denote the smulaton sample of the th chan = 1,..., I, after dscardng the frst n teratons. Brooks and Gelman 1998 stated that the posteror varancecovarance matrx of θ can be estmated by ˆV = n 1 1 n W B I n, where W and B/n denote the wthn and between-sequence sample covarance matrx estmates of θ 1,...,θ I, respectvely. They then proposed the multvarate potental scale reducton factor MP- SRF, ˆRp = n 1/n /Iλ 1, where λ 1 s the largest egenvalue of W 1 B/n. Note that the multvarate measure ˆR p bounds above the unvarate ˆR values over all p varables. Suppose the I parallel chans are mxng well wthn the model, ˆRp wll declne to 1 for reasonably large n. Meanwhle, f the I parallel chans are essentally overlappng, then the determnants of ˆV and W should stablze over the teratons and be suffcently close. 5. Examples 5.1. The enzyme data We frst carry out our methodology for the enzyme data set wth n = 245 observatons. The data were frst analyzed by Bechtel, Bonata-Pelleé, Posson, Magnette and Bechtel 1993, who dentfed a mxture of skew dstrbutons by the maxmum lkelhood technques of Maclean, Morton, Elston and Yee Rchardson and Green 1997 provded the reversble jump MCMC approach for
13 SKEW NORMAL MIXTURES 921 the unvarate normal mxture models wth an unknown number of components and dentfed the most possble values of g to be between 3 and 5. Table 1. Estmated parameter values and the correspondng standard errors SE for model 16 wth the enzyme data. ω ξ 1 ξ 2 σ 1 σ 2 λ 1 λ 2 Estmate SE We ft the followng two-component SNMIX model to the data fy = ωψy ξ 1,σ 2 1,λ ωψy ξ 2,σ 2 2,λ The ECM algorthm was run wth 1 startng values and was checked for convergence. All EM teratons under dfferent statng values converge to the same statonary pont wth log-lkelhood The resultng ML estmates and the correspondng standard errors are lsted n Table 1. In ths table, we found that the standard error for λ 2 s relatvely large. Ths s due to the fact that the log-lkelhood functon can be farly flat near the ML estmates of the shape parameter of the skew normal components. We have shown ths by plottng the profle log-lkelhood functon of λ 1,λ 2 n Fgure 1. 2 Profle log-lkelhood λ λ Fgure 1. Plot of the profle log-lkelhood for λ 1 and λ 2 for the enzyme data. For comparson purposes, we also ft a NORMIX model λ 1 = λ 2 = wth g = 2 5 components. The log-lkelhood maxmum and two nformaton-based crtera, AIC Akake 1973 and BIC Schwarz 1978, are dsplayed n the thrd to ffth columns of Table 2. Apparently, the ftted two-component SN- MIX model s superor to the ftted NORMIX model, snce t has the largest
14 922 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN log-lkelhood and the smallest AIC and BIC. The last two columns of ths table present the requred number of EM teratons and the assocated rate of convergence, r, whch s assessed n practce as r = lm t θ t+1 θ t θ t θ t 1. A relatve tolerance of 1 8 for the estmates of all parameters n the model was used as the convergence crteron. We note that the reported rate of convergence depends on the fracton of mssng nformaton and the greater the value of r mples the slower the convergence, see Meng In ths example, we also note that the estmatng procedure for fttng SNMIX model does not converge properly for g 3. Table 2. Comparson of log-lkelhood maxmum, AIC and BIC for ftted SNMIX and NORMIX models usng the enzyme data. The number of parameters and the rate of convergence are denoted by m and r, respectvely. Model g m log-lkelhood AIC BIC Iteratons r SNMIX NORMIX NORMIX NORMIX NORMIX NORMIX 6 > 123 > 185 AIC= 2log-lkelhood m; BIC= 2 log-lkelhood.5m logn } The fathful data As another example, we consder the Old Fathful Geyser data taken from Slverman It conssts of 272 erupton lengths n mnutes of the Old Fathful Geyser n Yellowstone Natonal Park, Wyomng, USA. The data appear to be bmodal wth asymmetrcal components. We ft a two-component SNMIX model 16 by analogy wth the prevous example. The ML estmates and the correspondng standard errors are reported n the second and thrd columns of Table 3, respectvely. We carry out an MCMC smulaton by runnng 1, teratons of ten ndependent parallel chans wth dfferent startng values for each chan over-dspersed around ±3 standard devatons of the ML estmates. The convergence of MCMC samplers s montored by examnng ˆR p values as dscussed n Secton 4.2. The montored values of ˆR p and the determnants of ˆV and W are plotted n Fgures 2a and 2b, respectvely. By examnng both fgures, convergence occurs around 4, teratons. Havng obtaned the remanng
15 SKEW NORMAL MIXTURES 923 converged MCMC smulaton samples, we computed the posteror mean, standard devaton, medan and 95% posteror nterval 2.5% and 97.5% posteror quantles, whch are lsted n the 4th-8th columns of Table 3. MPSRF , 2, 3, teraton no 4, 5, Generalzed Varance , 2, 3, teraton no 4, 5, Fgure 2. a Plot of MPSRF, ˆR p ; b Plot of the determnants 1 13 of ˆV sold and W dashed. Table 3. ML estmaton results and MCMC summary statstcs for the parameters of model 16 wth the fathful data. Parameter ML MCMC Estmate SE Mean SE Medan 2.5% 97.5% ω ξ ξ σ σ λ λ Fgure 3 dsplays the hstograms of the posteror samples of the model parameters. It s evdent that the shape of the posteror dstrbuton of λ 1 s skewed to the rght, whle the shape of the posteror dstrbuton of λ 2 s skewed to the left. It s nterestng to note that the posteror dstrbutons of the parameters λ 1,λ 2, whch regulate the skewness, are skewed as well.
16 924 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN w 1 w ξ 1 ξ σ 2 σ λ 1 λ Fgure 3. Hstograms of the posteror sample of the SNMIX parameters for the fathful data. Fnally, t s nterestng to compare the densty estmaton of NORMIX and SNMIX fttng results. The ML-ftted NORMIX and SNMIX denstes, together wth the Bayesan predctve SNMIX densty, are supermposed n Fgure 4a. Subsequently, the ftted cumulatve densty functons CDFs and the emprcal CDF are shown n Fgure 4b. Based on the graphcal vsualzaton, the resultng ML-ftted SNMIX densty, as well as the Bayesan predctve SNMIX densty, are more sutable than the ML-ftted NORMIX densty for ths data set. Furthermore, the ftted SNMIX CDFs more closely track the emprcal CDF than does the ftted NORMIX CDF. 6. Concludng Remarks In our examples, t s qute appealng that the skew normal mxtures can provde a more approprate densty estmaton than normal mxtures based on nformaton-based crtera and graphcal vsualzaton. There are a number of possble extensons of the current work. Mxture modellng usng the multvarate skew normal dstrbuton e.g., Azzaln and Dalla Valle 1996, Shau, Dey and Branco 23 and Gupta, González-Farías and Domínguez-Monla 24
17 SKEW NORMAL MIXTURES 925 s the most natural extenson and wll be reported n a follow-up paper. In addton, t would be a worthwhle task to model the number of components, g, and component parameters, Θ, jontly. For modellng both skewness and long tals n a mxture context, component denstes usng the skew t dstrbuton e.g., Jones and Faddy 23, Azzaln and Captano 23 and Ln, Lee and Hseh 27 s a feasble choce and awats further nvestgaton. densty CDFs a SNMIXBayesan SNMIX ML NORMIX ML y b Emprcal SNMIXBayesan SNMIX ML NORMIX ML Fgure 4. a Hstogram of the fathful data overlad wth denstes based on two ftted two-component SNMIX ML and Bayesan, and a ML-ftted two-component NORMIX; b Emprcal CDF of the fathful data overlad wth CDFs based on two ftted two-component SNMIX ML and Bayesan and a ML-ftted two-component NORMIX. y
18 926 TSUNG I. LIN, JACK C. LEE AND SHU Y. YEN Acknowledgement We gratefully acknowledge the Char Co-Edtor, an assocate edtor, and one referee for ther valuable comments, whch substantally mproved the qualty of the paper. Ths research was supported by the Natonal Scence Councl of Tawan. References Akake, H Informaton theory and an extenson of the maxmum lkelhood prncple. In 2nd Int. Symp. on Informaton Theory, Edted by B. N. Petrov and F. Csak, Akadema Kado, Budapest. Arnold, B. C., Beaver, R. J., Groeneveld, R. A. and Meeker, W. Q The nontruncated margnal of a truncated bvarate normal dstrbuton. Psychometrka 58, Azzaln, A A class of dstrbutons whch ncludes the normal ones. Scand. J. Statst. 12, Azzaln, A Further results on a class of dstrbutons whch ncludes the normal ones. Statstca 46, Azzaln, A. and Captano, A Statstcal applcatons of the multvarate skew-normal dstrbuton. J. Roy. Statst. Soc. Ser. B 61, Azzaln, A. and Captano, A. 23. Dstrbutons generated by perturbaton of symmetry wth emphass on a multvarate skew t-dstrbuton J. Roy. Statst. Soc. Ser. B 65, Azzaln, A. and Dalla Valle, A The multvarate skew-normal dstrbuton. Bometrka 83, Basord, K. E., Greenway D. R., McLachlan G. J. and Peel D Standard errors of ftted means under normal mxture. Comput. Statst. 12, Bechtel, Y. C., Bonat-Pelleé, C., Posson, N., Magnette, J. and Bechtel, P. R A populaton and famly study of N-acetyltransferase usng caffene urnary metaboltes. Cln. Pharm. Therp. 54, Brooks, S. P. and Gelman, A General methods for montorng convergence of teratve smulatons. J. Comput. Graph. Statst. 7, Dempster, A. P., Lard, N. M. and Rubn, D. B Maxmum lkelhood from ncomplete data va the EM algorthm wth dscusson. J. Roy. Statst. Soc. Ser. B 39, Debolt, J. and Robert, C. P Estmaton of fnte mxture dstrbutons through Bayesan samplng. J. Roy. Statst. Soc. Ser. B 56, Efron B. and Tbshran R Bootstrap method for standard errors, confdence ntervals, and other measures of statstcal accuracy. Statst. Sc. 1, Escobar, M. D. and West, M Bayesan densty estmaton and nference usng mxtures. J. Amer. Statst. Assoc. 9, Gelman, A., Robert, G. and Glks, W Effcent Metropols jumpng rules. In Bayesan Statstcs 5 Edted by J. M. Bernardo, J. O. Berger, A. P. Dawd and A. F. M. Smth. Oxford Unversty Press, New York. Gelman A. and Rubn D. B Inference from teratve smulaton usng multple sequences. Statst, Sc. 7, Genton, M. G. 24. Skew-Ellptcal Dstrbutons and Ther Applcatons. Chapman & Hall, New York.
19 SKEW NORMAL MIXTURES 927 Gupta, A. K., González-Farías G. and Domínguez-Monla, J. A. 24. A multvarate skew normal dstrbuton. J. Multvarate Anal. 89, Hastngs, W. K Monte Carlo samplng methods usng Markov chans and ther applcatons. Bometrka 57, Henze, N A probablstc representaton of the skew-normal dstrbuton. Scand. J. Statst. 13, Jones, M. C. and Faddy, M. J. 23. A skew extenson of the t-dstrbuton, wth applcatons. J. Roy. Statst. Soc. Ser. B 65, Ln, T. I., Lee, J. C. and Hseh, W. J. 27. Robust mxture modelng usng the skew t dstrbuton. Statst. Comput. 17, Lu, C. H. and Rubn, D. B The ECME algorthm: a smple extenson of EM and ECM wth faster monotone convergence. Bometrka 81, Maclean, C. J., Morton, N. E., Elston, R. C. and Yee, S Skewness n commngled dstrbutons. Bometrcs 32, McLachlan, G. J. and Basord, K. E Mxture Models: Inference and Applcaton to Clusterng. Marcel Dekker, New York. McLachlan, G. J. and Peel D. 2. Fnte Mxture Models. Wely, New York. Meng, X. L. and Rubn, D. B Maxmum lkelhood estmaton va the ECM algorthm: A general framework. Bometrka 8, Meng, X. L On the global and componentwse rates of convergence of the EM algorthm. Ln. Alg. Applc. 199, Rchardson, S. and Green, P. J On Bayesan analyss of mxtures wth an unknown number of components. J. R. Statst. Soc. B 59, Roberts, C. and Gesser, S A necessary and suffcent condton for the square of a random varable to be gamma. Bometrka 53, Sahu, S. K., Dey, D. K. and Branco, M. D. 23. A new class of multvarate skew dstrbutons wth applcaton to Bayesan regresson models. Canad. J. Statst. 31, Schwarz, G Estmatng the dmenson of a model. Ann. Statst. 6, Slverman, B. W Densty Estmaton for Statstcs and Data Analyss. Chapman & Hall, London. Stephens, M. 2. Bayesan analyss of mxture models wth an unknown number of components an alternatve to reversble jump methods. Ann. Statst. 28, Ttterngton, D. M., Smth, A. F. M. and Markov, U. E Statstcal Analyss of Fnte Mxture Dstrbutons. Wely, New York. Department of Appled Mathematcs, Natonal Chung Hsng Unversty, Tachung 42, Tawan. E-mal: tln@amath.nchu.edu.tw Insttute of Statstcs and Graduate Insttute of Fnance, Natonal Chao Tung Unversty, Hsnchu 3, Tawan. E-mal: jclee@stat.nctu.edu.tw Insttute of Statstcs, Natonal Chao Tung Unversty, Hsnchu 3, Tawan. E-mal: kelly.st92g@nctu.edu.tw Receved November 24; accepted November 25
Robust mixture modeling using multivariate skew t distributions
Robust mxture modelng usng multvarate skew t dstrbutons Tsung-I Ln Department of Appled Mathematcs and Insttute of Statstcs Natonal Chung Hsng Unversty, Tawan August, 1 T.I. Ln (NCHU Natonal Chung Hsng
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationMaximum likelihood estimation for multivariate skew normal mixture models
Journal of Multvarate Analyss 100 2009 257 265 Contents lsts avalable at ScenceDrect Journal of Multvarate Analyss ournal homepage: wwwelsevercom/locate/mva Maxmum lkelhood estmaton for multvarate skew
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationParametric fractional imputation for missing data analysis
Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool
More informationHidden Markov Models & The Multivariate Gaussian (10/26/04)
CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationOn Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function
On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton
More informationConjugacy and the Exponential Family
CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More informationDurban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications
Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationSmall Area Interval Estimation
.. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationGoodness of fit and Wilks theorem
DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationOn mutual information estimation for mixed-pair random variables
On mutual nformaton estmaton for mxed-par random varables November 3, 218 Aleksandr Beknazaryan, Xn Dang and Haln Sang 1 Department of Mathematcs, The Unversty of Msssspp, Unversty, MS 38677, USA. E-mal:
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationBayesian analysis of Box Cox transformed linear mixed models with ARMA(p, q) dependence
Journal of Statstcal Plannng and Inference 133 (2005) 435 451 www.elsever.com/locate/jsp Bayesan analyss of Box Cox transformed lnear mxed models wth ARMA(p, q) dependence Jack C. Lee a,, Tsung I. Ln b,
More informationSTATS 306B: Unsupervised Learning Spring Lecture 10 April 30
STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationComputing MLE Bias Empirically
Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors
Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference
More informationInternational Journal of Industrial Engineering Computations
Internatonal Journal of Industral Engneerng Computatons 4 (03) 47 46 Contents lsts avalable at GrowngScence Internatonal Journal of Industral Engneerng Computatons homepage: www.growngscence.com/ec Modelng
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More information4.3 Poisson Regression
of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationA Matrix Variate Skew-t Distribution
A Matrx Varate Skew-t Dstrbuton Mchael P.B. Gallaugher and Paul D. McNcholas arv:73.364v3 [stat.me] 3 Apr 7 Dept. of Mathematcs & Statstcs, McMaster Unversty, Hamlton, Ontaro, Canada. Abstract Although
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More informationPrimer on High-Order Moment Estimators
Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationBAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup
BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (
More informationInterval Estimation of Stress-Strength Reliability for a General Exponential Form Distribution with Different Unknown Parameters
Internatonal Journal of Statstcs and Probablty; Vol. 6, No. 6; November 17 ISSN 197-73 E-ISSN 197-74 Publshed by Canadan Center of Scence and Educaton Interval Estmaton of Stress-Strength Relablty for
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationStatistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals
Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationHow its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013
Andrew Lawson MUSC INLA INLA s a relatvely new tool that can be used to approxmate posteror dstrbutons n Bayesan models INLA stands for ntegrated Nested Laplace Approxmaton The approxmaton has been known
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationA Hybrid Variational Iteration Method for Blasius Equation
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method
More informationA New Method for Estimating Overdispersion. David Fletcher and Peter Green Department of Mathematics and Statistics
A New Method for Estmatng Overdsperson Davd Fletcher and Peter Green Department of Mathematcs and Statstcs Byron Morgan Insttute of Mathematcs, Statstcs and Actuaral Scence Unversty of Kent, England Overvew
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationNon-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT
Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationRelevance Vector Machines Explained
October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationNotes prepared by Prof Mrs) M.J. Gholba Class M.Sc Part(I) Information Technology
Inverse transformatons Generaton of random observatons from gven dstrbutons Assume that random numbers,,, are readly avalable, where each tself s a random varable whch s unformly dstrbuted over the range(,).
More informationAn (almost) unbiased estimator for the S-Gini index
An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for
More informationDegradation Data Analysis Using Wiener Process and MCMC Approach
Engneerng Letters 5:3 EL_5_3_0 Degradaton Data Analyss Usng Wener Process and MCMC Approach Chunpng L Hubng Hao Abstract Tradtonal relablty assessment methods are based on lfetme data. However the lfetme
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationAn adaptive SMC scheme for ABC. Bayesian Computation (ABC)
An adaptve SMC scheme for Approxmate Bayesan Computaton (ABC) (ont work wth Prof. Mke West) Department of Statstcal Scence - Duke Unversty Aprl/2011 Approxmate Bayesan Computaton (ABC) Problems n whch
More informationSpace of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics
/7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space
More informationISSN X Robust bayesian inference of generalized Pareto distribution
Afrka Statstka Vol. 112), 2016, pages 1061 1074. DOI: http://dx.do.org/10.16929/as/2016.1061.92 Afrka Statstka ISSN 2316-090X Robust bayesan nference of generalzed Pareto dstrbuton Fatha Mokran 1, Hocne
More informationMultivariate skew t mixture models: applications to fluorescence-activated cell sorting data
Multvarate skew t mxture models: applcatons to fluorescence-actvated cell sortng data Author Wang, Ku, Ng, Shu-Kay, J. McLachlan, Geoffrey Publshed 2009 Conference Ttle Proceedngs: 2009 Dgtal Image Computng:
More informationThe EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X
The EM Algorthm (Dempster, Lard, Rubn 1977 The mssng data or ncomplete data settng: An Observed Data Lkelhood (ODL that s a mxture or ntegral of Complete Data Lkelhoods (CDL. (1a ODL(;Y = [Y;] = [Y,][
More informationAppendix B. The Finite Difference Scheme
140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton
More informationOutline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)
Markov chan Monte Carlo Rasmus Waagepetersen Department of Mathematcs Aalborg Unversty Denmark November, / Outlne for today MCMC / Condtonal smulaton for hgh-dmensonal U: Markov chan Monte Carlo Consder
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationJoint Statistical Meetings - Biopharmaceutical Section
Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve
More informationEfficient Algorithms for Robust Estimation in Linear Mixed-Effects Models Using the Multivariate t-distribution
Effcent Algorthms for Robust Estmaton n Lnear Mxed-Effects Models Usng the Multvarate t-dstrbuton José C. Pnhero, Chuanha Lu Bell Laboratores Lucent Technologes Murray Hll, NJ 07974 and Yngnan Wu Department
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationAdvances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models.
Advances n Longtudnal Methods n the Socal and Behavoral Scences Fnte Mxtures of Nonlnear Mxed-Effects Models Jeff Harrng Department of Measurement, Statstcs and Evaluaton The Center for Integrated Latent
More informationParameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples
Appled Mathematcal Scences, Vol. 5, 011, no. 59, 899-917 Parameters Estmaton of the Modfed Webull Dstrbuton Based on Type I Censored Samples Soufane Gasm École Supereure des Scences et Technques de Tuns
More informationMIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern
More informationMotion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong
Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal
More informationOverview. Hidden Markov Models and Gaussian Mixture Models. Acoustic Modelling. Fundamental Equation of Statistical Speech Recognition
Overvew Hdden Marov Models and Gaussan Mxture Models Steve Renals and Peter Bell Automatc Speech Recognton ASR Lectures &5 8/3 January 3 HMMs and GMMs Key models and algorthms for HMM acoustc models Gaussans
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationPHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University
PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationEfficient nonresponse weighting adjustment using estimated response probability
Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationLecture 6 More on Complete Randomized Block Design (RBD)
Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More information