New mixture models and algorithms in the mixtools package
|
|
- Esmond McLaughlin
- 6 years ago
- Views:
Transcription
1 New mixture models and algorithms in the mixtools package Didier Chauveau MAPMO - UMR Université d Orléans 1ères Rencontres BoRdeaux Juillet 2012
2 Outline 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
3 Univariate mixture example 1: Old Faithful wait times Time between Old Faithful eruptions (minutes) Minutes from Obvious bimodality Normal-looking components? Classification of individuals?
4 Univariate mixture example 2: lifetime data mixture of exponentials Density No obvious multimodality but 2 failure sources suspected? lifetime
5 Univariate mixture example 2: lifetime data mixture of exponentials Density component 1 component 2 mixture No obvious multimodality but 2 failure sources suspected? True components and mixture densities for these simulated data differences in the tails only! lifetime
6 Finite mixture model and missing data setup A complete ith observation Y i = (X i, Z i ) consists of: The missing 0/1 latent variable Z i = (Z i1,..., Z im ), { 1 if X i comes from component j Z ij =, P(Z ij = 1) = λ j 0 otherwise The observed, incomplete data of interest X i with jth component density (X i Z ij = 1) f j Most models in mixtools share in common a finite mixture pdf g θ (x) = m λ j f j (x) j=1 with the model parameters: θ = (λ, f) = (λ 1,..., λ m, f 1,..., f m )
7 Some mixture estimation problems in mixtools Goal: Estimate the parameter θ given an iid sample x from Univariate Cases: x R some parametric families g θ (x) = m λ j f (x φ j ) j=1 semi-parametric location-shift mixture g θ (x) = m λ j f (x µ j ) j=1 mixture of regressions,...
8 Some mixture estimation problems in mixtools Goal: Estimate the parameter θ given an iid sample x from Univariate Cases: x R some parametric families Multivariate cases: x R r parametric (Gaussian,... ) g θ (x) = m λ j f (x φ j ) j=1 g θ (x) = m λ j f (x φ j ) j=1 semi-parametric location-shift mixture g θ (x) = m λ j f (x µ j ) j=1 mixture of regressions,... Fully nonparametric, e.g. g θ (x) = m λ j j=1 k=1 r f jk (x k ) Conditional independence of x 1,..., x r given z
9 Multivariate example: Water-level data Example from psychometrics Thomas Lohaus Brainerd (1993). Subjects are shown r = 8 vessels orientations, presented in this order They draw the water surface for each Measure = signed angle formed by surface with horizontal 11:00 4:00 2:00 7:00 10:00 5:00 1:00 8:00
10 Multivariate example: Water-level data Example from psychometrics Thomas Lohaus Brainerd (1993). Subjects are shown r = 8 vessels orientations, presented in this order They draw the water surface for each Measure = signed angle formed by surface with horizontal Assume that opposite clock-face orientations lead to same behavior = conditionally iid responses (1 & 7), (2 & 8), (4 & 10), (5 & 11), 11:00 4:00 2:00 7:00 10:00 5:00 1:00 8:00
11 Water-level data: histograms by blocks of iid coord. Opposite clock-face = conditionally iid responses 1:00 and 7:00 orientations 2:00 and 8:00 orientations Vessel at Orientation 1: Vessel at Orientation 2: :00 and 10:00 orientations 5:00 and 11:00 orientations Vessel at Orientation 4: Vessel at Orientation 5:
12 EM Algorithm (1/3) Dempster Laird Rubin (1977) General context of missing data: A fraction x of y is observed MLE s on observed data (x) and complete data (y): ˆθ x = arg max θ Θ L x(θ), ˆθ y = arg max θ Θ Lc y(θ), where L x (θ) = n i=1 log g θ(x i ) and L c y(θ) =...
13 EM Algorithm (1/3) Dempster Laird Rubin (1977) General context of missing data: A fraction x of y is observed MLE s on observed data (x) and complete data (y): ˆθ x = arg max θ Θ L x(θ), ˆθ y = arg max θ Θ Lc y(θ), where L x (θ) = n i=1 log g θ(x i ) and L c y(θ) =... Intuition: often the MLE on the complete data ˆθ y is available, while ˆθ x is not Try to maximize instead of L c y(θ), its expectation given x for some θ... Q(θ θ ) := E [L c Y (θ) x; θ ]
14 EM Algorithm (2/3): general principle θ 0 = some arbitrary initialization, θ t = current value: EM θ t θ t+1 iteration Expectation step: compute θ Q(θ θ t ) Maximization step: set θ t+1 = arg max θ Θ Q(θ θ t ). Why does it works under mild conditions? Ascent property for the observed loglikelihood L x (θ t+1 ) L x (θ t ) alternatives: GEM, ECM, MM, Stochastic-EM,...
15 EM Algorithm (3/3): parametric mixture f j = f (x φ j ) E-step: Amounts to find the posterior probabilities ij := E θ t [Z ij x i ] = P θ t [Z ij = 1 x i ] = λt j f (x i φ t j ) j λt j f (x i φ t j ) Z t M-step: Maximization looks like weighted MLE λ t+1 j = 1 n n i=1 Z t ij generic for EM algorithms for mixtures and for e.g., the Gaussian f (x φ j ) = the pdf of N (µ j, v j ), µ t+1 j = n i=1 Z ij t x i n i=1 Z ij t, v t+1 j = n i=1 Z t ij (x i µ t+1 j ) 2 n i=1 Z t ij
16 About Stochastic EM versions In some (mixture) setup, it may be useful to simulate the latent data from the posterior probabilities: Then: Ẑ t i Mult ( 1 ; Z t i1,..., Z t im), i = 1,..., n The complete data (x, Ẑt ) allows direct computation of the MLE the sequence (θ t ) t 1 becomes a Markov Chain Historically, parametric Stochastic EM introduced by Celeux Diebolt (1985, 1986,... ) General convergence properties: Nielsen 2000
17 About Stochastic EM versions In some (mixture) setup, it may be useful to simulate the latent data from the posterior probabilities: Then: Ẑ t i Mult ( 1 ; Z t i1,..., Z t im), i = 1,..., n The complete data (x, Ẑt ) allows direct computation of the MLE the sequence (θ t ) t 1 becomes a Markov Chain Historically, parametric Stochastic EM introduced by Celeux Diebolt (1985, 1986,... ) General convergence properties: Nielsen 2000 In non-parametric framework: Stochastic EM for reliability mixture models, Bordes C (2010, 2012) more on this later
18 Outline: Next up... Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
19 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Old Faithful data with parametric Gaussian EM Time between Old Faithful eruptions (minutes) Density In R with mixtools, type R> data(faithful) R> attach(faithful) R> normalmixem(waiting, mu=c(55,80), sigma=5) number of iterations= Minutes
20 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Old Faithful data with parametric Gaussian EM Time between Old Faithful eruptions (minutes) Density λ 1 = In R with mixtools, type R> data(faithful) R> attach(faithful) R> normalmixem(waiting, mu=c(55,80), sigma=5) number of iterations= 24 Gaussian EM result: ˆµ = (54.6, 80.1) Minutes
21 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Univariate identifiable semiparametric mixtures One originality of mixtools: Its capability to handle various nonparametric components f j s in g θ (x) = m j=1 λ jf j (x), for several models Benaglia C Hunter Young (J. Stat. Soft. 2009)
22 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Univariate identifiable semiparametric mixtures One originality of mixtools: Its capability to handle various nonparametric components f j s in g θ (x) = m j=1 λ jf j (x), for several models Benaglia C Hunter Young (J. Stat. Soft. 2009) Location-shift semi-parametric mixture model: g θ (x) = λ 1 f (x µ 1 ) + (1 λ 1 )f (x µ 2 ) This model is identifiable when f ( ) is even. Bordes Mottelet Vandekerkhove (2006), Hunter Wang Hettmansperger (2007) Bordes C Vandekerkhove (2007) introduced an EM-like algorithm that includes a Kernel Density Estimation (KDE) step.
23 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms mixtools Semi-parametric EM algorithm E-step: Same as usual: Z t ij E θ t [Z ij x i ] = λ t j f t (x i µ t j ) λ t 1 f t (x i µ t 1 ) + λt 2 f t (x i µ t 2 ) M-step: Maximize complete data loglikelihood for λ and µ: λ t+1 j = 1 n n i=1 Zij t µ t+1 j = (nλ t+1 j ) 1 n i=1 Z t ij x i Weighted KDE-step: Update f t (for some bandwidth h) by ( f t+1 (u) = 1 n 2 u (xi Zij t nh K µ t+1 ) j ), then symmetrize. h i=1 j=1
24 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Location-shift semi-parametric mixture model Time between Old Faithful eruptions (minutes) Density λ 1 = Gaussian EM: ˆµ = (54.6, 80.1) Minutes
25 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Location-shift semi-parametric mixture model g θ (x) = λ 1 f (x µ 1 ) + (1 λ 1 )f (x µ 2 ) Time between Old Faithful eruptions (minutes) Density λ 1 = λ 1 = Gaussian EM: ˆµ = (54.6, 80.1) Semiparametric EM R> spemsymloc(waiting, mu=c(55,80), h=4) # bandwidth ˆµ = (54.7, 79.8) Minutes
26 Outline: Next up... Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
27 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Identifiability: the blessing of dimensionality (!) Recall the model in the multivariate case, r > 1: g θ (x) = m r f jk (x k ) λ j j=1 k=1 N.B.: Assume conditional independence of x 1,..., x r Hall Zhou (2003) show that when m = 2 and r 3, the model is identifiable under mild restrictions on the f jk ( ) Hall et al. (2005)... from at least one point of view, the curse of dimensionality works in reverse. Allman et al. (2008) give mild sufficient conditions for identifiability whenever r 3
28 The notation gets even worse... Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Motivation: the Water-level data with 4 blocks of 2 similar responses Let the r coordinates be grouped into B r blocks of conditionally iid coordinates. b k {1,..., B} is the block index of the kth coordinate The model becomes g(x) = m λ j j=1 k=1 r f jbk (x k ) Special cases: b k = k for k = 1,..., r: general model of conditional independence as in (Hall et al ) b k = 1 for all k: Conditionally i.i.d. assumption (Elmore et al. 2004)
29 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Nonparametric EM Benaglia C Hunter (2009, 2011) E-step: Same as for a genuine parametric EM, r k=1 f jb t k (x ik ) Z t ij = λt j j λt j r k=1 f t j b k (x ik ) M-step: Maximize complete data loglikelihood for λ: λ t+1 j = 1 n Zij t n WKDE-step: Update estimate of f jl (component j, block l) by ( ) r n f t+1 jl (u) = Zij t I {b k =l}k 1 nh t+1 jl C l λ t+1 j i=1 k=1 i=1 u x ik h t+1 jl where C l = # of coordinates in block l, = Iterative and per component & block adaptive bandwidth h t+1 jl
30 The Water-level data Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms Dataset previously analysed with conditional i.i.d. assumption. Hettmansperger Thomas (2000), Elmore et al. (2004) The non appropriate conditional i.i.d. assumption masks interesting features that our model reveals In mixtools: npem algorithm R> library(mixtools) R> data(waterdata) R> a <- npem(waterdata, 3, blockid=c(1,2,3,4,2,1,4,3), h=4) # user-fixed bandwidth here R> par(mfrow=c(2,2)) R> plot(a) R> summary(a) # some statistics per components...
31 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms The Water-level data, m = 3 components, 4 blocks Block 1: 1:00 and 7:00 orientations Block 2: 2:00 and 8:00 orientations Appearance of Vessel at Orientation = 1:00 Mixing Proportion (Mean, Std Dev) ( 32.1, 19.4) ( 3.9, 23.3) ( 1.4, 6.0) Appearance of Vessel at Orientation = 2:00 Mixing Proportion (Mean, Std Dev) ( 31.4, 55.4) ( 11.7, 27.0) ( 2.7, 4.6) Block 3: 4:00 and 10:00 orientations Block 4: 5:00 and 11:00 orientations Appearance of Vessel at Orientation = 4:00 Mixing Proportion (Mean, Std Dev) ( 43.6, 39.7) ( 11.4, 27.5) ( 1.0, 5.3) Appearance of Vessel at Orientation = 5:00 Mixing Proportion (Mean, Std Dev) ( 27.5, 19.3) ( 2.0, 22.1) ( 0.1, 6.1)
32 Outline: Next up... 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
33 From npem to NEMS for nonparametric mixtures Motivation: Our npem algorithms are not true EM Idea: combine regularization and npem approach to define an algorithm with a provable ascent property Levine Hunter C (Biometrika 2011) Nonparametric in smooth EM literature refers to a continuous mixing distribution true EM but ill-posed difficulties, Vardi et al. (1985) Smoothed EM (EMS), Silverman et al. (1990) Nonlinear EMS (NEMS) regularization approach Eggermont, LaRiccia (1995); Eggermont (1992, 1999)
34 Smoothing the mixture Applying Eggermont (1992, 1999) idea to component pdf s Nonlinear smoothing N f (x) = exp Ω K h (x u) log f (u) du, where K h (u) = h r r k=1 K (h 1 u k ) is a product kernel
35 Smoothing the mixture Applying Eggermont (1992, 1999) idea to component pdf s Nonlinear smoothing N f (x) = exp Ω K h (x u) log f (u) du, where K h (u) = h r r k=1 K (h 1 u k ) is a product kernel For f = (f 1,..., f m ), define m M λ N f(x) := λ j N f j (x) Goal: minimizing the -sample objective function (Kullback) g(x) l(θ) = l(f, λ) := g(x) log [M λ N f](x) dx. Ω j=1
36 Majorization-Minimization (MM) trick Principle: Let θ t be the current value of θ in an algorithm. A function B(θ θ t ) is said to majorize l(θ) at θ t provided B(θ θ t ) l(θ) for all θ B(θ t θ t ) = l(θ t ) MM minimization algorithm: set θ t+1 = arg min θ B(θ θ t ) The MM algorithm satisfies the descent property l(θ t+1 ) l(θ t ) and we can define such a majorizing function B(θ θ t ) here!
37 MM algorithm with a descent property Smoothed log-likelihood version for finite sample-size given the sample x 1,..., x n iid g l n (f, λ) := n log[m λ N f](x i ) i=1 The following MM algorithm satisfies a descent property l n (f t+1, λ t+1 ) l n (f t, λ t )
38 nonparametric Maximum Smoothed Likelihood (npmsl) algorithm E-step: (requires additional univariate numerical integrations) r k=1 N f jk t (x ik) w t ij = λt j N f t j (x i) M λ t N f t (x i ) = M-step: for j = 1,..., m λ t+1 j λ t j m j =1 λ j r k=1 N f t j k (x ik) = 1 n WKDE-step: For each component j and coord. k (or blocks), f t+1 1 n ( ) u jk (u) = wij t K xik h nhλ t+1 j i=1 n i=1 w t ij
39 The Water-level data again, npem vs. npmsl (dotted) Block 4: 5:00 and 11:00 Orientations Block 3: 4:00 and 10:00 Orientations Density Block 2: 2:00 and 8:00 Orientations Density Block 1: 1:00 and 7:00 Orientations npmsl in mixtools: R> b <- npmsl(waterdata, 3, blockid=c(1,2,3,4,2,1,4,3), h=4) # bandwidth Density Density R> plot(b)
40 Further extensions: Semiparametric models Component or block density may differ only in location and/or scale parameters, e.g. f jl (x) = 1 ( ) x µjl f j σ jl or or f jl (x) = 1 σ jl f l f jl (x) = 1 σ jl f where f j, f l, f remain fully unspecified σ jl ( ) x µjl σ jl ( ) x µjl For all these situations special cases of the npem/npmsl algorithm can be designed (some are already in mixtools). σ jl
41 Outline: Next up... 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
42 Bordes C (2010, 2012) Typical data from reliability and life testing + mixture model: lifetime data on R + (X 1,..., X n ) iid g θ = j λ jf j (x) censoring times (C1,..., C n ) iid q (unknown) Random (right) censoring T i = min(x i, C i ), D i = I {Xi C i } Observed (incomplete) data (t, d) = ((t i, d i ), i = 1,... n)
43 Bordes C (2010, 2012) Typical data from reliability and life testing + mixture model: lifetime data on R + (X 1,..., X n ) iid g θ = j λ jf j (x) censoring times (C1,..., C n ) iid q (unknown) Random (right) censoring T i = min(x i, C i ), D i = I {Xi C i } Observed (incomplete) data (t, d) = ((t i, d i ), i = 1,... n) A semiparametric example: Semiparametric accelerated lifetime mixture model g θ (x) = λf (x) + (1 λ)ξ f (ξx) Scaling model: (X Z 1 = 1) has distribution U f, and (X Z 2 = 1) U/ξ, identifiable under some restrictions
44 Semi-parametric Stochastic-EM for censored data (1) Denote F the cdf of f, S = 1 F the Survival function and α = f /S the hazard rate Building blocks: For a single censored sample (t, d), S and α can be estimated nonparametrically using Kaplan-Meier and Nelson-Aalen estimators.
45 Semi-parametric Stochastic-EM for censored data (1) Denote F the cdf of f, S = 1 F the Survival function and α = f /S the hazard rate Building blocks: For a single censored sample (t, d), S and α can be estimated nonparametrically using Kaplan-Meier and Nelson-Aalen estimators. E(X Z j = 1) = + 0 S j (s)ds, where S j (s) = P(X > s Z j = 1) is the jth component survival
46 Semi-parametric Stochastic-EM for censored data (1) Denote F the cdf of f, S = 1 F the Survival function and α = f /S the hazard rate Building blocks: For a single censored sample (t, d), S and α can be estimated nonparametrically using Kaplan-Meier and Nelson-Aalen estimators. E(X Z j = 1) = + 0 S j (s)ds, where S j (s) = P(X > s Z j = 1) is the jth component survival In this scaling model ξ = E(X Z 1=1) E(X Z 2 =1)
47 Semi-parametric Stochastic-EM for censored data (1) Denote F the cdf of f, S = 1 F the Survival function and α = f /S the hazard rate Building blocks: For a single censored sample (t, d), S and α can be estimated nonparametrically using Kaplan-Meier and Nelson-Aalen estimators. E(X Z j = 1) = + 0 S j (s)ds, where S j (s) = P(X > s Z j = 1) is the jth component survival In this scaling model ξ = E(X Z 1=1) E(X Z 2 =1) If X comes from component 2, then ξx f, so that {X i : Z i1 = 1} {ξx i : Z i2 = 1} iid f
48 Semi-parametric Stochastic-EM for censored data (1) Denote F the cdf of f, S = 1 F the Survival function and α = f /S the hazard rate Building blocks: For a single censored sample (t, d), S and α can be estimated nonparametrically using Kaplan-Meier and Nelson-Aalen estimators. E(X Z j = 1) = + 0 S j (s)ds, where S j (s) = P(X > s Z j = 1) is the jth component survival In this scaling model ξ = E(X Z 1=1) E(X Z 2 =1) If X comes from component 2, then ξx f, so that {X i : Z i1 = 1} {ξx i : Z i2 = 1} iid f Estimates ˆξ and ˆf = ˆαŜ available in mixture setup if Z is observed
49 Semi-parametric St-EM for censored data (2) Stochastic version required here! E-step: if d i = 0 (censored lifetime) else (d i = 1 observed lifetime) Z t i1 = λ t S t (t i ) λ t S t (t i ) + (1 λ t )S t (ξ t t i ) Z t i1 = λ t α t (t i )S t (t i ) λ t α t (t i )S t (t i ) + (1 λ t )ξ t α t (ξ t t i )S k (ξ k t i ) S-step: Simulate Ẑ i1 t B(pt i1 ), i = 1,..., n, and set χ t j = {i {1,..., n} : Ẑ i1 t = 1}, j = 1, 2
50 Semi-parametric St-EM for censored data (3) M-step for λ, ξ: λ t+1 = Card(χt 1 ), ξ t+1 = n τ t 1 0 Ŝt 1 (s)ds τ t, τ t 2 0 Ŝt 2 (s)ds j = max where Ŝt j is the Kaplan-Meier estimate based on {(t i, d i ); i χ t j } Nonparametric-step for α, S: Let t t be the order statistic of {t i ; i χ t 1 } {ξt t i ; i χ t 2 } and dt the associated censoring indicators α t+1 (s) = n 1 h K i=1 S t+1 (s) = i:t t i s ( s t t ) i h ( 1 d t i n i + 1 d t i n i + 1 ) for a kerneld. KChauveau and bandwidth RR Bordeaux 2012h. New algorithms in mixtools i χ t j t i
51 Simulated example: scale mixture of Lognormals n = 300, 15% censored scaling weight of component in mixtools: semiparametric Reliability Mixture Model Stochastic EM iterations iterations f(x) Survival function estimate true density Density estimate true R> library(mixtools) R> library(survival) # for KM estimation # simulating data... R> s <- sprmmsem(t, d, scaling=0.1) R> plot(s) time time
52 Outline: Next up... 1 Mixture models and EM algorithm-ology 2 Univariate (semiparametric) mixtures Nonparametric multivariate EM algorithms 3
53 Other new models/algorithms available (1) Parametric (Exponential, Weibull, lognormal... ) reliability mixture models for censored data
54 Other new models/algorithms available (1) Parametric (Exponential, Weibull, lognormal... ) reliability mixture models for censored data Gaussian mixtures with linear constraints on the parameters, e.g. for p m, known M, C, µ = µ 1. µ m = Mβ + C = M β 1. β p C 1 +. C m and similar constaints on variances requires ECM and/or MM algorithms C Hunter (2011)
55 Other new models/algorithms available (1) Parametric (Exponential, Weibull, lognormal... ) reliability mixture models for censored data Gaussian mixtures with linear constraints on the parameters, e.g. for p m, known M, C, µ = µ 1. µ m = Mβ + C = M β 1. β p C 1 +. C m and similar constaints on variances requires ECM and/or MM algorithms C Hunter (2011) Advances in mixtures of regression: predictor-dependent mixing proportions Young, Hunter (CSDA 2010)
56 Other new models (2): Mixtures in FDR estimation In multiple testing (e.g., micro-arrays) we consider n multiple tests p-values p = (p 1,..., p n ) question: False Discovery Rate (FDR) = expected proportion of falsely rejected H 0 s
57 Other new models (2): Mixtures in FDR estimation In multiple testing (e.g., micro-arrays) we consider n multiple tests p-values p = (p 1,..., p n ) question: False Discovery Rate (FDR) = expected proportion of falsely rejected H 0 s mixture modelling: latent variable Z i = 1 if H 0 is true and Z i = 0 if H 0 is rejected (= interesting case) g θ (p) = λ 0 f 0 (p) + (1 λ 0 )f 1 (p) theoretically (p i H 0 true) U [0,1] f 0 known!
58 Other new models (2): Mixtures in FDR estimation In multiple testing (e.g., micro-arrays) we consider n multiple tests p-values p = (p 1,..., p n ) question: False Discovery Rate (FDR) = expected proportion of falsely rejected H 0 s mixture modelling: latent variable Z i = 1 if H 0 is true and Z i = 0 if H 0 is rejected (= interesting case) g θ (p) = λ 0 f 0 (p) + (1 λ 0 )f 1 (p) theoretically (p i H 0 true) U [0,1] f 0 known! nonparametric assumption for f 1 as e.g. in Robin et. al or the fdrtool package Strimmer 2008
59 Other new models (2): Mixtures in FDR estimation In multiple testing (e.g., micro-arrays) we consider n multiple tests p-values p = (p 1,..., p n ) question: False Discovery Rate (FDR) = expected proportion of falsely rejected H 0 s mixture modelling: latent variable Z i = 1 if H 0 is true and Z i = 0 if H 0 is rejected (= interesting case) g θ (p) = λ 0 f 0 (p) + (1 λ 0 )f 1 (p) theoretically (p i H 0 true) U [0,1] f 0 known! nonparametric assumption for f 1 as e.g. in Robin et. al or the fdrtool package Strimmer 2008 In mixtools: a semiparametric EM with one component known, for FDR estimation C Saby (2011, 2012)
60 The end MERCI DE VOTRE ATTENTION! ET
61 For Further Reading (1/3) Benaglia, T., Chauveau, D., and Hunter, D. R. An EM-like algorithm for semi- and non-parametric estimation in mutlivariate mixtures. J. Comput. Graph. Statist. 18(2): , Benaglia T., Chauveau D., Hunter D. R., Young D. S. mixtools: An R Package for Analyzing Mixture Models. Journal of Statistical Software 32: 1 29, L. Bordes, D. Chauveau, and P. Vandekerkhove. A stochastic EM algorithm for a semiparametric mixture model. Comput. Statist. & Data Anal., 51(11): , 2007.
62 For Further Reading (2/3) Bordes, L. and Chauveau D. EM and Stochastic EM algorithms for reliability mixture models under random censoring. Preprint HAL, 2012 Levine, M., Hunter, D. R.,Chauveau, D. Maximum Smoothed Likelihood for Multivariate Mixtures. Biometrika 98(2): , E. S. Allman, C. Matias, and J. A. Rhodes. Identifiability of parameters in latent structure models with many observed variables. Ann. Statist, 37(6A): , 2009.
63 For Further Reading (3/3) S. F. Nielsen. The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli, 6(3): , D. Young and D. Hunter. Mixtures of regressions with predictor-dependent mixing proportions. Computational Statistics and Data Analysis, 54: , 2010.
Estimation for nonparametric mixture models
Estimation for nonparametric mixture models David Hunter Penn State University Research supported by NSF Grant SES 0518772 Joint work with Didier Chauveau (University of Orléans, France), Tatiana Benaglia
More informationMaximum Smoothed Likelihood for Multivariate Nonparametric Mixtures
Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,
More informationSemi-parametric estimation for conditional independence multivariate finite mixture models
Statistics Surveys Vol. 9 (2015) 1 31 ISSN: 1935-7516 DOI: 10.1214/15-SS108 Semi-parametric estimation for conditional independence multivariate finite mixture models Didier Chauveau Université d Orléans,
More informationAn EM-like algorithm for semi- and non-parametric estimation in multivariate mixtures
An EM-like algorithm for semi- and non-parametric estimation in multivariate mixtures Tatiana Benaglia 1 Didier Chauveau 2 David R. Hunter 1,3 August 2008 1 Pennsylvania State University, USA 2 Université
More informationNONPARAMETRIC ESTIMATION IN MULTIVARIATE FINITE MIXTURE MODELS
The Pennsylvania State University The Graduate School Department of Statistics NONPARAMETRIC ESTIMATION IN MULTIVARIATE FINITE MIXTURE MODELS A Dissertation in Statistics by Tatiana A. Benaglia 2008 Tatiana
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationComputation of an efficient and robust estimator in a semiparametric mixture model
Journal of Statistical Computation and Simulation ISSN: 0094-9655 (Print) 1563-5163 (Online) Journal homepage: http://www.tandfonline.com/loi/gscs20 Computation of an efficient and robust estimator in
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationOptimization Methods II. EM algorithms.
Aula 7. Optimization Methods II. 0 Optimization Methods II. EM algorithms. Anatoli Iambartsev IME-USP Aula 7. Optimization Methods II. 1 [RC] Missing-data models. Demarginalization. The term EM algorithms
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationEmpirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm
Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationMaximum likelihood estimation of a log-concave density based on censored data
Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationON SOME TWO-STEP DENSITY ESTIMATION METHOD
UNIVESITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLIII 2005 ON SOME TWO-STEP DENSITY ESTIMATION METHOD by Jolanta Jarnicka Abstract. We introduce a new two-step kernel density estimation method,
More informationSurvival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University
Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationSTAT 6350 Analysis of Lifetime Data. Probability Plotting
STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life
More informationParameter Estimation
Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods
More informationEM for ML Estimation
Overview EM for ML Estimation An algorithm for Maximum Likelihood (ML) Estimation from incomplete data (Dempster, Laird, and Rubin, 1977) 1. Formulate complete data so that complete-data ML estimation
More informationIntroduction An approximated EM algorithm Simulation studies Discussion
1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and
More informationEM for Spherical Gaussians
EM for Spherical Gaussians Karthekeyan Chandrasekaran Hassan Kingravi December 4, 2007 1 Introduction In this project, we examine two aspects of the behavior of the EM algorithm for mixtures of spherical
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationLatent Variable Models and EM algorithm
Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic
More informationThe EM Algorithm for the Finite Mixture of Exponential Distribution Models
Int. J. Contemp. Math. Sciences, Vol. 9, 2014, no. 2, 57-64 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.312133 The EM Algorithm for the Finite Mixture of Exponential Distribution
More informationMaximum Likelihood Estimation. only training data is available to design a classifier
Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional
More informationNonparametric Function Estimation with Infinite-Order Kernels
Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density
More informationStatistical Analysis of Data Generated by a Mixture of Two Parametric Distributions
UDC 519.2 Statistical Analysis of Data Generated by a Mixture of Two Parametric Distributions Yu. K. Belyaev, D. Källberg, P. Rydén Department of Mathematics and Mathematical Statistics, Umeå University,
More informationMixture Models and Expectation-Maximization
Mixture Models and Expectation-Maximiation David M. Blei March 9, 2012 EM for mixtures of multinomials The graphical model for a mixture of multinomials π d x dn N D θ k K How should we fit the parameters?
More information1 Glivenko-Cantelli type theorems
STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More informationStatistics 262: Intermediate Biostatistics Non-parametric Survival Analysis
Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationClustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation
Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide
More informationVariable selection for model-based clustering
Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationEmpirical Likelihood in Survival Analysis
Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationand Comparison with NPMLE
NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationChapter 3 : Likelihood function and inference
Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering
ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationMIXTURE MODELS AND EM
Last updated: November 6, 212 MIXTURE MODELS AND EM Credits 2 Some of these slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Simon Prince, University College London Sergios Theodoridis,
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationEM Algorithm. Expectation-maximization (EM) algorithm.
EM Algorithm Outline: Expectation-maximization (EM) algorithm. Examples. Reading: A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc.,
More informationKernel density estimation for heavy-tailed distributions...
Kernel density estimation for heavy-tailed distributions using the Champernowne transformation Buch-Larsen, Nielsen, Guillen, Bolance, Kernel density estimation for heavy-tailed distributions using the
More informationKey Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.
POWER-LAW ADJUSTED SURVIVAL MODELS William J. Reed Department of Mathematics & Statistics University of Victoria PO Box 3060 STN CSC Victoria, B.C. Canada V8W 3R4 reed@math.uvic.ca Key Words: survival
More informationExpectation Maximization
Expectation Maximization Aaron C. Courville Université de Montréal Note: Material for the slides is taken directly from a presentation prepared by Christopher M. Bishop Learning in DAGs Two things could
More informationLabel Switching and Its Simple Solutions for Frequentist Mixture Models
Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationAlgorithmic approaches to fitting ERG models
Ruth Hummel, Penn State University Mark Handcock, University of Washington David Hunter, Penn State University Research funded by Office of Naval Research Award No. N00014-08-1-1015 MURI meeting, April
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationMaximum Lq-Likelihood Estimation via the. Expectation Maximization Algorithm: A Robust Estimation of Mixture Models
Maximum Lq-Likelihood Estimation via the Expectation Maximization Algorithm: A Robust Estimation of Mixture Models Yichen Qin and Carey E. Priebe Abstract We introduce a maximum Lq-likelihood estimation
More informationA Conditional Approach to Modeling Multivariate Extremes
A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationComputing the MLE and the EM Algorithm
ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationPractice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:
Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation
More informationIssues on Fitting Univariate and Multivariate Hyperbolic Distributions
Issues on Fitting Univariate and Multivariate Hyperbolic Distributions Thomas Tran PhD student Department of Statistics University of Auckland thtam@stat.auckland.ac.nz Fitting Hyperbolic Distributions
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationLast lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton
EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares
More informationSize and Shape of Confidence Regions from Extended Empirical Likelihood Tests
Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationModel-based cluster analysis: a Defence. Gilles Celeux Inria Futurs
Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationSpatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood
Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationClustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26
Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationFitting Narrow Emission Lines in X-ray Spectra
Outline Fitting Narrow Emission Lines in X-ray Spectra Taeyoung Park Department of Statistics, University of Pittsburgh October 11, 2007 Outline of Presentation Outline This talk has three components:
More information