Bayesian Nonparametric Regression through Mixture Models
|
|
- Gillian Alexander
- 5 years ago
- Views:
Transcription
1 Bayesian Nonparametric Regression through Mixture Models Sara Wade Bocconi University Advisor: Sonia Petrone October 7, 2013
2 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
3 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
4 Introduction Aim: flexible regression, where both the mean and error distribution may evolve flexibly with x.
5 Introduction Aim: flexible regression, where both the mean and error distribution may evolve flexibly with x. Many approaches for flexible regression (ex. splines, wavelets, Gaussian processes), but typically assume iid Gaussian errors, Y i = m(x i ) + ɛ i, ɛ i σ 2 iid N(0, σ 2 ).
6 Introduction Aim: flexible regression, where both the mean and error distribution may evolve flexibly with x. Many approaches for flexible regression (ex. splines, wavelets, Gaussian processes), but typically assume iid Gaussian errors, Y i = m(x i ) + ɛ i, ɛ i σ 2 iid N(0, σ 2 ). For i.i.d. data, mixtures are a useful tool for flexible density estimation.
7 Introduction Aim: flexible regression, where both the mean and error distribution may evolve flexibly with x. Many approaches for flexible regression (ex. splines, wavelets, Gaussian processes), but typically assume iid Gaussian errors, Y i = m(x i ) + ɛ i, ɛ i σ 2 iid N(0, σ 2 ). For i.i.d. data, mixtures are a useful tool for flexible density estimation. Extend mixture models for conditional density estimation.
8 Mixture Models Given f, assume Y i are iid with density f p (y) = K(y θ)dp(θ), for some parametric density K(y θ).
9 Mixture Models Given f, assume Y i are iid with density f p (y) = K(y θ)dp(θ), for some parametric density K(y θ). Bayesian setting, define a prior on P, where typically P is discrete a.s., P = J J w j δ θj, f P (y) = w j K(y θ j ). j=1 j=1
10 Mixture Models Given f, assume Y i are iid with density f p (y) = K(y θ)dp(θ), for some parametric density K(y θ). Bayesian setting, define a prior on P, where typically P is discrete a.s., P = J J w j δ θj, f P (y) = w j K(y θ j ). j=1 j=1 How many components? Fixed J: overfitted mixtures (Rousseau and Mengersen (2011)) or post-processing techniques. Random J: reversible jump MCMC. J = : Dirichlet process (DP) mixtures and developments in BNP.
11 Extending Mixture Models Two main lines of research: Joint approach: Augment the observations to include X and assume that given f P, the (Y i, X i ) are iid with density f P (y, x) = K(y, x θ)dp(θ) = w j K(y, x θ j ). j=1 Conditional density estimates are obtained as a by-product through j=1 f P (y x) = w jk(y, x θ j ) j =1 w j K(x θ j ). Conditional approach: Allow the mixing measure to depend on x and assign a prior so that {P x } x X are dependent. Assume that given f Px and x, the Y i are independent with density f Px (y x) = K(y θ)dp x (θ) = w j (x)k(y θ j (x)). j=1
12 Mixture Models for Regression Many proposals in literature fall into these two general categories: 1. Joint Approach: Müller, Erkanli, and West (1996), Shahbaba and Neal (2009), Hannah, Blei, and Powell (2011), Park and Dunson (2010), Müller and Quintana (2010), Bhattacharya, Page, and Dunson (2012), Quintana (2011) Conditional Approach: a. Covariate-dependent atoms: MacEachern (1999; 2000; 2001), DeIorio et al. (2004), Gelfand, Kottas, and MacEachern (2005), Rodriguez and Horst (2008), De La Cruz, Quintana, and Müller (2007), De Iorio et al. (2009), Jara et al. (2010)... Example: f Px (y x) = j=1 w jn(y µ j (x), σj 2). b. Covariate-dependent weights: Griffin and Steele (2006; 2010), Dunson and Park (2008), Ren et al. (2011), Chung and Dunson (2009), Rodriguez and Dunson (2011)... (also models based on covariate-dependent urn schemes or partitions). Example: f Px (y x) = j=1 w j(x)n(y xβ j, σj 2).
13 Mixture Models for regression How to choose among the large number of proposals for the application at hand?
14 Mixture Models for regression How to choose among the large number of proposals for the application at hand? Computational reasons Frequentist properties: consistency (Hannah, Blei, and Powell (2011), Norets and Pelenis (2012), Pati, Dunson, and Tokdar (2012)), but may hide what happens in finite samples. Convergence rates?
15 Mixture Models for regression How to choose among the large number of proposals for the application at hand? Computational reasons Frequentist properties: consistency (Hannah, Blei, and Powell (2011), Norets and Pelenis (2012), Pati, Dunson, and Tokdar (2012)), but may hide what happens in finite samples. Convergence rates? Our aim is to answer this question from a Bayesian perspective. We do so through a detailed study of predictive performance based on finite samples, identifying potential sources of improvement, developing novel models to improve predictions.
16 Motivating Application Motivating application: study of Alzheimer s Disease based on neuroimaging data. Definite diagnosis isn t known until autopsy, and current diagnosis is based on clinical information. Aim: show that neuroimages can be used to improve diagnostic accuracy and may be better outcomes measures for clinical trials particularly in early stages of the disease. Prior knowledge, need for dimension reduction, non-homogenity BNP approach.
17 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
18 EDP (with S. Mongelluzzo and S. Petrone) In many applications, the DP is a prior for a multivariate random probability measure. DP has many desirable properties (easy elicitation of parameters: α, P 0, conjugacy, large support) However, when considering set of probability measures on R k, a DP prior can be quite restrictive in the sense that the variability is determined by a single parameter, α. Enriched conjugate priors (Consonni and Veronese (2001)) have been proposed to address an analogous lack of flexibility of standard conjugate priors in a parametric setting. Solution: Construct a enriched DP that is more flexible, yet still conjugate by extending the idea of enriched conjugate priors to a nonparametric setting.
19 EDP: Definition Sample space: Θ Ψ, where Θ, Ψ are complete separable metric spaces. Enriched Dirichlet Process: P 0 is a probability measure on Θ Ψ, α θ > 0, and α ψ (θ) > 0 θ Θ. Assume: 1. P θ DP(α θ P 0 θ ). 2. θ θ, P ψ θ ( θ) DP(α ψ (θ)p 0 ψ θ ( θ)). 3. P ψ θ ( θ), θ Θ are independent among themselves. 4. P θ is independent of { P ψ θ ( θ) }. θ Θ The law of the random joint is obtained from the joint law of the marginal and conditionals through the mapping (P θ, P ψ θ ) ( ) P ψ θ( θ)dp θ (θ). Many more precision parameters, yet conjugacy and many other desirable properties maintained.
20 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
21 Summary DP mixtures models are popular tools in BNP - known properties and analytically computable urn scheme. However, as p increases, x tends to dominate the posterior of the random partition, negatively affecting the prediction. Solution: replace the DP with the EDP, to allow local clustering, yet maintain an analytically computable urn scheme.
22 Joint DP mixture model Model (Müller et al. (1996)): f P (y, x) = N(y, x µ, Σ)dP(µ, Σ), P DP(αP 0 ), where P 0 is the conjugate normal-inverse Wishart. Problems: requires sampling the full p + 1xp + 1 covariance matrix and the inverse Wishart is poorly parametrized.
23 Joint DP mixture model Model (Müller et al. (1996)): f P (y, x) = N(y, x µ, Σ)dP(µ, Σ), P DP(αP 0 ), where P 0 is the conjugate normal-inverse Wishart. Problems: requires sampling the full p + 1xp + 1 covariance matrix and the inverse Wishart is poorly parametrized. Model (Shahbaba and Neal (2009)): p f P (y, x) = N(y xβ, σy x 2 ) N(x µ h, σx,h 2 )dp(θ, ψ), h=1 P DP(αP 0 θ P 0 ψ ), where θ = (β, σ 2 y x ), ψ = (µ, σ2 x), and P 0 θ, P 0 ψ are the conjugate normal-inverse gamma priors. Improvements: 1) eases computations, 2) more flexible prior for the variances, 3) flexible local correlations between Y and X, and 4) easy incorporation of other types of Y and X.
24 Examining the Partition P is discrete a.s. a random partition of the subjects. Notation: ρ n = (s 1,..., s n )-(partition), (θj, ψ j )-(unique parameters), xj, y j -(cluster-specific observations). Partition: p(ρ n x 1:n, y 1:n ) α k k k Γ(n j ) j=1 p j=1 h=1 g x,h (x j,h ) g y (y j x j ), where, for ex. g y (y j x j ) = i:s i =j K(y i x i, θ)dp 0 θ (θ).
25 Examining the Partition P is discrete a.s. a random partition of the subjects. Notation: ρ n = (s 1,..., s n )-(partition), (θj, ψ j )-(unique parameters), xj, y j -(cluster-specific observations). Partition: p(ρ n x 1:n, y 1:n ) α k k k Γ(n j ) j=1 p j=1 h=1 g x,h (x j,h ) g y (y j x j ), where, for ex. g y (y j x j ) = i:s i =j K(y i x i, θ)dp 0 θ (θ). As p increases, x dominates the partition structure. x exhibits increasing departures. large k and small n 1,..., n k with high posterior probability.
26 Examining the Prediction How does this affect prediction? E[Y n+1, y 1:n, x 1:n+1] = P n w k+1 (x n+1)x n+1 β 0 + w j (x n+1) = c 1 n j p h=1 w k+1 (x n+1) = c 1 αg x(x n+1). k w j (x n+1)x n+1 βj dp(ρ n, θ, φ y 1:n, x 1:n+1), j=1 N(x n+1,h µ j,h, σ 2 x,j,h) for j = 1,..., k, As p increases, given the partition, predictive estimates are an average over the large number of components. within component, predictive estimates are unreliable and variable due to small sample sizes. rigid measure of similarity in the covariate space.
27 An EDP mixture model for regression (with S. Petrone) Model: f P (y, x) = N(y xβ, σ 2 y x ) p h=1 N(x µ h, σx,h 2 )dp(θ, ψ), P EDP(α θ, α ψ ( ), P 0 θ P 0 ψ ),
28 An EDP mixture model for regression (with S. Petrone) Model: f P (y, x) = N(y xβ, σ 2 y x ) p h=1 N(x µ h, σx,h 2 )dp(θ, ψ), P EDP(α θ, α ψ ( ), P 0 θ P 0 ψ ), Defines a nested partition structure ρ n = (ρ n,y, (ρ nj+,x)), where k represents the number of y-clusters of sizes n j+, and k j represents the number of x-clusters within the j th y-cluster of sizes n j,l. Additional notation: (θj, ψ j,l )-(unique parameters), (yj, x j,l )-(cluster specific observations).
29 Examining the Partition Under the assumption that α ψ (θ) = α ψ for all θ Θ, k k j k p(ρ n x 1:n, y 1:n) αθ k α k j Γ(α ψ )Γ(n j+ ) k j p ψ Γ(n j,l ) g y (yj xj ) g x(xj,l,h). Γ(α ψ + n j+ ) j=1 l=1 j=1 l=1 h=1 likelihood of y determines if a coarser y-partition is present smaller k and larger n 1+,..., n k+ with high posterior probability.
30 Examining the Prediction Prediction: E[Y n+1, y 1:n, x 1:n+1] = P n w k+1 (x n+1)x n+1 β 0 + α x k w j (x n+1)x n+1 βj dp(ρ n, θ, φ y 1:n, x 1:n+1), j=1 w j (x n+1) = c 1 1 n j+ ( g x(x n+1) + α x + n j+ w k+1 (x n+1) = c 1 1 α y g x(x n+1). k j l=1 n j,l α x + n j+ p h=1 N(x n+1,h µ l j,h, σ 2 x,l j,h)), given the partition, predictive estimates are an average over a smaller number of components. within component, predictive estimates are based on larger sample sizes. flexible measure of similarity in the covariate space.
31 Examining the Prediction Prediction: E[Y n+1, y 1:n, x 1:n+1] = P n w k+1 (x n+1)x n+1 β 0 + α x k w j (x n+1)x n+1 βj dp(ρ n, θ, φ y 1:n, x 1:n+1), j=1 w j (x n+1) = c 1 1 n j+ ( g x(x n+1) + α x + n j+ w k+1 (x n+1) = c 1 1 α y g x(x n+1). k j l=1 n j,l α x + n j+ p h=1 N(x n+1,h µ l j,h, σ 2 x,l j,h)), given the partition, predictive estimates are an average over a smaller number of components. within component, predictive estimates are based on larger sample sizes. flexible measure of similarity in the covariate space. Summary: More reliable and less variable estimates, yet computations remain relatively simple due to analytically computable urn scheme.
32 Application to AD data Aim: Diagnosis of AD. Data: Y-binary indicator of CN (1) or AD (0); X-summaries of structural Magnetic Resonance Images (smri) (p = 15). Model: ind Y i x i, β i Bern(Φ((x i β i )/σ y,i )), p X i µ i, σx,i 2 ind N(µ i,h, σx,i,h 2 ), h=1 β i, µ i, σ 2 x,i P i.i.d. P, P Q. Q= DP or EDP with the same base measure and gamma hyperpriors on the precision parameters.
33 Results for n = 185 Posterior mode of k: 16 (DP) vs. 3 (EDP) ICV BV LHV RHV (a) DP ICV BV LHV RHV (b) EDP
34 Results for n = 185 Posterior mode of k: 16 (DP) vs. 3 (EDP) ICV BV LHV RHV (c) DP ICV BV LHV RHV (d) EDP Prediction for m = 192 subjects: Number of subjects correctly classified: 159 (DP) vs. 168 (EDP). Tighter credible intervals.
35 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
36 Introduction Aim: Analyse the role of the partition in prediction for BNP mixtures and highlight an understated problem: huge dimension of the partition space.
37 Introduction Aim: Analyse the role of the partition in prediction for BNP mixtures and highlight an understated problem: huge dimension of the partition space. Assume both Y and X are univariate and continuous.
38 Introduction Aim: Analyse the role of the partition in prediction for BNP mixtures and highlight an understated problem: huge dimension of the partition space. Assume both Y and X are univariate and continuous. Number of ways to partition the n subjects: S n,k = 1 k! B n = n S n,k, k=1 k ( ( 1) j k j j=0 ) (k j) n. Many of these partitions similar large number of partitions will fit the data.
39 Introducing Covariate Information Covariates typically provide information on the partition structure. Incorporating this covariate information (as for models with w j (x)) will improve posterior inference on partition.
40 Introducing Covariate Information Covariates typically provide information on the partition structure. Incorporating this covariate information (as for models with w j (x)) will improve posterior inference on partition. But posterior is still diluted this information needs to be strictly enforced.
41 Introducing Covariate Information Covariates typically provide information on the partition structure. Incorporating this covariate information (as for models with w j (x)) will improve posterior inference on partition. But posterior is still diluted this information needs to be strictly enforced. If aim is estimation of regression function, desirable partitions satisfy an ordering constraint of (x i ) Reduces the total number of partitions to 2 n 1. Note: if n=10, B 10 = 115, 975 and = 512, 0.44% of the total partitions, and if n = 100, /B 100 <
42 Restricted DPM (with S.G. Walker and S. Petrone) Let x (1),..., x (n) denote the ordered values of x, and s (1),..., s (n) denote the corresponding values of s 1,..., s n. Prior for the random partition: p(ρ n x) = Γ(α)Γ(n + 1) α k Γ(α + n) k! k j=1 1 n j I s(1)... s (n). Introduces covariate dependency, greatly reduces the total number of partitions, improves prediction, and speeds up computations.
43 Simulated Example Data are simulated from: { xi /8 + 5 if x y i x i = i 6 2x i 12 if x i > 6, x i = (0, 0.25, 0.5,..., 8.75, 9).
44 Simulated Ex.: 3 partitions with highest posterior mass p( rho y,x)= p( rho y,x)= p( rho y,x)= (e) DPM p( rho y,x)= p( rho y,x)= p( rho y,x)= (f) jdpm p( rho y,x)= p( rho y,x)= p( rho y,x)= (g) rdpm No. of partitions with positive post. prob.: 1,263, 918, and 43, respectively.
45 Simulated Example: Prediction (h) DPM (i) joint DPM (j) restricted DPM Figure: Prediction for x = (3.3, 5.9, 6.2, 6.3, 7.9, 8.1, 10). The empirical L 2 prediction error, m (1/m (ŷ m,j,est ŷ m,j,true ) 2 ) 1/2, j=1 for the three models, respectively, is 2.19, 1.66, and 1.09.
46 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
47 Introduction BNP mixture models with covariate dependent weights: f Px (y x) = K(y x, θ)dp x (θ), P x = w j (x)δ θj. Construction of covariate dependent weights (in literature): w j (x) = v j (x) v j (x)), j <j(1 j=1 where v j ( ) are independent stochastic processes.
48 Introduction BNP mixture models with covariate dependent weights: f Px (y x) = K(y x, θ)dp x (θ), P x = w j (x)δ θj. Construction of covariate dependent weights (in literature): w j (x) = v j (x) v j (x)), j <j(1 j=1 where v j ( ) are independent stochastic processes. Drawbacks: non-interpretable weight function, typically exhibiting peculiarities. choice for functional shapes and hyper-parameters. extensions for continuous and discrete covariates are not straightforward.
49 Introduction BNP mixture models with covariate dependent weights: f Px (y x) = K(y x, θ)dp x (θ), P x = w j (x)δ θj. Construction of covariate dependent weights (in literature): w j (x) = v j (x) v j (x)), j <j(1 j=1 where v j ( ) are independent stochastic processes. Drawbacks: non-interpretable weight function, typically exhibiting peculiarities. choice for functional shapes and hyper-parameters. extensions for continuous and discrete covariates are not straightforward. Aim: construct interpretable weights that can handle continuous and discrete covariates, with feasible posterior inference.
50 Normalized weights (with S.G. Walker and I. Antoniano) Solution: Since w j (x) represents the weight assigned to regression model j at covariate x, an interpretable construction is obtained through Bayes Theorem by combining P(A j) = K(x ψ j )dx, the probability that an observation from regression model j has a covariate value in the region A X. P(j) = w j, the prior probability that an observation comes from model j. w w j (x) = j K(x ψ j ) j =1 w j K(x ψ ). j Inclusion of continuous and discrete covariates is straightforward via appropriate definition of K(x ψ).
51 Normalized weights: Computations Likelihood contains an intractable normalizing constant, where w j (x) = f P (y x) = w j (x)k(y x, θ j ), j=1 w j K(x ψ j ) j =1 w j K(x ψ j ).
52 Normalized weights: Computations Likelihood contains an intractable normalizing constant, where w j (x) = f P (y x) = w j (x)k(y x, θ j ), j=1 w j K(x ψ j ) j =1 w j K(x ψ j ). However, using results for the geometric series, for r < 1, k=0 the likelihood can be written as j=1 r k = 1 1 r, f P (y x) = w j K(x ψ j )K(y x, θ j ) (1 w j K(x ψ j )) k. k=0 j =1
53 Normalized weights: Computations The likelihood can be re-written as f P (y x) = w j K(x ψ j )K(y x, θ j ) ( w j (1 K(x ψ j ))) k. j=1 k=0 j =1 k = w j K(x ψ j )K(y x, θ j ) ( w jl (1 K(x ψ jl ))). j=1 k=0 l=1 j l =1 Introducing latent variables k, d, D, latent model is k f P (y, k, d, D x) = w d K(x ψ d )K(y x, θ d ) w Dl (1 K(x ψ Dl )). l=1
54 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression 4 Restricted DP Mixtures for Regression 5 Normalized Covariate Dependent Weights 6 Discussion
55 Discussion BNP mixture models for regression are highly flexible and numerous. This thesis aimed to shed light on advantages and drawbacks of the various models.
56 Discussion BNP mixture models for regression are highly flexible and numerous. This thesis aimed to shed light on advantages and drawbacks of the various models. 1. Joint models: + easy computations, + flexible and easy to extend to other types of Y and X, posterior inference is based on the joint, problematic for large p. Can overcome with EDP prior (maintains + +).
57 Discussion 2. Conditional models with covariate dependent atoms: + moderate computations (increases with flexibility in θ j (x)), + posterior inference is based on the conditional, appropriate definition of θ j (x) for mixed x can be challenging, choice of θ j (x) has strong implications, inflexible: extremely poor prediction, over flexible: prediction may also suffer.
58 Discussion 2. Conditional models with covariate dependent atoms: + moderate computations (increases with flexibility in θ j (x)), + posterior inference is based on the conditional, appropriate definition of θ j (x) for mixed x can be challenging, choice of θ j (x) has strong implications, inflexible: extremely poor prediction, over flexible: prediction may also suffer. 3. Conditional models with covariate dependent weights: + posterior inference is based on the conditional, + flexible and incorporate covariate proximity in partition, difficult computations, lack of interpretable weights that can accommodate mixed x. Can overcome with normalized weights.
59 Discussion 2. Conditional models with covariate dependent atoms: + moderate computations (increases with flexibility in θ j (x)), + posterior inference is based on the conditional, appropriate definition of θ j (x) for mixed x can be challenging, choice of θ j (x) has strong implications, inflexible: extremely poor prediction, over flexible: prediction may also suffer. 3. Conditional models with covariate dependent weights: + posterior inference is based on the conditional, + flexible and incorporate covariate proximity in partition, difficult computations, lack of interpretable weights that can accommodate mixed x. Can overcome with normalized weights. - Huge dimension of partition space: posterior of ρ n is very spread out. Can overcome by enforcing notion of covariate proximity.
60 Future Work Examining models for large p problems. Formalizing model comparison of various models, Consistency, Convergence rates, Finite sample bounds. AD study: Diagnosis based on smri and other imaging techniques. Investigate dynamics of AD biomarkers (jointly + incorporating longitudinal data).
61 References 1. Consonni, G. and Veronese, P. (2001). Conditionally reducible natural exponential families and enriched conjugate priors. Scand. J. Statist., 28, Ramamoorthi, R.V. and Sangalli, L. (2005). On a Characterization of Dirichlet Distribution. Proceedings of the International Conference on Bayesian Statistics and its Applications, Varanasi, India. 3. Hannah, L. and Blei, D. and Powell W. (2011). Dirichlet Process mixtures of Generalized Linear Models. Journal of Machine Learning Research,12, Müller, P. Erkanli, A. and West, M. (1996). Bayesian curve fitting using multivariate normal mixtures. Biometrika, 83, Shahbaba, B. and Neal, R.M. (2009). Nonlinear models using Dirichlet Process Mixtures. Journal of Machine Learning Research, 10,
62 References 6. MacEachern, S.N. (2000). Dependent Nonparametric Processes. Technical Report, Department of Statistics, Ohio State University. 7. Griffin, J.E. and Steele, M. (2006). Order-Based Dependent Dirichlet Processes. JASA, 10, Dunson, D.B. and Park, J.H. (2008). Kernel Stick-Breaking Process. Biometrika,95, Rodriguez, A. and Dunson, D.B. (2011). Nonparametric Bayesian Models through Probit Stick-Breaking Processes. Bayesian Analysis, Fuentes-Garcia, R., Mena, R.H. and Walker, S.G. (2010). A probability for classification based on the mixture of Dirichlet process model. Journal of Classification, 27,
Improving prediction from Dirichlet process mixtures via enrichment
Improving prediction from Dirichlet process mixtures via enrichment Sara K. Wade, David B. Dunson, Sonia Petrone, Lorenzo Trippa, For the Alzheimer s Disease Neuroimaging Initiative. November 15, 2013
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationDirichlet Process Mixtures of Generalized Linear Models
Lauren A. Hannah David M. Blei Warren B. Powell Department of Computer Science, Princeton University Department of Operations Research and Financial Engineering, Princeton University Department of Operations
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationNonparametric Bayes regression and classification through mixtures of product kernels
Nonparametric Bayes regression and classification through mixtures of product kernels David B. Dunson & Abhishek Bhattacharya Department of Statistical Science Box 90251, Duke University Durham, NC 27708-0251,
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationBayesian Statistics. Debdeep Pati Florida State University. April 3, 2017
Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationHybrid Dirichlet processes for functional data
Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationLatent factor density regression models
Biometrika (2010), 97, 1, pp. 1 7 C 2010 Biometrika Trust Printed in Great Britain Advance Access publication on 31 July 2010 Latent factor density regression models BY A. BHATTACHARYA, D. PATI, D.B. DUNSON
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationA comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models
A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models William Barcella 1, Maria De Iorio 1 and Gianluca Baio 1 1 Department of Statistical Science,
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationBayesian Nonparametric Autoregressive Models via Latent Variable Representation
Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,
More informationBayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother
Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother J. E. Griffin and M. F. J. Steel University of Warwick Bayesian Nonparametric Modelling with the Dirichlet Process Regression
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren A. Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David M. Blei Department
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationOn the Support of MacEachern s Dependent Dirichlet Processes and Extensions
Bayesian Analysis (2012) 7, Number 2, pp. 277 310 On the Support of MacEachern s Dependent Dirichlet Processes and Extensions Andrés F. Barrientos, Alejandro Jara and Fernando A. Quintana Abstract. We
More informationNormalized kernel-weighted random measures
Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27 Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations Bayesian Density Regression We observe data (x 1,
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationChapter 2. Data Analysis
Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren A. Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David M. Blei Department
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationBayesian Nonparametric Modeling for Multivariate Ordinal Regression
Bayesian Nonparametric Modeling for Multivariate Ordinal Regression arxiv:1408.1027v3 [stat.me] 20 Sep 2016 Maria DeYoreo Department of Statistical Science, Duke University and Athanasios Kottas Department
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David Blei Department of
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationNonparametric Bayes Uncertainty Quantification
Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationBayesian Nonparametrics
Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet
More informationA Nonparametric Bayesian Model for Multivariate Ordinal Data
A Nonparametric Bayesian Model for Multivariate Ordinal Data Athanasios Kottas, University of California at Santa Cruz Peter Müller, The University of Texas M. D. Anderson Cancer Center Fernando A. Quintana,
More informationA Nonparametric Model for Stationary Time Series
A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu
More informationFlexible Regression Modeling using Bayesian Nonparametric Mixtures
Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham
More informationUSEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*
USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL* 3 Conditionals and marginals For Bayesian analysis it is very useful to understand how to write joint, marginal, and conditional distributions for the multivariate
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationModeling and Predicting Healthcare Claims
Bayesian Nonparametric Regression Models for Modeling and Predicting Healthcare Claims Robert Richardson Department of Statistics, Brigham Young University Brian Hartman Department of Statistics, Brigham
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationBayesian Nonparametric Inference Methods for Mean Residual Life Functions
Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Valerie Poynor Department of Applied Mathematics and Statistics, University of California, Santa Cruz April 28, 212 1/3 Outline
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationClustering using Mixture Models
Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationPOSTERIOR CONSISTENCY IN CONDITIONAL DENSITY ESTIMATION BY COVARIATE DEPENDENT MIXTURES. By Andriy Norets and Justinas Pelenis
Resubmitted to Econometric Theory POSTERIOR CONSISTENCY IN CONDITIONAL DENSITY ESTIMATION BY COVARIATE DEPENDENT MIXTURES By Andriy Norets and Justinas Pelenis Princeton University and Institute for Advanced
More informationNonparametric Bayes Density Estimation and Regression with High Dimensional Data
Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Abhishek Bhattacharya, Garritt Page Department of Statistics, Duke University Joint work with Prof. D.Dunson September 2010
More informationLecture 2: Priors and Conjugacy
Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationCMPS 242: Project Report
CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationNonparametric Bayes Inference on Manifolds with Applications
Nonparametric Bayes Inference on Manifolds with Applications Abhishek Bhattacharya Indian Statistical Institute Based on the book Nonparametric Statistics On Manifolds With Applications To Shape Spaces
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationEfficient Bayesian Multivariate Surface Regression
Efficient Bayesian Multivariate Surface Regression Feng Li (joint with Mattias Villani) Department of Statistics, Stockholm University October, 211 Outline of the talk 1 Flexible regression models 2 The
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationOn the posterior structure of NRMI
On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline
More informationNONPARAMETRIC BAYESIAN INFERENCE ON PLANAR SHAPES
NONPARAMETRIC BAYESIAN INFERENCE ON PLANAR SHAPES Author: Abhishek Bhattacharya Coauthor: David Dunson Department of Statistical Science, Duke University 7 th Workshop on Bayesian Nonparametrics Collegio
More informationA Bayesian Nonparametric Approach to Inference for Quantile Regression
A Bayesian Nonparametric Approach to Inference for Quantile Regression Matthew TADDY The University of Chicago Booth School of Business, Chicago, IL 60637 (matt.taddy@chicagogsb.edu) Athanasios KOTTAS
More informationA Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation
Submitted to Operations Research manuscript Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationParticle Learning for General Mixtures
Particle Learning for General Mixtures Hedibert Freitas Lopes 1 Booth School of Business University of Chicago Dipartimento di Scienze delle Decisioni Università Bocconi, Milano 1 Joint work with Nicholas
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationLecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu
Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationNon-parametric Clustering with Dirichlet Processes
Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction
More informationColouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles
Colouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles Peter Green and John Lau University of Bristol P.J.Green@bristol.ac.uk Isaac Newton Institute, 11 December
More informationOr How to select variables Using Bayesian LASSO
Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection
More informationA Nonparametric Model-based Approach to Inference for Quantile Regression
A Nonparametric Model-based Approach to Inference for Quantile Regression Matthew Taddy and Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz, CA
More informationMethods for the Comparability of Scores
Assessment and Development of new Statistical Methods for the Comparability of Scores 1 Pontificia Universidad Católica de Chile 2 Laboratorio Interdisciplinario de Estadística Social Advisor: Jorge González
More informationBayesian spatial quantile regression
Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse
More informationCompound Random Measures
Compound Random Measures Jim Griffin (joint work with Fabrizio Leisen) University of Kent Introduction: Two clinical studies 3 CALGB8881 3 CALGB916 2 2 β 1 1 β 1 1 1 5 5 β 1 5 5 β Infinite mixture models
More informationBayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother
Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother J.E. Griffin and M. F. J. Steel University of Kent and University of Warwick Abstract In this paper we discuss implementing
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More informationBayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,
Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives Often interest may focus on comparing a null hypothesis of no difference between groups to an ordered restricted alternative. For
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationShould all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?
Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/
More informationScaling up Bayesian Inference
Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression
More information