Supplementary materials for Scalable Bayesian model averaging through local information propagation

Size: px
Start display at page:

Download "Supplementary materials for Scalable Bayesian model averaging through local information propagation"

Transcription

1 Supplementary materials for Scalable Bayesian model averaging through local information propagation August 25, 2014 S1. Proofs Proof of Theorem 1. The result follows immediately from the distributions of the decision variables and the fact that the pfs procedure stops in the first t 1 steps iff γ (t 1) < t 1. Proof of Theorem 2. Our proof strategy is to actually find a pfs representation for any model space distribution π. To this end, we can proceed by induction on the total number of potential predictors. First, we note that the conclusion holds for p = 1 or Ω = {0, 1}. In this case there are but two models in the space: the null model and the model including X 1, written as (0) and (1) respectively. Let π( ) be any probability distribution on Ω. It is easy to check that π( ) is the marginal distribution of the final model under the pfs procedure with ρ(0) = π(0), ρ(1) = 1, and λ 1 (0) = 1. Now suppose the inductive claim holds for any model space involving up to p 1 variables. We next show it must hold for the one with p predictors, or Ω = {0, 1} p, as well. To this end, again let π( ) be any distribution on {0, 1} p, and let Ω ( p) = {0, 1}p 1 {0} the collection of models that do not involve X p. Let us define a new distribution π ( ) on Ω ( p) such that for each γ Ω ( p), π (γ) = π(γ) + π(γ +p ). 1

2 where γ +p Ω is the model that adds an additional variable, X p, into γ. It is easy to check that π (γ) = 1. γ Ω ( p) Because Ω ( p) is isomorphic to {0, 1}p 1, π ( ) can be considered a probability distribution on {0, 1} p 1. Thus by the inductive hypothesis, π has a pfs representation with parameter mappings ρ and λ defined on {0, 1} p 1 \{1, 1,...,1}. Now for any γ {0, 1} p, let γ 1:p 1 = (γ 1,γ 2,...,γ p 1 ) {0, 1} p 1, and let ρ and λ be mappings defined on Ω ( p) such that for any γ Ω ( p), if γ 1:p 1 < p 1, ρ (γ) = ρ (γ 1:p 1 ), λ j (γ) = λ j(γ 1:p 1 ) for j = 1, 2,...,p 1, and λ p(γ) = 0, while if γ 1:p 1 = p 1, then ρ (γ) = 1, λ j(γ) = 0 for j = 1, 2,...,p 1, and λ p(γ) = 1. that Now consider the pfs procedure with p predictors with mappings ρ and λ defined such (i) If γ Ω ( p) and π(γ+p ) > 0, ρ(γ) = ρ (γ) π(γ) π (γ) 1 ρ (γ) λ 1 ρ(γ) j(γ) for j = 1, 2,...,p 1, λ j (γ) = π(γ +p ) π (γ) ρ (γ) 1 ρ(γ) for j = p, (ii) If γ Ω ( p) and π(γ+p ) = 0, ρ(γ) = ρ (γ) and λ j (γ) = λ j(γ) for j = 1, 2,...,p. 2

3 (iii) If γ Ω \ Ω ( p) and γ < p, ρ(γ) = 1 and λ j (γ) = 1/(p γ ) 1 γj =0. Under this pfs procedure, the pth predictor is always the last to be added. Now let us check that the marginal distribution of the final model γ (p) is indeed π. Now, for any γ Ω ( p) such that π(γ) > 0, by (i), (ii), and (iii) we have γ (1),...,γ (p) = γ (1),...,γ (p) γ [1 ρ(γ (t 1) )] λ jt (γ (t 1) ) ρ(γ) t=1 γ [1 ρ (γ (t 1) )] λ j t (γ (t 1) ) ρ (γ) π(γ)/π (γ) t=1 =π (γ) π(γ)/π (γ) = π(γ), where j 1,j 2,...,j γ are the values of the selection variables J 1,J 2,...,J γ that corresponds to the sequence of models γ (1),...,γ ( γ 1),γ ( γ ) = γ. Similarly, for any γ Ω \Ω ( p) such that π(γ) > 0, by (i), (ii), and (iii), the marginal probability for γ (p) to be γ is = γ (1),...,γ (p) :γ (p) =γ γ (1),...,γ (p) :γ (p) =γ γ [1 ρ(γ (t 1) )] λ jt (γ (t 1) ) ρ(γ) t=1 γ 1 [1 ρ (γ (t 1) )] λ j t (γ (t 1) ) [1 ρ(γ γ 1)] t=1 =π(γ)/π (γ ( γ 1) ) π (γ ( γ 1) ) = π(γ). π(γ) π (γ ( γ 1) ) ρ (γ ( γ 1) ) [1 ρ(γ ( γ 1) )] 1 The second equality follows because for γ Ω \Ω ( p) such that π(γ) > 0, under (i), (ii), 3

4 and (iii) γ (1),...,γ (p) :γ (p) =γ γ 1 [1 ρ (γ (t 1) )] λ j t (γ (t 1) ) ρ (γ ( γ 1) ) = π (γ ( γ 1) ), t=1 and with probability 1, γ ( γ 1) is the model with the pth predictor removed from γ. Proof of Theorem 3. Let S 1,J 1,S 2,J 2,...,S p,j p be the latent decision variables of the pfs representation of π under consideration. We let (Ω d, F d ) be the probability space on which these decision variables are jointly defined. The sequence of models γ (1),γ (2),...,γ (p) are functions of the decision variables and thus also measurable with respect to (Ω d, F d ). Fixing the data D, the marginal likelihood under the final model, p(d γ (p) ), is also a random variable on (Ω d, F d ). For any γ Ω, we define an event U γ on (Ω d, F d ) that γ is a submodel of the final model γ (p), that is, γ (p) contains all of the predictors included in γ. athematically, this event can be expressed as U γ : = {ω Ω d : γ (t) (ω) = γ for t = γ }. Next, we define a mapping Φ : Ω R as follows. For each γ Ω, Φ(γ) := E γ(p) [ p(d γ(p) ) U γ ], where the data D is fixed and the expectation is taken over the final model γ (p), or equivalently the decision variables, conditional on the event U γ. Now for any γ Ω, we claim that p(d γ) if γ = p, Φ(γ) = ρ(γ)p(d γ) + (1 ρ(γ)) j:γ j =0 λ j(γ) Φ(γ +j ), if γ < p. 4

5 To see this, note that if γ = p, then conditional on U γ, we have γ (p) = γ, and so E γ(p) [p(d γ (p) ) U γ ] = p(d γ). Now if γ = t < p, then by the tower property, E γ(p) [p(d γ (p) ) U γ ] =E γ(p) [E γ(p) [p(d γ (p) ) S t+1,u γ ] U γ ] =E γ(p) [p(d γ (p) ) S t+1 = 1,U γ ] P(S t+1 = 1 U γ ) + E γ(p) [p(d γ (p) ) J t+1 = j,s t+1 = 0,U γ ] P(J t+1 = j S t+1 = 0,U γ ) P(S t+1 = 0 U γ ). j:γ j =0 Now note that S t+1 = 1 and U γ together imply that γ (p) = γ and so E γ(p) [p(d γ (p) ) S t+1 = 1,U γ ] = p(d γ). Also, for each j such that γ j = 0, {ω Ω d : J t+1 (ω) = 1,S t+1 (ω) = 0} U γ U γ +j. oreover, conditional on the event U γ +j, γ (p) is a function of S t+2,j t+2,...,s p,j p,s p+1 and so is independent of S 1,J 1,...,S t+1,j t+1. Thus, E γ(p) [p(d γ (p) ) J t+1 = j,s t+1 = 0,U γ ] = E γ(p) [p(d γ (p) ) J t+1 = j,s t+1 = 0,U γ,u γ +j ] = E γ(p) [p(d γ (p) ) U γ +j ] = Φ(γ +j ). Finally, since P(S t+1 = 1 U γ ) = ρ(γ) and P(J t+1 = j S t+1 = 0,U γ ) = λ j (γ), putting the pieces together we have Φ(γ) = ρ(γ)p(d γ) + (1 ρ(γ)) λ j (γ) Φ(γ +j ). j:γ j =0 This establishes the above claim about Φ. 5

6 Given the mapping Φ, we are now ready to establish the theorem. First, because under the pfs representation the data generative mechanism essentially forms an H by Theorem 1, the model space posterior has a pfs representation with the mappings ρ( D) and λ( D) determined by the posterior distributions of the decision variables S 1,J 1,...,S p,j p. So our proof strategy now is to simply find the posterior distributions of these decision variables. For any model γ Ω with γ = t < p, ρ(γ D) = P(S t+1 = 1 U γ, D) = P(S t+1 = 1, D U γ ) P(D U γ ) = E γ (p) [p(d γ (p) ) S t+1 = 1,U γ ] P(S t+1 = 1 U γ ) E γ(p) [p(d γ (p) ) U γ ] = ρ(γ)p(d γ)/φ(γ), which is equal to 1 when ρ(γ) = 1. Similarly, if ρ(γ) 1, then λ j (γ D) = P(J t+1 = j U γ,s t+1 = 0, D) = P(J t+1 = j,s t+1 = 0, D U γ ) P(S t+1 = 0, D U γ ) = P(J t+1 = j,s t+1 = 0, D U γ ) P(D U γ ) P(S t+1 = 1, D U γ ) = E γ (p) [p(d γ (p) ) J t+1 = j,s t+1 = 0,U γ ] P(J t+1 = j S t+1 = 0,U γ ) P(S t+1 = 0 U γ ) E γ(p) [p(d γ (p) ) U γ ] E γ(p) [p(d γ (p) ) S t+1 = 1,U γ ] P(S t+1 = 1 U γ ) = Φ(γ+j ) λ j (γ) (1 ρ(γ)). Φ(γ) p(d γ) ρ(γ) On the other hand, for ρ(γ) = 1, then given U γ, S t+1 = 0 with probability 0 and the value of J t for t > γ has no impact on the final model γ (p). So we can simply set λ j (γ D) = λ j (γ) for all j. The theorem now follows by letting φ(γ) = Φ(γ)/p(D 0). 6

7 S2. Bayes factors under g and hyper-g priors For many common priors on the regression coefficients, the BF term in the weight update can be computed either in closed form or well approximated numerically. Here let us consider two popular priors the g-prior and the hyper-g prior. Given a particular model γ, Zellner s g-prior in its most popular form is the following prior on the regression coefficients and the noise variance p(ϕ) 1/ϕ and β γ ϕ,γ N(β 0 γ,g(x T X) 1 /ϕ) where βγ 0 and g are hyperparameters. Following the exposition in Liang et al. (2008), we assume without loss of generality that the predictor variables X 1, X 2,..., X p have all been mean centered at zero. Then we can place a common non-informative flat prior on the intercept α for all models. So p(α,ϕ) 1/ϕ. Under this prior setup, one can show that the BF for a model γ versus the null model is given by BF 0 (γ) = (1 + g) (n 1 γ )/2 ( 1 + g(1 R 2 γ ) ) (n 1)/2 where Rγ 2 is the coefficient of determination for model γ. To avoid undesirable features of the g-priors such as Barlett s paradox and the information paradox (Berger and Pericchi, 2001), Liang et al. (2008) proposed the use of mixtures of g- priors. In particular, they introduced the hyper-g prior, which puts the following hyperprior on g: g 1 + g Beta(1,a/2 1). This prior also renders a closed form representation for the model-specific marginal likelihood, and thus for the corresponding BFs. In particular, Liang et al. (2008) showed that the BF 7

8 of a model γ versus the null model is given by BF 0 (γ) = a 2 γ + a 2 ( ) 2F 1 (n 1)/2, 1; ( γ + a)/2;r 2 γ where 2 F 1 (, ; ; ) is the hypergeometric function. ore specifically, in the notations of Liang et al. (2008), 2F 1 (a,b;c;z) = Γ(c) 1 t b 1 (1 t) c b 1 dt. Γ(b)Γ(c b) 0 (1 tz) a Therefore, with either the g-prior and the hyper-g prior the BF in the weight update can be computed as ) BF ( ) BF 0 (γ i γ(t),γ i (t 1) i (t) = ). BF 0 (γ(t 1) i S3. Incorporating dilution under model space redundancy In this subsection we show that the pfs representation allows us much flexibility in incorporating prior information, and we illustrates this through an interesting phenomenon called the dilution effect first noted by George (1999). Dilution occurs when there is redundancy in the model space. ore specifically, consider the scenario where there is strong correlation among some of the predictors, and any one of these predictors captures virtually all of the association between them and the response. In this case models that contain different members of this class but are otherwise identical are essentially the same. As a result, if, say, a symmetric prior specification is adopted, these models will receive more prior probability than they properly should. At the same time, other models that do not include members of this class will be down-weighted in the prior. In real data, this phenomenon occurs to varying degrees depending on the underlying correlation structure among the predictors. 8

9 Next, we present a very simple specification of the model space prior under the pfs representation that can effectively address this phenomenon. We do not claim that this approach is the best way to deal with dilution, but rather use this as an example to illustrate the flexibility rendered by the pfs representation. The specification can most simply be described in two steps. Step I. Pre-clustering the predictors based on their correlation. First, we carry out a hierarchical clustering over the predictor variables using the (absolute) correlation as the similarity metric, which divides the predictors into K clusters C 1,C 2,...,C K. We recommend using complete linkage for this purpose as this will ensure that the variables within each cluster are all very close to each other. One need to choose a correlation threshold s for cutting the corresponding dendrogram into clusters in the case of complete linkage, this is the minimum correlation for two variables to be in the same cluster. We recommend choosing a large s, such as 0.9, to place variables into the same basket only if they are very highly correlated. Step II. Prior specification given the predictor clusters. Based on the predictor clusters, we assign prior selection probabilities for a model γ to the variables not yet in the model in the following manner. First, we place equal total prior selection probability over each of the available clusters. Then within each cluster, we assign selection probability evenly across the variables. For example, consider the situation where there are a total of 10 predictors X 1 through X 10, and following Step I, they form four clusters C 1 = {X 1,X 2,X 3 }, C 2 = {X 4,X 10 }, C 3 = {X 5,X 7,X 9 } and C 4 = {X 6,X 8 }. Let γ be the model that contains variables X 1, X 4, X 5, X 6, and X 8. That is, γ = (1, 0, 0, 1, 1, 1, 0, 1, 0, 0). If the FS procedure reaches γ and the procedure does not stop, that is, S(γ) = 0, then five variables, X 2,X 3,X 7,X 9,X 10, from three clusters C 1 = {X 2,X 3 }, C 2 = {X 4 }, and C 3 = {X 5 } are available for further 9

10 inclusion. In this case we choose the selection probabilities λ(γ) to be: λ 1 (γ) = λ 4 (γ) = λ 5 (γ) = λ 6 (γ) = λ 8 (γ) = 0, λ 2 (γ) = λ 3 (γ) = 1/3 1/2 = 1/6, λ 4 = 1/3 and λ 5 = 1/3. Under such a specification, the predictors falling in the same cluster evenly share a fixed piece of the prior selection probability, which ensures that the prior weight on the other variables are not diluted. References Berger, J. O. and L. R. Pericchi (2001). Objective Bayesian methods for model selection: Introduction and comparison. Lecture Notes-onograph Series 38, pp George, E. I. (1999). Sampling considerations for model averaging and model search. invited discussion of odel averaging and model search, by. Clyde. In J.. Bernado, J. O. Berger, A. P. Dawid, and A. F.. Smith (Eds.), Bayesian Statistics 6, pp Oxford, UK: Oxford University Press. Liang, F., R. Paulo, G. olina,. A. Clyde, and J. O. Berger (2008). ixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association 103(481),

Bayesian Model Averaging

Bayesian Model Averaging Bayesian Model Averaging Hoff Chapter 9, Hoeting et al 1999, Clyde & George 2004, Liang et al 2008 October 24, 2017 Bayesian Model Choice Models for the variable selection problem are based on a subset

More information

Mixtures of Prior Distributions

Mixtures of Prior Distributions Mixtures of Prior Distributions Hoff Chapter 9, Liang et al 2007, Hoeting et al (1999), Clyde & George (2004) November 9, 2017 Bartlett s Paradox The Bayes factor for comparing M γ to the null model: BF

More information

Mixtures of Prior Distributions

Mixtures of Prior Distributions Mixtures of Prior Distributions Hoff Chapter 9, Liang et al 2007, Hoeting et al (1999), Clyde & George (2004) November 10, 2016 Bartlett s Paradox The Bayes factor for comparing M γ to the null model:

More information

Bayesian Variable Selection Under Collinearity

Bayesian Variable Selection Under Collinearity Bayesian Variable Selection Under Collinearity Joyee Ghosh Andrew E. Ghattas. Abstract In this article we highlight some interesting facts about Bayesian variable selection methods for linear regression

More information

Model Choice. Hoff Chapter 9, Clyde & George Model Uncertainty StatSci, Hoeting et al BMA StatSci. October 27, 2015

Model Choice. Hoff Chapter 9, Clyde & George Model Uncertainty StatSci, Hoeting et al BMA StatSci. October 27, 2015 Model Choice Hoff Chapter 9, Clyde & George Model Uncertainty StatSci, Hoeting et al BMA StatSci October 27, 2015 Topics Variable Selection / Model Choice Stepwise Methods Model Selection Criteria Model

More information

Bayesian Variable Selection Under Collinearity

Bayesian Variable Selection Under Collinearity Bayesian Variable Selection Under Collinearity Joyee Ghosh Andrew E. Ghattas. June 3, 2014 Abstract In this article we provide some guidelines to practitioners who use Bayesian variable selection for linear

More information

Mixtures of g-priors for Bayesian Variable Selection

Mixtures of g-priors for Bayesian Variable Selection Mixtures of g-priors for Bayesian Variable Selection January 8, 007 Abstract Zellner s g-prior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency

More information

Mixtures of g-priors for Bayesian Variable Selection

Mixtures of g-priors for Bayesian Variable Selection Mixtures of g-priors for Bayesian Variable Selection Feng Liang, Rui Paulo, German Molina, Merlise A. Clyde and Jim O. Berger August 8, 007 Abstract Zellner s g-prior remains a popular conventional prior

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Stat260: Bayesian Modeling and Inference Lecture Date: March 10, 2010

Stat260: Bayesian Modeling and Inference Lecture Date: March 10, 2010 Stat60: Bayesian Modelin and Inference Lecture Date: March 10, 010 Bayes Factors, -priors, and Model Selection for Reression Lecturer: Michael I. Jordan Scribe: Tamara Broderick The readin for this lecture

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

Mixtures of g Priors for Bayesian Variable Selection

Mixtures of g Priors for Bayesian Variable Selection Mixtures of g Priors for Bayesian Variable Selection Feng LIANG, RuiPAULO, GermanMOLINA, Merlise A. CLYDE, and Jim O. BERGER Zellner s g prior remains a popular conventional prior for use in Bayesian variable

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification 10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene

More information

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

The Median Probability Model and Correlated Variables

The Median Probability Model and Correlated Variables The Median Probability Model and Correlated Variables July 1 st 018 By Barbieri, M. 1, Berger, J. O., George, E. I. 3 and Rockova, V. 4 1 Universita Roma Tre, Duke University, 3 University of Pennsylvania,

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Posterior Model Probabilities via Path-based Pairwise Priors

Posterior Model Probabilities via Path-based Pairwise Priors Posterior Model Probabilities via Path-based Pairwise Priors James O. Berger 1 Duke University and Statistical and Applied Mathematical Sciences Institute, P.O. Box 14006, RTP, Durham, NC 27709, U.S.A.

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

THE MEDIAN PROBABILITY MODEL AND CORRELATED VARIABLES

THE MEDIAN PROBABILITY MODEL AND CORRELATED VARIABLES Submitted to the Annals of Statistics THE MEDIAN PROBABILITY MODEL AND CORRELATED VARIABLES By Maria M. Barbieri, James O. Berger Edward I. George and, Veronika Ročková,, 17 August 2018 Università Roma

More information

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Ioannis Ntzoufras, Department of Statistics, Athens University of Economics and Business, Athens, Greece; e-mail: ntzoufras@aueb.gr.

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

20: Gaussian Processes

20: Gaussian Processes 10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Model Choice. Hoff Chapter 9. Dec 8, 2010

Model Choice. Hoff Chapter 9. Dec 8, 2010 Model Choice Hoff Chapter 9 Dec 8, 2010 Topics Variable Selection / Model Choice Stepwise Methods Model Selection Criteria Model Averaging Variable Selection Reasons for reducing the number of variables

More information

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract Bayesian Estimation of A Distance Functional Weight Matrix Model Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies Abstract This paper considers the distance functional weight

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

MODEL AVERAGING by Merlise Clyde 1

MODEL AVERAGING by Merlise Clyde 1 Chapter 13 MODEL AVERAGING by Merlise Clyde 1 13.1 INTRODUCTION In Chapter 12, we considered inference in a normal linear regression model with q predictors. In many instances, the set of predictor variables

More information

The Calibrated Bayes Factor for Model Comparison

The Calibrated Bayes Factor for Model Comparison The Calibrated Bayes Factor for Model Comparison Steve MacEachern The Ohio State University Joint work with Xinyi Xu, Pingbo Lu and Ruoxi Xu Supported by the NSF and NSA Bayesian Nonparametrics Workshop

More information

Adaptive testing of conditional association through Bayesian recursive mixture modeling

Adaptive testing of conditional association through Bayesian recursive mixture modeling Adaptive testing of conditional association through Bayesian recursive mixture modeling Li Ma February 12, 2013 Abstract In many case-control studies, a central goal is to test for association or dependence

More information

An introduction to Bayesian inference and model comparison J. Daunizeau

An introduction to Bayesian inference and model comparison J. Daunizeau An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison

More information

A simple two-sample Bayesian t-test for hypothesis testing

A simple two-sample Bayesian t-test for hypothesis testing A simple two-sample Bayesian t-test for hypothesis testing arxiv:159.2568v1 [stat.me] 8 Sep 215 Min Wang Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA and Guangying

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Bayesian inference for the location parameter of a Student-t density

Bayesian inference for the location parameter of a Student-t density Bayesian inference for the location parameter of a Student-t density Jean-François Angers CRM-2642 February 2000 Dép. de mathématiques et de statistique; Université de Montréal; C.P. 628, Succ. Centre-ville

More information

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS Oloyede I. Department of Statistics, University of Ilorin, Ilorin, Nigeria Corresponding Author: Oloyede I.,

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Dimitris Fouskakis, Department of Mathematics, School of Applied Mathematical and Physical Sciences, National Technical

More information

Hierarchical Modelling for Multivariate Spatial Data

Hierarchical Modelling for Multivariate Spatial Data Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Harrison B. Prosper. CMS Statistics Committee

Harrison B. Prosper. CMS Statistics Committee Harrison B. Prosper Florida State University CMS Statistics Committee 08-08-08 Bayesian Methods: Theory & Practice. Harrison B. Prosper 1 h Lecture 3 Applications h Hypothesis Testing Recap h A Single

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Hierarchical Modeling for Multivariate Spatial Data

Hierarchical Modeling for Multivariate Spatial Data Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Variable Selection in Structured High-dimensional Covariate Spaces

Variable Selection in Structured High-dimensional Covariate Spaces Variable Selection in Structured High-dimensional Covariate Spaces Fan Li 1 Nancy Zhang 2 1 Department of Health Care Policy Harvard University 2 Department of Statistics Stanford University May 14 2007

More information

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web: capture/recapture.

More information

Variance prior forms for high-dimensional Bayesian variable selection

Variance prior forms for high-dimensional Bayesian variable selection Bayesian Analysis (0000) 00, Number 0, pp. 1 Variance prior forms for high-dimensional Bayesian variable selection Gemma E. Moran, Veronika Ročková and Edward I. George Abstract. Consider the problem of

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web this afternoon: capture/recapture.

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

Sparse Factor-Analytic Probit Models

Sparse Factor-Analytic Probit Models Sparse Factor-Analytic Probit Models By JAMES G. SCOTT Department of Statistical Science, Duke University, Durham, North Carolina 27708-0251, U.S.A. james@stat.duke.edu PAUL R. HAHN Department of Statistical

More information

Lecture 7 October 13

Lecture 7 October 13 STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We

More information

An Extended BIC for Model Selection

An Extended BIC for Model Selection An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,

More information

Relevance Vector Machines

Relevance Vector Machines LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian

More information

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i, Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives Often interest may focus on comparing a null hypothesis of no difference between groups to an ordered restricted alternative. For

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

The joint posterior distribution of the unknown parameters and hidden variables, given the

The joint posterior distribution of the unknown parameters and hidden variables, given the DERIVATIONS OF THE FULLY CONDITIONAL POSTERIOR DENSITIES The joint posterior distribution of the unknown parameters and hidden variables, given the data, is proportional to the product of the joint prior

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15 Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 15 Contingency table analysis North Carolina State University

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results

Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Jing Zhang Miami University August 12, 2014 Jing Zhang (Miami University) Using Historical

More information

ST 740: Model Selection

ST 740: Model Selection ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Discussion of Predictive Density Combinations with Dynamic Learning for Large Data Sets in Economics and Finance

Discussion of Predictive Density Combinations with Dynamic Learning for Large Data Sets in Economics and Finance Discussion of Predictive Density Combinations with Dynamic Learning for Large Data Sets in Economics and Finance by Casarin, Grassi, Ravazzolo, Herman K. van Dijk Dimitris Korobilis University of Essex,

More information

Learning to Learn and Collaborative Filtering

Learning to Learn and Collaborative Filtering Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

A REVERSE TO THE JEFFREYS LINDLEY PARADOX

A REVERSE TO THE JEFFREYS LINDLEY PARADOX PROBABILITY AND MATHEMATICAL STATISTICS Vol. 38, Fasc. 1 (2018), pp. 243 247 doi:10.19195/0208-4147.38.1.13 A REVERSE TO THE JEFFREYS LINDLEY PARADOX BY WIEBE R. P E S T M A N (LEUVEN), FRANCIS T U E R

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Bayesian Adjustment for Multiplicity

Bayesian Adjustment for Multiplicity Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of Texas 2011 Rao Prize Conference Department of Statistics, Penn State University May 19, 2011 1 2011 Rao Prize

More information

Bayesian model selection for computer model validation via mixture model estimation

Bayesian model selection for computer model validation via mixture model estimation Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Objective Bayesian Hypothesis Testing

Objective Bayesian Hypothesis Testing Objective Bayesian Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Statistical Science and Philosophy of Science London School of Economics (UK), June 21st, 2010

More information

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,

More information

Bayesian variable selection in high dimensional problems without assumptions on prior model probabilities

Bayesian variable selection in high dimensional problems without assumptions on prior model probabilities arxiv:1607.02993v1 [stat.me] 11 Jul 2016 Bayesian variable selection in high dimensional problems without assumptions on prior model probabilities J. O. Berger 1, G. García-Donato 2, M. A. Martínez-Beneito

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information