Bayesian Computational Methods: Monte Carlo Methods

Size: px
Start display at page:

Download "Bayesian Computational Methods: Monte Carlo Methods"

Transcription

1 Bayesian Computational Methods: Monte Carlo Methods c These notes are copyrighted by the authors. Unauthorized use is not permitted. Bayesian Computational Methods p.1/??

2 Noniterative Monte Carlo Methods 1. Direct Sampling Approximations to integrals can be carried out by Monte Carlo integration. The definition of Monte Carlo integration is as follows. Suppose a random variable θ has a density generically denoted by p(θ), and we wish to compute E [f(θ)]. This is given by γ = E [f(θ)] = f(θ)p(θ)dθ. Then if θ 1,...,θ N are i.i.d. samples from p(θ), we have ˆγ = 1 N N f(θ j ), j=1 Bayesian Computational Methods p.2/??

3 which converges to E [f(θ)] with probability 1 as N, by the strong law of large numbers. In Bayesian inference, p(θ) typically corresponds to the posterior distribution of θ, p(θ x), and thus E [f(θ)] is the posterior mean of f(θ). Hence the computation of posterior expectations requires only a sample size N from the posterior distribution. Thus, we must be able to directly sample from the posterior distribution in order to use the Monte Carlo approximation. Notice that as N increases, the quality of the approximation increases. Bayesian Computational Methods p.3/??

4 The Monte Carlo size, N is within our control, where as the sample size n is not. Another constraint with asymptotic methods is that it allows us to evaluate its accuracy for any fixed N. Since ˆγ is itself a sample mean of independent observations, we have V ar(ˆγ) = 1 N V ar (f(θ)). But V ar((f(θ)) can be estimated by the sample variance of the f(θ j ) values, so that a standard error estimate of ˆγ is given by se(ˆγ) = 1 N (f(θ j ) ˆγ) 2. N(N 1) j=1 Bayesian Computational Methods p.4/??

5 Finally, the CLT implies that ˆγ ± 2se(ˆγ) provides an approximate 95% interval for the true value of the posterior mean γ. Remark While it may seem strange for us to recommend use of a frequentist interval estimator here, Monte Carlo simulations provide one (and perhaps only one) example where they are clearly appropriate! In addition note that we can also estimate quantities such as p = P(a < f(θ) < b x) using Monte Carlo methods. Bayesian Computational Methods p.5/??

6 A estimate of p is simply N i=1 ˆp = I (f(θ i) s (a,b)) N where N is the number of Monte Carlo samples. The associated simulation standard error is ˆp(1 ˆp) se(ˆp) = N(N 1). This suggests that a histogram of the sampled θ j s would estimate the posterior itself, since the probability in each histogram bin converges to the true bin probability. Bayesian Computational Methods p.6/??

7 Alternatively, we could use a kernel density estimate to smooth the histogram. ˆp(θ x) = 1 Nk N K ( ) θ θj. where the K is a kernel density (typically a normal or rectangular distribution) and k N is a window width satisfying k N 0 and Nk N as N. Details of density estimation are given in a book by Silvermann (1986, Chapman Hall). k N Bayesian Computational Methods p.7/??

8 Example Suppose x 1,...,x n are i.i.d. N(µ,σ 2 ), τ = 1, and σ 2 π(µ,σ 2 ) τ 1. Recall that ( ) 1 µ τ, x N x,, nτ τ x ( n 1 Gamma, 2 (n 1)s2 2 ), where s 2 = 1 n 1 n i=1 (x i x) 2. Bayesian Computational Methods p.8/??

9 We can generate samples ( from the posterior ) of (µ,σ 2 ) as follows. 1. Sample τ j Gamma x, n 1, (n 1)s2 2 2 ( ) 1 2. Then sample µ j N x, nτ j j = 1,...,N. This process creates the set {(µ j,σ 2 j),j = 1,...,N} from p(µ,σ 2 x). To estimate, for example, the posterior mean of µ, we would use Ê(µ x) = 1 n N j=1 µ j, or, to obtain a 95% HPD interval for µ, we simply use the empirical and quantiles of the sample of µ j values. Bayesian Computational Methods p.9/??

10 Estimates of functions of parameters are also easily obtained. For example, suppose we want to estimate the posterior distribution of γ = σ/µ = τ 1/2 µ 1, which is known as the coefficient of variation. We simply define the transformed Monte Carlo samples γ j = τ 1/2 j µ 1 j, j = 1,...,N, and create a histogram or kernel density based on these values. As a final illustration, suppose we wanted to estimate [ ] P(z > c x) = p(z θ)dz p(θ x)dθ where z is a future observation. Θ c Bayesian Computational Methods p.10/??

11 Note that the inner integral can be written as ( ) c µ P(z > c θ) = 1 Φ = 1 Φ σ ( ) c µ τ 1/2 where Φ( ) is the standard normal c.d.f. Thus ( )] c µ P(z > c x) = E θ x [1 Φ 1 N [ N 1 Φ j=1 τ 1/2 ( c µ j τ 1/2 j )]. Bayesian Computational Methods p.11/??

12 2. Indirect Methods If we cannot directly sample fromt the joint posterior, as in the previous example, then we cannot use direct Monte Carlo. We need to search for ways to sample from then joint posterior indirectly. Several methods have been proposed in the literature for indirect sampling. We will discuss three of them. These are i) importance sampling ii) rejection sampling iii) weighted bootstrap Bayesian Computational Methods p.12/??

13 Importance Sampling This approach ws outlined carefully by Hammersly and Handscombe (1964, Monte Carlo Methods, Chapman and Hall) and championed for Bayesian analysis by Geweke (1989, Econometrica). Suppose we wish to approximate a posterior expectation, say E [f(θ) x] = f(θ)l(θ)π(θ)dθ L(θ)π(θ)dθ where L(θ) p(x θ) is the likelihood function of θ. Bayesian Computational Methods p.13/??

14 Suppose we can roughly approximate the normalized likelihood times the prior, c 1 L(θ)π(θ), by some density g(θ), from which we can easily sample, say a multivariate t density, or perhaps some variation of a multivariate t. Then by defining weight function We have ω(θ) = L(θ)π(θ), g(θ) E [f(θ) x] = f(θ)ω(θ)g(θ)dθ ω(θ)g(θ)dθ 1 N N j=1 f(θ j)ω(θ j ) 1 N N j=1 ω(θ j), Bayesian Computational Methods p.14/??

15 where the θ j are i.i.d. samples from g(θ). Here, g(θ) is called the importance function. How closely g(θ) resembles c 1 L(θ)π(θ) (i.e. the normalized posterior) controls how good the approximation is. To see this, note that if g(θ) is a good approximation, the weights will all be roughly equal, which in turn will minimize the variance of the numerator and denominator in the approximation. If g(θ) is a poor approximation, many of the weights will be close to zero, and thus few θ j s will dominate the sums, producing an inaccurate approximation. Bayesian Computational Methods p.15/??

16 Rejection Sampling Rejection sampling is an extremly general and quite common method of random variate generation. Excellent summaries are given in the books by Ripley (1987, John Wiley and Sons) and Devroye (1986, Springer-Verlag). In this method, instetad of trying to approximate the normalized posterior p(θ x) = L(θ)π(θ) L(θ)π(θ)dθ = L(θ)π(θ) c Bayesian Computational Methods p.16/??

17 We try to blanket it. That is, suppose there exists an identifiable constant M > 0 and a smooth density g(θ), called an envelope function, such that L(θ)π(θ) < Mg(θ) for all θ. The rejection method proceeds as follows: i) Generate θ j g(θ) ii) Generate U U(0, 1) iii) if MUg(θ j ) < L(θ j )π(θ j ), accept θ j, otherwise, reject. iv) Return to step i) and repeat until the desired number of samples are N obtained. Bayesian Computational Methods p.17/??

18 The members of {θ j,j = 1,...,N} will then be variables from p(θ x). Intuition suggests that M should be chosen as small as possible, so as not to waiste samples unnecessarily. This is easy to confirm, since if k denotes the number of iterations required to get one acceptable candidate θ j, then k is a geometric random variable. That is, P(k = i) = (1 p) i 1 p, where p is the probability of acceptance. Bayesian Computational Methods p.18/??

19 So P(k = i) decreases monotonically and at a n exponential rate. It can be shown that p = c/m. where c = L(θ)π(θ)dθ. Since E(k) = p 1 = M/c, we do need to minimize M. Note that if p(θ x) were available as the g function, we would choose theminimum acceltable value M = c, obtaining an acceptance probability of 1. The envelope density g should be similar to the posterior in general appearence, but with heavier tails and sharper infinite peaks, in order to assure that there are sufficiently many rejection candidates available across its entire domain. Bayesian Computational Methods p.19/??

20 Weighted Bootstrap This method was presented by Smith and Gelfand (1992, American Statistician) and is very similar to the sampling-importance sampling resampling algorithm of Rubin (1988). Suppose an M appropriate for the rejection is not readily available, but we do have a sample θ 1,...,θ N from some appoximating density g(θ). Define ω i = L(θ i)π(θ i ), g(θ i ) ω i q i = N i=1 ω. i Bayesian Computational Methods p.20/??

21 Now draw θ from the discrete distribution over {θ 1,...,θ N }, which places mass q i at θ i. Then the sample is from π(θ x) with the approximation improving as N. This is a weighted bootstrap, since instead of resampling from the set with equally likely probabilities of selection, we are resampling some points more that others due to the unequal weighting. To see why this method works, note that for the standard bootstrap, Bayesian Computational Methods p.21/??

22 P(θ a) = n i=1 a I( < θ i a) N g(θ)dθ as N, so that θ is approximately distributed as g(θ). For the weighted bootstrap P(θ a) = n q i I( < θ i a) = i=1 1 n N i=1 ω ii( < θ i a) 1 n N i=1 ω i Bayesian Computational Methods p.22/??

23 [ ] E L(θi )π(θ i ) g I( < θ g(θ) i a) [ ] E L(θi )π(θ i ) g g(θ) as N N, and = = [ a L(θi )π(θ i ) g(θ) [ L(θi )π(θ i ) g(θ) ] g(θ) ] g(θ) a [L(θ i)π(θ i )]dθ [L(θ i)π(θ i )]dθ dθ dθ = a p(θ x)dθ Bayesian Computational Methods p.23/??

24 Similar to importance sampling, we need to choose g(θ) p(θ x), or else a very large N will be required to obtain acceptable accuracy. Bayesian Computational Methods p.24/??

25 Markov Chain Monte Carlo Methods Importance sampling, rejection sampling, and the weighted bootsrap are all noniterative methods. They draw a sample of size N, and stop. Hence there is no notation of the algorithm converging - we simply require sufficiently large N. But for many problems, especially high-dimensional ones, it may be difficult of even imposible to find an importance sampling density (or envelope function) which is an acceptable accurate approximation to the log posterior, but still easy to sample from. This facilitates a need for new algorithms. Bayesian Computational Methods p.25/??

26 Gibbs Sampling Suppose we have a collection of k random variables denoted by U = U 1,...,U k ). We assume the full conditional distributions {p(u i u j,j i),i = 1,...,k} are available for sampling. Here, available means that the samples may be generated by some method. Thus, we do not require the one dimensional conditional distributions [U i U j,j i], i = 1,...,k to have closed form, but only that we need to be able to write them up to a normalizing constant. Bayesian Computational Methods p.26/??

27 Gibbs Sampling Algorithm 1. Suppose we have a set of arbitrary starting values {U (0) 1,...,U (0) k }. 2. Draw U (1) 1 from [U 1 U (0) 2,...,U (0) k ] 3. Draw U (1) 2 from [U 2 U (1) 1,U (0) 3,...,U (0) k ]. 4. Draw U (1) k from [U k U (1) 1,...,U (1) k 1 ] This completes one iteration of the Gibbs sampler. Thus after 1 iteration, we have {U (1) 1,...,U (1) }, and after t iterations we obtain {U (t) 1,...,U (t) k }. k Bayesian Computational Methods p.27/??

28 Remark: For notational convenience, we let [X Y ] denote the conditional distribution of X given Y. We are led to the following theorem. Theorem For the Gibbs sampling algorithm outlined above, a) {U (t) 1,...,U (t) k } [U 1,...,U k ] as t. b) The convergence in a) is exponential in t. See Geman and Geman (1984, IEEE); Schervish and Carlin (1992, Jounrnal of Comp. and Graph. Stat.); Gelfand and Smith (1990, JASA). Bayesian Computational Methods p.28/??

29 In the context of Bayesian analysis, we are interested in sampling from the joint posterior distribution [U 1,...,U k x]. The Gibbs sampler then requires draws from each of the univariate conditional distributions p(u i u j,x,i j). A marginal density estimate of U i is given by ˆp(u i x) = 1 m m j=1 p(u i u (t) 1,j,u(t) i 1,j,...,u(t) i+1,j,...,u(t) k,j,x). This type of estimate has less variability than an estimate obtained by a kernal density. Bayesian Computational Methods p.29/??

30 Example Suppose x 1,...,x n are i.i.d. N(µ,σ 2 ), τ = 1/σ 2, and π(µ,τ) τ 1. We want to devise the Gibbs sampler to sample from the joint posterior density of (µ,τ). We have ( ) 1 µ τ, x N x,, nτ ( ) n τ µ, x Gamma 2, (xi µ) 2, 2 Bayesian Computational Methods p.30/??

31 To obtain the joint posterior samples from [µ,τ x] we 1. Start with arbitrary (µ (0),τ (0) ). 2. Sample µ (1) from [µ τ (0),x]. 3. Sample τ (1) from [τ µ (1),x]. Repeat until we have t iterations yielding (µ (1),τ (1) ),..., (µ (t),τ (t) ). Bayesian Computational Methods p.31/??

32 Example: Logistic Regression Suppose y 1,...,y n are independent Binomial(1,p i ), where p i = exp {β o + β 1 x i } 1 + exp {β o + β 1 x i }, i = 1,...,n. We have [ n ] L(β) = exp {y i (β o + β 1 x i ) log(1 + exp{β o + β 1 x i })} i=1. How do we set up the Gibbs sampler for sampling from the joint posterior of [β o,β 1 x,y]? The univariate conditionals do not have closed form. Bayesian Computational Methods p.32/??

33 p(β o β 1,x,y) L(β), where β 1 is fixed. Now p(β 1 β o,x,y) L(β), where β o is treated as fixed. Note that the univariate conditionals for β o and β 1 have the exact same functional form, with different variables being treated as fixed. In general, the univariate conditional distribution is always proportional to the joint distribution. Bayesian Computational Methods p.33/??

34 That is, θ = (θ 1,...,θ k ), then p(θ i θ j,x,i j) L(θ)π(θ), i = 1,...,k. For the logistic regression problem, since the univariate conditionals do not have closed form. We need another algorithm to sample from these univariate conditionals. In many situations, we need to use rejection sampling within Gibbs sampling, see Gelfand and Smith (1990), Gilks et al. (1993, JRSS-B), Gelfand et al. (1990, JASA), Wakefield et al. (1994, Appl. Stat.). Bayesian Computational Methods p.34/??

35 Metropolis-Hastings Algorithm Like the Gibbs sampling algorithm, the Metropolis-Hastings algorithm is a MCMC (Markov Chain Monte Carlo) method. THe originator of the algorithm is Metropolis et al. (1953, Chem. Phys.). Hastings (1970, Biometrika) introduced this algorithm for statistical problems. Suppose we wish to sample from the joint distribution [U] = [U 1,...,U k ]. Denote the density of U by p(u). Bayesian Computational Methods p.35/??

36 Let q(u, v) be a density in u such that q(u, v) = q(v, u). The function q is called a candidate or proposal density. We generate values as follows Metropolis Algorithm 1. Draw v q(,u), where u = U t 1, the current state of the Metropolis algoritm. 2. Compute the odds ratio r = p(v) p(u) = L(v)π(v) L(u)π(u) 3. Set U t = v with probability min(r, 1) and U t = u with probability 1-min(r, 1). Bayesian Computational Methods p.36/??

37 Theorem For the Metropolis algorithm, under mild conditions, [U t ] [U] as t. Remark: The Gibbs sampling algorithm and the Metropolis- Hastings algorithm are Markov Chain Monte Carlo algorithms since the samples that are generated by these algorithms are samples from a certain Markov chain. Bayesian Computational Methods p.37/??

38 For continuous ( parameter settings, the most convenient choice for q is a N θ (t 1), Σ ), where Σ can be taken as Σ = [ ] 2 log[p 1 (θ x)] θ=θ(t 1) θ θ where p (θ x) = L(θ)π(θ). We note that this choice of q is easily sampled and clearly symmetric in v and θ (t 1). A simple but important generalization of the Metropolis algorithm was provided by Hastings (1970, Biometrika. Hastings drops the requirement that q(u,v) be symmetric, and redefined the odds ratio as Bayesian Computational Methods p.38/??

39 r = p(v)q(u,v) p(u)q(v, u) With this modification, it can be shown that this algorithm converges to the required target distribution for any candidate density q. Chib abd Greenburg (1995, American Statistician) give an excellent discussion of the Metropolis-Hastings algortihms and discuss choices of q(u,v). Bayesian Computational Methods p.39/??

40 Example Condiser for data by Bliss (1935, Ann. of Appl. Bio.). The data record the number of adult beetls killed after 5 hours of exposure to various levels of gaseous carbon disulphide (CS 2 ). Dosage no. killed no. exposed w i y i n i Bayesian Computational Methods p.40/??

41 Consider the model P(Death w i ) h(w i ) = [ ] m1 exp(xi ) 1 + exp(x i ) where m 1 > 0, w i is the covariate (dose), and x i = w i µ, for σ µ R 1 and σ > 0. For the prior distributions we take m 1 Gamma(a o,b o ) µ N(c o,d o ) σ 2 IG(e o,f o ) the Inverse Gamma distribution. Bayesian Computational Methods p.41/??

42 Note that is z IG(a,b), then p(z a,b) z (a+1) e b 2z and z 1 Gamma(a,b). We take m 1,µ,σ 2 to be independent a priori. The joint posterior is given by p(m 1,µ,σ 2 y) p(y m 1,µ,σ 2 )π(m 1,µ,σ 2 ) { k } [h(w i )] y i [1 h(w i )] n i y i i=1 ma o 1 1 (σ 2 ) exp e o+1 { 1 2 ( µ co d o ) } 2 m 1 1 b o f o σ 2 Bayesian Computational Methods p.42/??

43 Let θ = (θ 1,θ 2,θ 3 ) = (µ, 1 2 log(σ2 ),log(m 1 )). This transforms the parameter space to R 3. This will be nice if we we want to work with a Gaussian proposal density. Upon making this transformation, we get { k } p(θ y) [h(w i )] y i [1 h(w i )] n i y i i=1 exp {a o θ 3 2e o θ 2 } { exp 1 ( θ1 c o 2 d o ) } 2 exp(θ 3) exp( 2θ 2) b o f o Bayesian Computational Methods p.43/??

44 Numerical stability is improved by working on the log-scale for computing the Metropolis odds ratio as r = exp[logp (v y) logp (θ (t 1) y)]. The hyperparameters are chosen to be a o = 0.25, b o = 4, so that m 1 has a prior mean equal to 1 (corresponding to the standard logit model), and prior standard deviation 2. Vague priors are specified for µ and σ 2, by setting c o = 2,d o = 10,e o = 2 and f o = We use a N 3 (θ (t 1), Σ ) proposal density, with Σ = Diag( , 0.033, 0.10). Bayesian Computational Methods p.44/??

45 The chains mix very slowly, as can be seen from the high lag 1 sample autocorrelations, estimated from the chain 2 output and pronted above the monitoring plots. The reason for this slow convergence is the high posterior correlations amongst the paramters, estimated as ˆρ(θ 1,θ 2 ) = 0.78, ˆρ(θ 1,θ 3 ) = 0.94, and ˆρ(θ 2,θ 3 ) = As a result the proposal acceptance rate is low (13.5%), and convergence is slow. Convergence can be acceperated by using a nondiagnoal proposal covariance matrix designed to better mimic the posterior surface. Bayesian Computational Methods p.45/??

46 From the output of the initial chains, we obtain a better estimate of the posterior convariance matrix in the usual way as ˆΣ = 1 N N (θ j θ)(θ j θ), j=1 where j indexes the Monte Carlo samples. Now consider Σ = 2ˆΣ. The results indicate improved convergence, with lower obseved autocorrlations and higher Metropolis acceptance rate (27.3%). Bayesian Computational Methods p.46/??

47 Convergence Monitoring and Acceleration When we say that asn MCMC algorithm has converged as iteration T, we mean that its output can be safely thought of as coming from the true stationarity distribution of the Markov chain or all t > T. The most common cource of MCMC convergence difficulty is due ot model overparameterization. When a model becomes overparameterized, the parameters become nonidentifiable. As an example, consider the probelm of finding the posterior distribution of θ 1, where the likelihood of x i is defined by x i N(θ 1 + θ 2, 1), Bayesian Computational Methods p.47/??

48 for i = 1,...,n, and we take π(θ 1,θ 2 ) 1. Here, only the sum of the two parameters is identified by the data, so without proper priors for θ 1 and θ 2, their marginal posterior distributino is improper as well. Unfortunately, a naive Gibbs sampler would not reveal this problem. In addition, models that are overparameterizedtypically lead to high posterior correlations amongst the parameters, or cross-correlations. Traditional remedy: reparameterize, Gelfand et al.(1995). Alternativelty convergence may be monitored by a diagnostic staistic. Bayesian Computational Methods p.48/??

49 Characteristics of diagnostic statistic. 1. Theoretical - based on some expected criterion 2. Diagnostic Goal, e.g. reducing vaiance and bias 3. Output format, e.g. graphical 4. Replication, e.g. comparing replicate chains 5. Diminsionality, considering different aspects of the posterior sampling domain 6. Application depending on the algorithm 7. Ease of Use, i.e. interpretability and computational efficiency Bayesian Computational Methods p.49/??

50 Convergence Diagnostic Statistics The single most popular approach is due to Gelman and Rubin (1992, Stat. Scien.). In this approach we 1. Run m parallel chains from different starting values. 2. These chains must be initially overdispersed with respect to the true posterior. 3. Running the m chains for 2N iterations each, we then compare the variation within each chain to the variation between each chain. Specifically we monitor convergence by the scaled reduction factor, Bayesian Computational Methods p.50/??

51 N 1 ˆR = N + m + 1 mn ( B W )( ) df, df 2 where B/N is the variance between the means from the m parallel chains, W is the average of the m within-chain variances, and df is the degrees of freedom of an approximating t density to the posterior distribution. Gelman and Rubin (1992) show that ˆR 1 as N 1, with a value near one suggesting good convergence. Bayesian Computational Methods p.51/??

52 This approach has been critisized because 1. It s a univariate diagnostic, and must be applied to each parameter separately. 2. The approach focuses soley on the bias component of convergence, with no informaiton on the accuracy. 3. The method relies heavily on the user s ability to find a stating distribution that is actually overdispersed with reposect ot the true posterior distribution, a condition we cannot really chaeck without knowledge of the latter. Bayesian Computational Methods p.52/??

Direct Simulation Methods #2

Direct Simulation Methods #2 Direct Simulation Methods #2 Econ 690 Purdue University Outline 1 A Generalized Rejection Sampling Algorithm 2 The Weighted Bootstrap 3 Importance Sampling Rejection Sampling Algorithm #2 Suppose we wish

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

MONTE CARLO METHODS. Hedibert Freitas Lopes

MONTE CARLO METHODS. Hedibert Freitas Lopes MONTE CARLO METHODS Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Bridge estimation of the probability density at a point. July 2000, revised September 2003

Bridge estimation of the probability density at a point. July 2000, revised September 2003 Bridge estimation of the probability density at a point Antonietta Mira Department of Economics University of Insubria Via Ravasi 2 21100 Varese, Italy antonietta.mira@uninsubria.it Geoff Nicholls Department

More information

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

BRIDGE ESTIMATION OF THE PROBABILITY DENSITY AT A POINT

BRIDGE ESTIMATION OF THE PROBABILITY DENSITY AT A POINT Statistica Sinica 14(2004), 603-612 BRIDGE ESTIMATION OF THE PROBABILITY DENSITY AT A POINT Antonietta Mira and Geoff Nicholls University of Insubria and Auckland University Abstract: Bridge estimation,

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

CS281A/Stat241A Lecture 22

CS281A/Stat241A Lecture 22 CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Markov Chain Monte Carlo and Applied Bayesian Statistics

Markov Chain Monte Carlo and Applied Bayesian Statistics Markov Chain Monte Carlo and Applied Bayesian Statistics Trinity Term 2005 Prof. Gesine Reinert Markov chain Monte Carlo is a stochastic simulation technique that is very useful for computing inferential

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Sequential Monte Carlo Methods

Sequential Monte Carlo Methods University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31 Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Monte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University

Monte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University Methods Econ 690 Purdue University Outline 1 Monte Carlo Integration 2 The Method of Composition 3 The Method of Inversion 4 Acceptance/Rejection Sampling Monte Carlo Integration Suppose you wish to calculate

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

1 MCMC Methodology. ISyE8843A, Brani Vidakovic Handout Theoretical Background and Notation

1 MCMC Methodology. ISyE8843A, Brani Vidakovic Handout Theoretical Background and Notation ISyE8843A, Brani Vidakovic Handout MCMC Methodology. Independence of X,..., X n is not critical for an approximation of the form E θ x h(x) = n n i= h(x i), X i π(θ x). In fact, when X s are dependent,

More information

Bayesian spatial hierarchical modeling for temperature extremes

Bayesian spatial hierarchical modeling for temperature extremes Bayesian spatial hierarchical modeling for temperature extremes Indriati Bisono Dr. Andrew Robinson Dr. Aloke Phatak Mathematics and Statistics Department The University of Melbourne Maths, Informatics

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Empirical Likelihood Based Deviance Information Criterion

Empirical Likelihood Based Deviance Information Criterion Empirical Likelihood Based Deviance Information Criterion Yin Teng Smart and Safe City Center of Excellence NCS Pte Ltd June 22, 2016 Outline Bayesian empirical likelihood Definition Problems Empirical

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

A Single Series from the Gibbs Sampler Provides a False Sense of Security

A Single Series from the Gibbs Sampler Provides a False Sense of Security A Single Series from the Gibbs Sampler Provides a False Sense of Security Andrew Gelman Department of Statistics University of California Berkeley, CA 9472 Donald B. Rubin Department of Statistics Harvard

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Lecture Notes based on Koop (2003) Bayesian Econometrics

Lecture Notes based on Koop (2003) Bayesian Econometrics Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout

More information

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch. Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

arxiv: v1 [stat.co] 18 Feb 2012

arxiv: v1 [stat.co] 18 Feb 2012 A LEVEL-SET HIT-AND-RUN SAMPLER FOR QUASI-CONCAVE DISTRIBUTIONS Dean Foster and Shane T. Jensen arxiv:1202.4094v1 [stat.co] 18 Feb 2012 Department of Statistics The Wharton School University of Pennsylvania

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Monte Carlo Integration using Importance Sampling and Gibbs Sampling

Monte Carlo Integration using Importance Sampling and Gibbs Sampling Monte Carlo Integration using Importance Sampling and Gibbs Sampling Wolfgang Hörmann and Josef Leydold Department of Statistics University of Economics and Business Administration Vienna Austria hormannw@boun.edu.tr

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information