Dynamic Macro 1 / 114. Bayesian Estimation. Summer Bonn University

Size: px
Start display at page:

Download "Dynamic Macro 1 / 114. Bayesian Estimation. Summer Bonn University"

Transcription

1 Dynamic Macro Bayesian Estimation Petr Sedláček Bonn University Summer / 114

2 Overall plan Motivation Week 1: Use of computational tools, simple DSGE model Tools necessary to solve models and a solution method Week 2: function approximation and numerical integration Week 3: theory of perturbation (1st and higher-order) Tools necessary for, and principles of, estimation Week 4: Kalman filter and Maximum Likelihood estimation Week 5: principles of Bayesian estimation 2 / 114

3 Plan for today Bayesian estimation: the basic ideas extra information over ML: priors main challenge: evaluating the posterior Markov Chain Monte Carlo (MCMC) practical issues: acceptance rate, diagnostics implementation in Dynare 3 / 114

4 Frequentist vs. Bayesian views Bayes rule Bayesian estimation: basic concepts 4 / 114

5 Frequentist vs. Bayesian views Bayes rule Frequentist vs. Bayesian views Frequentist view: parameters are fixed, but unknown likelihood is a sampling distribution for the data realizations of observables Y T just one of many possible realizations from L(Y T Ψ) inferences about Ψ based on probabilities of particular Y T for given Ψ 5 / 114

6 Frequentist vs. Bayesian views Bayes rule Frequentist vs. Bayesian views Bayesian view: observations, not parameters, are taken as given Ψ are viewed as random inference about Ψ based on probabilities of Ψ conditional on data Y T P(Ψ Y T ) probabilistic view of Ψ enables incorporation of prior beliefs Sims (2007): Bayesian inference is a way of thinking, not a basket of methods 6 / 114

7 Frequentist vs. Bayesian views Bayes rule Bayes rule / 114

8 Frequentist vs. Bayesian views Bayes rule Bayes rule Joint density of the data and parameters is: P(Y T, Ψ) =L(Y T Ψ)P(Ψ) or P(Y T, Ψ) =L(Ψ Y T )P(Y T ) From the above we get Bayes rule: P(Ψ Y T ) = L(YT Ψ)P(Ψ) P(Y T ) 8 / 114

9 Frequentist vs. Bayesian views Bayes rule Elements of Bayes rule what we re interested in, posterior distribution: P(Ψ Y T ) likelihood of the data: L(Y T Ψ) our prior about the parameters: P(Ψ) probability of the data: P(Y T ) for the distribution of Ψ P(Y T ) is just a constant P(Ψ Y T ) L(Y T Ψ)P(Ψ) 9 / 114

10 Frequentist vs. Bayesian views Bayes rule What is the challenge? getting the posterior is typically not such a big deal problem is that we often want to know more: conditional expected values of a function of the posterior like mean, variance, model etc. 10 / 114

11 Frequentist vs. Bayesian views Bayes rule What is the challenge? E[g(Ψ)] = g(ψ)p(ψ Y T )dψ P(Ψ Y T )dψ E[g(Ψ)] is the weighted average of g(ψ) weights are determined by the data (likelihood) and the prior 11 / 114

12 Frequentist vs. Bayesian views Bayes rule What is the challenge? we need to be able to evaluate the integral! Special/Simple case: we are able to draw Ψ from P(Ψ Y T ) can evaluate integral via Monte Carlo integration you won t be lucky enough to experience this case 12 / 114

13 Frequentist vs. Bayesian views Bayes rule Our situation: we can calculate P(Ψ Y T ), but we cannot draw from it Solutions: numerical integration Markov Chain Monte Carlo (MCMC) integration What is the standard? although numerical integration is fast and accurate computational burden rises exponentially with dimension suited for low-dimension problems use MCMC methods 13 / 114

14 14 / 114

15 Idea of priors summarize prior information previous studies data not used in estimation pre-sample data other countries etc. don t be too restrictive more on prior selection in extensions 15 / 114

16 Most commonly used distributions: normal beta, support [0, 1] persistence parameters (inverted-) gamma, support (0, ) volatility parameters uniform 16 / 114

17 Prior predictive analysis check whether priors make sense use the prior as the posterior steady state? impulse response functions? 17 / 114

18 Some terminology Jeffreys prior non-informative prior improper vs. proper priors improper prior is non-integrable (integral is ) important to have proper distributions for model comparison 18 / 114

19 Some terminology (natural) conjugate priors family of prior distributions after multiplication with the likelihood produce a posterior of the same family Minnesota (Litterman) prior used in VARs for distribution of lags 19 / 114

20 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm 20 / 114

21 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Starting point Aim is to be able to calculate something like E[g(Ψ)] = g(ψ)p(ψ Y T )dψ P(Ψ Y T )dψ we know how to calculate P(Ψ Y T ) but we cannot draw from it the system is too large for numerical integration 21 / 114

22 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Principle of posterior evaluation We cannot draw from the target distribution, but 1. can draw from a different, stand-in, distribution 2. can evaluate both stand-in and target distributions 3. comparing the two, we can re-weigh the draw cleverly 22 / 114

23 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Principle of posterior evaluation the above procedure is the idea of importance sampling MCMC methods effectively a version of importance sampling traveling through the parameter space is more sophisticated and or acceptance probability more sophisticated 23 / 114

24 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm A few simple examples Problem: we want to simulate x x comes from truncated normal with mean µ and variance σ 2 and a < x < b Solution: 1. draw y from N(µ, σ 2 ) 2a. if y (a, b) then keep draw (accept) and go back to 1 2b. otherwise discard draw (reject) and go back to 1 24 / 114

25 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm A few simple examples Problem: want to draw x from F (x), but we cannot we can sample from G(x) and f (x) cg(x) x Solution: 1. sample y from G(y) 2. accept draw with probability f (y) cg(y) Note: and go back to 1 acceptance rate higher for lower c optimal c is c = sup x f (x) g(x) Metropolis-Hastings sampler (MCMC) is a generalization 25 / 114

26 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Importance sampling Main idea very similar to the previous example: cannot draw from P(Ψ Y T ) but can draw from H(Ψ) be smart in reweighing (accepting) the draws 26 / 114

27 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Importance sampling g(ψ) P(Ψ Y T ) h(ψ) h(ψ)dψ E[g(Ψ)] = P(Ψ Y T ) h(ψ) h(ψ)dψ = g(ψ)ω(ψ)h(ψ)dψ ω(ψ)h(ψ)dψ ω(ψ) = P(Ψ YT ) h(ψ) 27 / 114

28 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Importance sampling Approximate the integral using MC integration: E[g(Ψ)] M m=1 ω(ψ(m) )g(ψ (m) ) M m=1 ω(ψ(m) ) M is the number of draws from importance function h(ψ) 28 / 114

29 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Importance sampling How to best choose h(.)? we d like h(.) to have fatter tails compared to f (.) normal distribution has rather thin tails often not a good importance function 29 / 114

30 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Before we move on 3 doors, behind one of them is a car pick one I will open one of the remaining two without the car you can choose to stick with your choice or switch who stays and who switches? 30 / 114

31 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Some preliminaries for MCMC Markov property: if for all k 1 and all t P(x t+1 x t, x t 1,..., x t k ) = P(x t+1 x t ) Transition kernel: K(x, y) = P(x t+1 = y x t = x) for x, y X X is the sample space 31 / 114

32 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Main idea behind MCMC methods as before, we d like to sample from P(Ψ Y T ), but we cannot MCMC methods provide a way to create a Markov chain transition kernel (K) for Ψ that has an invariant density P(Ψ Y T ) given K simulate the Markov chain P = KP starting with some initial values P(Ψ 0) (eventually) distribution of Markov chain P(Ψ Y T ) 32 / 114

33 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Main idea behind MCMC methods a principle of constructing such kernels Metropolis (-Hastings) algorithm (MH) the Gibbs sampler is a special case 33 / 114

34 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Gibbs algorithm special case of the MH algorithm applies when can sample from each conditional distribution again, this will rarely be applicable in our case 34 / 114

35 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Gibbs algorithm instead of draws of Ψ from P(Ψ Y T ) portion Ψ into k blocks sample each from P(Ψ j Y T, Ψ j ) for j = 1,..., k iterate until convergence 35 / 114

36 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Gibbs sampling Iterations (k = 2): initiate sample with Ψ 0 then iterate according to: Ψ 1 i+1 P(Ψ 1 Y T, Ψ 2 i ) Ψ 2 i+1 P(Ψ 2 Y T, Ψ 1 i ) can prove that the above converges to P(Ψ Y T ) discard first B number of draws to eliminate influence of Ψ 0 36 / 114

37 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Gibbs sampling once Markov chain has converged proceed as if we could sample directly: E[g(Ψ)] = 1 m m g(ψ i ) i=1 37 / 114

38 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Gibbs sampling however, draws are serially correlated standard errors are higher [ ( 1 σ (E[g(Ψ)]) = σ m σ 2 0 variance of g(ψ) m 1 γ l l th -order autocovariance of g(ψ) l=1 )] 1/2 m 1 γ l m 38 / 114

39 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Metropolis-Hastings algorithm Main idea same as with importance sampling: 1. draw from a stand-in distribution h(ψ; θ) θ explicitly shows parameters of stand-in distribution e.g. mean (µ h ) and variance (σh 2) 2. accept/reject based on probability q(ψ i+1 Ψ i ) 3. go back to 1 3a. stand-in density does not change (indpendent MH) 3b. mean of stand-in adjusts (random walk MH) can show convergence to target distribution 39 / 114

40 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Acceptance probability Metropolis q(ψ i+1 Ψ i ) = min [ 1, P(Ψ i+1 YT ) P(Ψ i Y T ) ] Ψ i+1 is the new candidate draw from stand-in distribution if P(Ψ i+1 YT ) high relative to P(Ψ i Y T ) probability of Ψ i+1 relatively high and should accept 40 / 114

41 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Acceptance probability Metropolis-Hastings q(ψ i+1 Ψ i ) = min [ 1, P(Ψ i+1 YT ) P(Ψ i Y T ) ] h(ψ i ; θ) h(ψ i+1 ; θ) scale down by relative likelihood in stand-in density a more common draw from the stand-in gets less weight q(ψ i+1 Ψ i ) is lowered 41 / 114

42 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Acceptance probability Metropolis-Hastings q(ψ i+1 Ψ i ) = min [ 1, P(Ψ i+1 YT ) P(Ψ i Y T ) ] h(ψ i ; θ) h(ψ i+1 ; θ) P(Ψ i+1 YT )/h(ψ i+1 ; θ) high high probability of Ψ i+1 in target distribution should accept higher q(ψ i+1 Ψ i ) P(Ψ i Y T )/h(ψ i ; θ) high lower q(ψ i+1 Ψ i ) last draw was already in a likely part of the parameter space force the algorithm to explore less likely areas 42 / 114

43 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Updating the stand-in density Independence chain variant stand-in distribution does not change it is independent across Monte Carlo replications this is also the case in importance-sampling 43 / 114

44 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Updating the stand-in density Random walk variant candidate draws are obtained according to Ψ i+1 = Ψ i + ɛ i+1 ɛ i from a symmetric density around 0 and variance σ 2 h as if the mean of the stand-in density adjusts with each accepted draw in θ, µ h = Ψ i 44 / 114

45 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Summary of MCMC with MH algorithm 1. maximize log-posterior log P(Y T Ψ) + log P(Ψ) this yields the posterior mode Ψ 2. draw from a stand-in distribution h(ψ; θ) should have fatter tails than posterior 3. accept/reject based on probability q(ψ i+1 Ψ i ) Metropolis vs. Metropolis-Hastings specification 4. go back to 2 adjust (random walk variant) stand-in distribution do not adjust (independence variant) stand-in distribution 45 / 114

46 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Summary of MCMC with MH algorithm evaluation of the likelihood (step 1 and 3) requires computation of the steady state solution of the model constructing the likelihood function (via the Kalman filter) 46 / 114

47 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Choice of stand-in density stand-in should have fatter tails variance parameter important for acceptance rate optimal acceptance rates: around 0.44 for estimation of 1 parameter around 0.23 for estimation of more than 5 parameters 47 / 114

48 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Choice of stand-in density often, stand-in is N( ˆΨ, c 2 Σ Ψ ) ˆΨ is the posterior mode Σ Ψ is the inverse (negative) Hessian at the mode tip: start with c = 2.4/ d d is number of estimated parameters increase (decrease) c if acceptance rate is too high (low) 48 / 114

49 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Convergence statistics theory says that distribution will converge to target when does this happen? diagnostic tests sequence of draws should be from the invariant distribution moments should not change within/between sequences 49 / 114

50 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Brooks and Gelman statistics I draws and J sequences W = 1 J 1 J I 1 j=1 I ( ) 2 Ψi,j Ψ j i=1 B = I J J ( Ψj Ψ ) 2 j=1 B/I : estimate of the variance of the mean across sequences W : estimate of average variance within sequences 50 / 114

51 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Brooks and Gelman statistics Combine the two measures of variance: V = I 1 W + B I I as the length of the simulation increases want these statistics to settle down 51 / 114

52 Importance sampling Markov Chain Monte Carlo Gibbs algorithm Metropolis-Hastings algorithm Practical issues with MH algorithm Geweke statistic partition a sequence into 3 subsets s = {I, II, III } compute mean (Ψ s ) and standard errors (σ s Ψ ) s.e. s must be corrected for serial correlation then, under convergence CD is distributed N(0, 1) CD = ΨI Ψ III σ I Ψ + σiii Ψ 52 / 114

53 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison 53 / 114

54 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Bayesian vs. frequentist inference Bayesian inference cannot use frequentist principles t-test, F-test, LR-test etc. they have a frequentist justification of repeated sampling instead, there are two common Bayesian principles: Highest Posterior Density (HPD) interval Bayes factors (posterior odds) 54 / 114

55 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Highest posterior density intervals A 100(1 α)% posterior interval for Ψ is given by P(b < Ψ < b) = b b P(Ψ Y T )dψ = 1 α there exists many such intervals the HPD interval is the smallest one of them 55 / 114

56 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison HPD tests the HPD test amounts to checking whether Ψ i HPD 1 α this is an informal way of comparing nested models i.e. different parameter values Bayesians can also compare non-nested models more on this below 56 / 114

57 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Bayes factors B = P(YT Ψ 1 )P(Ψ 1 ) P(Y T Ψ 2 )P(Ψ 2 ) where Ψ 1 and Ψ 2 are two different sets of parameter values if B > 1 Ψ 1 is a posteriori more likely than Ψ 2 57 / 114

58 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Model comparison posterior densities can be used to evaluate conditional probabilities of particular parameter values conditional probabilities of different model specifications use Bayes factors (posterior odds ratio) to compare models advantage is that all models are treated symmetrically there is no null model compared to an alternative 58 / 114

59 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Model comparison B A B = P A(Y T Ψ A )P A (Ψ A ) P B (Y T Ψ B )P B (Ψ B ) it is also possible to assign priors on models the posterior odds ratio is then PO A B = P(A YT ) P(B Y T ) = B P(A) A B P(B) 59 / 114

60 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison Model comparison Bayes factor is related to Bayesian information criterion (BIC) the RHS is the BIC where B A B P A(Y T Ψ A ) P B (Y T Ψ B ) T kb ka 2 Ψ i denote ML estimates of parameters k i denote the number of parameters important to use proper priors if not, always prefer model with less parameters 60 / 114

61 Bayesian vs. frequentist inference Highest posterior density intervals Bayes factors Model comparison How much information in Bayes factor? Kass and Raftery (1995), if the value of B A B is between 1 and 3 barely worth mentioning between 3 and 20 positive evidence between 20 and 150 strong evidence over 150 very strong evidence 61 / 114

62 Preliminaries and steady state Estimation command Decomposition Output Example 62 / 114

63 Preliminaries and steady state Estimation command Decomposition Output Example Preliminaries setup is the same as with ML estimation always a good idea to solve model first some parameter values are likely to remain calibrated 63 / 114

64 Preliminaries and steady state Estimation command Decomposition Output Example : initialization initialize as usual var c, k, z, y; varexo e; parameters beta, rho, alpha, nu, delta, sigma; set parameter values that are not estimated alpha = 0.36; rho = 0.95; beta = 0.99; nu = 1; delta = 0.025; 64 / 114

65 Preliminaries and steady state Estimation command Decomposition Output Example : setting it up after model part, and specification of steady state tell Dynare which parameters he should estimate estimated params; stderr e, inv gamma pdf, 0.01, inf; end; the above tells Dynare to estimate σ, the st. error of the productivity disturbance the prior distribution is an inverted gamma the prior mean is 0.01 and the prior st. error is 65 / 114

66 Preliminaries and steady state Estimation command Decomposition Output Example : steady state steady state calculated for many different values of Ψ! solve for the steady state yourself (linearizing makes it easier) give the exact steady state to Dynare for the initial values option to provide own function that calculates steady state! modfilename steadystate.m or steady state model; block 66 / 114

67 Preliminaries and steady state Estimation command Decomposition Output Example : estimation then also tell Dynare which are the observable variables varobs y; estimation(options); options include specify data file for estimation: datafile=data number of MH sequences: mh nblocks number of MH replications: mh replic parameter of stand-in distribution variance (c): mh jscale variance of initial draw: mh init scale first observation (default first): first obs sample size (default all): nobs many more! 67 / 114

68 Preliminaries and steady state Estimation command Decomposition Output Example : decomposition decompose endogenous variables into contribution of shocks possible also after stoch simul shock decomposition(options) variables; options include e.g. parameter set use calibrated values: =calibration use prior/posterior mode: =prior mode/=posterior mode variables specifies for which variables to run the decomposition 68 / 114

69 Preliminaries and steady state Estimation command Decomposition Output Example : output RESULTS FROM POSTERIOR MAXIMIZATION: most important is the mode other stuff based on normality assumptions (typically violated) when Dynare gets to MCMC part it shows: in which MCMC sequence you are which fraction has been completed acceptance rate: adjust mh jscale appropriately remember that low acceptance rate algorithm travels through a larger part of Ψ domain 69 / 114

70 Preliminaries and steady state Estimation command Decomposition Output Example : plots priors MCMC diagnostics prior and posterior densities shocks implied at the mode observables and corresponding implied values 70 / 114

71 Preliminaries and steady state Estimation command Decomposition Output Example Estimating the neoclassical growth model use neoclassical growth model as data generating process 265 observations of output use Bayesian estimation to estimate σ σ, ρ, δ, α 71 / 114

72 Preliminaries and steady state Estimation command Decomposition Output Example Estimating the neoclassical growth model Easy case: estimated params; stderr e, inv gamma pdf, 0.01, inf; end; varobs y; estimation(datafile=y,mh nblocks=1,mh replic=10000, mh jscale=3,mh init scale=12) c, k, y; 72 / 114

73 Preliminaries and steady state Estimation command Decomposition Output Example MCMC prior plots-easy case 150 SE_e / 114

74 Preliminaries and steady state Estimation command Decomposition Output Example Shocks-easy case 0.03 e / 114

75 Preliminaries and steady state Estimation command Decomposition Output Example Observables and implied values-easy case 4.2 y / 114

76 Preliminaries and steady state Estimation command Decomposition Output Example Posterior density plots-easy case SE_e / 114

77 Preliminaries and steady state Estimation command Decomposition Output Example Printed results - easy case Posterior mode: (0.0004) Average acceptance rate: 37.7% Diagnostic statistics (Geweke): p-values on equality of means in sub-samples (no taper) 0.33 (4% taper) 0.38 (8% taper) etc. Posterior mean and HPD interval: ( ) 77 / 114

78 Preliminaries and steady state Estimation command Decomposition Output Example What we did today Basic concept of Bayesian estimation priors evaluating the posterior Markov Chain Monte Carlo (MCMC) practical issues acceptance rate, diagnostics implementation in Dynare 78 / 114

79 Preliminaries and steady state Estimation command Decomposition Output Example What we did in the first half of course Motivation Week 1: Use of computational tools, simple DSGE model Tools necessary to solve models and a solution method Week 2: function approximation and numerical integration Week 3: theory of perturbation (1st and higher-order) Tools necessary for, and principles of, estimation Week 4: Kalman filter and Maximum Likelihood estimation Week 5: principles of Bayesian estimation 79 / 114

80 Preliminaries and steady state Estimation command Decomposition Output Example 80 / 114

81 Trends More on priors Alternatives Trends Problem: methodology works for stationary environments data has trends not clear which trend the model represents? 81 / 114

82 Trends More on priors Alternatives Trends 10 5 deviations from trend (%) HP (1600) HP(10 5 ) linear quadratic BP(6,32) / 114

83 Trends More on priors Alternatives Trends we could build in a trend within the model e.g. productivity is trending stationarize non-stationary variables within the model i.e. inspect variables relative to productivity however, not clear that data satisfies balanced growth 83 / 114

84 Trends More on priors Alternatives Trends 0.65 Great ratios 0 c/y real i/y real :1 1962:4 1975:1 1987:2 2000: c/y nominal i/y nominal :1 1962:2 1975:1 1987:2 2000: :1 1962:2 1975:1 1987:2 2000: :1 1962:2 1975:1 1987:2 2000: / 114

85 Trends More on priors Alternatives Trends Solutions: use differenced data highlights high-frequency movements (measurement error) detrend prior to estimation 85 / 114

86 Trends More on priors Alternatives Estimation on detrended data use e.g. quadratic trend: y t = a 0 + a 1 t + a 2 t 2 + u t each variable can have its own trend using HP or Band Pass filter: y obs filtered t = B(L)y obs t B(L) is a 2-sided filter! creates artificial serial correlation in the filtered data apply filter also to model data 86 / 114

87 Trends More on priors Alternatives Estimation on detrended data the above implies that the model is fitted to low(er) frequencies only Canova (2010) points out that the above can lead to: underestimated volatility of shocks persistence of shocks is overestimated less perceived noise decisions rules imply higher predictability substitution and income effects may be distorted due to the above proposes to estimate flexible trend specifications within model 87 / 114

88 Trends More on priors Alternatives More on selecting priors what we ve described is based on selecting (independent) priors about deep parameters however, often we have priors about observables moreover, reasonable independent priors may form rather unreasonable properties of the model solutions proposed in the literature: Del Negro, Schorfheide (2008) Andrle, Benes (2013) Jarocinsky, Marcet (2013) 88 / 114

89 Trends More on priors Alternatives Del Negro, Schorfheide (2008) more guidance for eliciting priors three main issues with (independent) priors about deep parameters: may lead to probability mass on unrealistic properties of the model most exogenous shock processes are latent, i.e. difficult to form priors about priors are often transfered to different models 89 / 114

90 Trends More on priors Alternatives Del Negro, Schorfheide (2008) they group parameters into three categories: those determining the steady state those determining exogenous shocks those determining the endogenous propagation mechanism 90 / 114

91 Trends More on priors Alternatives Del Negro, Schorfheide (2008) Parameters related to steady state relationships discount rate, depreciation, returns to scale, inflation target etc. let S D (Ψ ss ) be a vector of steady state relationships depending on a set of parameters Ψ ss then Ŝ = S D(Ψ ss ) + η are measurements of those relationships with measurement error η Ŝ has a probabilistic interpretation and therefore using Bayes rule, one can write P(Ψ ss Ŝ) L(Ŝ Ψ ss)p(ψ ss ) allows for overidentification 91 / 114

92 Trends More on priors Alternatives Del Negro, Schorfheide (2008) Exogenous processes volatility and persistence parameters use implied moments of endogenous variables to back out priors the above is given values for Ψ ss and Ψ endo valid for a particular model and should not be directly transfered across models 92 / 114

93 Trends More on priors Alternatives Del Negro, Schorfheide (2008) Endogenous propagation mechanisms price rigidity, labor supply elasticity etc. one could use similar principle as above authors suggest independent priors because researchers often have a relatively good idea note that the joint prior induces non-trivial non-linear relationships between parameters joint prior becomes P(Ψ Ŝ) L(Ŝ Ψ ss)p(ψ ss )P(Ψ endo ) requires an additional step in MCMC algorithm 93 / 114

94 Trends More on priors Alternatives Andrle, Benes (2013) Andrle and Benes do not distinguish between groups of parameters their system priors are priors about concepts such as impulse response functions conditional correlations etc. 94 / 114

95 Trends More on priors Alternatives Andrle, Benes (2013) even sensible individual-parameter priors can lead to unintended properties of the aggregate model independence of priors can lead to substantial mass on such parameter regions call for careful prior-predictive analysis: IRFs, second moments... compare with posterior results is it the data or the model driving the results? 95 / 114

96 Trends More on priors Alternatives Andrle, Benes (2013) Candidates for system priors: steady states sensible values in levels or growth rates (un-)conditional moments cross-correlations (conditional on shocks) impulse response properties peak impacts, duration, horizon of monetary policy effectiveness etc. 96 / 114

97 Trends More on priors Alternatives Andrle, Benes (2013) Implementation: use Bayes rule again specify model properties you care about Z = h(ψ) these can be characterized by a probabilistic model Z D(Z s ) D(Z s ) is a distribution function Z s are parameters of that function (hyper-parameters) its likelihood function (the system prior): P(Z s Ψ, h) composite joint prior: P(Ψ Z s, h) P(Z s Ψ, h)p(ψ) 97 / 114

98 Trends More on priors Alternatives Andrle, Benes (2013) The posterior becomes P(Ψ Y T, Z s ) L(Y T Ψ)P(Z s Ψ, h)p(ψ) evaluation is in principle the same as before use of MCMC methods additional step in evaluating the system prior slows things down - have to run MCMC on prior (with likelihood switched off ) and then posterior 98 / 114

99 Trends More on priors Alternatives Jarocinsky, Marcet (2013) similar ideas as above, but in the context of Bayesian VARs their point is that widely used priors about parameters can lead to behavior of observables that is counterfactual always a good to do prior-predictive analysis of you model! 99 / 114

100 Trends More on priors Alternatives Alternatives to Bayesian estimation Maximum likelihood calibration GMM SMM & indirect inference 100 / 114

101 Trends More on priors Alternatives Maximum likelihood we ve seen it yesterday conceptually different from Bayesian estimation tools required part of Bayesian estimation 101 / 114

102 Trends More on priors Alternatives Calibration wide-spread methodology at least since Kydland and Prescott (1982) prior to this, state-of-the-art were systems of simultaneous equations those were viewed as true statistical models to be estimated 102 / 114

103 Trends More on priors Alternatives Calibration although calibration is also an empirical exercise it lacks the probabilistic interpretation the constraint is that the model mimics (a priori identified) features in the data Kydland and Prescott (1996): It is important to emphasize that the parameter values selected are not the ones that provide the best fit in some statistical sense. 103 / 114

104 Trends More on priors Alternatives Calibration Parameters are pinned down by a selection of real-world features long-run averages (labor share, hours worked) micro studies (preference parameters) certain business cycle properties of the data (shock parameters) etc. 104 / 114

105 Trends More on priors Alternatives Calibration compare different features of the data to model predictions closely related to moment-matching (estimating models) however, calibration lacks the statistical formality the above is a strong source of criticism of calibration no formal rules on selecting dimension to which model is fit no formal rules of comparing alternatives - models are necessarily misspecified not that the last point does not hold for Bayesian model comparison 105 / 114

106 Trends More on priors Alternatives Matching moments (GMM, SMM, II) idea similar to calibration: a set of moments (features) of the data used to parameterize model a different set of moments used to judge the performance of model matching moments adds statistical rigor estimation hypothesis testing 106 / 114

107 Trends More on priors Alternatives Matching moments (GMM, SMM, II) as with calibration, moment matching is based on a selection of moments often referred to as limited-information procedures a full range of statistical implications contained in model s likelihood function disadvantages of limited-information procedures potential loss of efficiency inference potentially sensitive to selected moments advantages of limited-information procedures no need to make distributional assumptions 107 / 114

108 Trends More on priors Alternatives Generalized method of moments attributed to Hansen (1982), generalization, asymptotic properties the main idea is to use orthogonality conditions (e.g. first-order-conditions) E[f (x t, Ψ)] = 0 x t is a vector of variables Ψ are model parameters 108 / 114

109 Trends More on priors Alternatives Generalized method of moments pick Ψ s.t. the sample analogs of orthogonality conditions g(x, Ψ) = 1/T t f (x t, Ψ) hold exactly, exactly identified case number of parameters = number of moment conditions are as close to zero as possible, overidentified case are number of parameters < number of moment conditions 109 / 114

110 Trends More on priors Alternatives Generalized method of moments in the over-identified case Ω is a weighting matrix min Ψ g(x, Ψ) Ωg(X, Ψ) the optimal weighting matrix is the inverse of the var-covar matrix of g(x, Ψ) 110 / 114

111 Trends More on priors Alternatives Simulated method of moments in some cases the orthogonality conditions cannot be assessed analytically moment-matching estimation based on simulations retains asymptotic properties of GMM 111 / 114

112 Trends More on priors Alternatives Simulated method of moments let z t be model variables corresponding to data x t let empirical targets be summarized by h(x t ) SMM estimation is based on E[h(x t )] = E[h(z t, Ψ)] f (x t, Ψ) = h(x t ) h(z t, Ψ) 112 / 114

113 Trends More on priors Alternatives Indirect inference based on reduced-form models main idea is to use structural model to interpret reduced-form results can simulated data from a structural model replicate a reduced-form estimate using real-world data? i.e. it is a moment-matching exercise moments are clearly defined by prior reduced-form analysis 113 / 114

114 Trends More on priors Alternatives Indirect inference let δ be a vector of reduced-form estimates δ(x t ) are those in the data and δ(z t, Ψ) are those from the model pick Ψ s.t. δ(x t ) = δ(z t, Ψ) 114 / 114

Bayesian Estimation of DSGE Models

Bayesian Estimation of DSGE Models Bayesian Estimation of DSGE Models Stéphane Adjemian Université du Maine, GAINS & CEPREMAP stephane.adjemian@univ-lemans.fr http://www.dynare.org/stepan June 28, 2011 June 28, 2011 Université du Maine,

More information

Building and simulating DSGE models part 2

Building and simulating DSGE models part 2 Building and simulating DSGE models part II DSGE Models and Real Life I. Replicating business cycles A lot of papers consider a smart calibration to replicate business cycle statistics (e.g. Kydland Prescott).

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc. DSGE Methods Estimation of DSGE models: Maximum Likelihood & Bayesian Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de Summer

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results Monetary and Exchange Rate Policy Under Remittance Fluctuations Technical Appendix and Additional Results Federico Mandelman February In this appendix, I provide technical details on the Bayesian estimation.

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

ESTIMATION of a DSGE MODEL

ESTIMATION of a DSGE MODEL ESTIMATION of a DSGE MODEL Paris, October 17 2005 STÉPHANE ADJEMIAN stephane.adjemian@ens.fr UNIVERSITÉ DU MAINE & CEPREMAP Slides.tex ESTIMATION of a DSGE MODEL STÉPHANE ADJEMIAN 16/10/2005 21:37 p. 1/3

More information

Empirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models

Empirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models Empirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models Michal Andrle, RES The views expressed herein are those of the author and should not be attributed to the International Monetary

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Chapter 1. Introduction. 1.1 Background

Chapter 1. Introduction. 1.1 Background Chapter 1 Introduction Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house and a collection of facts is not necessarily science. Henri

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Estimating Macroeconomic Models: A Likelihood Approach

Estimating Macroeconomic Models: A Likelihood Approach Estimating Macroeconomic Models: A Likelihood Approach Jesús Fernández-Villaverde University of Pennsylvania, NBER, and CEPR Juan Rubio-Ramírez Federal Reserve Bank of Atlanta Estimating Dynamic Macroeconomic

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc. DSGE Methods Estimation of DSGE models: GMM and Indirect Inference Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@wiwi.uni-muenster.de Summer

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Down by the Bayes, where the Watermelons Grow

Down by the Bayes, where the Watermelons Grow Down by the Bayes, where the Watermelons Grow A Bayesian example using SAS SUAVe: Victoria SAS User Group Meeting November 21, 2017 Peter K. Ott, M.Sc., P.Stat. Strategic Analysis 1 Outline 1. Motivating

More information

Graduate Macro Theory II: Notes on Quantitative Analysis in DSGE Models

Graduate Macro Theory II: Notes on Quantitative Analysis in DSGE Models Graduate Macro Theory II: Notes on Quantitative Analysis in DSGE Models Eric Sims University of Notre Dame Spring 2011 This note describes very briefly how to conduct quantitative analysis on a linearized

More information

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference DSGE-Models General Method of Moments and Indirect Inference Dr. Andrea Beccarini Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de

More information

Stochastic simulations with DYNARE. A practical guide.

Stochastic simulations with DYNARE. A practical guide. Stochastic simulations with DYNARE. A practical guide. Fabrice Collard (GREMAQ, University of Toulouse) Adapted for Dynare 4.1 by Michel Juillard and Sébastien Villemot (CEPREMAP) First draft: February

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Lawrence J Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Problem: the time series dimension of data is relatively

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Inference when identifying assumptions are doubted. A. Theory B. Applications

Inference when identifying assumptions are doubted. A. Theory B. Applications Inference when identifying assumptions are doubted A. Theory B. Applications 1 A. Theory Structural model of interest: A y t B 1 y t1 B m y tm u t nn n1 u t i.i.d. N0, D D diagonal 2 Bayesian approach:

More information

Inference when identifying assumptions are doubted. A. Theory. Structural model of interest: B 1 y t1. u t. B m y tm. u t i.i.d.

Inference when identifying assumptions are doubted. A. Theory. Structural model of interest: B 1 y t1. u t. B m y tm. u t i.i.d. Inference when identifying assumptions are doubted A. Theory B. Applications Structural model of interest: A y t B y t B m y tm nn n i.i.d. N, D D diagonal A. Theory Bayesian approach: Summarize whatever

More information

Gaussian Mixture Approximations of Impulse Responses and the Non-Linear Effects of Monetary Shocks

Gaussian Mixture Approximations of Impulse Responses and the Non-Linear Effects of Monetary Shocks Gaussian Mixture Approximations of Impulse Responses and the Non-Linear Effects of Monetary Shocks Regis Barnichon (CREI, Universitat Pompeu Fabra) Christian Matthes (Richmond Fed) Effects of monetary

More information

Introduction to Smoothing spline ANOVA models (metamodelling)

Introduction to Smoothing spline ANOVA models (metamodelling) Introduction to Smoothing spline ANOVA models (metamodelling) M. Ratto DYNARE Summer School, Paris, June 215. Joint Research Centre www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting

More information

DSGE-Models. Calibration and Introduction to Dynare. Institute of Econometrics and Economic Statistics

DSGE-Models. Calibration and Introduction to Dynare. Institute of Econometrics and Economic Statistics DSGE-Models Calibration and Introduction to Dynare Dr. Andrea Beccarini Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics willi.mutschler@uni-muenster.de Summer 2012 Willi Mutschler

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Economics 701 Advanced Macroeconomics I Project 1 Professor Sanjay Chugh Fall 2011

Economics 701 Advanced Macroeconomics I Project 1 Professor Sanjay Chugh Fall 2011 Department of Economics University of Maryland Economics 701 Advanced Macroeconomics I Project 1 Professor Sanjay Chugh Fall 2011 Objective As a stepping stone to learning how to work with and computationally

More information

Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models

Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models Francis X. Diebold Frank Schorfheide Minchul Shin University of Pennsylvania May 4, 2014 1 / 33 Motivation The use

More information

Model comparison. Christopher A. Sims Princeton University October 18, 2016

Model comparison. Christopher A. Sims Princeton University October 18, 2016 ECO 513 Fall 2008 Model comparison Christopher A. Sims Princeton University sims@princeton.edu October 18, 2016 c 2016 by Christopher A. Sims. This document may be reproduced for educational and research

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Combining Macroeconomic Models for Prediction

Combining Macroeconomic Models for Prediction Combining Macroeconomic Models for Prediction John Geweke University of Technology Sydney 15th Australasian Macro Workshop April 8, 2010 Outline 1 Optimal prediction pools 2 Models and data 3 Optimal pools

More information

Macroeconomics Theory II

Macroeconomics Theory II Macroeconomics Theory II Francesco Franco FEUNL February 2016 Francesco Franco Macroeconomics Theory II 1/23 Housekeeping. Class organization. Website with notes and papers as no "Mas-Collel" in macro

More information

Sequential Monte Carlo Methods

Sequential Monte Carlo Methods University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

High-dimensional Problems in Finance and Economics. Thomas M. Mertens High-dimensional Problems in Finance and Economics Thomas M. Mertens NYU Stern Risk Economics Lab April 17, 2012 1 / 78 Motivation Many problems in finance and economics are high dimensional. Dynamic Optimization:

More information

Estimating Deep Parameters: GMM and SMM

Estimating Deep Parameters: GMM and SMM Estimating Deep Parameters: GMM and SMM 1 Parameterizing a Model Calibration Choose parameters from micro or other related macro studies (e.g. coeffi cient of relative risk aversion is 2). SMM with weighting

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

Bayesian Computations for DSGE Models

Bayesian Computations for DSGE Models Bayesian Computations for DSGE Models Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 This Lecture is Based on Bayesian Estimation of DSGE Models Edward P. Herbst &

More information

Dynamics of Real GDP Per Capita Growth

Dynamics of Real GDP Per Capita Growth Dynamics of Real GDP Per Capita Growth Daniel Neuhoff Humboldt-Universität zu Berlin CRC 649 June 2015 Introduction Research question: Do the dynamics of the univariate time series of per capita GDP differ

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY ECO 513 Fall 2008 MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY SIMS@PRINCETON.EDU 1. MODEL COMPARISON AS ESTIMATING A DISCRETE PARAMETER Data Y, models 1 and 2, parameter vectors θ 1, θ 2.

More information

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem? Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born

More information

System Priors for Econometric Time Series

System Priors for Econometric Time Series WP/16/231 System Priors for Econometric Time Series by Michal Andrle and Miroslav Plašil IMF Working Papers describe research in progress by the author(s) and are published to elicit comments and to encourage

More information

Bayesian model selection for computer model validation via mixture model estimation

Bayesian model selection for computer model validation via mixture model estimation Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

1 Teaching notes on structural VARs.

1 Teaching notes on structural VARs. Bent E. Sørensen February 22, 2007 1 Teaching notes on structural VARs. 1.1 Vector MA models: 1.1.1 Probability theory The simplest (to analyze, estimation is a different matter) time series models are

More information

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p GMM and SMM Some useful references: 1. Hansen, L. 1982. Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p. 1029-54. 2. Lee, B.S. and B. Ingram. 1991 Simulation estimation

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Federal Reserve Bank of New York Staff Reports

Federal Reserve Bank of New York Staff Reports Federal Reserve Bank of New York Staff Reports Forming Priors for DSGE Models (and How It Affects the Assessment of Nominal Rigidities) Marco Del Negro Frank Schorfheide Staff Report no. 32 March 28 This

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information