Bayesian Analysis of Matrix Normal Graphical Models

Size: px
Start display at page:

Download "Bayesian Analysis of Matrix Normal Graphical Models"

Transcription

1 Bayesian Analysis of Matrix Normal Graphical Models By HAO WANG and MIKE WEST Department of Statistical Science, Duke University, Durham, North Carolina , U.S.A. {hao, February 7,

2 Summary We develop Bayesian analysis of matrix-variate normal data with conditional independence graphical structuring of the characterising variance matrix parameters. This leads to fully Bayesian analysis of matrix normal graphical models, including discussion of novel prior specifications, the resulting problems of posterior computation addressed using Markov chain Monte Carlo methods, and graphical model assessment that involves approximate evaluation of marginal likelihood functions under specified graphical models. Modelling and inference for spatial/image data via a novel class of Markov random fields that arise as natural examples of matrix normal graphical models is discussed. This is complemented by the development of a broad class of dynamic models for matrix-variate time series within which stochastic elements defining time series errors and structural changes over time are subject to graphical model structuring. Three examples illustrate these developments and highlight questions of graphical model uncertainty and comparison in matrix data contexts. Some key words: Gaussian graphical models; Graphical model comparison; Hyper-inverse Wishart distributions; Markov random fields; Marginal likelihood evaluation; Matrix normal models; Matrix-variate dynamic graphical models; Posterior simulation. 2

3 1 Introduction Matrix-variate normal distributions (Dawid, 1981; Gupta & Nagar, 2000) have been studied in analyses of two-factor linear models for cross-classified multivariate data (Finn, 1974; Galecki, 1994; Naik & Rao, 2001), in spatio-temporal models (Mardia & Goodall, 1993; Huizenga et al., 2002) and other areas. Some computational and inferential developments, including iterative calculation of maximum likelihood estimates have been developed (Dutilleul, 1999; Mitchell et al., 2005, 2006), and some empirical Bayesian methodology has recently been introduced for Procrustes analysis with matrix models (Theobald & Wuttke, 2006). The current paper develops a complete Bayesian analysis of matrix normal graphical models, i.e., matrix normal distributions in which each of the characterising variance matrices is constrained by a set of conditional independence restrictions consistent with an underlying graphical model (Whittaker, 1990; Lauritzen, 1996; Giudici, 1996; Giudici & Green, 1999; Jones et al., 2005). The framework includes fully Bayesian analysis of the matrix normal (full graphs) as a special case, and effective computational methods for marginal likelihood computation on a specified graphical model that underlies inference about conditional independence structures. The developments include novel Markov random field models, with potential utility in spatial and image analysis, that emerge naturally as a sub-class of matrix normal graphical models. The paper then extends the random sampling framework to matrix-variate time series models that inherit the graphical model structure to represent conditional independencies in matrix series over time. Part of our motivation lies in the interest in scaling multivariate and matrix-variate models to deal with increasingly higher-dimensional problems, such as multiple economic indicators or assets measured across multiple funds, companies, sectors or countries. Applied contexts exist in many other areas in the natural and engineering as 3

4 well as economic and social sciences. As variable dimension increases, the scientific and empirical rationale for increased sparsity of the structure of dependencies among variables is increasingly forceful. In parallel, the need for increased parametric constraints to reflect increased sparsity are vital if computations for model implementation are to scale satisfactorily. Our models combine sparsity-enabling graphical models in priors for each if the two variance matrices in a matrix normal model to address these issues. We focus on decomposable graphs for initial studies to maintain focus and scope of the paper, though the general ideas and approach will apply to non-decomposable models. Beginning with preliminaries and notation for matrix normal models in Section 2, we describe matrix normal graphical models in Section 3. This includes a novel extension of the hyper-markov inverse Wishart (hereafter HIW) distribution that uses the hyper-markov properties (Dawid & Lauritzen, 1993) and parameter expansion (Liu et al., 1998; Liu & Wu, 1999) ideas to incorporate inherent parameter identification constraints. Section 4 describes Gibbs sampling for posterior computations, and a custom method of approximating model marginal likelihoods based on the posterior simulation. Section 5 provides a simple example of analysis of a simulated data set, illustrating aspects of the computation. Section 6 introduces a novel class of Markov random fields of potential utility in spatial, imaging and texture modelling; this is a specific example of a sub-class of matrix normal graphical models of interest in its own right that is also used here to illustrate marginal likelihood computation as a key component of model comparison. Section 7 shows how the matrix graphical structure can be naturally embedded in a broad class of matrix time series models, and develops a detailed analysis of a macro-economic data set for additional illustration of the effectiveness and utility of the new matrix-variate models. 4

5 2 Matrix Variate Normal Distribution and Notation 2.1 Matrix Variate Normal Distribution The q p random matrix Y has a matrix-variate normal distribution with mean matrix M, left (or column) and right (or row) non-singular variance matrices U = (u ij ) (q q) and V = (v ij ) (p p) respectively, and defining notation Y N(M, U, V ) when the density function is p(y ) p(y U, V ) = k(u, V )etr{ (Y M) U 1 (Y M)V 1 /2} with k(u, V ) = (2π) qp/2 U p/2 V q/2. Denote row i = 1 : p of Y by y i and column j = 1 : q by y j, so that Y = (y ij ) = (y 1,, y p ) = (y 1,, y n ). Use the same notation for elements, rows and columns of M. Then y i N(m i, u ii V ) for row i = 1 : q and y j N(m j, v jj U) for column j = 1 : p. 2.2 Precision Matrices and Conditional Dependencies Write Ω = U 1 = (ω ij ) and Λ = V 1 = (λ ij ) for the left and right precision matrices, respectively. For each row i = 1 : q we have a complete conditional p variate normal distribution with E(y i y i ) = m i ω 1 ii s (1:q\i) ω is (y s m s ), V (y i y i ) = ω 1 ii V. Similarly, for each column j = 1 : p we have a complete conditional q variate normal distribution with E(y j y j ) = m j λ 1 jj t (1:p\j) λ tj (y t m t ), V (y j y j ) = λ 1 jj U. 5

6 For scalar elements y ij, the complete conditional univariate normals have E(y ij y ij ) = m ij (ω ii λ jj ) 1 ω 1 ii V (y ij y ij ) = (ω ii λ jj ) 1. s (1:q\i) s (1:q\i) t (1:p\j) ω is λ tj (y st m st ) ω si (y sj m sj ) λ 1 jj t (1:p\j) λ tj (y it m it ), These equations show how zeros in the off-diagonal elements of the two precision matrices define conditional independencies: Zeros in the right precision matrix Λ define, and are defined by, conditional independencies among columns. For t j, λ tj = 0 if and only if the complete conditional distribution of y j does not depend on y t, or y j y t conditional on all y k, k (1 : p)\(j, t). A similar relationship exists between the left precision matrix Ω and the conditional independencies among rows. In terms of univariate elements, for s i, j t elements y ij and y st may, conditional upon y (ij,st) be dependent through either rows or columns. Conditional independence is equivalent to: (a) at least one zero among λ tj and ω is when s i, j t; (b) ω is = 0 when s i, j = t; (c)λ jt = 0 when s = i, j t. This structure is a key aspect of matrix normal models and, in particular, underlies our extension of Gaussian graph models. Without loss of generality in the rest of this section, we assume the data are zero mean to develop the ideas. 2.3 Graphical Model Structuring and Density Factorisations Two graphical structures can be overlaid on a matrix normal model, one for each of the two precision matrices. Suppose two such undirected graphs are specified. Thus Λ = V 1 has off-diagonal zeros, corresponding to conditional independencies of elements of Y within any one row, that define a graph G V = (N V, E V ). Here the node (or 6

7 vertex) set N V = {j = 1 : p} is just column (within row) indicators, and the edge set E V contains only those pairs of column indices (j, t) for which λ jt 0. Similarly, G U = (N U, E U ) has node set N U = {i = 1 : q} of row (within column) indicators and the edge set E U contains only those pairs of row indices (i, s) for which ω is 0. The theory, and aspects of methodology, of multivariate normal graphical models can be now overlaid. We focus here on decomposable graphs, so that each of the two graphs is assumedly decomposable. Thus, the row multivariate normal distribution is Markov with respect to the graph G U and the column multivariate normal is Markov with respect to the graph G V. The Markovian factorisation translates directly to conditional factorisations of the matrix normal distrubution. For example, over G V the density of Y factorises as p(y U, V, G V ) = P V P V p(y PV U, V PV ) S V S V p(y SV U, V SV ), (1) where P V is the set of complete prime components, or cliques, of the graph, and S V is the set of separators. For each subgraph g {P V, S V }, Y g is the q g matrix with variables from the g columns of Y defined by the subgraph, and V g the corresponding sub-matrix of V. Each density term in the product of ratios of equation (1) is that of a matrix-variate normal p(y g U, V g ) = k(u, V g )etr{ Y gu 1 Y g V 1 g /2} (2) with a full precision matrix; that is, Λ g = V 1 g constraints apart from positive-definiteness and symmetry. has no zero entries and no parametric We can similarly represent the joint density in factorised form over the graph G U. 2.4 Identification Although the product U V is uniquely identified, U and V individually are not since, for any c 0, p(y U, V ) = p(y cu, V/c). There are a number of potential approaches to 7

8 imposing identification constraints, including constraints such as tr(v ) = p (Theobald & Wuttke, 2006). However, the development of graphical model structuring complicates the matter, and leads us to develop model identification within the context of hyper- Markov priors over each of U and V over decomposable graphs. Then it becomes natural to adopt a very simple mathematical constraint for identification, namely fixing the value of one diagonal entry in one of the variance matrices. We develop the analysis now with v 11 = 1. 3 Matrix Gaussian Graphical Model Form and Priors For multivariate Gaussian graphical models, the general class of priors based on the hyper-markov laws introduced in Dawid & Lauritzen (1993) have the desirable property that the laws are compatible and consistent across different graphs, as discussed with full details of notation in Jones et al. (2005) and Giudici & Green (1999). The resulting priors for variance matrices are hyper inverse Wishart distributions, denoted by HIW. On decomposable graphs, the implied priors on sub-variance matrices on all components and separators are inverse Wishart. To extend this class of priors to U, V in the matrix normal graphical model, one first step is simply to adopt independent HIW priors for each of U and V separately. This maintains compatibility and consistency across graphs G U, G V. Then, to incorporate the identification constraint v 11 = 1, we use a parameter expansion (PX) approach. The general PX idea involves expanding the parameter space by adding new nuisance parameters, and has been used simply algorithmically to accelerate Monte Carlo Markov chain samplers (Liu et al., 1998; Liu & Wu, 1999). However, as noted by (Gelman, 2004, 2006), PX can also usefully induce new families of prior distributions in which the added parameters are assigned informative priors. Generalizing the HIW priors for multivariate Gaussian graphical models, we thus define the matrix-variate graphical model as follows, assuming an initial random sampling context for q p data matrices Y i, (i = 1 : n). We assume a matrix normal distribution 8

9 and graphical models G U, G V as above. Then Y i U, V i.i.d N(0, U, V ), i = 1 : n, (3) U HIW GU (b, B), (4) V HIW GV (d, D), (5) V = V /v 11, (6) with U, V independent. Here we use the traditional HIW notation in which the degrees of freedom parameters b, d are positive, while B, D are variance matrices with dimensions those of U, V. The PX approach has introduced V as an unconstrained, right variance matrix, while V satisfies v 11 = 1 and v 11 represents the added parameter. The prior p(u, V ) is conditionally conjugate, in that the implied complete conditional posteriors for each of U, V are HIW. This leads to straightforward Gibbs sampling as described in Section 4.1 below, and is based on simply coupling of two independent HIW priors in an expanded format that integrates the identification constraint. We can also interpret v11. For each column Y j in Y, V (Y j ) = v jj U = (vjj/v 11)U, so v11 converts column scales to those relative to the scale of the first column. Note that the model maintains consistency and compatibility of the priors across graphs since they are inherited directly from the coupled HIW priors in the expanded parameter space of {V, U}. The induced prior for V maintains consistency over cliques, and as we move across graphs G V the priors p(v G V ) are compatible in the sense of having the same induced priors over correlation structures. That is, for two graphs G V and G V with a common clique C, let R C be the correlation matrix of V C. Then the implied priors are equal, p(r C G V ) = p(r C G V ). However, the induced laws are no longer in complete agreement for V = V /v 11 due to the different parameterizations and interpretation, and this is natural and appropriate. Each element in diag(v C ) now represents the relative scale of variance of that column to the variance of the first column. Therefore, if G V and 9

10 G V imply different conditional dependencies between the first column and columns corresponding to C, then the induced priors over V C should indeed be different to reflect such differences. The prior density for V on G V is obtained directly by transformation from that for V. Note that, on any graph G V, V is determined only by its free elements, i.e., those elements appearing in the sub-matrices corresponding to the cliques of the graph, and the non-free elements of V are deterministic functions of the free elements. Let ν be the number of free elements in V on G V. The transformation from V to V = V /v11 and v11 then has Jacobian J(V v11, V ) = (v11) ν 1, and so we obtain the joint density of the induced prior for V and the PX parameter v11 as p(v, v11) = HIW GV (v11v d, D)(v11) ν 1. This coupled with the HIW prior p(u) on G U defines a class of conditionally conjugate priors and the posterior analysis can be developed. 4 Posterior and Marginal Likelihood Computation 4.1 Gibbs Sampling For specified graphs G U, G V, the full model of equation (6) with the parameter expansion yields the following set of complete conditional distributions: n (U Y 1:n, V, v11) HIW GU (b + np, B + Y i V 1 Y i ), i=1 (v 11 Y 1:n, V, U) IG(a/2 ν, tr(dv 1 )/2) where a = P V P V (2 P V + d) S V S V (2 S V + d), V Y 1:n, U, v 11 HIW GV (d + nq, Dv n i=1 Y i U 1 Y i )I(V 11 = 1). The last component, is the HIW distribution conditioned on the 1 1 element of the variance matrix set at unity. These form the basis of an efficient Gibbs sampler to 10

11 generate from the full posterior p(u, V, v11 Y 1:n ). The Gibbs iterates involve sampling from the HIW, inverse gamma and new conditional HIW distributions as the defining parameters change. Simulation of the HIW is based on Carvalho et al. (2007). Sampling conditional HIW distributions of the form required involves a simple modification; following Lemma 2.18 of Lauritzen (1996), we can always find a perfect ordering of the nodes in G V so that node 1 in the first clique, say C, and then initialise the HIW sampler of Carvalho et al. (2007) to begin with a simulation of the implied conditional inverse Wishart distribution for the variance matrix on that first clique. Sampling V C from an inverse Wishart distribution conditional on the first diagonal element set to unity is straightforward. 4.2 Marginal Likelihood Exploration of uncertainty about graphical model structures involves, in part, consideration of the marginal likelihood function over graphs G U, G V. On any given two graphs, the value of marginal likelihood function is p(y 1:n ) p(y 1:n G U, G V ) = p(y 1:n U, V )p(u)p(v )dv du (7) where the priors in the integrand depend on the graphs although we drop that in the notation for clarity. Graphical model search and comparison involves computing posterior probabilities over graphs, and the value of p(y 1:n ) as graphs are varied provides the corresponding likelihood function. In traditional multivariate models, this can be evaluated in closed form on decomposable graphs, and is a computational cornerstone of applied use of such models (Giudici, 1996; Giudici & Green, 1999; Jones et al., 2005; Carvalho & West, 2007a,b). Here, however, the integral cannot be evaluated in closed form or easily numerically. We can, however, generate useful approximations via a novel use of the so-called Candidate s formula (Besag, 1989; Chib, 1995). Having obtained MCMC posterior samples, we note that Candidate s formula (Be- 11

12 sag, 1989; Chib, 1995) can be applied in a number of ways to generate approximations to (7). We capitalise on this observation to generate two alternative approximations, so that we can compare them to gain insights into approximation quality. (A) Observe that, for any value U, V, v 11, p(y ) = p(y, U, V, v 11) p(u, V, v 11 Y ) directly from Bayes theorem. The idea is then to estimate the components of this equation that have no closed form, then plug-in chosen values U, V, v 11, such as the MCMC-based approximate posterior means used in our examples, to provide an estimate of p(y ). The numerator p(y, U, V, v 11) = p(y U, V, v 11)p(U)p(V v 11)p(v 11) can be directly and easily computed. The denominator is p(u, V, v 11 Y ) = p(v v 11, U, Y )p(v 11, U Y ) where the first term has an easily evaluated closed form, as in the Gibbs sampling step. The second term may be approximated by p(v11, U Y ) = p(v11 Y, V )p(u Y, V, v11)p(v Y )dv 1 M M p(v11 Y, V (j) )p(u Y, V (j), v11) j where the sum is over MCMC posterior draws V (j) ; this is easy to compute as it is a sum of the product of inverse gamma and hyper-inverse Wishart density evaluations. (B) In parallel, an alternative representation is, again directly from Bayes theorem, p(y ) = p(y, V ) p(v Y ) 12

13 The numerator can be analytically evaluated as p(v, Y ) = p(y, U, V, v11)du dv11 = q V (2π) nqp/2 H(b, B, G U )H(d, D, G V ) H(b + nq, B + n i=1 Y iv 1 Y i, G U )H(a, tr(dv 1, G V )) where q V is the constant q V = P V P V V PV (nq+d+2 P V )/2 S V S V V SV (nq+d+2 S V )/2, the H(,, G ) terms are normalizing constants of the corresponding HIW distributions, and a = P V (2 P V + d) S V (2 S V + d) 2ν. P V P V S V S V The density function in the denominator is approximated as p(v Y ) = p(v v11, U, Y )p(v11, U Y )dv11 du 1 M M j P (V Y, U (j), v (j) 11 ) where the sum is over posterior MCMC draws (U (j), v (j) 11 ) can be easily performed, with terms given by conditional hyper-inverse Wishart density evaluations. By comparing the two, resulting marginal likelihood estimates we may assess how well the approximation performs, as is illustrated in Section 5. 5 Example: A Simulated Random Sample The first example fixes ideas and provides illustration of the performance of the MCMC and marginal likelihood approximation. Here n = 48 observations were drawn from the (q = 8) (p = 7) dimensional N(0, U, V ) distribution with graphical structure in 13

14 rows and columns; the precisions (with denoting zero elements to clearly display the graphical structure) are Λ = V 1 = Ω = U 1 = The analysis was developed using priors with b = d = 3 and B = 5I 8, D = 5I 7 and simulation sample size 8000 after an initial, discarded burn-in of 2000 iterations. Figure 1 presents some trace plots of Monte Carlo samples. Figure 2 compares the images of the true underlying precision matrices with Monte Carlo posterior estimates, the sample mean of the 5000 simulated precision matrices. Parallel checking for assessing the dual approximation of marginal likelihood is presented in Figure 3. Beyond its use as an implementation check (the code is in Matlab and available to interested readers from the authors web site) this illustrates the concordance of the two, parallel marginal likelihood estimates, which are close together even for rather small MCMC sample sizes and which differ negligibly on the log probability scale. 14

15 6 Markov Random Fields from Matrix Graphical Models A rather interesting class of matrix graphical structures arises under autoregressive (AR) correlation specifications for the two precision matrices. This generates a novel (to our knowledge) class of Markov random field models that is of potential interest in application areas such as texture image modelling. We use this construction here for a second, much higher-dimensional synthetic example. Take U and V as covariances matrices of stationary AR process. For example here, we choose q = p = 60 taking U as the variance matrix of an AR(5) model with AR parameters (0.91, 0.44, 0.38, 0.31, 0.22) and marginal variance 0.55, and V as the variance matrix of an AR(4) model with AR parameters (0.47, 0.23, 0.14, 0.19) and marginal variance This model is used to repeatedly simulate 50 observations and each draw from the model is a sampled Markov random field; the columns of each sample are correlated realisations from the underlying AR(5) model, and the rows correlated realisations of the AR(4) model. Figure 4 images the two underlying precision matrices along with two representative samples. To illustrate model fitting and evaluation, we use a prior specified with d = b = 3, D = (d + 2)I 60 and B = 0.01(b + 2)I 60. MCMC analysis uses burn-in of 1000 and then saved 2000 samples starting with initial value V = I 60. The MCMC was run repeatedly across a range of models differing in the order of the underlying AR models for rows and columns, exploring all combinations of AR(1) to AR(9) structures for each of the precision matrices. Applying the model marginal likelihood approximation to each model allows us to evaluate model orders. Table 1 shows the top 5 models selected by the largest log-marginal likelihood. As can be seen, the true model orders lead to the largest marginal likelihood and, more importantly in terms of assessing the effectiveness of the methodology, the two parallel marginal likelihood assessments are in concordance and differ negligibly on the scale of interest. 15

16 7 Dynamic Matrix-Variate Graphical Models for Time Series 7.1 General Model Class One of our interests is in the development of models for matrix time series data and in extending prior methodology for multivariate graphical model structuring as in Carvalho & West (2007a,b) and examples in Carvalho et al. (2007), to the matrix context. We develop a first such extension in the class of matric-variate dynamic linear models, or exchangeable time series models (Quintana & West, 1987; Quintana, 1992; West & Harrison, 1997), that for some years has been a central model context for financial time series and provides building blocks for more elaborate and highly structured models (Quintana et al., 2003; Carvalho & West, 2007a,b). Consider q p univariate times series Y t,ij following the matrix dynamic linear model (DLM) defined by Y t = (I q F t)θ t + ν t, ν t N(0, U, V ) (8) Θ t = (I q G t )Θ t 1 + Υ t, Υ t N(0, U Σ t, V ) (9) for t = 1, 2,..., where: Y t = (Y t,ij ), the q p matrix observation at time t; Θ t = (Θ t,ij ), the qs p state matrix comprised of q p state vectors Θ t,ij each of dimension s 1; Υ t = (ω t,ij ), the qs p matrix of state evolution innovations comprised of q p innovation vectors ω t,ij each of dimension s 1; ν t = (ν t,ij ), the q p matrix of observational errors; Σ t is an s s variance matrix related to scale and structure of innovations at time t; 16

17 The regression s vector F t and s s state evolution matrix G t are known for each t. In examples below F t = F and G t = G as is common in many practical models, but the model class is more general and includes dynamic regressions when F t involves predictor variables. The matrix normal distributions for the innovations and errors as specified here are such that Υ t follows a matrix-variate normal distribution with mean 0, left covariance matrix U Σ t and right covariance matrix V (Dawid, 1981; Carvalho & West, 2007a,b). In terms of the scalar time series, the model comprises p q univariate DLMs with individual s vector state parameters, namely Observation: Y t,ij = F tθ t,ij + ν t,ij, ν t,ij N(0, v i w j ) Evolution: Θ t,ij = G t Θ t 1,ij + ω t,ij, ω t,ij N(0, v i w j Σ t ) for each t, so it is clear how the covariance structures V,U and Σ t separately reflect dependencies among the rows and columns of the state parameter matrices, and the innovations that drive changes in time, as well as the observations. Importantly, each of the scalar series shares the same F t and G t elements, and the reference to the model as one of exchangeable time series reflects these symmetries. Suppose now that U and V are constrained by graphs G U and G V. Complete the model specification, conditional on U, V, with the initial prior (Θ 0 U, V, D 0 ) N(m 0, U C 0, V ) (10) Conditional on U, V, the model and prior setup implies a complete conjugate structure for sequential learning as data is processed, as follows. Theorem 1. Under the initial prior of Equation (10) and with data observed sequentially to update information set as D t = {D t 1, Y t }, the sequential updating for the matrix normal DLM with trend/seasonal effects is given as follows: 17

18 (i) Posterior at t-1: (Θ t 1 D t 1, U, V ) N(m t 1, U C t 1, V ) (ii) Prior at t: (Θ t D t 1, U, V ) N(a t, U R t, V ) where a t = (I n G t )m t 1 and R t = G t C t 1 G t + Σ t (iii) One-step forcast at t-1: (Y t D t 1, U, V ) N(f t, U q t, V ) where f t = (I n F tg t )m t 1 and q t = F tr t F t + 1 (iv) Posterior at t: (Θ t D t, U, V ) N(m t, U C t, V ) with m t = a t + (I q A t )e t C t = R t A t A tq t where A t = R t F t /q t and e t = Y t f t Proof. This is a direct extension of the theory in multivariate DLMs for data vec(y t ), see West & Harrison (1997). The main novelty here concerns the separability of covariance structures. That is: 18

19 (a) For all t, the distributions for state matrices have separable covariance structures; for example, (Θ t D t, U, V ) is such that cov(vec(θ t ) D t, U, V ) = V U C t. (b) The q p matrix of one-step ahead prediction errors, e t, does not depend on U and V. (c) The sequential updating equations for the set of qs p state matrices are implemented in parallel based on computations for the component univariate DLMs, each of them involving the same scalar q t, s vector A t and s s matrices R t, C t at time t. This reduction in required computation is a direct consequence of the exchangeable structure of the model for the set of series. In practical modelling, the sequence of variances matrices Σ t is typically controlled and highly structured using discount factors (West & Harrison, 1997). The most parsimonious structure uses one discount factor δ such that 0 < δ << 1 and sets Σ t = C t 1 (1 δ)/δ. The discount factor represents a loss of information, or increase in uncertainty, between time points corresponding to stochastic the time t innovation. Slightly more elaborate structures are used in models when the state vectors have identifiable sub-vector components, such as might be related to groups of regression parameters, trend parameters, seasonal parameters, and so forth. In such cases, two or more discount factors are usually applied to define Σ t. We give an example of this in the data analysis example below. 7.2 MCMC and Marginal Likelihoods in Dynamic Models Consider now the question of inference on U, V in this model context. First, note that with any set of observations Y 1:n over times t = 1 : n, the sequential updating analysis 19

20 on any graph G U and G V leads to the full joint distribution p(y 1:n U, V ) = p(y n U, V, D n 1 ) p(y 1 U, V, D 0 ) n = N(e t 0, q t U, V ), t=1 marginalised with respect to the sequence of state vectors. Thus, effectively, the sequence of one-step forecast error matrices represent a conditionally independent random sample from matrix normal distributions. Apart from the scalar factors q t, this is essentially the framework of Sections 3 and 4 and both the MCMC analysis and the approximate computation of marginal likelihood values are immediately accessible. The analysis details involves a tiny change to insert the q t constants, but otherwise applies directly. As a result, we are able to directly analysis data under such models for any specified graphs G U, G V under a specified prior U HIW GU (b, B) V HIW GV (d, D) V = V /v 11 with U V. We can now re-analyse the model and approximately evaluate marginal likelihoods p(y 1:n ) on any specified pair of graphs by marginalisation over U, V using the two parallel assessments developed earlier. 7.3 A Macroeconomic Data Example The model developed above is illustrated here. The interest is in analysing the conditional dependence structure over a period of years in monthly series of observations on changes in labour market employment statistics. The data series considered are Current Employment Statistics (CES) for the 8 US states New Jersey (NJ), New York (NY), Massachusetts (MA), Georgia (GA), North Carolina (NC), Virginia (VA), Illinois (IL) and Ohio (OH). We explore these data across 9 industrial sectors, namely construction 20

21 (C), manufacturing (M), transportation & utilities (T&U), information (I), financial activities (FA), professional & business (P&BS) services, education & health (E&H) services, leisure & hospitality (L&H) and government (G). In our model framework, we have q = 8, p = 9 and monthly data over several years. Then U characterizes the residual conditional dependencies among states while V does the same for industrial sectors, all in the context of an overall model that incorporates time-varying state parameters for underlying trend and annual seasonal structure in the series. The trend and seasonal elements are represented in standard form, the former as random walks and the latter as randomly varying seasonal effects. The details in the model notation above are as follows. In month t, the monthly employment change in state i and industrial sector j is Y t,ij, modelled as a first-order polynomial/seasonal effect model (West & Harrison (1997)) with the state vector comprising a local level parameter and 12 seasonal factors, so that the state dimension is s = 13. The univariate DLMs of equation (10) are as follows: F t F = (1, 1, 0,..., 0) for all t, have 2 leading ones followed by 11 zeros; matrix G t G = P where P = 0 I ; Θ t,ij = (µ t,ij, φ t,ij ) where µ t,ij is the local level of the series and φ t,ij = (φ t,ij,k, φ t,ij,k+1,, φ t,ij,11, φ t,ij,0,, φ t,ij,k 1 ) is the current seasonal factor satisfying the constraint 1 φ t,ij = 0 for all i, j and t; Σ t is the variance matrix of trend/seasonal effects structured as Σ t = Σ t,µ 0 0 Σ t,φ 21

22 where the univariate entry Σ t,µ and the block Σ t,φ are defined via two discount factors δ l (for level) and δ s (for seasonal) and the corresponding block components of C t as Σ t,µ = C t 1,µ (1 δ l )/δ l and Σ t,φ = P C t 1,φ P (1 δ s )/δ s. The discount factor δ l determines the rate at which the trend parameter µ t,ij are expected to vary between months, with 100(δ 1 l 1)% of information (as measured by precision matrix) about these parameters decaying each month. The factor δ s playes the same role for seasonal parameters. Our analysis uses δ l = 0.9, δ s = For each model analyses, the initial prior is very vague, being specified via m 0 = 0, the matrix, and C 0 = 100I 13. The constraint that 1 φ t,ij = 0 is imposed by transforming m 0 and C 0 as discussed in West & Harrison (1997). Applying this model, we aim to detect and estimate sustained movement and changes in trend and seasonality, generating on-line detrended and deseasonalized estimates matrix series e t whose row and column covariance patterns are defined by the parameters U, V. As an example, the (NC,FA) data and some aspects of the sequential model fit are graphed in Figure 5. It is beyond the scope of the current paper to develop and discuss computational methods of graphical model search that are of interest to automate the process of generating candidate graphs (G U, G V ). This is an open research area and one we are currently exploring using ideas from MCMC and stochastic search (Dobra et al., 2004; Jones et al., 2005; Rich et al., 2005; Hans et al., 2007). For our purposes here, we generated a collection of potentially interesting and relevant pairs of graphs via an obvious, direct ad-hoc method. That is: For each US state i, say NC, sample the IW reference posterior for V. Explore which entries in Λ = V 1 might be plausibly small or zero simply by inspecting the implied sets of complete conditional regression coefficients derived from 22

23 the reference posterior mean of the precision matrix Λ. Use this informally to assess a range of candidate graphs G V. Figure 6 shows 8 candidate graphs G V so generated. Apply a similar strategy industrial sector j to obtain a few candidate graphs G U. The 9 selected candidate graphs appear in Figure 7. For each of the resulting 72 models on all possible pairs (G U, G V ), the above analysis was run to generate approximate marginal likelihoods. An overall summary of the likelihood over graphs is given in Table 2. Figures 8 and 9 shows the graph combination with the highest likelihood, and we note that both the states and industrial sector graphs seem to reflect relevant dependencies in the econometric context. In particular, the state graph groups together the tightly related northeastern US states into one clique; it connects as statistical neighbours the mid-atlanta/southeastern physical and close economic neighbours VA and NC; and it links non-atlantic seaboard northeastern neighbours OH and IL. The business sector graph connects the core manufacturing, commodity production/flow and industrial support sectors M,T&H,C,P&B,L&H and E&H in an intuitive chain, with the remaining sectors free-standing and isolated from this chain. 8 Discussion This paper has presented and illustrated novel matrix-data modelling to incorporate conditional independence structures, in terms of Gaussian graphical models, in rows and columns of matrix data. The innovations include fully Bayesian analysis of the resulting models for random samples using MCMC methods, and a first computational methodology for marginal likelihood evaluations to provide entrée into the realm of graphical model uncertainty assessment. A cornerstone of the theoretical model structure is a class of novel priors for matrix normal graphical models, using a parameter 23

24 expansion (PX) approach. The posterior computation using a Gibbs sampler leads to clean convergence and mixing of the Gibbs chains as well as stable and accurate MCMC evaluations of marginal likelihood functions under specified graphical models. Recent work of Hobert & Marchev (2008) and Roy & Hobert (2007) shows theoretical support for PX Gibbs samplers; in our model, PX not only yields good estimation adue to its good mixing and convergence, but is also fundamental to the new model/prior framework in addressing identification issues while yielding tractable and computationally accessible resulting posteriors. The first synthetic data example is of modest size but serves as an excellent illustration of the ability of Markov chain Monte Carlo methods in estimating covariance matrices and approximating marginal likelihood in this framework. The second example is illustrative of the statistical and computational methodology in a higher-dimensional problem, while introducing a novel class of Markov random field models that emerge quite naturally from the matrix graphical model context. The final example builds on our extensions of the modelling theory and computations to a broad class of matrixvariate dynamic models with structured dependencies among both rows and columns of time series data matrices using graphical model forms. Current interests lies in developing these new spatial and time series graphical models, both methodologically and in terms of applications, and in extending the methodology to more formal and automated graphical model search. Acknowledgement The authors acknowledge the support of grants from the U.S. National Science Foundation and National Institutes of Health. 24

25 References Besag, J. (1989). A candidate s formula: A curious result in Bayesian prediction. Biometrika 76, Carvalho, C., Massam, H. & West, M. (2007). Simulation of hyper-inverse wishart distributions in graphical models. Biometrika 94, Carvalho, C. M. & West, M. (2007a). Dynamic matrix-variate graphical models. Bayesian Analysis 2, Carvalho, C. M. & West, M. (2007b). Dynamic matrix-variate graphical models - A synopsis. In Bayesian Statistics VIII, J. Bernardo, M. Bayarri, J. Berger, A. Dawid, D. Heckerman, A. Smith & M. West, eds. Oxford University Press. Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association 90, Dawid, A. P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika 68, Dawid, A. P. & Lauritzen, S. L. (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21, Dobra, A., Jones, B., Hans, C., Nevins, J. & West, M. (2004). Sparse graphical models for exploring gene expression data. J. Mult. Anal. 90, Dutilleul, P. (1999). The MLE algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation 64, Finn, J. D. (1974). A General Model for Multivariate Analysis. New York: Holt, Rinehart and Winston. 25

26 Galecki, A. (1994). General class of covariance structures for two or more repeated factors in longitudinal data analysis. Communications in Statistics - Theory and Methods 23, Gelman, A. (2004). Parameterization and Bayesian modeling. Journal of the American Statistical Association 99, Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 3, Giudici, P. (1996). Learning in graphical Gaussian models. In Bayesian Statistics 5, J. M. Bernado, J. O. Berger, A. P. Dawid & A. M. Smith, eds. Oxford Univeristy Press, pp Giudici, P. & Green, P. J. (1999). Decomposable graphical Gaussian model determination. Biometrika 86, Gupta, A. K. & Nagar, D. K. (2000). Matrix Variate Distributions, vol. 104 of Monographs and Surveys in Pure and Applied Mathematics. London: Chapman & Hall/CRC. Hans, C., Dobra, A. & West, M. (2007). Shotgun stochastic search in regression with many predictors. Journal of the American Statistical Association 102, Hobert, J. P. & Marchev, D. (2008). Prior distributions for variance parameters in hierarchical models. Annals of Statistics To appear. Huizenga, H. M., de Munck, J. C. & Waldorp, L. J. Grasman, R. (2002). Spatiotemporal EEG/MEG source analysis based on a parametric noisecovariance model. IEEE Transactions on Biomedical Engineering 49,

27 Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C. & West, M. (2005). Experiments in stochastic computation for high-dimensional graphical models. Statistical Science 20, Lauritzen, S. L. (1996). Graphical Models. Oxford: Clarendon Press. Liu, C., Rubin, D. B. & Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika, Liu, J. S. & Wu, Y. N. (1999). Parameter expansion for data augmentation. Journal of the American Statistical Association 94, Mardia, K. V. & Goodall, C. R. (1993). Spatial-temporal analysis of multivariate environmental monitoring data. In Multivariate Environmental Statistics, G. P. Patil & C. R. Rao, eds. Elsevier. Mitchell, M. W., Genton, M. G. & Gumpertz, M. L. (2005). Testing for separability of space-time covariances. Environmetrics 16, Mitchell, M. W., Genton, M. G. & Gumpertz, M. L. (2006). A likelihood ratio test for separability of covariances. Journal of Multivariate Analysis 97, Naik, D. N. & Rao, S. S. (2001). Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix. Journal of Applied Statistics 29, Quintana, J. (1992). Optimal portfolios of forward currency contracts. In Bayesian Statistics IV, J. Berger, J. Bernardo, A. Dawid & A. Smith, eds. Oxford University Press. Quintana, J., Lourdes, V., Aguilar, O. & Liu, J. (2003). Global gambling. In Bayesian Statistics VII, J. Bernardo, M. Bayarri, J. Berger, A. Dawid, D. Heckerman, A. Smith & M. West, eds. Oxford University Press. 27

28 Quintana, J. & West, M. (1987). Multivariate time series analysis: New techniques applied to international exchange rate data. The Statistician 36, Rich, J., Hans, C., Jones, B., Iversen, E., McClendon, R., Rashed, A., Dobra, A., Dressman, H., Bigner, D., Nevins, J. & West, M. (2005). Gene expression and genetic markers in glioblastoma survival. Cancer Research 65, Roy, V. & Hobert, J. P. (2007). Convergence rates and asymptotic standard errors for mcmc algorithms for Bayesian probit regression. Journal of the Royal Statistical Society: Series B 69, Theobald, D. L. & Wuttke, D. S. (2006). Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian procrustes problem. Proceedings of the National Academy of Sciences 103, West, M. & Harrison, P. (1997). Bayesian Forecasting and Dynamic Models. New York: Springer-Verlag, 2nd ed. Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Chichester, United Kingdom: John Wiley and Sons. 28

29 Graph Structure log likelihood (A) log likelihood (B) V U AR(4) AR(5) AR(5) AR(5) AR(6) AR(5) AR(4) AR(6) Table 1: Relative log-marginal likelihood of the top five models in the MRF graphical model. Each entry is the estimated log-marginal likelihood relative to that of the most likely model on Candidate s method (A). 29

30 G V G U Table 2: Relative log-marginal likelihood of pairs of graphs in the matrix dynamic graphical model applied to the US states: Industrial sector time series. Each entry is the estimated log-marginal likelihood relative to that of the most likely model, where each is computed as the average of the two approximate values generated from the two parallel versions (A) and (B) of Candidate s formula. The differences between the two estimates of marginal likelihoods are, in all cases, in the second decimal place or smaller. 30

31 Figure 1: MCMC traceplots of diagonal elements in V in the analysis of the simulated random sample of Section 5. This illustrates the stability and fast-mixing of the MCMC that is consistent across all parameters in U and V. 31

32 Figure 2: Images of the two true precision matrices Λ and Ω (a and c) in the model generating the simulated random sample of Section 5, together with corresponding MCMC-based posterior means (b and d). 32

33 Figure 3: Log-marginal likelihood values on the two true graphs in the simulation example of Section 5. The two estimates, from method (A) and (B) of Section 4.2, were successively re-evaluated at differing MCMC sample sizes. The plot confirms the concordance of the two estimates even at low MCMC samples sizes, and suggests accuracy in terms of differences between the two estimates on the log-likelihood scale. 33

34 Figure 4: Images displaying the band structure of the two precision matrices (upper row) used in the MRF matrix graphical model example of Section 6, together with images of two simulated draws (lower row) from the model. 34

35 Figure 5: Aspects of one data time series in the econometric example of Section 7.3. Monthly changes in employment (upper frame) (NC, FA) together with the one-step ahead forecasts. Standardized one-step ahead forecast errors et /qt (middle frame). On-line estimated seasonal pattern (lower frame). 35

36 Figure 6: Candidate graphs for V in the econometric example of Section 7.3, represented as edge adjacency matrices displayed as dots (1) and missing dots (0). Figure 7: Candidate graphs for U in the econometric example of Section 7.3, represented as edge adjacency matrices displayed as dots (1) and missing dots (0). 36

37 Figure 8: Econometric example of Section 7.3: Highest marginal likelihood graph that shows conditional dependencies among states for the CES data. Figure 9: Econometric example of Section 7.3: Highest marginal likelihood graph that shows conditional dependencies among industrial sectors for the CES data. 37

Dynamic Matrix-Variate Graphical Models A Synopsis 1

Dynamic Matrix-Variate Graphical Models A Synopsis 1 Proc. Valencia / ISBA 8th World Meeting on Bayesian Statistics Benidorm (Alicante, Spain), June 1st 6th, 2006 Dynamic Matrix-Variate Graphical Models A Synopsis 1 Carlos M. Carvalho & Mike West ISDS, Duke

More information

Dynamic Financial Index Models: Modeling Conditional Dependencies via Graphs

Dynamic Financial Index Models: Modeling Conditional Dependencies via Graphs Bayesian Analysis (211) 6, Number, pp. 639 66 Dynamic Financial Index Models: Modeling Conditional Dependencies via Graphs Hao Wang, Craig Reeson and Carlos M. Carvalho Abstract. We discuss the development

More information

Covariance decomposition in multivariate analysis

Covariance decomposition in multivariate analysis Covariance decomposition in multivariate analysis By BEATRIX JONES 1 & MIKE WEST Institute of Statistics and Decision Sciences Duke University, Durham NC 27708-0251 {trix,mw}@stat.duke.edu Summary The

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

Bayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton)

Bayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton) Bayesian (conditionally) conjugate inference for discrete data models Jon Forster (University of Southampton) with Mark Grigsby (Procter and Gamble?) Emily Webb (Institute of Cancer Research) Table 1:

More information

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Bayesian Dynamic Linear Modelling for. Complex Computer Models Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer

More information

A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs

A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs Ricardo Silva Abstract Graphical models are widely used to encode conditional independence constraints and causal assumptions,

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Sparse Factor-Analytic Probit Models

Sparse Factor-Analytic Probit Models Sparse Factor-Analytic Probit Models By JAMES G. SCOTT Department of Statistical Science, Duke University, Durham, North Carolina 27708-0251, U.S.A. james@stat.duke.edu PAUL R. HAHN Department of Statistical

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Bayesian Inference of Multiple Gaussian Graphical Models

Bayesian Inference of Multiple Gaussian Graphical Models Bayesian Inference of Multiple Gaussian Graphical Models Christine Peterson,, Francesco Stingo, and Marina Vannucci February 18, 2014 Abstract In this paper, we propose a Bayesian approach to inference

More information

Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis

Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis Joyee Ghosh Institute of Statistics and Decision Sciences, Duke University Box 90251, Durham, NC 27708 joyee@stat.duke.edu

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Summary STK 4150/9150

Summary STK 4150/9150 STK4150 - Intro 1 Summary STK 4150/9150 Odd Kolbjørnsen May 22 2017 Scope You are expected to know and be able to use basic concepts introduced in the book. You knowledge is expected to be larger than

More information

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan Timevarying VARs Wouter J. Den Haan London School of Economics c Wouter J. Den Haan Time-Varying VARs Gibbs-Sampler general idea probit regression application (Inverted Wishart distribution Drawing from

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,

More information

Nearest Neighbor Gaussian Processes for Large Spatial Data

Nearest Neighbor Gaussian Processes for Large Spatial Data Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns

More information

BAYESIAN DYNAMIC MODELLING

BAYESIAN DYNAMIC MODELLING BAYESIAN DYNAMIC MODELLING Mike West Department of Statistical Science, Duke University in honour of Professor Sir Adrian F.M. Smith in Bayesian Theory and Applications (eds: P. Damien and P. Dellaportas

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Lawrence J Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Problem: the time series dimension of data is relatively

More information

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Yong Huang a,b, James L. Beck b,* and Hui Li a a Key Lab

More information

Dynamic linear models (aka state-space models) 1

Dynamic linear models (aka state-space models) 1 Dynamic linear models (aka state-space models) 1 Advanced Econometris: Time Series Hedibert Freitas Lopes INSPER 1 Part of this lecture is based on Gamerman and Lopes (2006) Markov Chain Monte Carlo: Stochastic

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case Areal data models Spatial smoothers Brook s Lemma and Gibbs distribution CAR models Gaussian case Non-Gaussian case SAR models Gaussian case Non-Gaussian case CAR vs. SAR STAR models Inference for areal

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Posterior convergence rates for estimating large precision. matrices using graphical models

Posterior convergence rates for estimating large precision. matrices using graphical models Biometrika (2013), xx, x, pp. 1 27 C 2007 Biometrika Trust Printed in Great Britain Posterior convergence rates for estimating large precision matrices using graphical models BY SAYANTAN BANERJEE Department

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Inequalities on partial correlations in Gaussian graphical models

Inequalities on partial correlations in Gaussian graphical models Inequalities on partial correlations in Gaussian graphical models containing star shapes Edmund Jones and Vanessa Didelez, School of Mathematics, University of Bristol Abstract This short paper proves

More information

Precision Engineering

Precision Engineering Precision Engineering 38 (2014) 18 27 Contents lists available at ScienceDirect Precision Engineering j o ur nal homep age : www.elsevier.com/locate/precision Tool life prediction using Bayesian updating.

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Cross-sectional space-time modeling using ARNN(p, n) processes

Cross-sectional space-time modeling using ARNN(p, n) processes Cross-sectional space-time modeling using ARNN(p, n) processes W. Polasek K. Kakamu September, 006 Abstract We suggest a new class of cross-sectional space-time models based on local AR models and nearest

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Introduction to Graphical Models

Introduction to Graphical Models Introduction to Graphical Models STA 345: Multivariate Analysis Department of Statistical Science Duke University, Durham, NC, USA Robert L. Wolpert 1 Conditional Dependence Two real-valued or vector-valued

More information

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, ) Econometrica Supplementary Material SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, 653 710) BY SANGHAMITRA DAS, MARK ROBERTS, AND

More information

An EM algorithm for Gaussian Markov Random Fields

An EM algorithm for Gaussian Markov Random Fields An EM algorithm for Gaussian Markov Random Fields Will Penny, Wellcome Department of Imaging Neuroscience, University College, London WC1N 3BG. wpenny@fil.ion.ucl.ac.uk October 28, 2002 Abstract Lavine

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions

Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions Catherine A. Calder Department of Statistics The Ohio State University 1958 Neil Avenue Columbus,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Geometric ergodicity of the Bayesian lasso

Geometric ergodicity of the Bayesian lasso Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components

More information

Assessing Regime Uncertainty Through Reversible Jump McMC

Assessing Regime Uncertainty Through Reversible Jump McMC Assessing Regime Uncertainty Through Reversible Jump McMC August 14, 2008 1 Introduction Background Research Question 2 The RJMcMC Method McMC RJMcMC Algorithm Dependent Proposals Independent Proposals

More information

arxiv: v1 [stat.me] 26 Jul 2011

arxiv: v1 [stat.me] 26 Jul 2011 AUTOREGRESSIVE MODELS FOR VARIANCE MATRICES: STATIONARY INVERSE WISHART PROCESSES BY EMILY B. FOX AND MIKE WEST Duke University, Durham NC, USA arxiv:1107.5239v1 [stat.me] 26 Jul 2011 We introduce and

More information

A new Hierarchical Bayes approach to ensemble-variational data assimilation

A new Hierarchical Bayes approach to ensemble-variational data assimilation A new Hierarchical Bayes approach to ensemble-variational data assimilation Michael Tsyrulnikov and Alexander Rakitko HydroMetCenter of Russia College Park, 20 Oct 2014 Michael Tsyrulnikov and Alexander

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Inference and estimation in probabilistic time series models

Inference and estimation in probabilistic time series models 1 Inference and estimation in probabilistic time series models David Barber, A Taylan Cemgil and Silvia Chiappa 11 Time series The term time series refers to data that can be represented as a sequence

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results Monetary and Exchange Rate Policy Under Remittance Fluctuations Technical Appendix and Additional Results Federico Mandelman February In this appendix, I provide technical details on the Bayesian estimation.

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Generalized Autoregressive Score Models

Generalized Autoregressive Score Models Generalized Autoregressive Score Models by: Drew Creal, Siem Jan Koopman, André Lucas To capture the dynamic behavior of univariate and multivariate time series processes, we can allow parameters to be

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Bayesian time series classification

Bayesian time series classification Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University

More information

Bayesian spatial hierarchical modeling for temperature extremes

Bayesian spatial hierarchical modeling for temperature extremes Bayesian spatial hierarchical modeling for temperature extremes Indriati Bisono Dr. Andrew Robinson Dr. Aloke Phatak Mathematics and Statistics Department The University of Melbourne Maths, Informatics

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Supplementary Note on Bayesian analysis

Supplementary Note on Bayesian analysis Supplementary Note on Bayesian analysis Structured variability of muscle activations supports the minimal intervention principle of motor control Francisco J. Valero-Cuevas 1,2,3, Madhusudhan Venkadesan

More information

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively

More information

On a multivariate implementation of the Gibbs sampler

On a multivariate implementation of the Gibbs sampler Note On a multivariate implementation of the Gibbs sampler LA García-Cortés, D Sorensen* National Institute of Animal Science, Research Center Foulum, PB 39, DK-8830 Tjele, Denmark (Received 2 August 1995;

More information

Hyperparameter estimation in Dirichlet process mixture models

Hyperparameter estimation in Dirichlet process mixture models Hyperparameter estimation in Dirichlet process mixture models By MIKE WEST Institute of Statistics and Decision Sciences Duke University, Durham NC 27706, USA. SUMMARY In Bayesian density estimation and

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Bayesian Inference in the Multivariate Probit Model

Bayesian Inference in the Multivariate Probit Model Bayesian Inference in the Multivariate Probit Model Estimation of the Correlation Matrix by Aline Tabet A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science

More information

Lecture Notes based on Koop (2003) Bayesian Econometrics

Lecture Notes based on Koop (2003) Bayesian Econometrics Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Coupled Hidden Markov Models: Computational Challenges

Coupled Hidden Markov Models: Computational Challenges .. Coupled Hidden Markov Models: Computational Challenges Louis J. M. Aslett and Chris C. Holmes i-like Research Group University of Oxford Warwick Algorithms Seminar 7 th March 2014 ... Hidden Markov

More information