Summary of Talk Background to Multilevel modelling project. What is complex level 1 variation? Tutorial dataset. Method 1 : Inverse Wishart proposals.
|
|
- Edgar Grant
- 5 years ago
- Views:
Transcription
1 Modelling the Variance : MCMC methods for tting multilevel models with complex level 1 variation and extensions to constrained variance matrices By Dr William Browne Centre for Multilevel Modelling Institute of Education, London
2 Summary of Talk Background to Multilevel modelling project. What is complex level 1 variation? Tutorial dataset. Method 1 : Inverse Wishart proposals. Method 2 : Truncated Normal proposals. Log formulations. Extensions to the multivariate problem.
3 Multilevel modelling project Based at Institute of Education. Headed by Professor Harvey Goldstein. Funded by ESRC originally through ALCD programme. 3 Full-time Research ocers. 2 Lecturers associated with project. A Network of project Fellows.
4 Aims of Project Modelling complex structures in social science data. Establishing forms of model structure. Developing methodology to t model. Comparing alternative methodologies. Programming methodology into computer package MLwiN. Disseminating ideas to the social science community.
5 MLwiN Software package Developed from a chain of packages developed by MMP. Forerunners include ML2, ML3 and MLn. Main programmer : Jon Rasbash. Consists of user-friendly Windows interface on top of fast estimation engines. Over 3,000 users (mainly academic) worldwide. Estimation by IGLS, RIGLS, MCMC methods, bootstrapping. MCMC theory and programming by William Browne and David Draper.
6 Current research interests Cross-classied and multiple membership models. Missing data and measurement errors in multilevel modelling. Multilevel Factor analysis modelling. Spatial modelling. Combining estimation procedures. Improving user interface.
7 Univariate Normal model y i N( 2 ) i= 1 ::: 4 Can be written 0 y 1 y 2 y 3 y 4 1 C A N 0 = 0 1 C A V = CC AA Normal linear model y i N(X i 2 ) i= 1 ::: 4 Can be written y = (y 1, y 2, y 3, y 4 ) T MV N( V ) where = 0 X 1 X 2 X 3 X 4 1 C A V = C A
8 2 level variance components model y ij = X ij + u j + e ij u j N(0 2 u) e ij N(0 2 e ) i = 1 ::: 2 j = 1 ::: 2 Can be written y = (y 1 1, y 2 1, y 1 2, y 2 2 ) T MV N( V ) where = 0 X 1 1 X 2 1 X 1 2 X C A and V = 0 2 u + 2 e 2 u u 2 u + 2 e u + 2 e 2 u u 2 u + 2 e 1 C A
9 Complex Variation. Denition: A model where the variance depends on predictor variables. y ij = X ij + Z ij u j + X C ij e ij u j MV N(0 u ) e ij MV N(0 e ) V matrix now has diagonal elements of the form V ij ij = eij + uij where eij = X CT ij ex C ij and uij = Z T ij uz ij. The o-diagonal elements of V are zero if they correspond to observations in dierent level 2 units or otherwise V ij i 0 j = Z T ij uz i 0 j.
10 Example : Tutorial Dataset Dataset of school exam results at age 16. Dataset 4059 pupils from 65 schools. Response variable is total GCSE score. Main predictor variable is LRT score. Other predictor of interest is gender.
11 Partitioning the dataset Here we see the mean and variance for dierent partitions of the dataset. Partition N Mean Variance Whole dataset Boys Girls LRT < ; ;1 < LRT < ;0: ;0:5 < LRT < ;0: ;0:1 < LRT < 0: :3 < LRT < 0: :7 < LRT < 1: :1 < LRT
12 A 1 Level Model eij y ij N( 0 + X1 ij 1 V) = e00 +2X1 ij e01 +X1 2 ij e11 (1) where X1 is London Reading test (LRT) score. This graph mimics the results from partitioning the data. Level 1 variance Standardised LRT score
13 A 2 level model with a constant variance at level 2. y ij N( 0 + X1 ij 1 V) uij = u00 (2) eij = e00 +2X1 ij e01 +X1 2 ij e11 where X1 is London Reading test (LRT) score. Variance Level 1 variance Level 2 variance Standardised LRT score
14 A 2 Level model with complex variation at both levels 1 and 2. uij eij y ij N( 0 + X1 ij 1 V) = u00 +2X1 ij u01 +X1 2 ij u11 = e00 +2X1 ij e01 +X1 2 ij e11 (3) where X1 is London Reading test (LRT) score. Variance Level 1 variance Level 2 variance Standardised LRT score
15 A 2 level model with a more complicated variance structure at level 1 y ij N( 0 +X1 ij 1 + X2 ij 2 V) uij = u00 + 2X1 ij u01 +X1 2 ij u11 eij = e00 + 2X1 ij e01 + 2X1 ij X2 ij e12 + X2 ij e22 (4) where X1 is London Reading test (LRT) score, and X2 is 1 for boys and 0 for girls. Variance Level 1 variance - Boys Level 1 variance - Girls Level 2 variance Standardised LRT score
16 Two possible formulations We can write a general two level Normal model with complex level 1 variation out in two similar but not identical formulations Firstly y ij = X ij + Z ij u j + X C ij e ij where u j MV N(0 u ) e ij MV N(0 e ) and secondly y ij = X ij + Z ij u j + e ij where u j MV N(0 u ) e ij N(0 eij) and e ij = XC ij e ij and eij = X CT ij ex C ij.
17 Gibbs Sampling steps for both methods In a Gibbs Sampling algorithm we construct conditional posterior distributions of each parameter (or group of parameters) in turn. This constructs a chain of values for each parameter which upon convergence will be a sample from the joint posterior distribution. Here we nd Step 1 : p( j y u u e ) MVN( b c D) Step 2 : p(u j j y u e ) MVN(bu j c D j ) Step 3 : p( u j y u e ) InvW ishart(b b S) Step 4 : p( e j y u u )? The distribution in Step 4 does not have a `nice' form using either formulation. So for this step we will use the Metropolis Hastings(MH) sampler.
18 Method 1 Inverse Wishart proposals In formulation 1 we know that e is a `variance matrix' therefore values of e must form a positive denite matrix. To use Metropolis Hastings we require a proposal distribution that generates positive denite matrices. We will use an inverse Wishart distribution. Let invw ishart k ( S) E() = ( ; k ; 1) ;1 S So at timestep t +1 draw from proposal distribution p( (t+1) e ) invw ishart k (w + k + 1 w (t) e ): This has mean the current estimate of e and w is a tuning constant.
19 Method 1 continued: As we are using an Inverse Wishart proposal distribution this distribution is not symmetric. Consequently we have to work out the hastings ratio. If the current value of e is A and we propose a move to B then the hastings ratio is as follows : hr = p((t+1) e p( (t+1) e = j A j2w+3k+3 2 j B j 2w+3k+3 2 = B j (t+1) e = A j (t+1) e IW(w + k + 1 Aw)) IW(w + k + 1 Bw)) exp( w 2 (tr(ba;1 ) ; tr(ab ;1 )) Our step 4 now becomes (t+1) e = e with prob. min(1 hr p( e jy :::) = (t) e otherwise. ) p( (t) e jy :::) where e is drawn from an InvW ishart k (w + k + ) distribution. 1 w (t) e
20 Earlier Example 3 Here we look again at the third earlier example model y ij N( 0 + X1 ij 1 V) uij = u00 + 2X1 ij u01 + X1 2 ij u11 eij = e00 + 2X1 ij e01 + X1 2 ij e11 (1) where y is the (normalised) GCSE score and X1 is the (standardised) LRT score. Results Par. IGLS MCMC Meth 1 MCMC Meth (0.040) (0.041) (0.041) (0.020) (0.020) (0.020) u (0.018) (0.020) (0.020) u (0.007) (0.007) (0.007) u (0.004) (0.005) (0.005) e (0.015) (0.013) (0.015) e (0.006) (0.007) (0.007) e (0.009) (0.006) (0.009)
21 Output for parameter e11 using the inverse Wishart method
22 Method 2 : Truncated Normal proposals Our second formulation of the model was as follows : y ij = X ij + Z ij u j + e ij where u j MV N(0 u ) e ij N(0 eij) and e ij = XC ij e ij and eij = X CT ij ex C ij. So rather than a positive denite constraint for the matrix e we instead have the weaker constraint that eij = X CT ij ex C ij > 08i j. Note that this would be identical to a positive denite constraint if X CT took all possible values but in practice it doesn't. This constraint looks quite dicult but we will consider the elements of e one at a time.
23 Method 2 : Updating Diagonal terms ekk At time t we require that eij = (X C ij )T (t) e X C ij > 0 = (X C ij(k) )2 (t) ekk ; dc ij(kk) > 0 8 i j where d C ij(kk) = (XC ij(k) )2 (t) ekk ; (XC ij) T (t) e X C ij : So (t) ekk > max ekk where max ekk = max(d C ij(kk) =(XC ij(k) )2 ):
24 Method 2 : Updating O-Diagonal terms ekl This step will be similar to the step given for diagonal terms except this time ekl will be multiplied by X c ij(k) Xc ij(l) which can be negative. This means that there will be two truncation points (a maximum and minimum) rather than one. Step 4 of the algorithm becomes (repeated for all k and l) (t+1) ekl = ekl with prob. min(1 hr p( ekl jy :::) = (t) ekl otherwise. p( (t) jy :::)) ekl where ekl is drawn from a truncated Normal distribution with truncation points that maintain a positive variance for every observation.
25 Calculating the Hastings Ratios for Method 2. hr = ((min ekl; ; B)=s kl ) ; ((max ekl + ; B)=s kl ) ((min ekl ; ; A)=s kl ) ; ((max ekl + ; A)=s kl ) : M AB. M AB. (i) (ii) M AB m. M AB m. (iii) (iv) Figure 1: Plots of truncated univariate normal proposal distributions for a parameter,. A is the current value, c and B is the proposed new value,. Mismax and m is min, the truncation points. The distributions in (i) and (iii) have mean c, while the distributions in (ii) and (iv) have mean.
26 Output for parameter e11 using the truncated normal method
27 Example 2 Our model is as follows : y ij N( 0 + girl ij 1 V) V = u00 + e00 + 2girl ij e01 (2) This model ts a variance for boys and a term that represents the dierence in variance between boys and girls. Par. IGLS RIGLS MCMC (0.058) (0.058) (0.060) (0.041) (0.041) (0.040) u (0.031) (0.032) (0.035) e (0.032) (0.032) (0.032) e (0.020) (0.020) (0.020)
28 Summary so far In this talk I have introduced 2 MCMC methods for tting models with complex level 1 variation. Below is a summary of their respective advantages and disadvantages. Method 1 does not allow the variance to be negative for unobserved predictors. Method 1 allows easy specication of informative prior distributions. Method 2 mimics the existing ML methods. Method 2 allows more exibility in specication of level 1 variance functions. Method 2 can be extended to include log specications.
29 Log variance/precision formulation As an alternative we can write a general two level Normal model with complex level 1 variation out in the following formulation as used by Spiegelhalter et al. (1996). y ij = X ij + Z ij u j + e ij where u j MV N(0 u ) e ij N(0 1= ij ) and log( ij ) = X C ij e. This results in a multiplicative variance function: eij = exp(;x C 1ij e1) ::: exp(;x C nij en) The main advantage is the parameters are unconstrained. The main disadvantage is the diculty in interpreting the individual parameters.
30 A comparison of four possible models In the following graph we plot the variance function for four possible formulations of the level 1 variance for the tutorial dataset : eij = e00 + 2X1 ij e01 (1) eij = e00 +2X1 ij e01 + X1 2 ij e11 (2) eij = exp(; e00 ; 2X1 ij e01) (3) eij = exp(; e00 ; 2X1 ij e01 ; X1 2 ij e11) (4) where X1 is London Reading test (LRT) score... Level 1 Variance Linear Quadratic Exp. Linear Exp. Quadratic Standardised LRT score
31 Comparison of Speed and Eciency Here we look at the four models illustrated in the graph and compare the speed and eciency of the MH truncated Normal approach and the Adaptive rejection approach used in WinBUGS. Here the time is based on running the method on the model for 50,000 iterations and the eciency is the maximum Raftery Lewis ^N statistic rstly for a level 1 variance parameter and secondly for any model parameter. Results Model MH Time AR Time MH E. AR E. Linear 28 mins k k Quadratic 30 mins k k Exponential 34 mins 143 mins 14.8k 3.8k Linear 16.8k 17.1k Exponential 38 mins 340 mins 16.2k 4.7k Quadratic 17.5k 17.7k
32 Model Comparison (Work in progress). The DIC diagnostic (Spiegelhalter et al. 2001) is a measure that can be used for comparing complex models tted using MCMC methods. It can be thought of as a generalization of the AIC diagnostic and combines a measure of t based on a deviance function (D( )) and complexity based on an 'eective' number of parameters, p D. DIC = D( ) + 2pD The following table gives DIC values for some of the models above Variance function D( ) pd DIC Constant Quadratic Linear Exp. Linear Exp. Quadratic
33 Extension to multivariate problems (Work in progress). We will here ignore the multilevel problems considered so far and stick to a one level problem as the multilevel analogue is a simple extension. Assume for each of P individuals we have an N vector response, y i and this response is assumed to come from a multivariate Normal distribution so : y i MV N( i V i ) Assume we wish to update a variance matrix V i with the Metropolis-Hastings algorithm. Normally we would have V i = V and use Gibbs sampling but let us assume that each individual has a unique variance matrix. For example let us assume that for individual i we have V i [j k] = X i, for the j k th element of the matrix V i. In this talk so far we have considered the case where N = 1 for this problem.
34 Constraints to maintain positive deniteness We could consider using the truncated normal method but calculating all the parameter constraints to retain a positive denite matrix V i 8i. N = 1 : V i > 08i N = 2 : V i [0 0] > 0 V i [1 1] > 0 and ;1 < V i [0 1] pv i [0 0]V i [1 1] < 1 N = 3: 3 variance constraints, 3 correlation constraints and a 3-way correlation constraint. Generally for an N N matrix there are in total 2 N ; 1 constraints. Each variance parameter is involved in 2 N ;1 constraints and each covariance in 2 N ;2. So even though some constraints are redundant, evaluating all constraints is impractical for large N. Solution Use univariate Normal proposals with no truncation!!
35 Univariate Normal proposals Metropolis method Explanation A Metropolis step is (generally) easier than a Gibbs sampling step as it involves evaluating a posterior distribution at two points rather than calculating fully the form of the conditional posterior distribution. Similarly the Normal proposal is easier than a truncated Normal proposal as it involves checking if a proposed value satises the positive denite constraints rather than fully calculating these constraints in advance of generating the proposed value. Any values that do not satisfy the constraints have probability 0 and are automatically rejected. The univariate Normal proposal also has the advantage of being symmetric and so we do not need to worry about calculating Hastings ratios.
36 Applications Multivariate response models where elements of the variance matrix are functions of predictor variables (as above). Factor Analysis models with correlated factors (see Goldstein and Browne 2001). Mixed Normal and Binomial response (with probit link) models (see various work by Chib)
37 Useful web sites - Project home page that contains general information on multilevel modelling and information about MLwiN including bug listings and downloads of the latest version of MLwiN plus the documation. - contains drafts of all my publications including papers awaiting publication. - contains downloads of recent presentations I have given. - Training materials in Social sciences site containing free teaching version of MLwiN.
Bayesian Methods in Multilevel Regression
Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design
More informationModelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research
Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Research Methods Festival Oxford 9 th July 014 George Leckie
More informationPartitioning variation in multilevel models.
Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.
Goldstein, H., Carpenter, J. R., & Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationDiscrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data
Quality & Quantity 34: 323 330, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 323 Note Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationA Non-parametric bootstrap for multilevel models
A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More information2 1 Introduction Multilevel models, for data having a nested or hierarchical structure, have become an important component of the applied statistician
Implementation and Performance Issues in the Bayesian and Likelihood Fitting of Multilevel Models William J. Browne 1 and David Draper 2 1 Institute of Education, University of London, 20 Bedford Way,
More informationEstimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters
Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters Audrey J. Leroux Georgia State University Piecewise Growth Model (PGM) PGMs are beneficial for
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationVariance partitioning in multilevel logistic models that exhibit overdispersion
J. R. Statist. Soc. A (2005) 168, Part 3, pp. 599 613 Variance partitioning in multilevel logistic models that exhibit overdispersion W. J. Browne, University of Nottingham, UK S. V. Subramanian, Harvard
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationAppendix: Modeling Approach
AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More information1 Introduction. 2 Example
Statistics: Multilevel modelling Richard Buxton. 2008. Introduction Multilevel modelling is an approach that can be used to handle clustered or grouped data. Suppose we are trying to discover some of the
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More information36-463/663Multilevel and Hierarchical Models
36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population
More information0.1 factor.ord: Ordinal Data Factor Analysis
0.1 factor.ord: Ordinal Data Factor Analysis Given some unobserved explanatory variables and observed ordinal dependent variables, this model estimates latent factors using a Gibbs sampler with data augmentation.
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationProbabilistic Graphical Models Lecture 17: Markov chain Monte Carlo
Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationLecture Notes based on Koop (2003) Bayesian Econometrics
Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationp(z)
Chapter Statistics. Introduction This lecture is a quick review of basic statistical concepts; probabilities, mean, variance, covariance, correlation, linear regression, probability density functions and
More informationUniversity of Groningen. The multilevel p2 model Zijlstra, B.J.H.; van Duijn, Maria; Snijders, Thomas. Published in: Methodology
University of Groningen The multilevel p2 model Zijlstra, B.J.H.; van Duijn, Maria; Snijders, Thomas Published in: Methodology IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationA General Multilevel Multistate Competing Risks Model for Event History Data, with. an Application to a Study of Contraceptive Use Dynamics
A General Multilevel Multistate Competing Risks Model for Event History Data, with an Application to a Study of Contraceptive Use Dynamics Published in Journal of Statistical Modelling, 4(): 145-159. Fiona
More informationBayesian Analysis of Multivariate Normal Models when Dimensions are Absent
Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES
More informationThe Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11
Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element
More information0.1 factor.bayes: Bayesian Factor Analysis
0.1 factor.bayes: Bayesian Factor Analysis Given some unobserved explanatory variables and observed dependent variables, the Normal theory factor analysis model estimates the latent factors. The model
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationPrediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels
STATISTICS IN MEDICINE Statist. Med. 2005; 24:1357 1369 Published online 26 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2009 Prediction of ordinal outcomes when the
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationOnline Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access
Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationPattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods
Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationNovember 2002 STA Random Effects Selection in Linear Mixed Models
November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More informationAdvising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand
Advising on Research Methods: A consultant's companion Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand Contents Preface 13 I Preliminaries 19 1 Giving advice on research methods
More informationCS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling
CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationComputer Vision Group Prof. Daniel Cremers. 14. Sampling Methods
Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationSeparable covariance arrays via the Tucker product - Final
Separable covariance arrays via the Tucker product - Final by P. Hoff Kean Ming Tan June 4, 2013 1 / 28 International Trade Data set Yearly change in log trade value (in 2000 dollars): Y = {y i,j,k,t }
More informationA comparison of fully Bayesian and two-stage imputation strategies for missing covariate data
A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data Alexina Mason, Sylvia Richardson and Nicky Best Department of Epidemiology and Biostatistics, Imperial College
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationBayesian Estimation of Expected Cell Counts by Using R
Bayesian Estimation of Expected Cell Counts by Using R Haydar Demirhan 1 and Canan Hamurkaroglu 2 Department of Statistics, Hacettepe University, Beytepe, 06800, Ankara, Turkey Abstract In this article,
More informationHospital H1 H2 H3 H4. Patient P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12. Neighbourhood N1 N2 N3
Chapter 1 NON-HIERARCHICAL MULTILEVEL MODELS Jon Rasbash and William Browne 1. INTRODUCTION In the models discussed in this book so far we have assumed that the structures of the populations from which
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationReport and Opinion 2016;8(6) Analysis of bivariate correlated data under the Poisson-gamma model
Analysis of bivariate correlated data under the Poisson-gamma model Narges Ramooz, Farzad Eskandari 2. MSc of Statistics, Allameh Tabatabai University, Tehran, Iran 2. Associate professor of Statistics,
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationThree-Level Multiple Imputation: A Fully Conditional Specification Approach. Brian Tinnell Keller
Three-Level Multiple Imputation: A Fully Conditional Specification Approach by Brian Tinnell Keller A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts Approved
More informationUsing Model Selection and Prior Specification to Improve Regime-switching Asset Simulations
Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations Brian M. Hartman, PhD ASA Assistant Professor of Actuarial Science University of Connecticut BYU Statistics Department
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationBayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014
Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation
More informationA quick introduction to Markov chains and Markov chain Monte Carlo (revised version)
A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to
More informationGeneralized Exponential Random Graph Models: Inference for Weighted Graphs
Generalized Exponential Random Graph Models: Inference for Weighted Graphs James D. Wilson University of North Carolina at Chapel Hill June 18th, 2015 Political Networks, 2015 James D. Wilson GERGMs for
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More information19 : Slice Sampling and HMC
10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often
More information1. Introduction. Hang Qian 1 Iowa State University
Users Guide to the VARDAS Package Hang Qian 1 Iowa State University 1. Introduction The Vector Autoregression (VAR) model is widely used in macroeconomics. However, macroeconomic data are not always observed
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationPart 7: Hierarchical Modeling
Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationModeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study
Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Gunter Spöck, Hannes Kazianka, Jürgen Pilz Department of Statistics, University of Klagenfurt, Austria hannes.kazianka@uni-klu.ac.at
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationStatistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling
1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods
Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationDIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals
DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals Overall Measures of GOF Deviance: this measures the overall likelihood of the model given a parameter vector D( θ) = 2 log L( θ) This can be averaged
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationMonte Carlo Lecture Notes II, Jonathan Goodman. Courant Institute of Mathematical Sciences, NYU. January 29, 1997
Monte Carlo Lecture Notes II, Jonathan Goodman Courant Institute of Mathematical Sciences, NYU January 29, 1997 1 Introduction to Direct Sampling We think of the computer \random number generator" as an
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More information