Bayesian inference for factor scores

Size: px
Start display at page:

Download "Bayesian inference for factor scores"

Transcription

1 Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor model follows directly from the likelihood and the prior distributions for the model parameters. Inference about the factor scores themselves is more complex, but can be accommodated in the complete data form of the model using Markov Chain Monte Carlo methods. This approach has interesting connections to the EM algorithm approach to maximum likelihood estimation, and it casts light on the controversy over factor score estimation and factor indeterminacy. Introduction Bayesian methods are rapidly becoming more popular with the greatly increased power of Markov Chain Monte Carlo (MCMC or MC ) methods for inference in complex models. Such features as missing or incomplete data, latent variables and non-conjugate prior distributions can be handled in a unified way. Bayesian methods solve inferential problems that are difficult to deal with in frequentist (repeated-sampling) theory: the inadequacy of asymptotic theory in small samples and the difficulties of second-order asymptotics; the difficulties of maximum likelihood methods with complex models and partial or incomplete data. The Bayesian solution of these problems, as for all models, requires a full prior specification for model parameters and all other unobserved variables. Bayesian analysis of the factor model has been considered in considerable generality by Press and Shigemasu (989, 997) who give a good coverage of earlier work; in the context of this volume Bayesian methods have been used for the problem of Heywood cases by Martin and McDonald (975). A new book by Rowe () gives a comprehensive Bayesian treatment of the factor model. Data augmentation (DA) methods including MCMC methods are discussed in detail

2 in Tanner (996), but these have apparently not been applied to the factor model apart from the maximum likelihood EM algorithm approach of Rubin and Thayer (98). In this chapter we describe the fully Bayesian DA approach. The single-factor model We adopt the standard notation of upper-case letters for random variables and lower-case letters for their observed values. For the single common-factor model with p test variables Y and a single unobserved factor X, and the marginal distribution of Y is Y X = x N( + λx, Ψ) X N(, ) Y N(, Σ), Σ = λλ + Ψ, where is a length p column vector of means, λ is a length p column vector of factor loadings, and Ψ = diag(,..., ψ p) is a p p diagonal matrix of specific variances. We restrict X to be standard normal because of the unidentifiability of its mean and variance parameters. It follows immediately that the maximum likelihood estimate (MLE) of is ȳ. Maximum likelihood methods for the estimation of λ and Ψ from data y i (i =,..., n), together with large-sample standard errors from the information matrix, are implemented in many packages, and will not concern us here apart from the EM algorithm approach of Rubin and Thayer (98). In this approach, we regard the unobserved factor variables X i as missing data; in the complete data model in which the x i are counterfactually observed, the complete data log-likelihood is, omitting constants, l = log L (, λ, Ψ) = n log Ψ = n n i= x i n (y i λx i ) Ψ (y i λx i ) i= p log ψj j= n i= x i n p (y ij j λ j x i ) /ψj i= j= which is equivalent to a sum of p separate log-likelihoods from the p regressions of Y j on x with intercept j, slope λ j and variance ψj. The term i x i does not involve unknown parameters and can be omitted.

3 The sufficient statistics in these regressions involve the x i and x i ; in the E step of the algorithm these are replaced by their conditional expectations given the current estimates of the parameters. Standard calculations give the conditional distribution of X given Y = y and the parameters as X Y = y N(λ Σ (y ), λ Σ λ). Here ρ = λ Σ λ is the squared multiple correlation of the factor X with the variables Y, so the conditional variance of X given Y = y is ρ. In the E step of EM the unobserved x i are replaced by x i = λ Σ (y i ) and the unobserved x i are replaced by x i = x i + ρ where the parameters are replaced by their current estimates. In the M step of EM new estimates of the parameters are obtained from the p regressions by solving the score equations ˆλ j = i (y ij ȳ j )x i /[ i x i + n( ρ )] ˆψ j = i (y ij ȳ j λ j x i ) /n + ρ λ j. The EM algorithm may converge very slowly if the regression of Y on X is weak, and further numerical work is needed for the information matrix and (large-sample) standard errors for the parameter estimates. 3 Bayesian analysis Bayesian analysis of the factor model, as with any other model, requires prior distributions for the model parameters; the product of the joint prior and the likelihood gives the joint posterior distribution (after normalization), and any marginal posterior distributions of interest may be computed by integrating out the relevant parameters. Prior distributions may be diffuse, representing little information, or informative, representing real external information relevant to the current sample. Conjugate prior distributions are widely used where they exist; by setting the ( hyper- ) parameters in these distributions at appropriate values, they can be made to represent a range of information from diffuse to precise. Since the factor model is essentially a set of conditionally independent linear regressions, diffuse priors are the same as for a regression model flat on, λ and log ψj. Conjugate priors are normal for and λ, and inverse gamma for ψj. 3

4 The mean is of no inferential interest, so it is convenient to integrate it out immediately from the posterior distribution. The multivariate normal likelihood can be written L(, Σ) = = [ exp Σ n/ ] n (y i ) Σ (y i ) i= [ Σ exp n ] / (ȳ ) Σ (ȳ ) [ exp n (y Σ (n )/ i ȳ) Σ (y i ȳ) A flat prior on leaves this unchanged, and integrating out gives directly the marginal likelihood [ M(Σ) = Σ exp n (n )/ trsσ ], where S = n (y i ȳ)(y i ȳ) /n is the sample covariance matrix; as in frequentist theory, the analysis of the factor model may be based on this matrix. Because the structure of Σ = Ψ+λλ does not lead to any simple form for the posterior distributions of the λ j and ψj, it is simpler to approach the posterior distributions of these parameters indirectly, through the complete data model. Since, conditional on the x i, the regressions of y j on x are independent with unrelated parameters, it follows immediately from standard Bayesian results for regression models that where i= ( j, λ j ) x, ψ j N((ˆ j, ˆλ j ), ψ j S xx ) (n )s j/ψ j x χ n, x = (x,..., x n ) ˆ j = ȳ j ˆλ x ˆλ j = S jx S xx (n )s j = S jj S jx S xx S xj S jj = i (y ij ȳ j ) ]. S jx = (y ij ȳ j )x i i [ n S xx = i x ] i i x i i x i Since the individual λ j given x are conditionally independent, the joint conditional distribution of the λ j given x and the ψ j is multivariate normal with a 4

5 diagonal covariance matrix, so integrating out the ψ j, the joint distribution of the λ j given x is multivariate t, with the marginal distributions of the individual λ j, given x, being λ j ˆλ j s j t n. We cannot proceed further analytically integrating out x as well gives an intractable marginal distribution for the λ j and ψ j because of the complex appearance of x in the conditioned distributions. 3. Inference about the factor scores One standard approach to factor score estimation in repeated-sampling theory is to use the conditional mean x i as the estimate of x i the regression estimate of the factor score. This estimate requires the ML estimates of, λ and Ψ to be substituted for the true values, introducing uncertainty which is difficult to allow for; though the delta method may be used to find large-sample standard errors for nonlinear functions like λ Σ, this does not give reliable representation of uncertainty in small to medium samples. A further difficulty is that the regression estimate is only the (conditional) mean of the factor score distribution the conditional variance is ignored in this representation. This underlies the criticism of regression estimates by Guttman and others, discussed below. In Bayes theory inference about X, like that about the model parameters, is based on its posterior distribution. The conditional distribution of X given the parameters is normal, as given in, but in Bayes analysis we have to integrate out the parameters from this conditional distribution (and not substitute the MLEs for the unknowns), with respect to their conditional distribution given the data. Unfortunately, integrating out λ and Ψ from the conditional distribution of X given these parameters is intractable, though Press and Shigemasu (997) showed, in a more general model than ours, that the marginal posterior distribution of X is asymptotically multivariate matrix T. The attraction of Bayesian methods is that they can give exact (to within simulation error) posterior distributions without asymptotic approximations. We now consider simulation methods to obtain these. 3 Data Augmentation The close parallel between the EM algorithm approach to maximum likelihood, using the complete data model, and the Bayesian analysis of the same model, can be turned to advantage using a simulation approach, called Data Augmentation (DA) by Tanner and Wong (987). We augment the observed data by the unobserved factor x, using the same complete data model as for the EM algorithm. Write θ = (, λ, Ψ) for the full vector of parameters. Then the conditional posterior distribution of θ given 5

6 x and y is π(θ x, y), and the conditional distribution of x given θ and y is π(x θ, y). Our object is to compute the marginal posterior distribution π(θ y) of θ given y, and the predictive distribution π(x y) of x. The data augmentation (DA) algorithm achieves this by cycling between the two conditionals, in a similar way to the EM algorithm cycling between the E and M steps. However, in DA we perform not expectations as in the E step, but full simulations from the conditional distributions (Tanner 996 p.9), and convergence is in distribution, rather than to a function maximum. One full of the DA algorithm consists of: Imputation step: generate a sample of M values x [],...,x [M] from the current approximation to the predictive distribution π(x y). Posterior step: update the current approximation to π(θ y) to be the mixture of conditional posteriors of θ, given y and the augmented data x [],...,x [M] : π(θ y) = π(θ x [m], y) M. m Generate a sample of M values θ [],..., θ [M] from π(θ y). For each θ [m], generate a random value x [m] of x from π(x θ [m], y). We repeat these s until the posterior distributions of θ and x converge, or stabilise. To assess this stability we track summary statistics of the posterior distributions of the parameters; we illustrate below with the medians and quartiles of the model parameters. These s can be carried out with relatively small M, like M = 5 to save computing time. Once the posterior distributions have converged, M may be increased to a larger number, like M =, to give the posterior distribution to high accuracy. We use kernel density estimation to give a smooth graphical picture of the posteriors, though the values themselves may be used to make any needed probability statements about the parameters. This process can be substantially accelerated by starting from an approximate posterior for θ based on the ML estimates ˆθ and information matrix I of the parameters from a ML routine, though it can start from a random starting point, as we show in the example. At convergence we have the full (marginal over x) posterior distribution of θ, and the (marginal over θ) posterior distribution of X = (X,..., X n ), so the marginal posterior distribution of any individual factor score X i follows immediately. An obvious, but fundamental, point is that the inference about X i is its distribution, conditional on the observed responses y ij. From the M simulation values of this X, we could compute the posterior mean and variance, and these would be corrected for the underestimation of variability in the plug-in conditional mean and variance. But this is unnecessary, because we have the full 6

7 (simulated) distribution of X: the distribution of the simulated values represents the real information about X that is, it is not a parameter which can be estimated by ML with a standard error, but a random variable. 4 An example We illustrate the DA analysis with n = observations from a four-variate example, with = (,,, ), λ = (,,, ), diag Ψ = (.36, 4, 4,.96). We generate values x i randomly from N(, ), and compute the data values y ij = j + λ j x i + e ij, i =,..., ; j =,..., 4 where the e ij are randomly generated from N(, ψj ). To illustrate the power and capabilities of the DA approach, we do not use an ML factor analysis package to get initial estimates of the parameters, but begin with a set of M = random values of x im, m =,..., M generated from N(, ) for each observation i. For each m we fit the regression of each y j on x m, obtaining MLEs of the model parameters. We then draw, for each m and each j, a random value jm, λ jm and ψjm from the respective current conditional posterior distributions of these parameters given y and x. The values of all the parameters for each m are conceptually assigned mass /M in the discrete joint posterior distribution of all these parameters; the updated posterior is the unweighted mean of the M individual conditional posteriors. This completes one posterior step of the DA algorithm. To generate random parameter values from the current marginal posterior, we draw a random integer m in the range (, M), and select the corresponding parameter vector indexed m from the above discrete posterior distribution. Given the parameter vector ( [m ], λ [m ], ψ[m ]) we compute the posterior distribution of x given y ij, and draw one random vector x [m ] from this distribution. We repeat this process of random integer drawing and random generation of x M times, obtaining x im, m =,..., M. This completes one full of the DA algorithm. We show in Figure the median and upper and lower quartiles of the posterior distributions of the factor loadings λ j and specific variances ψj for s of the DA algorithm. The distributions of the intercepts j are very stable around zero and we do not show them. The s, each with M =, required about 5 hours computing time on a Dell workstation. All programming was done in STATA 8. It is clear that convergence of the algorithm for most of the λ j requires - 5 s (this is known as the burn-in period in the Bayesian literature), but many more s are needed for the variances ψj, especially for ψ, the smallest variance, and the corresponding λ : as for the EM algorithm, convergence of a parameter value near a zero boundary is much slower. The rate 7

8 of convergence is parametrization-dependent, an important issue in large-scale Markov chains for complex models; convergence may be substantially improved by choosing near-orthogonal parametrizations. From the observations on each parameter we compute a kernel density using a Gaussian kernel, choosing the bandwidth to give a smooth density. The kernel densities for all the parameters are shown in Figure. The density for ψ is shown on the log scale, as the kernel method does not restrict the density estimate to the positive values of ψ. All posterior densities have the true parameter values within the 95% credible region, though some are near the edges. The densities of the loadings λ j are slightly skewed and have slightly longer tails than the normal distribution; those for j are very close to normality, and those for ψj are quite skewed as expected, especially for small ψj. We show in Figure 3 the kernel posterior density for x (solid curve), together with the empirical Bayes normal density N(ˆλ ˆV (y ˆ), ˆρ ) (dashed curve) using the maximum likelihood estimates of the parameters from a standard factor analysis routine; these are given in the table below. For reference, Table : MLEs of the parameters j ˆ j ˆλj ˆψ j the true value of x is.94. The much greater dispersion of the posterior density is quite marked, showing the importance of allowing correctly for uncertainty in the parameters. The Figure also shows the.5% (-.537) and 97.5% (-) points of the x distribution, giving a 95% credible interval for x of (.537, ). The corresponding values from the plug-in normal distribution are -34 and ; the coverage of this interval is only 76% from the true posterior distribution of x. We remark again that preliminary estimates of the parameters are not required it is quite striking that the DA algorithm converges to stable posterior distributions relatively quickly, given the well-known slow convergence of the EM algorithm in this model. 5 Relation to factor indeterminacy The DA analysis casts light on the controversy over factor score estimation and factor indeterminacy (Guttman 955, McDonald 974). The standard practice at the time for factor score estimation was to use the mean of the conditional distribution of X Y = y as a point estimate of x. The estimates of the parameters were substituted for their true values in this approach. 8

9 This substitution approach is well-documented in modern random effect models as the empirical Bayes approach the posterior distribution (typically normal) of the random effects, depending on the unknown model parameters, is estimated by the same distribution with the unknown parameters replaced by their MLEs, called plug-in estimates. This distribution is used to make full distributional statements about the random effects, not just the mean. The distributional statements are deficient because the uncertainty in the MLEs is not allowed for, and so the true variance of the posterior distribution is underestimated (as in Figure 3). Guttman was however concerned with a different issue, the behavior of the regression estimate as a random variable in repeated sampling. He considered the sampling distribution of the regression estimate, averaged over the distribution of Y. Conditionally on Y = y, but if we average the distribution of X Y = y N(λ V (y ), ρ ), X = λ V (y ) over Y, we have X N(, λ V λ) which is N(, ρ ). So the unconditional distribution of X, averaged across varying sample values of Y, has zero mean but variance ρ, not. If regression estimates were really estimates of the factor scores, then it seemed axiomatic that they should have the right distribution, that of the true factor scores. They clearly failed this requirement. Guttman expressed this failure through a conceptual simulation experiment: given the true values of all the parameters, generate a random error term ɛ from N(, ρ ) and add this to the regression estimate, giving a new estimate Z = X + ɛ. Then the unconditional distribution of Z would be N(, ) like that for X. Now imagine two such error terms with opposite signs: given ɛ, we could equally well have observed ɛ, and this could have been used to obtain Z = X ɛ (with the same ɛ). The correlation of Z and Z, in repeated random generations, would be r = cov(z, Z ) var(z )var(z ) = var( X) var(ɛ) = ρ which is negative for ρ <.5, a high value for any regression model in psychology, let alone a factor model. Guttman argued that two equally plausible values of X which correlated negatively would cast doubt on the whole factor model, and he coined the term factor indeterminacy for this aspect of the factor model. Since the primary 9

10 role of the factor model was to make statements about individual unobserved abilities, Guttman concluded that the model could not be used in this way except for models with very high variable factor score correlations. Factor analysis was based on a shaky assumption. The Bayesian framework helps to understand why this criticism was overstated. The essential feature of Bayesian analysis is that the information about the model parameters and any other unobserved quantities is expressed through posterior distributions, conditional on the observed data. This applies to the unobserved factor variables, and provides a very natural conditional distribution for an individual subject s factor score. The regression estimate the mean of this conditional distribution is indeed an unsuitable summary represention of the factor score as it suppresses the conditional variance, and more generally the whole conditional distribution, of the factor score. Guttman s criticism, viewed in this light, makes just this point that the variance is being overlooked: conditional means cannot be used as a surrogate for the true values without some way of representing the conditional uncertainty. But this failure to represent correctly the uncertainty about factor scores does not cast doubt on the factor model, merely on the technical tools with which the uncertainty in factor scores is represented. One such representation would be to present the plug-in conditional variance as well as the plug-in conditional mean. But the Bayesian representation is more informative it gives the full distribution of the uncertain factor score, and allows properly for the estimation of the parameters on which this distribution is based; the uncertainty in the factor loading and specific variance parameters is also correctly represented, in a much richer way than by the MLE and information matrix. The simulation approach is even more general than we have demonstrated it can be applied to differences among individuals in ability, by simply computing the M values of any comparison x i x i of interest. Such comparisons are widely used in small-area estimation and other empirical Bayes applications in multi-level models where differences among regions, areas or centres are of importance. Such differences are always overstated by empirical Bayes methods because of the systematic underestimation of variability from the use of plug-in ML estimates. The full joint posterior distribution of the model parameters provides even more information, for which there is no frequentist equivalent. Figure 4 is a composite plot of each parameter against every other, for a random sub-sample of values drawn from the (the sub-sampling is necessary for clarity). It is immediately clear that correlations are very high between the three smaller loadings, and that the factor loading and specific variance for Y are highly negatively correlated. The correlation matrix of the parameters (from the full values) shown in the table below bears this out.

11 Table : Correlation coefficients of the parameters λ ψ λ ψ 3 ψ 3 4 λ -. ψ λ ψ ψ ψ Conclusion The data augmentation algorithm, and more general Markov Chain Monte Carlo methods, provide the Bayesian analysis of the factor model. No new issues arise in the general multiple common factor case except for the rotational invariance problem - given a fixed covariance matrix for the factors, their posterior distribution can be simulated in the same way as for a single factor. The computing time required for the full Bayesian analysis is substantial, but the richness of information from the full joint posterior distribution more than compensates for the computational effort. MCMC packages like BUGS (Bayesian inference Using Gibbs Sampling) are widely available: we can look forward confidently to their future use with factor models, and other latent variable models, of much greater complexity than the simple model discussed here. 7 Acknowledgement We have benefited from discussions with our colleague Darren Wilkinson, and from editorial suggestions from Albert Maydeu-Olivares. 8 References Guttman, L. (955) The determinacy of factor score matrices with applications for five other problems of common factor theory. Brit. Jour. Statist. Psych. 8, McDonald, R. P. (974) The measurement of factor indeterminacy. Psychometrika 39, 3-. Martin, J.K. and McDonald, R.P. (975) Bayesian estimation in unrestricted factor analysis: a treatment for Heywood cases. Psychometrika 4, Press, S.J. and Shigemasu, K. (989) Bayesian inference in factor analysis. In Contributions to Probability and Statistics: Essays in honor of Ingram Olki. Gleser, L. J., et. al. editors. Springer Verlag, New York. Press, S.J. and Shigemasu, K., (997) Bayesian inference in factor analysis revised. Technical Report No. 43, Department of Statistics, University of

12 California, Riverside. Rubin, D.B. and Thayer, D. (98) EM Algorithms for ML factor analysis. Psychometrika 47, Rowe, D.B. () Multivariate Bayesian Statistics: Models for Source Separation and Signal Unmixing. CRC Press, Boca Raton. Tanner, M.A. (996) Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions. Springer, New York. Tanner, M.A. and Wong, W.H. (987) The calculation of posterior distributions by data augmentation. J. Amer. Statist. Assoc. 8,

13 PSfrag replacements PSfrag replacements PSfrag replacements PSfrag replacements λ λ λ λ λ λ λ λ ψ ψ ψ ψ PSfrag replacements λ λ ψ3 ψ PSfrag replacements PSfrag replacements PSfrag replacements λ λ ψ3 ψ4 λ λ λ λ Figure : Iteration history 3

14 3 4 5 PSfrag replacements λ λ PSfrag replacements PSfrag replacements PSfrag replacements λ λ λ λ λ λ λ λ λ PSfrag replacements PSfrag replacements PSfrag replacements PSfrag replacements λ λ λ λ λ λ PSfrag replacements PSfrag replacements PSfrag replacements PSfrag replacements 3 4 ψ3 ψ λ Figure : Kernel densities 4

15 .5.5.5% 97.5% PSfrag replacements λ λ λ3 λ4 ψ ψ ψ3 ψ4 log(ψ ) 3 4 x posterior empirical Bayes Figure 3: Posterior density of x and empirical Bayes density λ.5 ψ PSfrag replacements λ ψ 3 λ3 ψ3 log(ψ ) 4 λ4 ψ4.5 Figure 4: scatter plots of all parameters 5

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Bayesian inference for multivariate skew-normal and skew-t distributions

Bayesian inference for multivariate skew-normal and skew-t distributions Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Inferences on missing information under multiple imputation and two-stage multiple imputation

Inferences on missing information under multiple imputation and two-stage multiple imputation p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Estimation of Operational Risk Capital Charge under Parameter Uncertainty Estimation of Operational Risk Capital Charge under Parameter Uncertainty Pavel V. Shevchenko Principal Research Scientist, CSIRO Mathematical and Information Sciences, Sydney, Locked Bag 17, North Ryde,

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Lecture Notes based on Koop (2003) Bayesian Econometrics

Lecture Notes based on Koop (2003) Bayesian Econometrics Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia

More information

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach CHI HANG AU & ALLISON AMES, PH.D. 1 Acknowledgement Allison Ames, PhD Jeanne Horst, PhD 2 Overview Features of the

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas Multiple Imputation for Missing Data in epeated Measurements Using MCMC and Copulas Lily Ingsrisawang and Duangporn Potawee Abstract This paper presents two imputation methods: Marov Chain Monte Carlo

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

Rank Regression with Normal Residuals using the Gibbs Sampler

Rank Regression with Normal Residuals using the Gibbs Sampler Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Bayesian inference for multivariate extreme value distributions

Bayesian inference for multivariate extreme value distributions Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Nonparametric Drift Estimation for Stochastic Differential Equations

Nonparametric Drift Estimation for Stochastic Differential Equations Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Inferring biological dynamics Iterated filtering (IF)

Inferring biological dynamics Iterated filtering (IF) Inferring biological dynamics 101 3. Iterated filtering (IF) IF originated in 2006 [6]. For plug-and-play likelihood-based inference on POMP models, there are not many alternatives. Directly estimating

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Gunter Spöck, Hannes Kazianka, Jürgen Pilz Department of Statistics, University of Klagenfurt, Austria hannes.kazianka@uni-klu.ac.at

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Quantile POD for Hit-Miss Data

Quantile POD for Hit-Miss Data Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Lecture 16: Mixtures of Generalized Linear Models

Lecture 16: Mixtures of Generalized Linear Models Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently

More information

Scaling up Bayesian Inference

Scaling up Bayesian Inference Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Quantitative Trendspotting. Rex Yuxing Du and Wagner A. Kamakura. Web Appendix A Inferring and Projecting the Latent Dynamic Factors

Quantitative Trendspotting. Rex Yuxing Du and Wagner A. Kamakura. Web Appendix A Inferring and Projecting the Latent Dynamic Factors 1 Quantitative Trendspotting Rex Yuxing Du and Wagner A. Kamakura Web Appendix A Inferring and Projecting the Latent Dynamic Factors The procedure for inferring the latent state variables (i.e., [ ] ),

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Bayesian Inference in the Multivariate Probit Model

Bayesian Inference in the Multivariate Probit Model Bayesian Inference in the Multivariate Probit Model Estimation of the Correlation Matrix by Aline Tabet A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science

More information