Factor Analysis and Indian Buffet Process

Size: px
Start display at page:

Download "Factor Analysis and Indian Buffet Process"

Transcription

1 Factor Analysis and Indian Buffet Process Lecture 4 Peixian Chen pchenac@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology March 25, 2013 Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

2 Factor Analysis A statistical method for dimensionality reduction Represents correlated observed variables with a smaller number of latent variables referred to as Factors Used to explore the underlying dimensions of the data (Exploratory Factor Analysis) or to test specific hypotheses (Confirmatory Factor Analysis) Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

3 Outline Factor Models 1 Factor Models Two Linear Models 2 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis 3 Indian Buffet Process Finite to infinite binary matrices Equivalence Class The Indian Buffet Process Applications Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

4 Tow Linear Models Factor Models For Component Analysis: x i = a i1 z i + a i2 z a in z n (i = 1, 2,..., n) x i the ith observed variable, z 1... z n n uncorrelated components, a i1... a in the correlations of n components with ith variable. Each component makes a maximum contrbution to the sum of the variances of the n variables. All the components are required to reproduce the correlations among the variables, but only a few components may be retained in a practical problem if they account for a large percentage of the total vairance Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

5 Two Linear Models Factor Models Two Linear Models For (Classical) Factor Analysis: x i = a i1 z i + a i2 z a im z m + ɛu i (i = 1, 2,..., n) x i the ith observed variable, z 1... z m m common factors(normally m < n), u i a unique factor, a i1... a im factor loadings. Classical factor analysis model is designed to maximally reproduce the correlations. Each of the n observed variables is described linearly in terms of m common factors and a unique factor. Common factors account for the correlations among the variabels, while each unique factor accounts for the remaining variance (including error). Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

6 Factor Models Two Linear Models Difference of Factor and Components Components are real factors. In PCA the factors are actual combinations of variables. The factor loadings are the correlations of these combinations with the factors. Common factors are hypothetical. Common factors have to be estimated from actual variables and to obtain them mathematical procedures must be used which specify factors in terms of common variance. Common and unique variance in factor analysis are separated. Excluding the latter one is beneficial because generally the unique variance is of no scientific interest. PCA tries to account for both of them. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

7 Factor Model Factor Models Two Linear Models FA assumes that there is a set of latent factors z j which when acting in combination generate the observed variables x. The goal of FA is to characterize the dependency among the observed variables by means of a smaller number of factors. we simplify the model as : x i u i = a i1 z i + a i2 z a im z m + ɛ i (i = 1, 2,..., n) or in vector-marix form as x u = Az + ɛ x i the ith observed variable, z 1... z m m common factors(normallym < n), ɛ i Noise source treated as noise or error as the unique factor of ith observed variable, a i1... a im factor loadings, u i u i = E[x i ]. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

8 Factor Models Factor Model Assumption Two Linear Models E[x] = u, Cov(x) = Σ. Without loss of generality, we assume u = 0; E[z i ] = 0,Var(z i ) = 1, and Cov(z i, z j ) = 0 (standardized, factors are mutually independent); E[ɛ i ] = 0,Var(ɛ i ) = φ i, Cov(ɛ i, ɛ j ) = 0; Cov(ɛ, z) = 0; Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

9 Factor Models Two Linear Models Covariance Matrix Given that Var(z i ) = 1 and Var(ɛ i ) = φ i, Σ = Cov(x) = Cov(Az + ɛ) = Cov(Az) + Cov(ɛ) = ACov(z)A T + Φ = AA T + Φ Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

10 Factor Models Two Linear Models A Single Common Factor Example This is a two-variable, one-common factor model: x 1 = a 1 z + ɛ 1 x 2 = a 2 z + ɛ 2 Cov(z, x 1 ) = E[(z z)(x 1 x 1 )] = E[zx 1 ] = E[z(a 1 z + ɛ 1 )] = a 1 E[z 2 ] + E[zɛ 1 ] = a 1 Var(z) + Cov(zɛ 1 ) = a 1 Var(z) = a 1 Similarly, Cov(z, x 2 ) = a 2. And to extend the conclusion to multi-common factor model, the factor loadings A represent the covariance between the variables and the factors. Note that if all variables are standardized to have unit variance, factor loadings are equivalent to correlations between factors and variables where only a single common factor is involved, or in the case where multiple common factors are orthogonal to each other. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

11 Factor Models Two Linear Models A Single Common Factor Example So,Cov(x 1, x 2 ) = a 1 a 2. Cov(x 1, x 2 ) = E[(x 1 x 1 )(x 2 x 2 )] = E[(a 1 z + ɛ 1 )(a 2 z + ɛ 2 )] = E[a 1 a 2 z 2 + a 1 zɛ 2 + a 2 zɛ 1 + ɛ 1 ɛ 2 ] = a 1 a 2 Var(z) = a 1 a 2 Note that in other models with more common factors, the covariance between two observed variables is more complex. For example, Cov(x 1, x 2 ) = a 11 a 21 + a 12 a 22 in a two-factor model. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

12 Exploratory Factor Analysis and Confirmatory Factor Analysis Outline 1 Factor Models Two Linear Models 2 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis 3 Indian Buffet Process Finite to infinite binary matrices Equivalence Class The Indian Buffet Process Applications Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

13 Exploratory Factor Analysis and Confirmatory Factor Analysis Brief Introduction to EFA and CFA Exploratory Factor Analysis (EFA) Used to explore the dimensionality of a measurement instrument by finding the smallest number of interpretable factors needed to explain the correlations among a set of variables. Exploratory in the sense that it places no structure on the linear relationships between the observed variables and on the linear relationships between the observed variables and the factors but only specifies the number of latent variables. Confirmatory Factor Analysis (CFA) Used to study how well a hypothesized factor model fits a new sample from the same population or a sample from a different population and characterized by allowing restrictions on the parameters of the model. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

14 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis Main Steps in Exploratory Factor Analysis (1) Collect and explore data: choose relevant variables. (2) Extract initial factors (via principal components) (3) Rotate and interpret (4) (a) Decide if changes need to be made (b) repeat (3) (5) Construct scales and use in further analysis Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

15 Exploratory Factor Analysis and Confirmatory Factor Analysis (1) Data Matrix Exploratory Factor Analysis Factor analysis is totally dependent on correlations between variables. So first a covariance matrix or correlation matrix should be prepared if it s not already available. From Kim and Mueller, they suggest one may rely on the use of a correlation matrix in EFA. Because(1) many existing computer programs do not accept the covariance matrix as basic input data, and (2)almost all of the examples in the literature are based on correlation matrices. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

16 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis (2)Extracting initial factors To find the number of factors that can adequately explain the observed correlations among observed variables. Typical approaches: maximum likelihood method least-squares method Alpha factoring Image factoring principal components analysis At this stage of the analysis one should not be concerned with whether the underlying factors are orthogonal or oblique all the initial solutions are based on the orthogonal solution. Nor should one be too concerned with whether the factors extracted are interpretable or meaningful. The chief concern is whether a smaller number of factors can account for the covariation among a much larger number of variables. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

17 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis (2)Extracting initial factors An initial solution must provide number of common factors to be extracted OR objective criterion for choosing number of factors Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

18 Exploratory Factor Analysis and Confirmatory Factor Analysis (3) Rotation to a terminal solution Exploratory Factor Analysis On initial solution, certain restrictions are imposed. there are k common factors, underlying factors are orthogonal to each other, the first factor accounts for as much variance as possible, the second factor accounts for as much of the residual variance left unexplained by the first factor...and so on. After choosing number of factors to retain, we want to spread variability more evenly among factors. so we rotate factors: redefine factors such that loadings on various factors tend to be very high (-1 or 1) or very low (0) intuitively, it makes sharper distinctions in the meanings of the factors Note that we use factor analysis for rotation NOT principal components! Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

19 Exploratory Factor Analysis and Confirmatory Factor Analysis (3)Rotation (continued) Exploratory Factor Analysis Unrotated solution is based on the idea that each factor tries to maximize variance explained, conditional on previous factors. What if we take that away? Then, there is not one best solution. All solutions are relatively the same. Goal is simple structure Most construct validation assumes simple (typically rotated) structure. Rotation does NOT improve fit! Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

20 Exploratory Factor Analysis and Confirmatory Factor Analysis (3)Rotation (continued) Exploratory Factor Analysis Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

21 Outline Indian Buffet Process 1 Factor Models Two Linear Models 2 Exploratory Factor Analysis and Confirmatory Factor Analysis Exploratory Factor Analysis 3 Indian Buffet Process Finite to infinite binary matrices Equivalence Class The Indian Buffet Process Applications Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

22 Indian Buffet Process Finite to infinite binary matrices Finite to infinite binary matrices Clustering algorithms (e.g. using mixture models) represent data in terms of which cluster each data point belongs to.but clustering models are restrictive. Consider modelling peoples movie preferences (the Netix problem). A movie might be described using features such as is science ction, has Charlton Heston, was made in the US, was made in 1970s, has apes in it... these features may be unobserved (latent). The number of potential latent features for describing a movie (or person, news story, image, gene, speech waveform, etc) is unlimited. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

23 Indian Buffet Process Finite to infinite binary matrices Finite to infinite binary matrices Derive a distribution on innite binary matrices by starting with a simple model that assumes K features, and then taking the limit as K. The resulting distribution corresponds to a simple generative process, which we term the Indian buffet process. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

24 Infinite binary matrices Indian Buffet Process Finite to infinite binary matrices F = [f T 1 ft 2... ft N ] latent feature values for all N objects A prior on F can be dened by specifying priors for Z and V separately, with p(f) = P(Z)p(V). The binary matrix Z indicating which featuresare possessed by each object, with z i k = 1 if object i has feature k and 0 otherwise, and a second matrix V indicating the value of each feature for each object. F can be expressed as the elementwise (Hadamard) product of Z and V, F = Z V. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

25 A finite feature model Indian Buffet Process Finite to infinite binary matrices Probability model: π k α Beta( α K, 1) z ik π k Bernoulli(π k ) z ik form a binary N K feature matrix Z. Each object possesses feature k with probability π k. The features are generated independently. π k can each take on any value in [0,1]. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

26 A finite feature model Indian Buffet Process Finite to infinite binary matrices P(Z π) = K k=1 i=1 N P(z ik π k ) = K k=1 π m k k (1 π k) N m k m k = N i=1 z ik is the number of objects possessing feature k. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

27 A finite feature model Indian Buffet Process Finite to infinite binary matrices P(Z) = = = K N ( P(z ik π k ))p(π k )dπ k k=1 i=1 K B(m k + α K, N m k + 1) k=1 K k=1 B( α K, 1) α K Γ(m k + α K )Γ(N m k + 1) Γ(N α K ) (1) Note that the result follows from conjugacy between the binomial and beta distributions. This distribution is exchangeable, depending only on the counts m k = N i=1 z ik. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

28 A finite feature model Indian Buffet Process Finite to infinite binary matrices The expectation of the number of non-zero entries in the matrix Z,E[I T ZI] = E[ ik z ik], has and upper bound that is independent of K. Each column of Z is independent, first we compute: E[I T z k ] = N E(z ik ) = i=1 N i=1 1 E[I T ZI] = KE[I T z k ] = Nα 1 + α K 0 α π k p(π k )dπ k = N K 1 + α Then K Upper bound: Nα. Even in the K limit, the matrix is expected to have a nite number of non-zero entries. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

29 Equivalence Class Indian Buffet Process Equivalence Class Function that maps binary matrices to left-ordered binary matrices : lof(x) lof(z) is obtained by ordering the columns of the binary matrix Z from left to right by the magnitude of the binary number, taking the first row as the most significant bit. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

30 Indian Buffet Process Equivalence Class History history : the full history of the feature k, (z 1k, z 2k,..., z Nk ). K h :the number of features possessing the history h. K 0 : number of features for which m k = 0. K + = 2 N 1 h=1 K h: the number of features for which m k > 0. K = K 0 + K +. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

31 Cardinality of [Z] Indian Buffet Process Equivalence Class Cardinality of [Z]: the number of matrices that map to the same left-ordered form. The cardinality reduces when Z contains identical columns. The cardinality of [Z] is K! 2 N 1 h=0 K h! Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

32 Indian Buffet Process Taking the Infinite Limit Equivalence Class From equation (1) P([Z]) = Z [Z] P(Z) = K! 2 N 1 h=0 K h! K k=1 α K Γ(m k + α K )Γ(N m k + 1) Γ(N α K ) (2) Then we divide the columns of Z into two subsets: m k > 0 if k K + and m k = 0 otherwise. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

33 Taking the infinite limit Indian Buffet Process Equivalence Class Then we get K k=1 α K Γ(m k + α K )Γ(N m k + 1) Γ(N α K ) α =( K Γ( α )Γ(N + 1) K Γ(N α K ) )K K + α =( K Γ( α )Γ(N + 1) K Γ(N α K ) )K K + k=1 K + k=1 α K Γ(m k + α K )Γ(N m k + 1) Γ(N α K ) Γ(m k + α K )Γ(N m k + 1) Γ( α )Γ(N + 1) K (3) N! =( N j=1 (j + α ( α K ))K K )K+ K + k=1 (N m k )! m k 1 j=1 (j + α K ) N! Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

34 Indian Buffet Process Taking the infinite limite Equivalence Class Substituing Equation (3) to Equation (2) and rearranging terms, we take the limit K : lim K = α K+ 2N 1 h=1 K h! α K+ 2 N 1 K K! K 0!K ( N! + K+ N j=1 (j + α K ))K h=1 K h! 1 exp{ αh N} K + k=1 k=1 (N m k )!(m k 1)! N! where H N is the Nth harmonic number, H N = N j=1 1 j. (N m k )! m k 1 j=1 (j + α K ) Again, this distribution is exchangeable: neither the number of identical columns nor the column sums are affected by the ordering on objects. N! (4) Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

35 Indian Buffet Process The Indian Buffet Process The Indian Buffet Process N customers enter a restaurant one after another. Each customer encounters a buffet consisting of innitely many dishes arranged in a line. The rst customer starts at the left of the buffet and takes a serving from each dish, stopping after a Poisson(α) number of dishes The ith customer moves along the buffet, sampling dishes in proportion to their popularity, serving himself with probability m k i, where mk is the number of previous customers who have sampled a dish. Having reached the end of all previous sampled dishes, the ith customer then tries a Poisson(α) number of new dishes. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

36 Indian Buffet Process The Indian Buffet Process The Indian Buffet Process Using K (i) 1 to indicate the number of new dishes sampled by the ith customer, the probability of any particular matrix being produced by this process is P(Z) = Not left-ordered form α K+ N i=1 K (i) 1!exp{ αh N} K + k=1 (N m k )!(m k 1)! N! Customers are not exchangeable under this distribution N i=1 K (i) 1! 2N 1 h=1 K matrices generated via this process map to the same left-ordered h! form. P([Z]) is obtained by multiplying by this quantity. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

37 Indian Buffet Process Inference by Gibbs Sampling The Indian Buffet Process Conditional distribution: When K, P(z ik = 1 z i,k ) = 1 0 P(z ik π k )p(π k z i,k )dπ k = m i,k + α K N + α K P(z ik ) = m i,k N Similarly the number of new features associated with object i should be drawn from a Poisson( α N ) distribution. Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

38 Applications Indian Buffet Process Applications Modelling Data Latent variable model: let X be the N D matrix of observed data, and Z be the N K matrix of binary latent features P (X, Z α) = P (X Z)P (Z α) By combining the IBP with different likelihood functions we can get different kinds of models: Models for graph structures (w/ Wood, Griffiths, 2006) Models for protein complexes (w/ Chu, Wild, 2006) Models for overlapping clusters (w/ Heller, 2007) Models for choice behaviour (Görür, Jäkel & Rasmussen, 2006) Models for users in collaborative filtering (w/ Meeds, Roweis, Neal, 2006) Sparse latent factor models (w/ Knowles, 2007) Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

39 Applications Indian Buffet Process Applications Gibbs sampling: Posterior Inference in IBPs P (Z, α X) P (X Z)P (Z α)p (α) P (z nk = 1 Z (nk), X, α) P (z nk = 1 Z (nk), α)p (X Z) If m n,k > 0, P (z nk = 1 z n,k ) = m n,k N For infinitely many k such that m n,k = 0: Metropolis steps with truncation to sample from the number of new features for each object. If α has a Gamma prior then the posterior is also Gamma Gibbs sample. Conjugate sampler: assumes that P (X Z) can be computed. Non-conjugate sampler: P (X Z) = P (X Z, θ)p (θ)dθ cannot be computed, requires sampling latent θ as well (c.f. (Neal 2000) non-conjugate DPM samplers). Slice sampler: non-conjugate case, is not approximate, and has an adaptive truncation level using a stick-breaking construction of the IBP (Teh, et al, 2007). Particle Filter: (Wood & Griffiths, 2007). Accelerated Gibbs Sampling: maintaining a probability distribution over some of the variables (Doshi-Velez & Ghahramani, 2009). Variational inference: (Doshi-Velez, Miller, van Gael, & Teh, 2009). Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

40 Applications Indian Buffet Process Applications An application of IBPs A Non-Parametric Bayesian Method for Inferring Hidden Causes (Wood, Griffiths, Ghahramani, 2006) Inferring stroke localization from patient symptoms: (50 stroke patients, 56 symptoms/signs) The IBP models the graph structure connecting hidden causes to symptoms Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

41 Applications Indian Buffet Process Applications Infinite Sparse Latent Factor Models x z... Model: Y = G(Z X) + E y G where Y is the data matrix, G is the factor loading matrix, Z IBP(α, β) is a mask matrix, X is heavy tailed factors and E is Gaussian noise. The IBP models the sparsity structure in the latent variables (w/ Knowles, 2007) Peixian Chen (HKUST) Factor Analysis and Indian Buffet Process March 25, / 47

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Haupthseminar: Machine Learning. Chinese Restaurant Process, Indian Buffet Process

Haupthseminar: Machine Learning. Chinese Restaurant Process, Indian Buffet Process Haupthseminar: Machine Learning Chinese Restaurant Process, Indian Buffet Process Agenda Motivation Chinese Restaurant Process- CRP Dirichlet Process Interlude on CRP Infinite and CRP mixture model Estimation

More information

Nonparametric Bayesian Models for Sparse Matrices and Covariances

Nonparametric Bayesian Models for Sparse Matrices and Covariances Nonparametric Bayesian Models for Sparse Matrices and Covariances Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Bayes

More information

Infinite Latent Feature Models and the Indian Buffet Process

Infinite Latent Feature Models and the Indian Buffet Process Infinite Latent Feature Models and the Indian Buffet Process Thomas L. Griffiths Cognitive and Linguistic Sciences Brown University, Providence RI 292 tom griffiths@brown.edu Zoubin Ghahramani Gatsby Computational

More information

Infinite latent feature models and the Indian Buffet Process

Infinite latent feature models and the Indian Buffet Process p.1 Infinite latent feature models and the Indian Buffet Process Tom Griffiths Cognitive and Linguistic Sciences Brown University Joint work with Zoubin Ghahramani p.2 Beyond latent classes Unsupervised

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models François Caron UBC October 2, 2007 / MLRG François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 1 / 29 Overview 1 Introduction

More information

The Indian Buffet Process: An Introduction and Review

The Indian Buffet Process: An Introduction and Review Journal of Machine Learning Research 12 (2011) 1185-1224 Submitted 3/10; Revised 3/11; Published 4/11 The Indian Buffet Process: An Introduction and Review Thomas L. Griffiths Department of Psychology

More information

Nonparametric Probabilistic Modelling

Nonparametric Probabilistic Modelling Nonparametric Probabilistic Modelling Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Signal processing and inference

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models Indian Buffet process, beta process, and related models François Caron Department of Statistics, Oxford Applied Bayesian Statistics Summer School Como, Italy

More information

Dirichlet Processes and other non-parametric Bayesian models

Dirichlet Processes and other non-parametric Bayesian models Dirichlet Processes and other non-parametric Bayesian models Zoubin Ghahramani http://learning.eng.cam.ac.uk/zoubin/ zoubin@cs.cmu.edu Statistical Machine Learning CMU 10-702 / 36-702 Spring 2008 Model

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Sparsity? A Bayesian view

Sparsity? A Bayesian view Sparsity? A Bayesian view Zoubin Ghahramani Department of Engineering University of Cambridge SPARS Conference Cambridge, July 2015 Sparsity Many people are interested in sparsity. Why? Sparsity Many people

More information

Bayesian non parametric approaches: an introduction

Bayesian non parametric approaches: an introduction Introduction Latent class models Latent feature models Conclusion & Perspectives Bayesian non parametric approaches: an introduction Pierre CHAINAIS Bordeaux - nov. 2012 Trajectory 1 Bayesian non parametric

More information

The Infinite Factorial Hidden Markov Model

The Infinite Factorial Hidden Markov Model The Infinite Factorial Hidden Markov Model Jurgen Van Gael Department of Engineering University of Cambridge, UK jv279@cam.ac.uk Yee Whye Teh Gatsby Unit University College London, UK ywteh@gatsby.ucl.ac.uk

More information

Accelerated Gibbs Sampling for Infinite Sparse Factor Analysis

Accelerated Gibbs Sampling for Infinite Sparse Factor Analysis LLNL-TR-499647 Accelerated Gibbs Sampling for Infinite Sparse Factor Analysis D. M. Andrzejewski September 19, 2011 Disclaimer This document was prepared as an account of work sponsored by an agency of

More information

Nonparametric Factor Analysis with Beta Process Priors

Nonparametric Factor Analysis with Beta Process Priors Nonparametric Factor Analysis with Beta Process Priors John Paisley Lawrence Carin Department of Electrical & Computer Engineering Duke University, Durham, NC 7708 jwp4@ee.duke.edu lcarin@ee.duke.edu Abstract

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet

More information

Non-parametric Bayesian Methods

Non-parametric Bayesian Methods Non-parametric Bayesian Methods Uncertainty in Artificial Intelligence Tutorial July 25 Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK Center for Automated Learning

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

arxiv: v2 [stat.ml] 10 Sep 2012

arxiv: v2 [stat.ml] 10 Sep 2012 Distance Dependent Infinite Latent Feature Models arxiv:1110.5454v2 [stat.ml] 10 Sep 2012 Samuel J. Gershman 1, Peter I. Frazier 2 and David M. Blei 3 1 Department of Psychology and Princeton Neuroscience

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical

More information

Clustering using Mixture Models

Clustering using Mixture Models Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Bayesian Mixtures of Bernoulli Distributions

Bayesian Mixtures of Bernoulli Distributions Bayesian Mixtures of Bernoulli Distributions Laurens van der Maaten Department of Computer Science and Engineering University of California, San Diego Introduction The mixture of Bernoulli distributions

More information

Feature Allocations, Probability Functions, and Paintboxes

Feature Allocations, Probability Functions, and Paintboxes Feature Allocations, Probability Functions, and Paintboxes Tamara Broderick, Jim Pitman, Michael I. Jordan Abstract The problem of inferring a clustering of a data set has been the subject of much research

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis When we have data x (i) R n that comes from a mixture of several Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting,

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Nonparametric Mixed Membership Models

Nonparametric Mixed Membership Models 5 Nonparametric Mixed Membership Models Daniel Heinz Department of Mathematics and Statistics, Loyola University of Maryland, Baltimore, MD 21210, USA CONTENTS 5.1 Introduction................................................................................

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

Bayesian Nonparametric Learning of Complex Dynamical Phenomena

Bayesian Nonparametric Learning of Complex Dynamical Phenomena Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),

More information

Bayesian Nonparametric Models on Decomposable Graphs

Bayesian Nonparametric Models on Decomposable Graphs Bayesian Nonparametric Models on Decomposable Graphs François Caron INRIA Bordeaux Sud Ouest Institut de Mathématiques de Bordeaux University of Bordeaux, France francois.caron@inria.fr Arnaud Doucet Departments

More information

A Process over all Stationary Covariance Kernels

A Process over all Stationary Covariance Kernels A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that

More information

Dimension Reduction. David M. Blei. April 23, 2012

Dimension Reduction. David M. Blei. April 23, 2012 Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do

More information

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology FE670 Algorithmic Trading Strategies Lecture 3. Factor Models and Their Estimation Steve Yang Stevens Institute of Technology 09/12/2012 Outline 1 The Notion of Factors 2 Factor Analysis via Maximum Likelihood

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Lecture 6: April 19, 2002

Lecture 6: April 19, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models Spring 2002 Lecturer: Jeff Bilmes Lecture 6: April 19, 2002 University of Washington Dept. of Electrical Engineering Scribe: Huaning Niu,Özgür Çetin

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

Interpretable Latent Variable Models

Interpretable Latent Variable Models Interpretable Latent Variable Models Fernando Perez-Cruz Bell Labs (Nokia) Department of Signal Theory and Communications, University Carlos III in Madrid 1 / 24 Outline 1 Introduction to Machine Learning

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Bayesian non-parametric models and inference for sparse and hierarchical latent structure

Bayesian non-parametric models and inference for sparse and hierarchical latent structure Bayesian non-parametric models and inference for sparse and hierarchical latent structure David Arthur Knowles Wolfson College University of Cambridge A thesis submitted for the degree of Doctor of Philosophy

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:

More information

Bayesian Nonparametrics: Models Based on the Dirichlet Process

Bayesian Nonparametrics: Models Based on the Dirichlet Process Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro

More information

arxiv: v2 [stat.ml] 4 Aug 2011

arxiv: v2 [stat.ml] 4 Aug 2011 A Tutorial on Bayesian Nonparametric Models Samuel J. Gershman 1 and David M. Blei 2 1 Department of Psychology and Neuroscience Institute, Princeton University 2 Department of Computer Science, Princeton

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

A Stick-Breaking Construction of the Beta Process

A Stick-Breaking Construction of the Beta Process John Paisley 1 jwp4@ee.duke.edu Aimee Zaas 2 aimee.zaas@duke.edu Christopher W. Woods 2 woods004@mc.duke.edu Geoffrey S. Ginsburg 2 ginsb005@duke.edu Lawrence Carin 1 lcarin@ee.duke.edu 1 Department of

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Fast Approximate MAP Inference for Bayesian Nonparametrics

Fast Approximate MAP Inference for Bayesian Nonparametrics Fast Approximate MAP Inference for Bayesian Nonparametrics Y. Raykov A. Boukouvalas M.A. Little Department of Mathematics Aston University 10th Conference on Bayesian Nonparametrics, 2015 1 Iterated Conditional

More information

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism Aaron C. Courville, Douglas Eck and Yoshua Bengio Department of Computer Science and Operations Research University of Montréal Montréal, Québec,

More information

Bayesian Hidden Markov Models and Extensions

Bayesian Hidden Markov Models and Extensions Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

Part IV: Monte Carlo and nonparametric Bayes

Part IV: Monte Carlo and nonparametric Bayes Part IV: Monte Carlo and nonparametric Bayes Outline Monte Carlo methods Nonparametric Bayesian models Outline Monte Carlo methods Nonparametric Bayesian models The Monte Carlo principle The expectation

More information

Infinite Independent Components Analysis by

Infinite Independent Components Analysis by Infinite Independent Components Analysis by David Knowles (JN) Fourth-year undergraduate project in Group F, 2006/2007 I hereby declare that, except where specifically indicated, the work submitted herein

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Lecture 16: Mixtures of Generalized Linear Models

Lecture 16: Mixtures of Generalized Linear Models Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Lecture 6: Gaussian Mixture Models (GMM)

Lecture 6: Gaussian Mixture Models (GMM) Helsinki Institute for Information Technology Lecture 6: Gaussian Mixture Models (GMM) Pedram Daee 3.11.2015 Outline Gaussian Mixture Models (GMM) Models Model families and parameters Parameter learning

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College

More information

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007 COS 597C: Bayesian Nonparametrics Lecturer: David Blei Lecture # Scribes: Jordan Boyd-Graber and Francisco Pereira October, 7 Gibbs Sampling with a DP First, let s recapitulate the model that we re using.

More information

Factor analysis. George Balabanis

Factor analysis. George Balabanis Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average

More information

INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES

INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, China E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron INRIA IMB - University of Bordeaux Talence, France Francois.Caron@inria.fr Abstract We develop a novel Bayesian nonparametric model for

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Nonparametric Bayesian Dictionary Learning for Machine Listening

Nonparametric Bayesian Dictionary Learning for Machine Listening Nonparametric Bayesian Dictionary Learning for Machine Listening Dawen Liang Electrical Engineering dl2771@columbia.edu 1 Introduction Machine listening, i.e., giving machines the ability to extract useful

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling

The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling Sinead Williamson SAW56@CAM.AC.UK Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, UK

More information

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado CSCI 5822 Probabilistic Model of Human and Machine Learning Mike Mozer University of Colorado Topics Language modeling Hierarchical processes Pitman-Yor processes Based on work of Teh (2006), A hierarchical

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5)

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Tamara Broderick ITT Career Development Assistant Professor Electrical Engineering & Computer Science MIT Bayes Foundations

More information

Nonparametric Latent Feature Models for Link Prediction

Nonparametric Latent Feature Models for Link Prediction Nonparametric Latent Feature Models for Link Prediction Kurt T. Miller EECS University of California Berkeley, CA 94720 tadayuki@cs.berkeley.edu Thomas L. Griffiths Psychology and Cognitive Science University

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

Matrix and Tensor Factorization from a Machine Learning Perspective

Matrix and Tensor Factorization from a Machine Learning Perspective Matrix and Tensor Factorization from a Machine Learning Perspective Christoph Freudenthaler Information Systems and Machine Learning Lab, University of Hildesheim Research Seminar, Vienna University of

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

A Non-parametric Conditional Factor Regression Model for Multi-Dimensional Input and Response

A Non-parametric Conditional Factor Regression Model for Multi-Dimensional Input and Response A Non-parametric Conditional Factor Regression Model for Multi-Dimensional Input and Response Ava Bargi a Richard Yi Da Xu b Zoubin Ghahramani Massimo Piccardi c University of Technology, Sydney a,b,c

More information

Priors for Random Count Matrices with Random or Fixed Row Sums

Priors for Random Count Matrices with Random or Fixed Row Sums Priors for Random Count Matrices with Random or Fixed Row Sums Mingyuan Zhou Joint work with Oscar Madrid and James Scott IROM Department, McCombs School of Business Department of Statistics and Data Sciences

More information