A hierarchical group ICA model for assessing covariate effects on brain functional networks

Size: px
Start display at page:

Download "A hierarchical group ICA model for assessing covariate effects on brain functional networks"

Transcription

1 A hierarchical group ICA model for assessing covariate effects on brain functional networks Ying Guo Joint work with Ran Shi Department of Biostatistics and Bioinformatics Emory University 06/03/2013 Ying Guo (Emory University) SRCOS /03/ / 31

2 Outline Introduction of independent component analysis (ICA) for fmri data A hierarchical group PICA model for modeling covariate effects on brain functional networks Estimation methods for the proposed group ICA model Simulation studies A fmri data example Summary Ying Guo (Emory University) SRCOS /03/ / 31

3 Source Separation in fmri studies Goal: Decompose observed fmri data as a linear combination of spatio-temporal processes of underlying source signals. Ying Guo (Emory University) SRCOS /03/ / 31

4 Single-subject ICA Classic noise-free ICA Y = AS, (1) Y(T V ) is the observed fmri data. A(T q) is the mixing matrix. Each column represents the time series associated with an IC S(q V ) is statistically independent spatial source signals in its rows. The distribution of the spatial source signal is non-gaussian. Probabilistic ICA (PICA): Y = AS + E E(T V ) noise term representing residual variability in the data that are not explained by the extracted ICs. Ying Guo (Emory University) SRCOS /03/ / 31

5 Time T...1 Schematic plot for Single-subject ICA Voxels 1.. V IC 1 IC q IC 1 Y A S Statistically independent f s 1,, s q = f s 1 f(s q ) IC q Ying Guo (Emory University) SRCOS /03/ / 31

6 Existing group ICA methods Temporal-concatenation group ICA (TC-GICA) GIFT (Calhoun et al., 2001 ) Tensor PICA model (Beckmann and Smith, 2005 ) A general group PICA model (Guo and Pagnoni, 2008; Guo 2011) Assume common IC spatial maps across subjects in group ICA decomposition. A hierarchical group PICA (Guo and Tang, 2013) Incorporate subject-specific random effects in IC spatial maps in decomposing multi-subject fmri data. Ying Guo (Emory University) SRCOS /03/ / 31

7 Modeling covariate effects on brain functional networks The brain functional networks may differ depending on subjects clinical and demographical characteristics. For example, Greicius MD et al. (2007): increased default-mode network functional connectivity in subjects with major depression. Greicius MD et al. (2007). Biol Psychiatry.62(5): Ying Guo (Emory University) SRCOS /03/ / 31

8 Existing approaches for assessing covariate effects in ICA Single-subject ICA approach (e.g. Greicius MD et al., 2007) Run single-subject ICA on each subject select IC in each subject that most closely matched to the functional network of interest group analysis based on subjects best-matched IC images Existing group ICA: Estimate subject-specific spatial maps: back-construction, dual-regression, or model-based estimation group analysis based on subject-specific IC maps Ying Guo (Emory University) SRCOS /03/ / 31

9 A hierarchical group PICA regression model The first-level model The second-level model Y i (v) = A i s i (v) + γ (1) i (v), s i (v) = s(v) + β(v) x i + γ (2) i (v), Y i (v) (T 1): the observed fmri data from subject i at voxel v s i (v) (q 1): spatial source signals for subject i at voxel v A i (T q): mixing matrix for subject i γ (1) i (v): residual variability in the observed data N(0, Σ (1) ). s(v) (q 1): statistically independent population spatial source signals, x i (p 1): p covariates of subject i β(v)(p q): regression parameters of the p covariates for the q IC γ (2) i (v): subject random variability, N(0, Σ (2) ). Ying Guo (Emory University) SRCOS /03/ / 31

10 Source Distribution Model Based on the characteristics of fmri signals, we model s(v) = [s 1 (v),..., s q (v)] with a mixture of Gaussian distributions, m f (s l (v); ϕ l ) = π lj g(s l (v); µ lj, σlj) 2 j=1 where g( ) is the Gaussian pdf. Define z l (v) as latent states variable for the lth IC at voxel v with p (z l (v) = j) = π lj. Given z l (v), we have p (s l (v) z l (v) = j) = g(s l (v); µ lj, σ 2 lj), Let z(v) as q 1 latent states vector for the q ICs at voxel v. z(v) = [z 1 (v),..., z q (v)], Ying Guo (Emory University) SRCOS /03/ / 31

11 Estimation Complete Log-likelihood Function log l(θ Y, SI, S, Z) { V N [ ( = log g Yi (v); A i s i (v), Σ (1)) v=1 i=1 + log g ( s i (v); s(v) + β(v) x i, Σ (2))] } ( ) q + log g s(v); µ z(v), Σ (3) z(v) + log π lz(v) where Y = {Y i (v)}, SI = {s i (v)}, S = {s(v)} and Z = {z(v)}, θ = {{A i }, Σ (1), Σ (2), ϕ} where ϕ include parameters in the source distribution model. Ying Guo (Emory University) SRCOS /03/ / 31 l=1

12 An EM algorithm E-Step: we find the conditional expectation for the complete data log-likelihood, E (k)[l(θ Y, S, SI, Z)] S,SI,Z Y, ˆθ = Q 1 (θ ˆθ (k) ) + Q 2 (θ ˆθ (k) ) + Q 2 (θ ˆθ (k) ) + Q 4 (θ ˆθ (k) ), Ying Guo (Emory University) SRCOS /03/ / 31

13 The EM algorithm, Cont. Q 1 (θ ˆθ (k) ) = Q 2 (θ ˆθ (k) ) = Q 3 (θ ˆθ (k) ) = Q 4 (θ ˆθ (k) ) = V v=1 V v=1 N [ ( E log g S,SI,Z Y,θ (k) Yi (v); A i s i (v), Σ (1))] i=1 N [ ( E log g S,SI,Z Y,θ (k) si (v); s(v) + β(v) x i, Σ (2))] i=1 V E [log g S,SI,Z Y,θ (k) v=1 V v=1 q E S,SI,Z Y,θ (k)[log π lz(v)] l=1 ( )] s(v); µ z(v), Σ (3) z(v) Ying Guo (Emory University) SRCOS /03/ / 31

14 The EM algorithm, Cont. Derive the conditional distributions needed for Q(θ ˆθ (k) ) (p(s i y, θ), p(s y, θ), p(s, s i y, θ) and p(z y, θ)), We first derive p(s i z, y, θ), p(s z, y, θ), p(s, s i z, y, θ). Conditioning on z(v), rewrite the source distribution model as, s(v) = Gz(v)µ + γ (3) (v), Third-level Model where µ = [µ 11,..., µ qm ], Gz(v) is q mq binary indicator matrix, and γ (3) (v) MVN(0, Σ (3) z(v) ). We then derive p(z y, θ) Obtain p(s i y, θ), p(s y, θ), and p(s, s i y, θ). Ying Guo (Emory University) SRCOS /03/ / 31

15 The EM algorithm, Cont. M-Step Parameter updates for the first-level model: Â (k+1) { V i = v=1 Y i(v)e (s i (v) Y(v); ˆθ (k))} { V v=1 E (s i (v)s i (v) Y(v); ˆθ (k))} 1 { ˆΣ (1)(k+1) = 1 V N NV v=1 i=1 Y i (v)y i (v) 2Y i (v)e ( s i (v) Y(v); ˆθ (k)) Â (k+1) i ( +Â(k+1) i E s i (v)s i (v) Y(v); ˆθ (k)) } Â (k+1) i Ying Guo (Emory University) SRCOS /03/ / 31

16 The EM algorithm, Cont. For the second-level model, the covariate effects is updated by ( N ) 1 N ˆβ(v) (k+1) = x i x i x i [E (s i (v) (k)) Y(v); ˆθ i=1 i=1 ( E s(v) Y(v); ˆθ (k))], while Σ (2) = diag(ν 2 1, ν 2 2,..., ν 2 q) is updated by ˆν 2(k+1) l = 1 NV V v N i=1 { ( E s il (v) 2 Y(v); ˆθ (k)) + E (s l (v) 2 Y(v); ˆθ (k)) ( 2E s il (v)s l (v) Y(v); ˆθ (k)) + ˆβ l (v) (k+1) x i x ˆβ i l (v) (k+1) ( ( +2 E s l (v) Y(v); ˆθ (k)) ( E s il (v) Y(v); ˆθ (k))) x ˆβ i l (v) (k+1), (2) Ying Guo (Emory University) SRCOS /03/ / 31

17 An EM algorithm, Cont. Parameter updates for the source distribution model: ˆπ (k+1) lj = 1 V p(z(v) l = j Y(v); ˆθ (k) ), V ˆµ (k+1) lj = ˆσ 2(k+1) lj = v=1 V v=1 p(z(v) l = j Y(v); ˆθ (k) )E (s(v) l z(v) l = j, Y(v); ˆθ (k)) V ˆπ (k+1) lj V v=1 p(z(v) l = j Y(v); ˆθ (k) )E (s(v) 2 l z(v) l = j, Y(v); ˆθ (k)) (ˆµ (k+1) lj ) 2. V ˆπ (k+1) lj, Ying Guo (Emory University) SRCOS /03/ / 31

18 Steps for the model estimation Step 1. Start with initial parameter values ˆθ (0) = (Â (0) i, {ˆΣ (1,0) }, {ˆΣ (2,0) }, ˆϕ (0) ), β (0). Step 2. E-step: compute the conditional expectation function Q(θ ˆθ (k) ). Step 3. M-step: updating parameter estimates using the explicit solutions. ˆθ (k+1) = argmax Q(θ ˆθ (k) ) θ Step 4. Iterate between steps 2-3 until convergence. Step 5. Estimate IC spatial maps based on ˆθ for specified covariates x. ŝ (v) = E(s(v) Y, ˆθ) + ˆβ(v) x Ying Guo (Emory University) SRCOS /03/ / 31

19 Dimension Reduction and Whitening Prior to ICA, dimension reduction and whitening is performed, Ỹ i (v) = (Λ i,q σ 2 i,qi q ) 1/2 U i,qy i (v), (3) U i,q and Λ i,q : the first q eigenvectors and eigenvalues from SVD of [Y i (1),..., Y i (V )] T V. The first-level model: where Ỹ i (v) = Ãis i (v) + γ (1) i (v), Ỹ i (v) = H i Y i (v), Ãi(v) = H i A i (v), γ (1) i (v) = H i γ (1) i (v) H i = (Λ i,q σ 2 i,q I q) 1/2 U i,q: transformation matrix Ying Guo (Emory University) SRCOS /03/ / 31

20 Simulation Studies We compare the performance of our method (Covariate ICA) vs. an existing method (Dual regression) under the following simulation settings simulate fmri data from q = 3 source signals and consider sample sizes of n = 10, 20, 40 subjects for each IC: 3D population spatial maps ( ) and time course of length T = 200 simulate two covariates X = (X 1, X 2 ), X 1 U( 1, 1), X 2 Bernoulli(0.5) multi-subject fmri data with low and high between-subject variability in spatial distributions of source signals simulate subject-specific time courses for each source signal Ying Guo (Emory University) SRCOS /03/ / 31

21 Simulation Study: Group Level IC maps Group IC (True) Group IC (Cov.ICA.) Group IC (Dual.Reg.) Ying Guo (Emory University) SRCOS /03/ / 31

22 Simulation Study: Covariate Effect Maps for X 1 β(v)(true) β(v)(cov.ica.) β(v)(dual.reg.) Ying Guo (Emory University) SRCOS /03/ / 31

23 Simulation Study: Covariate Effect Maps for X 2 β(v)(true) β(v)(cov.ica.) β(v)(dual.reg.) Ying Guo (Emory University) SRCOS /03/ / 31

24 Simulation Studies Table: Simulation results: Population-level maps and covariate effects. Btw-subj Population-level spatial maps Covariate Effects Var Corr(SD) MSE(SD) Cov.ICA. Dual.Reg. Cov.ICA. Dual.Reg. Low N= (0.003) 0.953(0.017) 0.048(0.019) 0.154(0.055) N= (0.002) 0.955(0.002) 0.021(0.003) 0.127(0.044) N= (0.002) 0.952(0.003) 0.012(0.001) 0.111(0.030) MSE( ˆβ) is defined as 1 V E( V v=1 ˆβ(v) β(v) F ) Ying Guo (Emory University) SRCOS /03/ / 31

25 An Approximated EM algorithm The proposed exact EM algorithm: Pros: provide explicit solutions for E-step and M-step Cons: this exact estimation becomes computational expensive when the number of ICs increases. We propose an approximation EM algorithm for faster computation. It is derived based on the fact that fmri source signals are generally sparse and well-separated in distributed patterns. Ying Guo (Emory University) SRCOS /03/ / 31

26 Simulation Studies: the Exact EM vs. the Approximated EM Num. of Population-level spatial maps Covariate Effects IC Corr(SD) MSE(SD) Exact EM Approx EM Exact EM Approx EM q= (0.003) 0.981(0.001) 0.048(0.020) 0.048(0.019) q= (0.006) 0.980(0.006) 0.069(0.024) 0.070(0.022) q= (0.019) 0.970(0.018) 0.097(0.035) 0.103(0.035) Num. of IC Computation Time (min) Exact EM Approx EM q= q= q= Ying Guo (Emory University) SRCOS /03/ / 31

27 A Data Example A Zen meditation study 24 subjects: Zen meditation group (n=12) and control group (n=12). The two groups were matched for gender,age, and education level. A series of 520 functional MRI images were acquired with a 3.0 Tesla Siemens Magnetom Trio scanner, each containing voxels. Word or nonword items presented visually in random order; press a button with the left hand (index finger = yes, middle finger = no ); focus on breathing in other time points. We fit the proposed ICA regression model with a covariate indicating group membership: x = 1 for meditator and x = 0 for controls. Ying Guo (Emory University) SRCOS /03/ / 31

28 Figure: Model-based estimation of subgroup spatial maps for functional networks 8 Task-related network 60 Default mode network Control Temporal corr. with task time series: Meditator Temporal corr. with task time series: 0.86 Ying Guo (Emory University) SRCOS /03/ / 31

29 Summary We propose a hierarchical group PICA regression model The proposed model provides a formal statistical framework for modeling covariate effects in group ICA. Our model can provide model-based estimation of spatial distributed patterns of brain functional networks based on subjects demographic and clinical characteristics. We propose an exact EM algorithm and an approximation EM for model estimation. Simulation results show that our proposed model has better performance than the existing method. Ying Guo (Emory University) SRCOS /03/ / 31

30 Acknowledgements Giuseppe Pagnoni, PhD. Department of Biomedical Sciences and Technologies, University of Modena and Reggio Emilia, Modena, Italy Emory University URC grant and NIH grant R01-MH Ying Guo (Emory University) SRCOS /03/ / 31

31 References Beckmann, C.F. and Smith, S.M. (2005). Tensorial extensions of independent component analysis for multisubject FMRI analysis, NeuroImage 25, Calhoun, V.D., Adali, T., Pearlson, G.D. and Pekar, J.J. (2001). A method for making group inferences from functional MRI data using independent component analysis, Human Brain Mapping 14, Guo Y. and Pagnoni G. (2008). A unified framework for group independent component analysis for multi-subject fmri data. NeuroImage 42, Guo Y. (2011). A General Probabilistic Model for Group Independent Component Analysis and Its Estimation Methods. Biometrics 67, Guo Y. and Tang L. A hierarchical model for probabilistic independent component analysis of multi-subject fmri studies. Biometrics. (Accepted). Ying Guo (Emory University) SRCOS /03/ / 31

arxiv: v3 [stat.ap] 30 Apr 2015

arxiv: v3 [stat.ap] 30 Apr 2015 Submitted to the Annals of Applied Statistics arxiv: arxiv:1402.4239 MODELING COVARIATE EFFECTS IN GROUP INDEPENDENT COMPONENT ANALYSIS WITH APPLICATIONS TO FUNCTIONAL MAGNETIC RESONANCE IMAGING arxiv:1402.4239v3

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data

Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data F. DuBois Bowman Department of Biostatistics and Bioinformatics Center for Biomedical Imaging Statistics Emory University,

More information

Granger Mediation Analysis of Functional Magnetic Resonance Imaging Time Series

Granger Mediation Analysis of Functional Magnetic Resonance Imaging Time Series Granger Mediation Analysis of Functional Magnetic Resonance Imaging Time Series Yi Zhao and Xi Luo Department of Biostatistics Brown University June 8, 2017 Overview 1 Introduction 2 Model and Method 3

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

A MULTIVARIATE MODEL FOR COMPARISON OF TWO DATASETS AND ITS APPLICATION TO FMRI ANALYSIS

A MULTIVARIATE MODEL FOR COMPARISON OF TWO DATASETS AND ITS APPLICATION TO FMRI ANALYSIS A MULTIVARIATE MODEL FOR COMPARISON OF TWO DATASETS AND ITS APPLICATION TO FMRI ANALYSIS Yi-Ou Li and Tülay Adalı University of Maryland Baltimore County Baltimore, MD Vince D. Calhoun The MIND Institute

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Modelling temporal structure (in noise and signal)

Modelling temporal structure (in noise and signal) Modelling temporal structure (in noise and signal) Mark Woolrich, Christian Beckmann*, Salima Makni & Steve Smith FMRIB, Oxford *Imperial/FMRIB temporal noise: modelling temporal autocorrelation temporal

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Latent Variable Models and EM Algorithm

Latent Variable Models and EM Algorithm SC4/SM8 Advanced Topics in Statistical Machine Learning Latent Variable Models and EM Algorithm Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/atsml/

More information

MULTI-VARIATE/MODALITY IMAGE ANALYSIS

MULTI-VARIATE/MODALITY IMAGE ANALYSIS MULTI-VARIATE/MODALITY IMAGE ANALYSIS Duygu Tosun-Turgut, Ph.D. Center for Imaging of Neurodegenerative Diseases Department of Radiology and Biomedical Imaging duygu.tosun@ucsf.edu Curse of dimensionality

More information

EM for Spherical Gaussians

EM for Spherical Gaussians EM for Spherical Gaussians Karthekeyan Chandrasekaran Hassan Kingravi December 4, 2007 1 Introduction In this project, we examine two aspects of the behavior of the EM algorithm for mixtures of spherical

More information

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven

More information

Integrative Methods for Functional and Structural Connectivity

Integrative Methods for Functional and Structural Connectivity Integrative Methods for Functional and Structural Connectivity F. DuBois Bowman Department of Biostatistics Columbia University SAMSI Challenges in Functional Connectivity Modeling and Analysis Research

More information

Overview of Spatial Statistics with Applications to fmri

Overview of Spatial Statistics with Applications to fmri with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

Prewhitening High Dimensional fmri Data Sets. Without Eigendecomposition

Prewhitening High Dimensional fmri Data Sets. Without Eigendecomposition Prewhitening High Dimensional fmri Data Sets Without Eigendecomposition Abd-Krim Seghouane and Yousef Saad 2 Department of Electrical and Electronic Engineering, Melbourne School of Engineering The University

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

Model-free Functional Data Analysis

Model-free Functional Data Analysis Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components decomposes data into a set of statistically independent spatial component

More information

An Introduction to Expectation-Maximization

An Introduction to Expectation-Maximization An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative

More information

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Find the maximum likelihood estimate of θ where θ is a parameter

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

MIXED EFFECTS MODELS FOR TIME SERIES

MIXED EFFECTS MODELS FOR TIME SERIES Outline MIXED EFFECTS MODELS FOR TIME SERIES Cristina Gorrostieta Hakmook Kang Hernando Ombao Brown University Biostatistics Section February 16, 2011 Outline OUTLINE OF TALK 1 SCIENTIFIC MOTIVATION 2

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl

More information

Factor Analysis and Kalman Filtering (11/2/04)

Factor Analysis and Kalman Filtering (11/2/04) CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Towards a Regression using Tensors

Towards a Regression using Tensors February 27, 2014 Outline Background 1 Background Linear Regression Tensorial Data Analysis 2 Definition Tensor Operation Tensor Decomposition 3 Model Attention Deficit Hyperactivity Disorder Data Analysis

More information

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016 Lecture 5 Gaussian Models - Part 1 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza November 29, 2016 Luigi Freda ( La Sapienza University) Lecture 5 November 29, 2016 1 / 42 Outline 1 Basics

More information

Frequentist-Bayesian Model Comparisons: A Simple Example

Frequentist-Bayesian Model Comparisons: A Simple Example Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Finite Singular Multivariate Gaussian Mixture

Finite Singular Multivariate Gaussian Mixture 21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate

More information

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline Bayesian Analysis DuBois Bowman, Ph.D. Gordana Derado, M. S. Shuo Chen, M. S. Department of Biostatistics and Bioinformatics Center for Biomedical Imaging Statistics Emory University Outline I. Introduction

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Testing for group differences in brain functional connectivity

Testing for group differences in brain functional connectivity Testing for group differences in brain functional connectivity Junghi Kim, Wei Pan, for ADNI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Banff Feb

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

A. Motivation To motivate the analysis of variance framework, we consider the following example.

A. Motivation To motivate the analysis of variance framework, we consider the following example. 9.07 ntroduction to Statistics for Brain and Cognitive Sciences Emery N. Brown Lecture 14: Analysis of Variance. Objectives Understand analysis of variance as a special case of the linear model. Understand

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Linear discriminant functions

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

ROI ANALYSIS OF PHARMAFMRI DATA:

ROI ANALYSIS OF PHARMAFMRI DATA: ROI ANALYSIS OF PHARMAFMRI DATA: AN ADAPTIVE APPROACH FOR GLOBAL TESTING Giorgos Minas, John A.D. Aston, Thomas E. Nichols and Nigel Stallard Department of Statistics and Warwick Centre of Analytical Sciences,

More information

Non-Gaussian likelihoods for Gaussian Processes

Non-Gaussian likelihoods for Gaussian Processes Non-Gaussian likelihoods for Gaussian Processes Alan Saul University of Sheffield Outline Motivation Laplace approximation KL method Expectation Propagation Comparing approximations GP regression Model

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

New Machine Learning Methods for Neuroimaging

New Machine Learning Methods for Neuroimaging New Machine Learning Methods for Neuroimaging Gatsby Computational Neuroscience Unit University College London, UK Dept of Computer Science University of Helsinki, Finland Outline Resting-state networks

More information

Bayesian modelling of fmri time series

Bayesian modelling of fmri time series Bayesian modelling of fmri time series Pedro A. d. F. R. Højen-Sørensen, Lars K. Hansen and Carl Edward Rasmussen Department of Mathematical Modelling, Building 321 Technical University of Denmark DK-28

More information

Scaling Neighbourhood Methods

Scaling Neighbourhood Methods Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

Corner. Corners are the intersections of two edges of sufficiently different orientations.

Corner. Corners are the intersections of two edges of sufficiently different orientations. 2D Image Features Two dimensional image features are interesting local structures. They include junctions of different types like Y, T, X, and L. Much of the work on 2D features focuses on junction L,

More information

Principal Component Analysis (PCA) for Sparse High-Dimensional Data

Principal Component Analysis (PCA) for Sparse High-Dimensional Data AB Principal Component Analysis (PCA) for Sparse High-Dimensional Data Tapani Raiko Helsinki University of Technology, Finland Adaptive Informatics Research Center The Data Explosion We are facing an enormous

More information

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1 Recap Vector observation: Y f (y; θ), Y Y R m, θ R d sample of independent vectors y 1,..., y n pairwise log-likelihood n m i=1 r=1 s>r w rs log f 2 (y ir, y is ; θ) weights are often 1 more generally,

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid High Dimensional Discriminant Analysis - Lear seminar p.1/43 Introduction High

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 7 Unsupervised Learning Statistical Perspective Probability Models Discrete & Continuous: Gaussian, Bernoulli, Multinomial Maimum Likelihood Logistic

More information

! Dimension Reduction

! Dimension Reduction ! Dimension Reduction Pamela K. Douglas NITP Summer 2013 Overview! What is dimension reduction! Motivation for performing reduction on your data! Intuitive Description of Common Methods! Applications in

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

SUPPLEMENTARY SIMULATIONS & FIGURES

SUPPLEMENTARY SIMULATIONS & FIGURES Supplementary Material: Supplementary Material for Mixed Effects Models for Resampled Network Statistics Improve Statistical Power to Find Differences in Multi-Subject Functional Connectivity Manjari Narayan,

More information

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543 K-Means, Expectation Maximization and Segmentation D.A. Forsyth, CS543 K-Means Choose a fixed number of clusters Choose cluster centers and point-cluster allocations to minimize error can t do this by

More information

Data Analysis I: Single Subject

Data Analysis I: Single Subject Data Analysis I: Single Subject ON OFF he General Linear Model (GLM) y= X fmri Signal = Design Matrix our data = what we CAN explain x β x Betas + + how much x of it we CAN + explain ε Residuals what

More information

Spatial Source Filtering. Outline EEG/ERP. ERPs) Event-related Potentials (ERPs( EEG

Spatial Source Filtering. Outline EEG/ERP. ERPs) Event-related Potentials (ERPs( EEG Integration of /MEG/fMRI Vince D. Calhoun, Ph.D. Director, Image Analysis & MR Research The MIND Institute Outline fmri/ data Three Approaches to integration/fusion Prediction Constraints 2 nd Level Fusion

More information

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM. Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification

More information

Week 3: The EM algorithm

Week 3: The EM algorithm Week 3: The EM algorithm Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2005 Mixtures of Gaussians Data: Y = {y 1... y N } Latent

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

A Note on the Expectation-Maximization (EM) Algorithm

A Note on the Expectation-Maximization (EM) Algorithm A Note on the Expectation-Maximization (EM) Algorithm ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign March 11, 2007 1 Introduction The Expectation-Maximization

More information

Latent Variable Models

Latent Variable Models Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:

More information

Unsupervised learning: beyond simple clustering and PCA

Unsupervised learning: beyond simple clustering and PCA Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have

More information

EM-based Reinforcement Learning

EM-based Reinforcement Learning EM-based Reinforcement Learning Gerhard Neumann 1 1 TU Darmstadt, Intelligent Autonomous Systems December 21, 2011 Outline Expectation Maximization (EM)-based Reinforcement Learning Recap : Modelling data

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

Stochastic Dynamic Causal Modelling for resting-state fmri

Stochastic Dynamic Causal Modelling for resting-state fmri Stochastic Dynamic Causal Modelling for resting-state fmri Ged Ridgway, FIL, Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, London Overview Connectivity in the brain Introduction to

More information

Introduction to Graphical Models

Introduction to Graphical Models Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Dimension Reduction. David M. Blei. April 23, 2012

Dimension Reduction. David M. Blei. April 23, 2012 Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

Answer Key for STAT 200B HW No. 8

Answer Key for STAT 200B HW No. 8 Answer Key for STAT 200B HW No. 8 May 8, 2007 Problem 3.42 p. 708 The values of Ȳ for x 00, 0, 20, 30 are 5/40, 0, 20/50, and, respectively. From Corollary 3.5 it follows that MLE exists i G is identiable

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Mixtures of Rasch Models

Mixtures of Rasch Models Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information