DCM: Advanced topics. Klaas Enno Stephan. SPM Course Zurich 06 February 2015

Similar documents
Dynamic causal modeling for fmri

DCM for fmri Advanced topics. Klaas Enno Stephan

Dynamic Causal Modelling : Advanced Topics

Effective Connectivity & Dynamic Causal Modelling

Wellcome Trust Centre for Neuroimaging, UCL, UK.

Dynamic causal modeling for fmri

Anatomical Background of Dynamic Causal Modeling and Effective Connectivity Analyses

Dynamic Causal Modelling for fmri

Bayesian inference J. Daunizeau

Will Penny. SPM short course for M/EEG, London 2015

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y.

Dynamic Causal Modelling for EEG/MEG: principles J. Daunizeau

Bayesian inference J. Daunizeau

An introduction to Bayesian inference and model comparison J. Daunizeau

Dynamic Causal Models

Stochastic Dynamic Causal Modelling for resting-state fmri

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)

Will Penny. SPM short course for M/EEG, London 2013

Will Penny. DCM short course, Paris 2012

Dynamic Causal Modelling for EEG and MEG. Stefan Kiebel

Models of effective connectivity & Dynamic Causal Modelling (DCM)

Dynamic Causal Modelling for evoked responses J. Daunizeau

Dynamic Causal Modelling for EEG and MEG

Dynamic Modeling of Brain Activity

Will Penny. SPM for MEG/EEG, 15th May 2012

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

Overview of SPM. Overview. Making the group inferences we want. Non-sphericity Beyond Ordinary Least Squares. Model estimation A word on power

Unsupervised Learning

Bayesian Inference Course, WTCN, UCL, March 2013

Experimental design of fmri studies

Models of effective connectivity & Dynamic Causal Modelling (DCM)

The General Linear Model (GLM)

Dynamic causal models of neural system dynamics:current state and future extensions.

Experimental design of fmri studies

Group Analysis. Lexicon. Hierarchical models Mixed effect models Random effect (RFX) models Components of variance

Experimental design of fmri studies

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

PATTERN RECOGNITION AND MACHINE LEARNING

Experimental design of fmri studies & Resting-State fmri

Lecture 6: Graphical Models: Learning

Bayesian modeling of learning and decision-making in uncertain environments

Modelling temporal structure (in noise and signal)

Causal modeling of fmri: temporal precedence and spatial exploration

Modelling behavioural data

Comparing hemodynamic models with DCM

Comparing Families of Dynamic Causal Models

Experimental design of fmri studies

Mathematical Formulation of Our Example

Bayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course

NeuroImage. Nonlinear dynamic causal models for fmri

Lecture 6: Model Checking and Selection

Principles of DCM. Will Penny. 26th May Principles of DCM. Will Penny. Introduction. Differential Equations. Bayesian Estimation.

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

Bayesian Inference. Sudhir Shankar Raman Translational Neuromodeling Unit, UZH & ETH. The Reverend Thomas Bayes ( )

The General Linear Model (GLM)

ECE521 week 3: 23/26 January 2017

Machine Learning Summer School

arxiv: v1 [cs.sy] 13 Feb 2018

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann

Variational Scoring of Graphical Model Structures

Dynamic causal models of neural system dynamics. current state and future extensions

Empirical Bayes for DCM: A Group Inversion Scheme

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Variational Methods in Bayesian Deconvolution

Mixed effects and Group Modeling for fmri data

Will Penny. 21st April The Macroscopic Brain. Will Penny. Cortical Unit. Spectral Responses. Macroscopic Models. Steady-State Responses

Non-Parametric Bayes

Continuous Time Particle Filtering for fmri

MIXED EFFECTS MODELS FOR TIME SERIES

M/EEG source analysis

Group analysis. Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London. SPM Course Edinburgh, April 2010

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Model Selection for Gaussian Processes

Cheng Soon Ong & Christian Walder. Canberra February June 2018

CS-E3210 Machine Learning: Basic Principles

HGF Workshop. Christoph Mathys. Wellcome Trust Centre for Neuroimaging at UCL, Max Planck UCL Centre for Computational Psychiatry and Ageing Research

Statistical Inference

Semi-analytical approximations to statistical moments. of sigmoid and softmax mappings of normal variables

Interpretable Latent Variable Models

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Machine Learning Basics: Maximum Likelihood Estimation

Active and Semi-supervised Kernel Classification

Computer Vision Group Prof. Daniel Cremers. 3. Regression

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

An Implementation of Dynamic Causal Modelling

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES

Gentle Introduction to Infinite Gaussian Mixture Modeling

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Study Notes on the Latent Dirichlet Allocation

Bayesian Methods for Machine Learning

SRNDNA Model Fitting in RL Workshop

Bayesian Learning in Undirected Graphical Models

Outline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

Probabilistic Models in Theoretical Neuroscience

Transcription:

DCM: Advanced topics Klaas Enno Stephan SPM Course Zurich 06 February 205

Overview DCM a generative model Evolution of DCM for fmri Bayesian model selection (BMS) Translational Neuromodeling

Generative model p( y, m) p( m) p( y, m). enforces mechanistic thinking: how could the data have been caused? 2. generate synthetic data (observations) by sampling from the prior can model explain certain phenomena at all? 3. inference about model structure: formal approach to disambiguating mechanisms p(y m) 4. inference about parameters p( y) 5. basis for predictions about interventions control theory

Bayesian system identification u(t) Design experimental inputs Neural dynamics dx dt f ( x, u, ) Define likelihood model Observer function y g( x, ) p( y, m) N( g( ), ( )) p(, m) N(, ) Specify priors Inference on model structure Inference on parameters p( y p( m) y, m) p( y, m) p( ) d p( y, m) p(, m) p( y m) Invert model Make inferences

Variational Bayes (VB) Idea: find an approximation q(θ) to the true posterior p θ y. Often done by assuming a particular form for q and then optimizing its sufficient statistics (fixed form VB). true posterior p θ y divergence KL q p best proxy q θ hypothesis class

Variational Bayes ln p(y) = KL[q p divergence 0 (unknown) + F q, y neg. free energy (easy to evaluate for a given q) ln p y KL[q p ln p y F q, y KL[q p F q, y is a functional wrt. the approximate posterior q θ. F q, y Maximizing F q, y is equivalent to: minimizing KL[q p tightening F q, y as a lower bound to the log model evidence When F q, y is maximized, q θ is our best estimate of the posterior. initialization convergence

Alternative optimisation schemes Global optimisation schemes for DCM: MCMC Gaussian processes Papers submitted in SPM soon

Overview DCM a generative model Evolution of DCM for fmri Bayesian model selection (BMS) Translational Neuromodeling

The evolution of DCM in SPM DCM is not one specific model, but a framework for Bayesian inversion of dynamic system models The implementation in SPM has been evolving over time, e.g. improvements of numerical routines (e.g., optimisation scheme) change in priors to cover new variants (e.g., stochastic DCMs) changes of hemodynamic model To enable replication of your results, you should ideally state which SPM version (release number) you are using when publishing papers.

Factorial structure of model specification Three dimensions of model specification: bilinear vs. nonlinear single-state vs. two-state (per region) deterministic vs. stochastic

y activity x (t) y activity x 2 (t) y activity x 3 (t) neuronal states BOLD y λ x hemodynamic model modulatory input u 2 (t) integration driving input u (t) The classical DCM: a deterministic, one-state, bilinear model t t Neural state equation endogenous connectivity modulation of connectivity direct inputs x ( j) ( A u B ) x Cu j x A x ( j) B u x C u j x x

bilinear DCM modulation non-linear DCM driving input driving input modulation Two-dimensional Taylor series (around x 0 =0, u 0 =0): 2 2 2 dx f f f f x f ( x, u ) f ( x0,0) x u ux...... 2 dt x u x u x 2 Bilinear state equation: Nonlinear state equation: dx dt A m i u B i ( i) x Cu dx dt m n ( i) A uib x i j j D ( j) x Cu

u 2 0.4 0.3 Neural population activity 0.2 0. 0 0 0 20 30 40 50 60 70 80 90 00 0.6 u x 3 0.4 0.2 0 0 0 20 30 40 50 60 70 80 90 00 0.3 0.2 0. x x 2 3 2 0 0 20 30 40 50 60 70 80 90 00 fmri signal change (%) 0 0 Nonlinear Dynamic Causal Model for fmri dx dt A m n ( i) uib i j x j D ( j) x Cu 4 3 2 0 0 0 20 30 40 50 60 70 80 90 00-0 0 20 30 40 50 60 70 80 90 00 3 2 Stephan et al. 2008, NeuroImage 0 0 0 20 30 40 50 60 70 80 90 00

u input Single-state DCM x Intrinsic (within-region) coupling Extrinsic (between-region) coupling N NN N N ij ij ij x x x ub A Cu x x Two-state DCM E x I N E N I E II NN IE NN EE NN EE NN EE N II IE EE N EI EE ij ij ij ij x x x x x ub A Cu x x 0 0 0 0 0 0 ) exp( I x I E x x Two-state DCM Marreiros et al. 2008, NeuroImage

Stochastic DCM dx dt v ( A u B ) x Cv u ( v) j ( j) ( x) all states are represented in generalised coordinates of motion j random state fluctuations w (x) account for endogenous fluctuations, have unknown precision and smoothness two hyperparameters fluctuations w (v) induce uncertainty about how inputs influence neuronal activity can be fitted to "resting state" data 0.5 0-0.5 Estimates of hidden causes and states (Generalised filtering) inputs or causes - V2-0 200 400 600 800 000 200 0. 0.05 0-0.05 hidden states - neuronal -0. 0 200 400 600 800 000 200.3.2. 0.9 hidden states - hemodynamic 0.8 0 200 400 600 800 000 200 2 0 predicted BOLD signal excitatory signal flow volume dhb observed predicted - Li et al. 20, NeuroImage -2-3 0 200 400 600 800 000 200 time (seconds)

Spectral DCM deterministic model that generates predicted crossed spectra in a distributed neuronal network or graph finds the effective connectivity among hidden neuronal states that best explains the observed functional connectivity among hemodynamic responses advantage: replaces an optimisation problem wrt. stochastic differential equations with a deterministic approach from linear systems theory computationally very efficient disadvantages: assumes stationarity Friston et al. 204, NeuroImage

cross-spectra = inverse Fourier transform of crosscorrelation cross-correlation = generalized form of correlation (at zero lag, this is the conventional measure of functional connectivity) Friston et al. 204, NeuroImage

Overview DCM a generative model Evolution of DCM for fmri Bayesian model selection (BMS) Translational Neuromodeling

Model comparison and selection Given competing hypotheses on structure & functional mechanisms of a system, which model is the best? Which model represents the best balance between model fit and model complexity? For which model m does p(y m) become maximal? Pitt & Miyung (2002) TICS

p(y m) Bayesian model selection (BMS) Model evidence: Gharamani, 2004 p( y m) p( y, m) p( m) d accounts for both accuracy and complexity of the model y all possible datasets allows for inference about model structure Various approximations, e.g.: - negative free energy, AIC, BIC McKay 992, Neural Comput. Penny et al. 2004a, NeuroImage

Approximations to the model evidence in DCM Logarithm is a monotonic function Maximizing log model evidence = Maximizing model evidence Log model evidence = balance between fit and complexity log p( y m) accuracy ( m) complexity( m) log p( y, m) complexity( m) No. of parameters Akaike Information Criterion: Bayesian Information Criterion: AIC BIC log p( y, m) log p( y, m) p p 2 log N No. of data points Penny et al. 2004a, NeuroImage

The (negative) free energy approximation F Neg. free energy is a lower bound on log model evidence: log p( y m) F KL q, p y, m ln p y m KL[q p F q, y Like AIC/BIC, F is an accuracy/complexity tradeoff: F log p( y, m) KL q, p m accuracy complexity

The complexity term in F In contrast to AIC & BIC, the complexity term of the negative free energy F accounts for parameter interdependencies. KL q( ), p( m) 2 ln C 2 ln C y 2 T C y y The complexity term of F is higher the more independent the prior parameters ( effective DFs) the more dependent the posterior parameters the more the posterior mean deviates from the prior mean

Bayes factors To compare two models, we could just compare their log evidences. But: the log evidence is just some number not very intuitive! A more intuitive interpretation of model comparisons is made possible by Bayes factors: B 2 p( y p( y Kass & Raftery classification: Kass & Raftery 995, J. Am. Stat. Assoc. m m 2 ) ) positive value, [0;[ B 2 p(m y) Evidence to 3 50-75% weak 3 to 20 75-95% positive 20 to 50 95-99% strong 50 99% Very strong

Fixed effects BMS at group level Group Bayes factor (GBF) for...k subjects: GBF ij BF ij k ( k ) Average Bayes factor (ABF): ABF ij K k BF ( k ) ij Problems: - blind with regard to group heterogeneity - sensitive to outliers

Random effects BMS for heterogeneous groups Dirichlet parameters = occurrences of models in the population r ~ Dir( r; ) Dirichlet distribution of model probabilities r m y k ~ p( mk p) mk ~ p( mk p) mk ~ p( mk p) m ~ Mult( m;, r ~ p( y m ) y ~ p( y m ) y2 ~ p( y2 m2 ) y ~ p( y m ) ) Multinomial distribution of model labels m Measured data y Model inversion by Variational Bayes (VB) or MCMC Stephan et al. 2009a, NeuroImage Penny et al. 200, PLoS Comp. Biol.

m 2 LD LD LVF m MOG FG FG MOG MOG FG FG MOG LD RVF LD LVF LD LD LG LG LG LG RVF stim. LD LVF stim. RVF stim. LD RVF LVF stim. Subjects m 2 m Data: Stephan et al. 2003, Science Models: Stephan et al. 2007, J. Neurosci. -35-30 -25-20 -5-0 -5 0 5 Log model evidence differences

5 p(r >0.5 y) = 0.997 4.5 4 3.5 m 2 p r r 99.7% 2 m 3 p(r y) 2.5 2.5 0.5 2 r 2 2.2 5.7%.8 0 0 0. 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 r r 84.3% Stephan et al. 2009a, NeuroImage

When does an EP signal deviation from chance? EPs express our confidence that the posterior probabilities of models are different under the hypothesis H that models differ in probability: r k /K does not account for possibility "null hypothesis" H 0 : r k =/K Bayesian omnibus risk (BOR) of wrongly accepting H over H 0 : protected EP: Bayesian model averaging over H 0 and H : Rigoux et al. 204, NeuroImage

Overfitting at the level of models #models risk of overfitting solutions: regularisation: definition of model space = setting p(m) family-level BMS Bayesian model averaging (BMA) posterior model probability: p m y p y m p( m) p y m p( m) m BMA: p m y p y, m p m y

FFX RFX 80 log GBF Summed log evidence (rel. to RBML) 60 40 20 2 0 0 nonlinear linear CBM N CBM N (ε) RBM N RBM N (ε) CBM L CBM L (ε) RBM L RBM L (ε) 5 4.5 4 m 2 Model space partitioning: comparing model families p(r >0.5 y) = 0.986 m alpha 8 6 4 2 p(r y) 3.5 3 2.5 p r r 2 98.6% 0 CBM N CBM N (ε) RBM N RBM N (ε) CBM L CBM L (ε) RBM L RBM L (ε) 2.5 Model space partitioning alpha 6 4 2 0 8 6 m m 2 * 4 k k 0.5 r2 26.5% r 73.5% 0 0 0. 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 r 4 2 0 nonlinear models * 2 8 k k5 linear models Stephan et al. 2009, NeuroImage

Bayesian Model Averaging (BMA) abandons dependence of parameter inference on a single model and takes into account model uncertainty represents a particularly useful alternative when none of the models (or model subspaces) considered clearly outperforms all others when comparing groups for which the optimal model differs single-subject BMA: p m y p y, m p m y group-level BMA: p m n y.. N p y, m p m y n n.. N NB: p(m y..n ) can be obtained by either FFX or RFX BMS Penny et al. 200, PLoS Comput. Biol.

Schmidt et al. 203, JAMA Psychiatry

Schmidt et al. 203, JAMA Psychiatry

Prefrontal-parietal connectivity during working memory in schizophrenia 7 ARMS, 2 first-episode (3 non-treated), 20 controls Schmidt et al. 203, JAMA Psychiatry

definition of model space inference on model structure or inference on model parameters? inference on individual models or model space partition? inference on parameters of an optimal model or parameters of all models? optimal model structure assumed to be identical across subjects? comparison of model families using FFX or RFX BMS optimal model structure assumed to be identical across subjects? BMA yes no yes no FFX BMS RFX BMS FFX BMS RFX BMS FFX analysis of parameter estimates (e.g. BPA) RFX analysis of parameter estimates (e.g. t-test, ANOVA) Stephan et al. 200, NeuroImage

Overview DCM a generative model Evolution of DCM for fmri Bayesian model selection (BMS) Translational Neuromodeling

Model-based predictions for single patients model structure DA BMS parameter estimates generative embedding

Synaesthesia projectors experience color externally colocalized with a presented grapheme associators report an internally evoked association across all subjects: no evidence for either model but BMS results map precisely onto projectors (bottom-up mechanisms) and associators (top-down) van Leeuwen et al. 20, J. Neurosci.

Generative embedding (unsupervised): detecting patient subgroups Brodersen et al. 204, NeuroImage: Clinical

Generative embedding: DCM and variational Gaussian Mixture Models Supervised: SVM classification Unsupervised: GMM clustering 7% number of clusters number of clusters 42 controls vs. 4 schizophrenic patients fmri data from working memory task (Deserno et al. 202, J. Neurosci) Brodersen et al. (204) NeuroImage: Clinical

Detecting subgroups of patients in schizophrenia Optimal cluster solution three distinct subgroups (total N=4) subgroups differ (p < 0.05) wrt. negative symptoms on the positive and negative symptom scale (PANSS) Brodersen et al. (204) NeuroImage: Clinical

A hierarchical model for unifying unsupervised generative embedding and empirical Bayes Raman et al., submitted

Dissecting spectrum disorders Differential diagnosis model validation (longitudinal patient studies) BMS Generative embedding (unsupervised) Computational assays Generative models of behaviour & brain activity BMS Generative embedding (supervised) optimized experimental paradigms (simple, robust, patient-friendly) initial model validation (basic science studies) Stephan & Mathys 204, Curr. Opin. Neurobiol.

Methods papers: DCM for fmri and BMS part Brodersen KH, Schofield TM, Leff AP, Ong CS, Lomakina EI, Buhmann JM, Stephan KE (20) Generative embedding for model-based classification of fmri data. PLoS Computational Biology 7: e002079. Brodersen KH, Deserno L, Schlagenhauf F, Lin Z, Penny WD, Buhmann JM, Stephan KE (204) Dissecting psychiatric spectrum disorders by generative embedding. NeuroImage: Clinical 4: 98- Daunizeau J, David, O, Stephan KE (20) Dynamic Causal Modelling: A critical review of the biophysical and statistical foundations. NeuroImage 58: 32-322. Daunizeau J, Stephan KE, Friston KJ (202) Stochastic Dynamic Causal Modelling of fmri data: Should we care about neural noise? NeuroImage 62: 464-48. Friston KJ, Harrison L, Penny W (2003) Dynamic causal modelling. NeuroImage 9:273-302. Friston K, Stephan KE, Li B, Daunizeau J (200) Generalised filtering. Mathematical Problems in Engineering 200: 62670. Friston KJ, Li B, Daunizeau J, Stephan KE (20) Network discovery with DCM. NeuroImage 56: 202 22. Friston K, Penny W (20) Post hoc Bayesian model selection. Neuroimage 56: 2089-2099. Friston KJ, Kahan J, Biswal B, Razi A (204) A DCM for resting state fmri. Neuroimage 94:396-407. Kiebel SJ, Kloppel S, Weiskopf N, Friston KJ (2007) Dynamic causal modeling: a generative model of slice timing in fmri. NeuroImage 34:487-496. Li B, Daunizeau J, Stephan KE, Penny WD, Friston KJ (20). Stochastic DCM and generalised filtering. NeuroImage 58: 442-457 Marreiros AC, Kiebel SJ, Friston KJ (2008) Dynamic causal modelling for fmri: a two-state model. NeuroImage 39:269-278. Penny WD, Stephan KE, Mechelli A, Friston KJ (2004a) Comparing dynamic causal models. NeuroImage 22:57-72.

Methods papers: DCM for fmri and BMS part 2 Penny WD, Stephan KE, Mechelli A, Friston KJ (2004b) Modelling functional integration: a comparison of structural equation and dynamic causal models. NeuroImage 23 Suppl :S264-274. Penny WD, Stephan KE, Daunizeau J, Joao M, Friston K, Schofield T, Leff AP (200) Comparing Families of Dynamic Causal Models. PLoS Computational Biology 6: e000709. Penny WD (202) Comparing dynamic causal models using AIC, BIC and free energy. Neuroimage 59: 39-330. Rigoux L, Stephan KE, Friston KJ, Daunizeau J (204). Bayesian model selection for group studies revisited. NeuroImage 84: 97-985. Stephan KE, Harrison LM, Penny WD, Friston KJ (2004) Biophysical models of fmri responses. Curr Opin Neurobiol 4:629-635. Stephan KE, Weiskopf N, Drysdale PM, Robinson PA, Friston KJ (2007) Comparing hemodynamic models with DCM. NeuroImage 38:387-40. Stephan KE, Harrison LM, Kiebel SJ, David O, Penny WD, Friston KJ (2007) Dynamic causal models of neural system dynamics: current state and future extensions. J Biosci 32:29-44. Stephan KE, Weiskopf N, Drysdale PM, Robinson PA, Friston KJ (2007) Comparing hemodynamic models with DCM. NeuroImage 38:387-40. Stephan KE, Kasper L, Harrison LM, Daunizeau J, den Ouden HE, Breakspear M, Friston KJ (2008) Nonlinear dynamic causal models for fmri. NeuroImage 42:649-662. Stephan KE, Penny WD, Daunizeau J, Moran RJ, Friston KJ (2009a) Bayesian model selection for group studies. NeuroImage 46:004-07. Stephan KE, Tittgemeyer M, Knösche TR, Moran RJ, Friston KJ (2009b) Tractography-based priors for dynamic causal models. NeuroImage 47: 628-638. Stephan KE, Penny WD, Moran RJ, den Ouden HEM, Daunizeau J, Friston KJ (200) Ten simple rules for Dynamic Causal Modelling. NeuroImage 49: 3099-309. Stephan KE, Mathys C (204). Computational approaches to psychiatry. Current Opinion in Neurobiology 25: 85-92.

Thank you