The General Linear Model (GLM)

Similar documents
The General Linear Model (GLM)

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

Statistical Inference

Statistical Inference

Overview of SPM. Overview. Making the group inferences we want. Non-sphericity Beyond Ordinary Least Squares. Model estimation A word on power

Group analysis. Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London. SPM Course Edinburgh, April 2010

Data Analysis I: Single Subject

Contents. Introduction The General Linear Model. General Linear Linear Model Model. The General Linear Model, Part I. «Take home» message

Group Analysis. Lexicon. Hierarchical models Mixed effect models Random effect (RFX) models Components of variance

Extracting fmri features

Experimental design of fmri studies

Experimental design of fmri studies

Event-related fmri. Christian Ruff. Laboratory for Social and Neural Systems Research Department of Economics University of Zurich

Neuroimaging for Machine Learners Validation and inference

Jean-Baptiste Poline

Signal Processing for Functional Brain Imaging: General Linear Model (2)

Modelling temporal structure (in noise and signal)

Mixed-effects and fmri studies

Wellcome Trust Centre for Neuroimaging, UCL, UK.

Experimental design of fmri studies & Resting-State fmri

Contents. design. Experimental design Introduction & recap Experimental design «Take home» message. N εˆ. DISCOS SPM course, CRC, Liège, 2009

Experimental design of fmri studies

Experimental design of fmri studies

Contrasts and Classical Inference

Optimization of Designs for fmri

MIXED EFFECTS MODELS FOR TIME SERIES

Statistical Analysis Aspects of Resting State Functional Connectivity

Dynamic Causal Modelling for fmri

Statistical inference for MEG

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau

1st level analysis Basis functions, parametric modulation and correlated regressors

Mixed effects and Group Modeling for fmri data

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Piotr Majer Risk Patterns and Correlated Brain Activities

What is NIRS? First-Level Statistical Models 5/18/18

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM

General linear model: basic

Effective Connectivity & Dynamic Causal Modelling

Statistical Analysis of fmrl Data

Beyond Univariate Analyses: Multivariate Modeling of Functional Neuroimaging Data

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Detecting fmri activation allowing for unknown latency of the hemodynamic response

Overview of Spatial Statistics with Applications to fmri

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline

Contents. Data. Introduction & recap Variance components Hierarchical model RFX and summary statistics Variance/covariance matrix «Take home» message

Lecture 2: Univariate Time Series

A. Motivation To motivate the analysis of variance framework, we consider the following example.

Experimental Design. Rik Henson. With thanks to: Karl Friston, Andrew Holmes

An introduction to Bayesian inference and model comparison J. Daunizeau

HST 583 FUNCTIONAL MAGNETIC RESONANCE IMAGING DATA ANALYSIS AND ACQUISITION A REVIEW OF STATISTICS FOR FMRI DATA ANALYSIS

The General Linear Model Ivo Dinov

Dynamic Causal Models

ROI analysis of pharmafmri data: an adaptive approach for global testing

For GLM y = Xβ + e (1) where X is a N k design matrix and p(e) = N(0, σ 2 I N ), we can estimate the coefficients from the normal equations

Bayesian Estimation of Dynamical Systems: An Application to fmri

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Outline. Overview of Issues. Spatial Regression. Luc Anselin

General multilevel linear modeling for group analysis in FMRI

FIL. Event-related. fmri. Rik Henson. With thanks to: Karl Friston, Oliver Josephs

Peak Detection for Images

Model-free Functional Data Analysis

Ch 2: Simple Linear Regression

Bayesian Treatments of. Neuroimaging Data Will Penny and Karl Friston. 5.1 Introduction

Hierarchical Models. W.D. Penny and K.J. Friston. Wellcome Department of Imaging Neuroscience, University College London.

Towards a Regression using Tensors

Models of effective connectivity & Dynamic Causal Modelling (DCM)

NeuroImage. Modeling the hemodynamic response function in fmri: Efficiency, bias and mis-modeling

Models of effective connectivity & Dynamic Causal Modelling (DCM)

Overview. Experimental Design. A categorical analysis. A taxonomy of design. A taxonomy of design. A taxonomy of design. 1. A Taxonomy of Designs

M/EEG source analysis

The ASL signal. Parenchy mal signal. Venous signal. Arterial signal. Input Function (Label) Dispersion: (t e -kt ) Relaxation: (e -t/t1a )

Homework #2 Due date: 2/19/2013 (Tuesday) Translate the slice Two Main Paths in Lecture Part II: Neurophysiology and BOLD to words.

Dynamic Causal Modelling for EEG/MEG: principles J. Daunizeau

Statistical parametric mapping for event-related potentials (II): a hierarchical temporal model

First Technical Course, European Centre for Soft Computing, Mieres, Spain. 4th July 2011

Classic Time Series Analysis

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Title. Description. var intro Introduction to vector autoregressive models

Spatial Regression. 6. Specification Spatial Heterogeneity. Luc Anselin.

Stochastic Dynamic Causal Modelling for resting-state fmri

Commentary on the statistical properties of noise and its implication on general linear models in functional near-infrared spectroscopy

Computing FMRI Activations: Coefficients and t-statistics by Detrending and Multiple Regression

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Assessing Model Adequacy

Analysis of fmri Timeseries:

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

Morphometry. John Ashburner. Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK.

Statistical Analysis of Functional ASL Images

1 Introduction to Generalized Least Squares

Generalized Linear Models. Kurt Hornik

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Diagnostics of Linear Regression

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y.

Lecture 4: Heteroskedasticity

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

ROI ANALYSIS OF PHARMAFMRI DATA:

The Role of "Leads" in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji

An Implementation of Dynamic Causal Modelling

Transcription:

he General Linear Model (GLM) Klaas Enno Stephan ranslational Neuromodeling Unit (NU) Institute for Biomedical Engineering University of Zurich & EH Zurich Wellcome rust Centre for Neuroimaging Institute of Neurology, University College London With many thanks to the FIL Methods group Zurich SPM Course 01 15-17 February

Overview of SPM Image time-series Kernel Design matrix Statistical parametric map (SPM) Realignment Smoothing General linear model Normalisation Statistical inference Gaussian field theory emplate Parameter estimates p <0.05

One session Passive word listening versus rest 7 cycles of rest and listening A very simple fmri experiment Blocks of 6 scans with 7 sec R Stimulus function Question: Is there a change in the BOLD response between listening and rest?

Modelling the measured data Why? How? Make inferences about effects of interest 1. Decompose data into effects and error. Form statistic using estimates of effects and error stimulus function data linear model effects estimate error estimate statistic

Voxel-wise time series analysis ime model specification parameter estimation hypothesis statistic single voxel time series BOLD signal SPM

Single voxel regression model ime = β + 1 β + error BOLD signal x 1 x e β β 1 y x + x + = 1 e

Mass-univariate analysis: voxel-wise GLM 1 p 1 1 y = Xβ + e β y = X p + e e ~ N (0, σ I Model is specified by 1. Design matrix X. Assumptions about e ) N N N N: number of scans p: number of regressors he design matrix embodies all available knowledge about experimentally controlled factors and potential confounds.

GLM assumes Gaussian spherical (i.i.d.) errors sphericity = i.i.d. error covariance is scalar multiple of identity matrix: Cov(e) = σ I Examples for non-sphericity: 4 0 Cov(e) = 0 1 non-identity Cov(e) = 1 0 0 1 1 Cov(e) = 1 non-independence

Parameter estimation β1 = + β Objective: estimate parameters to minimize N t= 1 e t y X e Ordinary least squares estimation (OLS) (assuming i.i.d. error): y = Xβ + e ˆ β = ( X X ) 1 X y

A geometric perspective on the GLM ˆ β = OLS estimates ( X X ) 1 X y y e Residual forming matrix R e R = = Ry I P x ˆ Xβˆ y = Design space defined by X x 1 yˆ P Projection matrix P = Py = X ( X X ) 1 X

Deriving the OLS equation X e= 0 ( ) X y Xβˆ = 0 X y X Xβˆ = 0 X X β ˆ = X y ( ) X X 1 ˆ = X y OLS estimate β

Correlated and orthogonal regressors y x * x x 1 y = 1 1 β = β x β + x 1 = 1 β + e y = x β + x 1 1 * β 1 ; * 1 > β = 1 β * + e Correlated regressors = explained variance is shared between regressors When x is orthogonalized with regard to x 1, only the parameter estimate for x 1 changes, not that for x!

What are the problems of this model? 1. BOLD responses have a delayed and dispersed form. HRF. he BOLD signal includes substantial amounts of lowfrequency noise. 3. he data are serially correlated (temporally autocorrelated) this violates the assumptions of the noise model in the GLM

Problem 1: Shape of BOLD response Solution: Convolution model hemodynamic response function (HRF) f t g( t) = f ( τ ) g( t τ ) dτ he response of a linear time-invariant (LI) system is the convolution of the input with the system's response to an impulse (delta function). expected BOLD response = input function impulse response function (HRF) 0

Convolution model of the BOLD response Convolve stimulus function with a canonical hemodynamic response function (HRF): f t g( t) = f ( τ ) g( t τ ) dτ 0 HRF

Problem : Low-frequency noise Solution: High pass filtering Sy = SXβ + Se S = residual forming matrix of DC set discrete cosine transform (DC) set

High pass filtering: example blue = black = data mean + low-frequency drift green = predicted response, taking into account low-frequency drift red = predicted response, NO taking into account low-frequency drift

e = 1 + ε with ε ~ N (0, σ ) t ae t Problem 3: Serial correlations t t 1 st order autoregressive process: AR(1) autocovariance function Cov(e) N N

Dealing with serial correlations Pre-colouring: impose some known autocorrelation structure on the data (filtering with matrix W) and use Satterthwaite correction for df s. Pre-whitening: 1. Use an enhanced noise model with multiple error covariance components, i.e. e ~ N(0, σ V) instead of e ~ N(0, σ I).. Use estimated serial correlation to specify filter matrix W for whitening the data. Wy = WXβ + We

How do we define W? Enhanced noise model e ~ N (0, σ V ) Remember linear transform for Gaussians x ~ y ~ N( µ, σ N( aµ, a ), y = ax σ ) Choose W such that error covariance becomes spherical We ~ N(0, σ W V ) Conclusion: W is a simple function of V so how do we estimate V? W W V = V = I 1/ Wy = WXβ + We

Estimating V: Multiple covariance components V e ~ N(0, σ ) V = enhanced noise model V Cov( e) λiq i error covariance components Q and hyperparameters λ V Q 1 Q = λ 1 + λ Estimation of hyperparameters λ with ReML (restricted maximum likelihood).

Contrasts & statistical parametric maps c = 1 0 0 0 0 0 0 0 0 0 0 X Q: activation during listening? Null hypothesis: β 1 = 0 t = c ˆ β Std( c ˆ) β

t-statistic based on ML estimates Wy = WXβ + We ˆβ = ( WX ) + Wy c = 1 0 0 0 0 0 0 0 0 0 0 X c ˆ β t = ˆ std( c ˆ) β σ W V 1/ = V = Cov( e) V = λ i Q i std ˆ ( c ˆ σ c ˆ σ = ˆ) β = ( WX ) + ( WX ) + ( Wy WX ˆ β ) tr( R) R = I WX (WX ) + c For brevity: ReMLestimates ( WX ) + = ( X WX ) 1 X

Physiological confounds head movements arterial pulsations (particularly bad in brain stem) breathing eye blinks (visual cortex) adaptation effects, fatigue, fluctuations in concentration, etc.

Outlook: further challenges correction for multiple comparisons variability in the HRF across voxels slice timing limitations of frequentist statistics Bayesian analyses GLM ignores interactions among voxels models of effective connectivity hese issues are discussed in future lectures.

Correction for multiple comparisons Mass-univariate approach: We apply the GLM to each of a huge number of voxels (usually > 100,000). hreshold of p<0.05 more than 5000 voxels significant by chance! Massive problem with multiple comparisons! Solution: Gaussian random field theory

Variability in the HRF HRF varies substantially across voxels and subjects For example, latency can differ by ± 1 second Solution: use multiple basis functions See talk on event-related fmri

Summary Mass-univariate approach: same GLM for each voxel GLM includes all known experimental effects and confounds Convolution with a canonical HRF High-pass filtering to account for low-frequency drifts Estimation of multiple variance components (e.g. to account for serial correlations)

Bibliography Friston, Ashburner, Kiebel, Nichols, Penny (007) Statistical Parametric Mapping: he Analysis of Functional Brain Images. Elsevier. Christensen R (1996) Plane Answers to Complex Questions: he heory of Linear Models. Springer. Friston KJ et al. (1995) Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping : 189-10.

Supplementary slides

Convolution step-by-step (from Wikipedia): 1. Express each function in terms of a dummy variable τ.. Reflect one of the functions: g(τ) g( τ). 3. Add a time-offset, t, which allows g(t τ) to slide along the τ-axis. 4.Start t at - and slide it all the way to +. Wherever the two functions intersect, find the integral of their product. In other words, compute a sliding, weighted-average of function f(τ), where the weighting function is g( τ). he resulting waveform (not shown here) is the convolution of functions f and g. If f(t) is a unit impulse, the result of this process is simply g(t), which is therefore called the impulse response.