The general linear model and Statistical Parametric Mapping I: Introduction to the GLM

Similar documents
Experimental Design Initial GLM Intro. This Time

Contents. Introduction & recap. The General Linear Model, Part II. F-test and added variance Good & bad models Improvedmodel

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Comparing Several Means: ANOVA. Group Means and Grand Mean

Contents. Introduction The General Linear Model. General Linear Linear Model Model. The General Linear Model, Part I. «Take home» message

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

Part 3 Introduction to statistical classification techniques

The General Linear Model (GLM)

Simple Linear Regression (single variable)

Resampling Methods. Chapter 5. Chapter 5 1 / 52

A Matrix Representation of Panel Data

Principal Components

Computational modeling techniques

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

Distributions, spatial statistics and a Bayesian perspective

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Statistical Inference

Statistics, Numerical Models and Ensembles

Pattern Recognition 2014 Support Vector Machines

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

INSTRUMENTAL VARIABLES

More Tutorial at

COMP 551 Applied Machine Learning Lecture 4: Linear classification

Smoothing, penalized least squares and splines

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Group Analysis: Hands-On

Statistical Inference

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

Unit 1: Introduction to Biology

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

Inference in the Multiple-Regression

FIELD QUALITY IN ACCELERATOR MAGNETS

Hypothesis Tests for One Population Mean

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

The standards are taught in the following sequence.

Data mining/machine learning large data sets. STA 302 or 442 (Applied Statistics) :, 1

5 th grade Common Core Standards

Computational modeling techniques

ENGI 4430 Parametric Vector Functions Page 2-01

Chapter 3: Cluster Analysis

Excessive Social Imbalances and the Performance of Welfare States in the EU. Frank Vandenbroucke, Ron Diris and Gerlinde Verbist

ENSC Discrete Time Systems. Project Outline. Semester

CHAPTER 8 ANALYSIS OF DESIGNED EXPERIMENTS

Tree Structured Classifier

1 The limitations of Hartree Fock approximation

What is Statistical Learning?

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date

Surface and Contact Stress

(1.1) V which contains charges. If a charge density ρ, is defined as the limit of the ratio of the charge contained. 0, and if a force density f

CN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)

Elements of Machine Intelligence - I

Computational modeling techniques

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

Keysight Technologies Understanding the Kramers-Kronig Relation Using A Pictorial Proof

Data Analysis, Statistics, Machine Learning

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

Physics 2B Chapter 23 Notes - Faraday s Law & Inductors Spring 2018

You need to be able to define the following terms and answer basic questions about them:

IN a recent article, Geary [1972] discussed the merit of taking first differences

Determining the Accuracy of Modal Parameter Estimation Methods

Time, Synchronization, and Wireless Sensor Networks

The General Linear Model (GLM)

Aircraft Performance - Drag

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 11 (3/11/04) Neutron Diffusion

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

Lecture 10, Principal Component Analysis

Aerodynamic Separability in Tip Speed Ratio and Separability in Wind Speed- a Comparison

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment

Misc. ArcMap Stuff Andrew Phay

Physics 2010 Motion with Constant Acceleration Experiment 1

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

Eric Klein and Ning Sa

7 TH GRADE MATH STANDARDS

Introduction to Regression

Chapter 5: The Keynesian System (I): The Role of Aggregate Demand

Computational Statistics

SAMPLING DYNAMICAL SYSTEMS

ChE 471: LECTURE 4 Fall 2003

Application of ILIUM to the estimation of the T eff [Fe/H] pair from BP/RP

Multiple Source Multiple. using Network Coding

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

Kinetic Model Completeness

NOTE ON THE ANALYSIS OF A RANDOMIZED BLOCK DESIGN. Junjiro Ogawa University of North Carolina

Group analysis. Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London. SPM Course Edinburgh, April 2010

Weathering. Title: Chemical and Mechanical Weathering. Grade Level: Subject/Content: Earth and Space Science

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Physics 212. Lecture 12. Today's Concept: Magnetic Force on moving charges. Physics 212 Lecture 12, Slide 1

Study Group Report: Plate-fin Heat Exchangers: AEA Technology

OTHER USES OF THE ICRH COUPL ING CO IL. November 1975

Introduction to Quantitative Genetics II: Resemblance Between Relatives

Neuroimaging for Machine Learners Validation and inference

Lecture 3: Principal Components Analysis (PCA)

Numerical Simulation of the Thermal Resposne Test Within the Comsol Multiphysics Environment

Lecture 5: Equilibrium and Oscillations

NGSS High School Physics Domain Model

Transcription:

The general linear mdel and Statistical Parametric Mapping I: Intrductin t the GLM Alexa Mrcm and Stefan Kiebel, Rik Hensn, Andrew Hlmes & J-B J Pline

Overview Intrductin Essential cncepts Mdelling Design matrix Parameter estimates Simple cntrasts Summary

Sme terminlgy SPM is based n a mass univariate apprach that fits a mdel at each vxel Is there an effect at lcatin X? Investigate lcalisatin f functin r functinal specialisatin Hw des regin X interact with Y and Z? Investigate behaviur f netwrks r functinal integratin A General(ised) Linear Mdel Effects are linear and additive If errrs are nrmal (Gaussian), General (SPM99) If errrs are nt nrmal, Generalised (SPM2)

Classical statistics... Parametric ne sample t-test tw sample t-test paired t-test ANOVA ANCOVA crrelatin linear regressin multiple regressin F-tests etc all cases f the (univariate) General Linear Mdel Or, with nn-nrmal errrs, the Generalised Linear Mdel

Classical statistics... Parametric ne sample t-test tw sample t-test paired t-test ANOVA ANCOVA crrelatin linear regressin multiple regressin F-tests etc all cases f the (univariate) General Linear Mdel Or, with nn-nrmal errrs, the Generalised Linear Mdel

Classical statistics... Parametric ne sample t-test tw sample t-test paired t-test ANOVA ANCOVA crrelatin linear regressin multiple regressin F-tests etc Multivariate? all cases f the (univariate) General Linear Mdel Or, with nn-nrmal errrs, the Generalised Linear Mdel fi PCA/ SVD, MLM

Classical statistics... Parametric ne sample t-test tw sample t-test paired t-test ANOVA ANCOVA crrelatin linear regressin multiple regressin F-tests etc Multivariate? Nn-parametric? all cases f the (univariate) General Linear Mdel Or, with nn-nrmal errrs, the Generalised Linear Mdel fi PCA/ SVD, MLM fi SnPM

Why mdelling? Why? Make inferences abut effects f f interest Hw? 1. 1. Decmpse data data int int effects and and errr errr 2. 2. Frm statistic using estimates f f effects and and errr errr Mdel? Use Use any any available knwledge data data mdel mdel effects effects estimate errr errr estimate statistic statistic

Chse yur mdel A very very general general mdel mdel All mdels are wrng, but sme are useful Gerge E.P.Bx Variance N N nrmalisatin N N smthing Massive Massive mdel mdel Captures signal but...... nt much sensitivity default default SPM SPM Bias Lts Lts f f nrmalisatin Lts Lts f f smthing Sparse Sparse mdel mdel High sensitivity but...... may miss signal

Mdelling with SPM Functinal data data Design Design matrix matrix Cntrasts Preprcessing Smthed nrmalised data data Generalised linear mdel Parameter estimates SPMs Templates Variance cmpnents Threshlding

Mdelling with SPM Design Design matrix matrix Cntrasts Preprcessed data: data: single single vxel vxel Generalised linear mdel Parameter estimates SPMs Variance cmpnents

fmri example One One sessin Passive wrd listening versus rest rest Time Time series series f f BOLD BOLD respnses in in ne ne vxel vxel 7 cycles f f rest rest and and listening Each Each epch 6 scans with with 7 sec sec TR TR Questin: Is Is there there a a change change in in the the BOLD BOLD respnse between listening and and rest? rest? Stimulus functin functin

GLM essentials The mdel Design matrix: Effects f interest Design matrix: Cnfunds (aka effects f n interest) Residuals (errr measures f the whle mdel) Estimate effects and errr fr data Parameter estimates (aka betas) Quantify specific effects using cntrasts f parameter estimates Statistic Cmpare estimated effects the cntrasts with apprpriate errr measures Are the effects surprisingly large?

GLM essentials The mdel Design matrix: Effects f interest Design matrix: Cnfunds (aka effects f n interest) Residuals (errr measures f the whle mdel) Estimate effects and errr fr data Parameter estimates (aka betas) Quantify specific effects using cntrasts f parameter estimates Statistic Cmpare estimated effects the cntrasts with apprpriate errr measures Are the effects surprisingly large?

Regressin mdel General case Time = b + 1 b + 2 errr e~n(0, s 2 I) (errr is nrmal and independently and identically distributed) Intensity x 1 x 2 e Questin: Is Is there there a change in in the the BOLD respnse between listening and and rest? rest? Hypthesis test: β 1 = 0? (using t-statistic)

Regressin mdel General case ˆβ ˆβ 1 2 = + + ˆ Y x + β ˆ x + εˆ = β 1 1 2 2 Mdel is is specified by by 1. 1. Design matrix X 2. 2. Assumptins abut ε

Matrix frmulatin Y i = b 1 x i + b 2 + e i i = 1,2,3 Y 1 = b 1 x 1 + b 2 1 + e 1 Y 2 = b 1 x 2 + b 2 1 + e 2 Y 3 = b 1 x 3 + b 2 1 + e 3 dummy variables Y 1 x 1 1 b 1 e 1 Y 2 = x 2 1 + e 2 Y 3 x 3 1 b 1 e 3 Y = X b + e

Linear regressin Matrix frmulatin ^ Hats Hats = estimates Y i = b 1 x i + b 2 + e i i = 1,2,3 Y 1 = b 1 x 1 + b 2 1 + e 1 Y 2 = b 1 x 2 + b 2 1 + e 2 y ^ b 2 Y i 1 ^ Y i ^b1 Y 3 = b 1 x 3 + b 2 1 + e 3 Y 1 x 1 1 b 1 e 1 Y 2 = x 2 1 + e 2 Y 3 x 3 1 b 1 e 3 Y = X b + e dummy variables Parameter estimates ^ ^ b 1 & b 2 Fitted values Y ^ 1 & Y ^ 2 Residuals e^ 1 & ^e 2 x

Gemetrical perspective Y = Y 1 x 1 1 b 1 e 1 Y 2 = x 2 1 + e 2 Y 3 x 3 1 b 2 e 3 DATA (Y 1, Y 2, Y 3 ) b 1 X 1 +b 2 X 2 + e Y X 1 (x1, x2, x3) O X 2 (1,1,1) design space

GLM essentials The mdel Design matrix: Effects f interest Design matrix: Cnfunds (aka effects f n interest) Residuals (errr measures f the whle mdel) Estimate effects and errr fr data Parameter estimates (aka betas) Quantify specific effects using cntrasts f parameter estimates Statistic Cmpare estimated effects the cntrasts with apprpriate errr measures Are the effects surprisingly large?

Parameter estimatin Ordinary least squares Y = Xβ + ε βˆ = ( X X ) T 1 X T Y εˆ = Y Xβˆ Parameter estimates residuals Estimate parameters N such such that that t= 1 ˆε 2 t minimal Least squares parameter estimate

Parameter estimatin Ordinary least squares Y = Xβ + ε βˆ = ( X X ) T 1 X T Y εˆ = Y Xβˆ Parameter estimates residuals = rr Errr variance s 2 2 = (sum (sum f) f) squared residuals standardised fr fr df df = rr T T r r // df df (sum (sum f f squares) where degrees f f freedm df df (assuming iid): iid): = N --rank(x) (=N-P if if X full full rank)

Estimatin (gemetrical) Y = b Y 1 x 1 1 b 1 e 1 Y 2 = x 2 1 + e 2 Y 3 x 3 1 e 3 2 Y b 1 X 1 +b 2 X 2 + e DATA (Y 1, Y 2, Y 3 ) RESIDUALS ^e = (e^ 1,e ^ 2,e^ 3 ) T X 1 (x1, x2, x3) Y^ (Y^ 1,Y ^ 2,Y ^ 3 ) O X 2 (1,1,1) design space

GLM essentials The mdel Design matrix: Effects f interest Design matrix: Cnfunds (aka effects f n interest) Residuals (errr measures f the whle mdel) Estimate effects and errr fr data Parameter estimates (aka betas) Quantify specific effects using cntrasts f parameter estimates Statistic Cmpare estimated effects the cntrasts with apprpriate errr measures Are the effects surprisingly large?

Inference - cntrasts Y = Xβ + ε A cntrast = a linear cmbinatin f parameters: c x b spm_cn*img t-test: ne-dimensinal cntrast, difference in means/ difference frm zer bxcar parameter > 0? Null Null hypthesis: β 1 = 0 spmt_000*.img SPM{t} map F-test: tests multiple linear hyptheses des subset f mdel accunt fr significant variance Des bxcar parameter mdel anything? Null Null hypthesis: variance f f tested effects = errr errr variance ess_000*.img SPM{F} map

t-statistic - example Y = Xβ + ε c c = = 1 1 0 0 0 0 0 0 0 0 0 0 0 t = T c βˆ ˆ T Std( c βˆ) Cntrast f f parameter estimates Variance estimate X Standard errr errr f f cntrast depends n n the the design, and and is is larger with with greater residual errr and and greater cvariance/ autcrrelatin Degrees f f freedm d.f. d.f. then then = n-p, n-p, where n bservatins, p parameters Tests fr fr a directinal difference in in means

F-statistic - example D D mvement parameters (r (r ther cnfunds) accunt fr fr anything? H 0 : True mdel is X 0 H 0 : b 3-9 = (0 0 0 0...) test H 0 : c x b = 0? X 0 X 1 (b 3-9 ) X 0 This mdel? Or this ne? Null Null hypthesis H 0 : 0 : That That all all these betas b 3-9 are 3-9 are zer, zer, i.e. i.e. that that n n linear cmbinatin f f the the effects accunts fr fr significant variance This This is is a nn-directinal test test

F-statistic - example D D mvement parameters (r (r ther cnfunds) accunt fr fr anything? H 0 : True mdel is X 0 H 0 : b 3-9 = (0 0 0 0...) test H 0 : c x b = 0? X 0 X 1 (b 3-9 ) X 0 c = 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 This mdel? Or this ne? SPM{F}

Summary s far The essential mdel cntains Effects f interest A better mdel? A better mdel (within reasn) means smaller residual variance and mre significant statistics Capturing the signal later Add cnfunds/ effects f n interest Example f mvement parameters in fmri A further example (mainly relevant t PET)

Example PET experiment b 1 b 2 b 3 b 4 rank(x)=3 12 12 scans, 3 cnditins (1-way ANOVA) y j = j x 1j b 1j 1 + 1 x 2j b 2j 2 + 2 x 3j b 3j 3 + 3 x 4j b 4j 4 + 4 e j j where (dummy) variables: x 1j = 1j [0,1] = cnditin A (first (first 4 scans) x 2j = 2j [0,1] = cnditin B (secnd 4 scans) x 3j = 3j [0,1] [0,1] = cnditin C (third 4 scans) x 4j = 4j [1] [1] = grand mean (sessin cnstant)

Glbal effects May May be be variatin in in PET PET tracer dse dse frm scan scan t t scan scan Such glbal changes in in image intensity (gcbf) cnfund lcal lcal // reginal (rcbf) changes f f experiment Adjust fr fr glbal effects by: by: --AnCva (Additive Mdel) --PET? --Prprtinal Scaling --fmri? Can Can imprve statistics when rthgnal t t effects f f interest but can can als als wrsen when effects f f interest crrelated with with glbal rcbf (adj) rcbf x x x x x x x rcbf rcbf a k 2 a k 1 rcbf 0 0 AnCva Scaling x x x x x x x x x x x x x x x x 1 x x x x g.. z k x x x x x x 50 glbal gcbf glbal gcbf glbal gcbf

Glbal effects (AnCva) b 1 b 2 b 3 b 4 b 5 rank(x)=4 12 12 scans, 3 cnditins, 1 cnfunding cvariate y j = j x 1j b 1j 1 + 1 x 2j b 2j 2 + 2 x 3j b 3j 3 + 3 x 4j b 4j 4 + 4 x 5j b 5j 5 + 5 e j j where (dummy) variables: x 1j = 1j [0,1] = cnditin A (first (first 4 scans) x 2j = 2j [0,1] = cnditin B (secnd 4 scans) x 3j = 3j [0,1] [0,1] = cnditin C (third 4 scans) x 4j = 4j [1] [1] = grand mean (sessin cnstant) x 5j = 5j glbal signal (mean ver ver all all vxels)

Glbal effects (AnCva) N N Glbal Orthgnal glbal Crrelated glbal b 1 b 2 b 3 b 4 b 1 b 2 b 3 b 4 b 5 b 1 b 2 b 3 b 4 b 5 rank(x)=3 rank(x)=4 rank(x)=4 Glbal effects nt nt accunted fr fr Maximum degrees f f freedm (glbal uses uses ne) ne) Glbal effects independent f f effects f f interest Smaller residual variance Larger T statistic Mre significant Glbal effects crrelated with with effects f f interest Smaller effect &/r &/r larger residuals Smaller T statistic Less Less significant

Glbal effects (scaling) Tw Tw types f f scaling: Grand Mean scaling and and Glbal scaling -- Grand Mean scaling is is autmatic, glbal scaling is is ptinal -- Grand Mean scales by by 100/mean ver ver all all vxels and and ALL ALL scans (i.e, (i.e, single number per per sessin) -- Glbal scaling scales by by 100/mean ver ver all all vxels fr fr EACH scan scan (i.e, (i.e, a different scaling factr every scan) Prblem with with glbal scaling is is that that TRUE glbal is is nt nt (nrmally) knwn nly nly estimated by by the the mean ver ver vxels -- S S if if there there is is a large large signal change ver ver many vxels, the the glbal estimate will will be be cnfunded by by lcal lcal changes -- This This can can prduce artifactual deactivatins in in ther regins after after glbal scaling Since mst surces f f glbal variability in in fmri are are lw lw frequency (drift), high-pass filtering may may be be sufficient, and and many peple d d nt nt use use glbal scaling

Summary General(ised) linear mdel partitins data int Effects f interest & cnfunds/ effects f n interest Errr Least squares estimatin Minimises difference between mdel & data T d this, assumptins made abut errrs mre later Inference at every vxel Test hypthesis using cntrast mre later Inference can be Bayesian as well as classical Next: Applying the GLM t fmri