Decoding conceptual representations

Similar documents
Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Introduction to representational similarity analysis

Representational similarity analysis. Nikolaus Kriegeskorte MRC Cognition and Brain Sciences Unit Cambridge, UK

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

Nonlinear reverse-correlation with synthesized naturalistic noise

Bayesian probability theory and generative models

The Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017

Competitive Learning for Deep Temporal Networks

Effective Connectivity & Dynamic Causal Modelling

Fundamentals of Computational Neuroscience 2e

Unsupervised Neural Nets

TUTORIAL PART 1 Unsupervised Learning

Semi-supervised subspace analysis of human functional magnetic resonance imaging data

Experimental design of fmri studies & Resting-State fmri

Big Data and the Brain G A L L A NT L A B

UNSUPERVISED LEARNING

Is early vision optimised for extracting higher order dependencies? Karklin and Lewicki, NIPS 2005

Modelling temporal structure (in noise and signal)

THE retina in general consists of three layers: photoreceptors

STA 414/2104: Lecture 8

Neural Networks. William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]

Dynamic Causal Modelling for fmri

Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces

SUPPLEMENTARY INFORMATION

Tales from fmri Learning from limited labeled data. Gae l Varoquaux

Bayesian inference J. Daunizeau

Experimental design of fmri studies

A Biologically-Inspired Model for Recognition of Overlapped Patterns

Lecture : Probabilistic Machine Learning

High-dimensional geometry of cortical population activity. Marius Pachitariu University College London

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

Flexible Gating of Contextual Influences in Natural Vision. Odelia Schwartz University of Miami Oct 2015

Natural Image Statistics

EXTENSIONS OF ICA AS MODELS OF NATURAL IMAGES AND VISUAL PROCESSING. Aapo Hyvärinen, Patrik O. Hoyer and Jarmo Hurri

Experimental design of fmri studies

RegML 2018 Class 8 Deep learning

Artificial Intelligence

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

Feature Design. Feature Design. Feature Design. & Deep Learning

Bayesian inference J. Daunizeau

Statistical inference for MEG

A Bayesian method for reducing bias in neural representational similarity analysis

Is the Human Visual System Invariant to Translation and Scale?

Derived Distance: Beyond a model, towards a theory

SPIKE TRIGGERED APPROACHES. Odelia Schwartz Computational Neuroscience Course 2017

Neuroscience Introduction

Beyond Spatial Pyramids

The functional organization of the visual cortex in primates

How to do backpropagation in a brain

How to make computers work like the brain

Experimental design of fmri studies

Principal Component Analysis

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Machine learning strategies for fmri analysis

Learning Deep Architectures for AI. Contents

Deriving Principal Component Analysis (PCA)

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts III-IV

Probabilistic Reasoning in Deep Learning

EEG/MEG Inverse Solution Driven by fmri

M/EEG source analysis

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Dynamic Causal Modelling for EEG and MEG. Stefan Kiebel

New Machine Learning Methods for Neuroimaging

An Introductory Course in Computational Neuroscience

Recipes for the Linear Analysis of EEG and applications

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

The General Linear Model (GLM)

Applied Statistics and Machine Learning

Decoding Cognitive Processes from Functional MRI

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Robust regression and non-linear kernel methods for characterization of neuronal response functions from limited data

Deep learning in the visual cortex

A Multivariate Time-Frequency Based Phase Synchrony Measure for Quantifying Functional Connectivity in the Brain

The statistical properties of local log-contrast in natural images

Tutorial on Methods for Interpreting and Understanding Deep Neural Networks. Part 3: Applications & Discussion

Introduction to Machine Learning Spring 2018 Note 18

Resampling Based Design, Evaluation and Interpretation of Neuroimaging Models Lars Kai Hansen DTU Compute Technical University of Denmark

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Transformation of stimulus correlations by the retina

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y.

Interpretable Latent Variable Models

Outline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

PV021: Neural networks. Tomáš Brázdil

Lecture 16 Deep Neural Generative Models

Deep Neural Networks

Advanced Introduction to Machine Learning CMU-10715

CHARACTERIZATION OF NONLINEAR NEURON RESPONSES

A Selective Review of Sufficient Dimension Reduction

CSCI 315: Artificial Intelligence through Deep Learning

THE functional role of simple and complex cells has

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Optimization of Designs for fmri

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Cheng Soon Ong & Christian Walder. Canberra February June 2018

STA 414/2104: Lecture 8

Transcription:

Decoding conceptual representations!!!! Marcel van Gerven! Computational Cognitive Neuroscience Lab (www.ccnlab.net) Artificial Intelligence Department Donders Centre for Cognition Donders Institute for Brain, Cognition and Behaviour

Conceptual representations luminance orientations contrast contours semantics shape

Stimulus-response relationships x y features X neural responses Y x y pixels objects time edges scene time

Decoding x y features X neural responses Y x y pixels objects time edges scene time

Whole-brain approach x y features X neural responses Y x y pixels objects time edges scene time

Connectivity-based approach x y features X neural responses Y x y pixels objects time edges scene time

Region-of-interest approach x y features X neural responses Y x y pixels objects time edges scene time

Searchlight approach x y features X neural responses Y x y pixels objects time edges scene time

Generative approach encoding x y features X decoding neural responses Y x M1 y M2 pixels objects time edges scene M3 M4 M5 time M6

Decoding flavors generative discriminative Kay and Gallant, Nature, 2009

Discriminative approach x = f (y) = arg max x p(x y, ) Figure courtesy Kai Brodersen

Example: Decoding amodal conceptual representations Animal and tool stimuli presented in four different modalities: pictures sounds spoken words written words What brain regions respond to concept information independent of modality? Can we decode conceptual representations from these areas during free recall?

Searchlight approach s1 sn Group-level significance maps: 1. non-parametric permutation test; for each sphere in each subject: randomly relabel trials run classifier (cross-validated) record accuracy repeat hundreds of times Accuracy cutoff level α observed accuracy

Searchlight approach s1 sn Group-level significance maps: 2. binomial test over subjects: s1 s2 s3 s4 sn 1 0 1 1 1 1 p = 1 - binocdf(k-1,n,α) k

Searchlight approach s1 sn Group-level significance maps: 3. Selection of significant spheres using p value FDR-corrected for nr of spheres sort group-level spheres according to p-values: (p1,,pm) find largest m such that pm αm/m keep spheres 1,..,m

Group-level map for animal versus tool images p p Decoding of animals versus tools driven by early visual areas However, could be driven by low-level visual properties

From single modalities to amodal representations training testing Simanova, I., Hagoort, P., Oostenveld, R., & van Gerven, M. (2014). Modality-independent decoding of semantic information from the human brain. Cerebral Cortex.

Decoding amodal conceptual representations primary sensorimotor convergence zones top-down control Free recall:

Probing spatiotemporal dynamics Van de Nieuwenhuijzen et al. (2013). MEG-based decoding of the spatiotemporal dynamics of visual category perception. NeuroImage. Also see e.g.: Harrison & Tong (2009). Nature; Sudre et al. (2012). Neuroimage; Carlson et al. (2013). JoV; King & DeHaene (2014). TiCS; Albers et al. (2013). Current Biology; Isik et al. (2014). J. Neurophys.

Decoding representations (discriminative approach) x y Generative appr

Decoding representations (generative approach) encoding x y decoding encoding model x = arg max x {p(y x)p(x)} stimulus prior

Gaussian framework encoding x y decoding N (y; B T x, ) x = arg max x {p(y x)p(x)} N (x; 0, R)

Linear Gaussian encoding model Without confounds and ignoring the HRF, we get y k = X k + This boils down to a linear Gaussian model: p(y k X, k, 2 )=N (y k ; X k, 2 I N ) The least squares solution for β k given a Gaussian prior on β k is given by! ˆk =(XX + I N ) 1 X T y k The matrix B of regression coefficients is given by [β 1,,β K ].

Gaussian image prior For a Gaussian image prior N (x; 0, R) we compute the covariance matrix R using a separate set of handwritten images pixel 1 R Covariance of each pixel pixel M

Gaussian decoding The posterior is given by!!!! p(x y) =N (x; m, Q) with mean m QBΣ 1 y and covariance Q = (R 1 + BΣ 1 B ) 1.! It immediately follows that the most probable stimulus is given by x = m = R 1 + B 1 B T 1 B 1 y Also see Thirion et al. (2006) Neuroimage!

Example: handwritten characters Schoenmakers, S., Barth, M., Heskes, T., & van Gerven, M. A. J. (2013). Linear reconstruction of perceived images from human brain activity. NeuroImage, 83, 951 961

Building better encoding models x Gabor wavelet pyramid φ(x) (non)linear feature space f( ) forward model See Naselaris et al. (2011). Encoding and decoding in fmri. NeuroImage y = f( (x)) + What is the optimal f( )?! What is the optimal ε?! What is the optimal φ(x)?

What is the optimal φ(x)? Conceptual representations as instantiations of φ(x) x = image pixels (gaussian model) 1(x) = local contrast energy 2(x) = edges and contours... M (x) = objects How to obtain these φ i (x)?

φ(x) as a manual relabelling Huth, A. G., Nishimoto, S., Vu, A. T., & Gallant, J. L. (2012). A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron. Labelling of objects and actions in each movie frame using 1364 Wordnet terms! Labour intensive

φ(x) as a canonical transformation Gabor wavelet pyramid (GWP) Kay et al, Nature, 2008 How to extend this strategy to more complex transformations?

φ(x) as a result of unsupervised learning neurons are adapted to statistical properties of their environment different brain regions respond to different statistical properties nonlinear feature spaces improve encoding can we further improve results via unsupervised learning of nonlinear feature spaces?

Two-layer topographic sparse coding model Voxel (β) Complex (H) Simple (W) SN E PW y = B T (x) with (x) = 2 ( 1 (x)) 2(x) 1(x) Image (x) x PW E SN = principal component analysis whitening = energy = static nonlinearity Güçlü, U., & van Gerven, M. A. J. (2014). Unsupervised Feature Learning Improves Prediction of Human Brain Activity in Response to Natural Images. PLoS Comp. Biol. In Press.

Two-layer topographic sparse coding model Simple-cell activations given by a linear transformation of whitened image patches z: 1(x) =Wz Complex-cell activations derived from the pooled energy of simple cell activations 2(s) = log 1+Hs 2 where H is a neighbourhood matrix for a square grid with circular boundary conditions.

Learning of simple-cell features Matrix W is learned using randomly sampled image patches Compex Simple 0 360 0 180 0.25 0.5 Phase ( ) Location Orientation ( ) Spatial frequency (cycles / pixel) 1 1 1 Response 0.5 Response 0.5 Response 0.5 0-90 0 90 Phase ( ) 0-16 0 16 Location ( pixels) 0

Sparse coding versus Gabor wavelet pyramid basis A Sparse coding (SC) B Gabor wavelet pyramid (GWP)

Analysis of voxel receptive fields 4.5 4 Area V1 Area V2 Area V3 3.5 Size ( ) 3 2.5 2 1.5 0 2 4 6 8 10 Eccentricity ( )

Encoding results * * *

Decoding results image identification * image reconstruction

Deep learning M (x)... 2(x) 1(x) unsupervised learning of nonlinear feature spaces provides better encoding and decoding deep neural networks offer a model of hierarchically structured representations in visual cortex deep belief networks as generative models that model complex nonlinear stimulus properties x

Decoding with deep belief networks conditional restricted Boltzmann machine: E(v, h z) = h T Wv z T Cv z T Bh { shallow learning van Gerven et al. (2010). Neural decoding with hierarchical generative models. Neural Computation, 1 16

Decoding with deep belief networks conditional restricted Boltzmann machine: E(v, h z) = h T Wv z T Cv z T Bh { deep learning van Gerven et al. (2010). Neural decoding with hierarchical generative models. Neural Computation, 1 16

High-throughput model testing Yaminset al. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS

High-throughput model testing Yaminset al. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS

Representational similarity analysis Yaminset al. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS Also see: Kriegeskorte et al. (2008). Neuron; Kriegeskorte & Kievit (2013).TiCS; Pantazis et al. (2014). Nature.

Conclusions Discriminative approaches allow probing of representations! Generative approaches make our assumptions explicit! Linear Gaussian model as a baseline model for generative decoding! Unsupervised deep learning for high-throughput analysis

Acknowledgements Alexander Backus; Ali Bahramisharif; Markus Barth; Christian Doeller; Umut Güçlü; Peter Hagoort; Tom Heskes; Ole Jensen; Floris de Lange; Marieke van de Nieuwenhuijzen; Robert Oostenveld; Sanne Schoenmakers; Irina Simanova!