A new look at state-space models for neural data

Similar documents
Statistical challenges and opportunities in neural data analysis

Challenges and opportunities in statistical neuroscience

A new look at state-space models for neural data

Statistical models for neural encoding, decoding, information estimation, and optimal on-line stimulus design

Combining biophysical and statistical methods for understanding neural codes

State-Space Methods for Inferring Spike Trains from Calcium Imaging

A new look at state-space models for neural data

Sean Escola. Center for Theoretical Neuroscience

Computing loss of efficiency in optimal Bayesian decoders given noisy or incomplete spike trains

Comparing integrate-and-fire models estimated using intracellular and extracellular data 1

Kalman filter mixture model for spike sorting of non-stationary data

Transductive neural decoding for unsorted neuronal spikes of rat hippocampus

Scalable Inference for Neuronal Connectivity from Calcium Imaging

Neural Coding: Integrate-and-Fire Models of Single and Multi-Neuron Responses

Probabilistic Inference of Hand Motion from Neural Activity in Motor Cortex

A Monte Carlo Sequential Estimation for Point Process Optimum Filtering

Lecture 6: Bayesian Inference in SDE Models

STA 4273H: Statistical Machine Learning

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Model-based decoding, information estimation, and change-point detection techniques for multi-neuron spike trains

Statistical analysis of neural data: Discrete-space hidden Markov models

Population Decoding of Motor Cortical Activity using a Generalized Linear Model with Hidden States

Nonparametric Bayesian Methods (Gaussian Processes)

Statistical models for neural encoding

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

Fast Kalman filtering via a low-rank perturbative approach

State Space and Hidden Markov Models

Pattern Recognition and Machine Learning

Efficient methods for sampling spike trains in networks of coupled neurons

CHARACTERIZATION OF NONLINEAR NEURON RESPONSES

Gaussian process based nonlinear latent structure discovery in multivariate spike train data

arxiv: v1 [stat.co] 2 Nov 2017

Lecture 2: From Linear Regression to Kalman Filter and Beyond

RESEARCH STATEMENT. Nora Youngs, University of Nebraska - Lincoln

Statistical analysis of neural data: Monte Carlo techniques for decoding spike trains

The Multivariate Gaussian Distribution

10/8/2014. The Multivariate Gaussian Distribution. Time-Series Plot of Tetrode Recordings. Recordings. Tetrode

Maximum Likelihood Estimation of a Stochastic Integrate-and-Fire Neural Model

Dynamical Systems and Computational Mechanisms

Optimal experimental design for sampling voltage on dendritic trees in the low-snr regime

Data assimilation with and without a model

Statistical Modeling of Temporal Evolution in Neuronal Activity. Rob Kass

Smoothing biophysical data 1

Fast state-space methods for inferring dendritic synaptic connectivity

Neural characterization in partially observed populations of spiking neurons

Linear Dynamical Systems

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Pattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods

Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Bayesian active learning with localized priors for fast receptive field characterization

Model-based clustering of non-poisson, non-homogeneous, point process events, with application to neuroscience

Low rank continuous-space graphical models

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Statistical Smoothing of Neuronal Data p.1

The Kalman Filter ImPr Talk

Towards a Bayesian model for Cyber Security

Ch 4. Linear Models for Classification

Markov Chains and Hidden Markov Models

Inferring synaptic conductances from spike trains under a biophysically inspired point process model

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Stat 451 Lecture Notes Numerical Integration

output dimension input dimension Gaussian evidence Gaussian Gaussian evidence evidence from t +1 inputs and outputs at time t x t+2 x t-1 x t+1

Smoothers: Types and Benchmarks

A Statistical Paradigm for Neural Spike Train Decoding Applied to Position Prediction from Ensemble Firing Patterns of Rat Hippocampal Place Cells

Efficient, adaptive estimation of two-dimensional firing rate surfaces via Gaussian process methods

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS

Probabilistic Graphical Models

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Estimation of information-theoretic quantities

Bayesian Feature Selection with Strongly Regularizing Priors Maps to the Ising Model

A BAYESIAN APPROACH FOR INFERRING NEURONAL CONNECTIVITY FROM CALCIUM FLUORESCENT IMAGING DATA

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Gentle Introduction to Infinite Gaussian Mixture Modeling

Data assimilation with and without a model

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Probabilistic Graphical Models

Introduction to Machine Learning CMU-10701

Expectation propagation for signal detection in flat-fading channels

Likelihood-Based Approaches to

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Bayesian inference for low rank spatiotemporal neural receptive fields

Conditional Random Field

Hidden Markov Models Hamid R. Rabiee

Neuronal Dynamics: Computational Neuroscience of Single Neurons

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

Neural variability and Poisson statistics

Efficient Markov Chain Monte Carlo Methods for Decoding Neural Spike Trains

Learning Nonlinear Dynamical Systems using an EM Algorithm

GAUSSIAN PROCESS REGRESSION

Fast Kalman filtering on quasilinear dendritic trees

9 Forward-backward algorithm, sum-product on factor graphs

Modelling geoadditive survival data

Statistical analysis of neural data: The expectation-maximization (EM) algorithm

Transcription:

A new look at state-space models for neural data Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/ liam liam@stat.columbia.edu May 16, 2010 Support: CRCNS, Sloan Fellowship, NSF CAREER, McKnight Scholar award.

State-space models Unobserved state q t with Markov dynamics p(q t+1 q t ) (i.e., q t is a noisy dynamical system) Observed y t : p(y t q t ) (noisy, partial observations) Goal: infer p(q t Y 0:T ) Dozens of applications in neuroscience (Paninski et al., 2010).

Example: nonstationary spike sorting q t : mean waveform; y t : observed spikes (Calabrese 10); data from (Harris et al., 2000)

Basic paradigm: the forward recursion We want p(q t Y 1:t ) p(q t, Y 1:t ). We know that! TY p(q, Y ) = p(q)p(y Q) = p(q 1 ) p(q t q t 1 ) t=2! TY p(y t q t ) t=1 To compute p(q t, Y 1:t ) recursively, just write out marginal and pull out constants from the integrals: Z Z Z Z Z Z!! ty ty p(q t, Y 1:t ) =... p(q 1:t, Y 1:t ) =... p(q 1 ) p(q i q i 1 ) p(y i q i ) q 1 q 2 q t 1 q 1 q 2 q t 1 i=2 i=1 Z Z Z Z = p(y t q t ) p(q t q t 1 )p(y t 1 q t 1 )... p(q 3 q 2 )p(y 2 q 2 ) p(q 2 q 1 )p(y 1 q 1 )p(q 1 ). q t 1 q t 2 q 2 q 1 So, just recurse Z p(q t, Y 1:t ) = p(y t q t ) p(q t q t 1 )p(q t 1, Y 1:t 1 ). q t 1 Linear-Gaussian (Kalman) case: requires O(dim(q) 3 T) time; just matrix algebra. Approximate solutions in more general case, e.g., Gaussian approximations (Brown et al., 1998), or Monte Carlo ( particle filtering ). Key point: efficient recursive computations = O(T) time.

Computing the MAP path We often want to compute the MAP estimate ˆQ = arg max Q p(q Y ). In standard Kalman setting, forward-backward gives MAP (because E(Q Y ) and ˆQ coincide in Gaussian case). More generally, extended Kalman-based methods give approximate MAP, but are non-robust: forward distribution p(q t Y 0:t ) may be highly non-gaussian even if full joint distribution p(q Y ) is nice and unimodal.

Write out the posterior: log p(q Y ) = log p(q) + log p(y Q) = X t log p(q t+1 q t ) + X t log p(y t q t ) Two basic observations: If log p(q t+1 q t ) and log p(y t q t ) are concave, then so is log p(q Y ). Hessian H of log p(q Y ) is block-tridiagonal: p(y t q t ) contributes a block-diag term, and log p(q t+1 q t ) contributes a block-tridiag term. Now recall Newton s method: iteratively solve HQ dir =. Solving tridiagonal systems requires O(T) time. computing MAP by Newton s method requires O(T) time, even in highly non-gaussian cases. Newton here acts as an iteratively reweighted Kalman smoother (Fahrmeir and Kaufmann, 1991; Davis and Rodriguez-Yam, 2005; Jungbacker and Koopman, 2007); all suff. stats may be obtained in O(T) time.

Constrained optimization In many cases we need to impose constraints (e.g., nonnegativity) on q t. Easy to incorporate here, via interior-point (barrier) methods: ( arg max Q C log p(q Y ) = lim arg max ǫց0 Q = lim arg max ǫց0 Q log p(q Y ) + ǫ X f(q t ) t ( ) X log p(q t+1 q t ) + log p(y t q t ) + ǫf(q t ) ; f(.) is concave and approaching near boundary of constraint set C. The Hessian remains block-tridiagonal and negative semidefinite for all ǫ > 0, so optimization still requires just O(T) time. t )

Example: computing the MAP subthreshold voltage given superthreshold spikes Leaky, noisy integrate-and-fire model: V (t + dt) = V (t) dtv (t)/τ + σ dtǫ t, ǫ t N(0, 1) Observations: y t = 0 (no spike) if V t < V th ; y t = 1 if V t = V th (Paninski, 2006)

Example: inferring presynaptic input I t = X j g j (t)(v j V t ) g j (t + dt) = g j (t) dtg j (t)/τ j + N j (t), N j (t) > 0 3 true g 2 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 I(t) estimated g 100 50 0 50 2.5 2 1.5 1 0.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time (sec)

Example: inferring spike times from slow, noisy calcium data C(t + dt) = C(t) dtc(t)/τ + N t ; N t > 0; y t = C t + ǫ t nonnegative deconvolution is a recurring problem in signal processing; many other possible applications (Vogelstein et al., 2008).

Optimal control of spike timing Optimal experimental design and neural prosthetics applications require us to perturb the network at will. How can we make a neuron fire exactly when we want it to? Assume bounded inputs; otherwise problem is trivial. Start with a simple model: λ t = f( k I t + h t ). Now we can just optimize the likelihood of the desired spike train, as a function of the input I t, with I t bounded. Concave objective function over convex set of possible inputs I t + Hessian is banded = O(T) optimization.

Optimal electrical control of spike timing resp optimal stim target resp 0 100 200 300 400 500 600 700 800 time(ms)

Example: intracellular control of spike timing I max = 0.76 I max = 1.26 I max = 1.76 I max = 2.04 target spikes 50 0 50 100 150 200 250 300 350 400 450 500 time (ms) (Ahmadian et al 2010)

Optical conductance-based control of spiking V t+dt = V t + dt gv t + gt(v i i V t ) + gt(v e e V t ) + dtσǫ t, ǫ t N(0, 1) ««gt+dt i = gt i + dt gi t + a ii L i t + a ie L e t ; gt+dt e = gt e + dt ge t + a ee L e t + a ei L i t τ i τ i target spike train 50 0 20 40 60 80 100 120 140 160 180 200 Voltage 60 70 0 20 40 60 80 100 120 140 160 180 200 Light intensity 20 10 induced spike trains 0 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200 time(ms) E I

Conclusions GLM and state-space approaches provide flexible, powerful methods for answering key questions in neuroscience Close relationships between forward-backward methods familiar from state-space theory and banded matrices familiar from spline theory Log-concavity, banded matrix methods make computations very tractable

References Brown, E., Frank, L., Tang, D., Quirk, M., and Wilson, M. (1998). A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. Journal of Neuroscience, 18:7411 7425. Davis, R. and Rodriguez-Yam, G. (2005). Estimation for state-space models: an approximate likelihood approach. Statistica Sinica, 15:381 406. Fahrmeir, L. and Kaufmann, H. (1991). On Kalman filtering, posterior mode estimation and fisher scoring in dynamic exponential family regression. Metrika, 38:37 60. Harris, K., Henze, D., Csicsvari, J., Hirase, H., and Buzsaki, G. (2000). Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J. Neurophys., 84:401 414. Jungbacker, B. and Koopman, S. (2007). Monte Carlo estimation for nonlinear non-gaussian state space models. Biometrika, 94:827 839. Paninski, L. (2006). The most likely voltage path and large deviations approximations for integrate-and-fire neurons. Journal of Computational Neuroscience, 21:71 87. Paninski, L., Ahmadian, Y., Ferreira, D., Koyama, S., Rahnama, K., Vidne, M., Vogelstein, J., and Wu, W. (2010). A new look at state-space models for neural data. Journal of Computational Neuroscience, In press. Vogelstein, J., Babadi, B., Watson, B., Yuste, R., and Paninski, L. (2008). Fast nonnegative deconvolution via tridiagonal interior-point methods, applied to calcium fluorescence data. Statistical analysis of neural data (SAND) conference.