Extracellular Recording and Spike Train Analysis

Size: px

Start display at page:

Download "Extracellular Recording and Spike Train Analysis"

Howard Cobb
5 years ago
Views:

1 Recording and Spike Train Analysis Christophe CNRS UMR 8118 and Paris-Descartes University Paris, France ENP 2009: July 3rd

2 Outline

3 Outline

4 are attractive. Why? They produce a (whole) lot of data with moderate tissue damage. They potentially provide sub-millisecond resolution of neuronal spiking activity. They are relatively cheap to implement. But... They require a lot of processing. They cannot provide some very important pieces of information like the neuronal cell type, the sub-threshold activity, the morphology of the recorded cells.

5 Outline

Multi-Electrodes In Vivo Recordings in Insects From the outside the neuronal activity appears as brief electrical impulses: the action potentials or spikes.

6 Multi-Electrodes In Vivo Recordings in Insects From the outside the neuronal activity appears as brief electrical impulses: the action potentials or spikes. Left, the brain and the probe with 16 electrodes (bright spots). Width of one probe shank: 80 µm. Right, 1 sec of raw data from 4 electrodes. The local extrema are the action potentials.

7 More info on the experimental setting The brain shown on the figure belongs to a locust (Schistocerca americana). The part of the brain close to the probe is the antennal lobe, the first olfactory relay in the insect brain. It s diameter is 400 µm. The thickness of the probe s shanks is µm. The electrodes are squares of 13 x 13 µm 2. The center of one tetrode is separated by 150 µm from its two nearest neighbors. The shown on the right hand side of the figure comes from the lowest right tetrode which was located approximately 100 µm bellow the antennal lobe surface. The data were filtered between 300 Hz and 5 khz.

8 Why Are Tetrodes Used? The last 200 ms of the previous figure. With the upper site only it would be difficult to properly classify the two first large spikes (**). With the lower site only it would be difficult to properly classify the two spikes labeled by ##.

9 Tetrodes are not always usefull You should not conclude from the last slide that tetrodes will always improve the classification of the recorded events. The gain due to tetrodes depends a lot on the tissue and / or on the species. For antennal lobe for instance, tetrodes are required to get good classification in the locust, Schistocerca americana, they do not bring anything in the cockroach, periplaneta americana where the density of active neurite is much lower leading to good performances of isolated sites. In the honeybee, Apis mellifera, the density of active neurites is so high that the signal to noise ratio on each site make the data useless.

10 Other Experimental Techniques Can Also be Used A single neuron patch-clamp coupled to calcium imaging. Data from Moritz Paehler and Peter Kloppenburg (Cologne University).

11 limitations The last slide reminds you that it is always a good idea to use several techniques on the same preparation. have many advantages but this comes at a price of harder to interpret data. In particular the question of which cell type is / was recorded is very difficult to answer by observing only the spike trains. It is then very useful to get other viewpoints on your favorite system with for instance patch / or sharp electrode combined with a morphological study.

12 Outline

13 Data Preprocessing: Spike Sorting To exploit our we must first: Find out how many neurons are recorded. For each neuron estimate some features like the spike waveform, the discharge statistics, etc. For each detected event find the probability with which each neuron could have generated it. Find an automatic method to answer these questions.

14 A Similar Problem (1) Think of a room with many seating people who are talking to each other using a language we do not know. Assume that microphones were placed in the room and that their are given to us. Our task is to isolate the discourse of each person.

15 A Similar Problem (2) We have therefore a situation like... With apologies to Brueghel...

16 A Similar Problem (3) To fulfill our task we could make use of the following features: Some people have a low pitch voice while other have a high pitch one. Some people speak loudly while other do not. One person can be close to one microphone and far from another such that its talk is simultaneously recorded by the two with different amplitudes. Some people speak all the time while other just utter a comment here and there, that is, the discourse statistics changes from person to person.

17 Spike Sorting as a Set of Standard Statistical Problems Efficient spike sorting requires: 1. Events detection followed by events space dimension reduction. 2. A clustering stage. This can be partially or fully automatized depending on the data. 3. Events classification.

18 Avoid loosing time As I just wrote, spike sorting involves a sequence of standard statistical problems. Then do not loose (too much) time with the dedicated spike sorting literature whose average quality is appallingly low! A good reference on classification / clustering problems is Brian Ripley s book (1996) Pattern Recognition and Neural Networks. For statistics in general A Davison (2003) Statistical Models is an excellent start. It also very important to choose a good software implementing all the classical clustering algorithms: k-means, Gaussian mixtures, etc. Don t hesitate go for R (

19 Detection illustration

20 Dimension Reduction Site 1 Site 2 Site 3 Site 4 Left, 100 spikes. Right, 1000 spikes projected on the subspace defined by the first 4 principal components.

21 Clustering Visualisation Gaussian mixture clustering (number of cluster minimizing the BIC) with mclust. Projection plan selected with the proj. pursuit of GGobi.

22 Spikes Attributed to Two Neurons of the Model Spikes of neurons 3 and 10. Site 1 Site 2 Site 1 Site 2 Site 3 Site 4 Site 3 Site 4

23 Outline

24 After the rather heavy spike sorting pre-processing stage spike trains are obtained. s

25 Studying spike trains per se A central working hypothesis of systems neuroscience is that action potential or spike occurrence times, as opposed to spike waveforms, are the sole information carrier between brain regions. This hypothesis legitimates and leads to the study of spike trains per se. It also encourages the development of models whose goal is to predict the probability of occurrence of a spike at a given time, without necessarily considering the biophysical spike generation mechanisms.

26 The raw data of one bursty neuron of the cockroach antennal lobe. 1 minute of spontaneous activity. s are not Poisson processes

27 s are not Renewal processes Some renewal tests applied to the previous data.

28 A counting process formalism (1) Probabilists and Statisticians working on series of events whose only (or most prominent) feature is there occurrence time (car accidents, earthquakes) use a formalism based on the following three quantities (Brillinger, 1988, Biol Cybern 59:189). Counting Process: For points {t j } randomly scattered along a line, the counting process N(t) gives the number of points observed in the interval (0, t]: N(t) = {t j with 0 < t j t} where stands for the cardinality (number of elements) of a set.

29 A counting process formalism (2) History: The history, H t, consists of the variates determined up to and including time t that are necessary to describe the evolution of the counting process. Conditional Intensity: For the process N and history H t, the conditional intensity at time t is defined as: λ(t H t ) = lim h 0 Prob{event (t, t + h] H t } h for small h one has the interpretation: Prob{event (t, t + h] H t } λ(t H t ) h

30 Goodness of fit tests for counting processes All goodness of fit tests derive from a mapping or a time transformation of the observed process realization. Namely one introduces the integrated conditional intensity : Λ(t) = t 0 λ(u H u ) du If Λ is correct it is not hard to show that the process defined by : {t 1,..., t n } {Λ(t 1 ),..., Λ(t n )} is a Poisson process with rate 1.

31 An illustration with simulated data. Time transformation illustrated

32 A goodness of fit test based on Donsker s theorem Y Ogata (1988, JASA, 83:9) introduced several procedures testing the time transformed event sequence against the uniform Poisson hypothesis. We propose an additional test built as follows : X j = Λ(t j+1 ) Λ(t j ) 1 S m = m j=1 X j W n (t) = S nt / n Donsker s theorem (Billingsley, 1999, pp 86-91) states that if Λ is correct then W n converges weakly to a standard Wiener process. We therefore test if the observed W n is within the tight confidence bands obtained by Kendall et al (2007, Statist Comput 17:1) for standard Wiener processes.

33 The proposed test applied to the simulated data. The boundaries have the form: f (x; a, b) = a + b x. Illustration of the proposed test

34 Where Are We? We are now in the fairly unusual situation (from the neuroscientist s viewpoint) of knowing how to show that the model we entertain is wrong without having an explicit expression for this model... We need a way to find candidates for the CI: λ(t H t ).

35 What Do We Put in H t? It is common to summarize the stationary discharge of a neuron by its inter-spike interval (ISI) histogram. If the latter histogram is not a pure decreasing mono-exponential, that implies that λ(t H t ) will at least depend on the elapsed time since the last spike: t t l. For the real data we saw previously we also expect at least a dependence on the length of the previous inter spike interval, isi 1. We would then have: λ(t H t ) = λ(t t l, isi 1 )

36 What About The Functional Form? We haven t even started yet and we are already considering a function of at least 2 variables: t t l, isi 1. What about its functional form? Following Brillinger (1988) we discretize our time axis into bins of size h small enough to have at most 1 spike per bin. We are then lead to a binomial regression problem. For analytical and computational convenience we are going to use the logistic transform: log ( λ(t t l, isi 1 ) h ) = η(t tl, isi 1 ) 1 λ(t t l, isi 1 ) h

37 Smoothing spline (0) Since cellular biophysics does not provide much guidance on how to build η(t t l, isi 1 ) we have chosen to use the nonparametric smoothing spline approach implemented in the gss package. η(t t l, isi 1 ) is then uniquely decomposed as : η(t t l, isi 1 ) = η + η l (t t l) + η 1 (isi 1 ) + η l,1 (t t l, isi 1 ) Where for instance: η 1 (u)du = 0 the integral being evaluated on the definition domain of the variable isi 1.

38 Smoothing spline (1) Given data: Y i = η(x i ) + ɛ i, i = 1,..., n where x i [0, 1] and ɛ i N(0, σ 2 ), we want to find η ρ minimizing: 1 n n (Y i η ρ (x i )) 2 + ρ i=1 1 0 (d 2 η ρ dx 2 ) 2dx

39 Smoothing spline (2) A simple example with simulated data y x

40 Smoothing spline (3) It can be shown (but it s not exactly trivial) that, for a given ρ, the solution of the functional minimization problem can be expressed on a finite basis: η ρ (x) = m 1 ν=0 d ν φ ν (x) + n c i R 1 (x i, x) i=1 where the functions, φ ν (), and R 1 (x i, ), are known.

41 Smoothing spline (4) Cst. unpenalized term Linear unpenalized term Penalized basis fct # x x x Penalized basis fct # 40 Penalized basis fct # 60 Penalized basis fct # x x x

42 Smoothing spline (5): What about ρ? λ = 0.5 x y λ = 0.05 x y λ = x y λ = 5e 04 x y λ = 5e 05 x y λ = 5e 06 x y

43 Smoothing spline (6): Cross-validation Ideally we would like ρ such that: 1 n n (η ρ (x i ) η(x i )) 2 i=1 is minimized... but we don t know the true η. So we choose ρ minimizing: V 0 (ρ) = 1 n n i=1 (η [i] ρ (x i ) Y i ) 2 where η [k] ρ is the minimizer of the delete-one functional: 1 n (Y i η ρ (x i )) 2 + ρ i k 1 0 (d 2 η ρ dx 2 ) 2dx

44 Smoothing spline (7) y y y y y y λ = λ = λ = x λ = 5e x λ = 5e x λ = 5e x x x

45 log(λ) x Smoothing spline (8) λ = rss GCV Exact y

46 Smoothing spline (9): The theory (worked out by Grace Wahba) also gives us confidence bands Data, Conf. Band, Actual η y x

47 Application to real data We fitted to the last 30 s of the data set the following additive model: event 9 t t l + 10 isi 1.

48 The tests applied to the first 30 s

49 The functional forms

50 Outline

51 I would like to warmly thank: Matthieu Delescluse, Ofer Mazor, Gilles Laurent, Jean Diebolt and Pascal Viot for working with (or allowing me to work) on the spike sorting problem. Antoine Chaffiol and the whole Kloppenburg lab (Univ. of Cologne) for providing high quality data and for being patient enough with a slow developer like myself. Chong Gu, Pierre Bohec, Vilmos Prokaj, Olivier Faugeras, Jonhatan Touboul and Carl van Vreeswijk for working with me or providing feed-back on spike train. The AFM / Decrypthon project and the CNRS GDR 2904 for funding my research. The statisticians for showing me an example of a healthy scientific community. You for listening to / reading me up to that point.

Spike Train Measurement and Sorting

Spike Train Measurement and Sorting Christophe Pouzat May 7 2010 Contents 1 A brief introduction to a biological problem 2 2 Raw data properties 5 3 Spike sorting 6 4 Spike train analysis 15 4.1 "Qualitative"