Correlations in Populations: Information-Theoretic Limits

Correlations in Populations: Information-Theoretic Limits Don H. Johnson Ilan N. Goodman dhj@rice.edu Department of Electrical & Computer Engineering Rice University, Houston, Texas

Population coding Describe a population as parallel point process channels Variations Separate inputs Common input Dependence among channels What do information theoretic considerations suggest is best? X X 1 X 2 X N Y 1 Y 2 Y N

Modeling approach We would like to use point process models for the outputs Technically very difficult to describe connection-induced dependencies Use simpler Bernoulli models, capable of describing complex correlation structures Assume homogeneous populations P( X 1, X 2,K, X N )= P( X 1 )P( X 2 )LP( X N ) P(X j =1) ' # (2) $( X " 1+ i % p)( X j % p) ) & p(1% p) () i> j # (3) * $( X + i % p)( X j % p)( X k % p) &, i> j>k [ p(1% p)] 3/ 2 +,

A note on modeling Correlation, orthogonal model P( X 1, X 2,K, X N ) = P( X 1 )P( X 2 )L P( X N ) 1 + Exponential model P( X 1, X 2,K, X N ) " exp % & ' + & ( ( ' % i > j > k % i > j " (2) # ( X i $ p i )( X j $ p j ) p i (1$ p i ) p j (1$ p j ) " (3) # ( X i $ p)( X j $ p)( X k $ p) p i (1 $ p i ) p j (1$ p j )p k (1$ p k ) $ # i X i + $ # ij X i X j + $ # ijk X i X j X k + L i i, j i, j,k ( ) * ) + + *

Fisher information analysis How should the stimulus be encoded in spike rate to achieve constant Fisher information? Input structure not important 0.5 Two-neuron Bernoulli Model p ρ 0.5 Three-neuron Bernoulli Model p ρ ρ3 10 Poisson Counting Model Total rate Common rate 0 0 1 α 0 0 1 α 0 0 α 1

Kullback-Leibler distance and data analysis D X (" 1 " 0 ) = # p(x;" 1 )log p(x;" 1 ) x p(x;" 0 ) α 0, α 1 two different stimulus conditions p (x; α) - response probabilities K-L distance is the exponential rate of a Neyman-Pearson classifier s false-alarm probability P F ~ 2!ND X (" 1 " 0 ) for fixed PM Distance resulting from information perturbations is proportional to Fisher information D X (" 0 + #" " 0 ) $ F(" 0 ) % (#") 2

Data Processing Theorem Redux X " X,Y (# 0,# 1 ) = D Y (# 1 # 0 ) D X (# 1 # 0 ) Systems cannot create information 0 $ " X,Y (# 0,# 1 ) $ 1 Basis for a system theory for information processing and determining which structures are inherently more effective Y

Population encoding properties from a K-L distance perspective Individual inputs don t necessarily achieve maximal information transfer " X,Y (N ) # max Explicitly indicating that the inputs encode a single quantity reveals that perfect fidelity is possible X. Y 1 Y 2 Y N 1 γ X,Y (N) i " X i,y i 0 N 1 " X,Y (N ) # 1$ k N or " X,Y (N ) # 1$ e $kn

Another viewpoint: Channel Capacity Capacity for the stationary point process channel is known If 0 λ t λ max is the power constraint C (bits/s) = " max eln 2 = " max 1.88417 If we additionally constrain average rate #" max % C (bits/s) = eln 2, " > " o $ " ln 2 ln ", " o = " max % max e," < " & o " Capacity achieved by a Poisson process driven by a random telegraph wave λ max t

Channel capacity of populations Use a Bernoulli model and investigate the small probability limit to determine capacity for parallel Poisson channels The two input structures have the same capacity C ( N ) = NC (1) = N " # max eln 2 X 1 Y 1 Y 1 X 2 Y 2 X Y 2 X N Y N Y N

Imposing connection dependence changes the story Using Bernoulli models, connection dependence can be added Caveat: modeling Poisson processes Interesting restrictions arise Capacity depends only on pairwise correlations (dependencies) Only positive pairwise correlations possible Restricted range of correlation values For homogenous populations: 0 " # " 1 N $1 For inhomogenous populations: 0 " # " # max

3.5 3 Capacity results Capacity achieved with a homogeneous population Correlation affects the two input structures differently M = 2 3.5 3 Separate inputs M = 3 2.5 Separate inputs 2.5 2 2 1.5 Single, common input 1.5 Single, common input 1 1 0.5 0.5 0 0 0.25 0.5 0.75 1 Correlation 0 0 0.25 0.5 0.75 1 Correlation Qualitatively similar to Gaussian channel results

However As population size increases, introducing connection-induced dependence reduces capacity Capacity unaffected by input- or connectioninduced dependence Fits with previous results derived using Fisher information

Poisson vs. Non-Poisson Models Results derived using a Poisson assumption How about non-poisson models? Probably impossible to extend Bernoulli approach to interesting non-poisson cases, but Kabanov showed that the single-channel Poisson capacity bounded the capacity of all other point process models Does this bound apply to multi-channel processes as well?

Connection-induced dependence Bernoulli model vague about how correlations are induced If internal feedback is used Feedback can increase capacity M. Lexa has shown that internal feedback can increase the performance of distributed classifiers Y 1 X Y 2 Y 3

Conclusions From two theoretical viewpoints, connectioninduced dependence not required to increase capacity Specific forms of dependence may increase a population s processing power Capacity afforded by non-poisson models probably bounded by Poisson result, but not in detail