Principles of Neural Information Theory

Size: px
Start display at page:

Download "Principles of Neural Information Theory"

Transcription

1

2 Principles of Neural Information Theory Computational Neuroscience and the E cient Coding Hypothesis James V Stone

3 Title: Principles of Neural Information Theory Author: James V Stone c 2017 Sebtel Press All rights reserved. No part of this book may be reproduced or transmitted in any form without written permission from the author. The author asserts his moral right to be identified as the author of this work in accordance with the Copyright, Designs and Patents Act First Edition, Typeset in L A TEX2". First printing. File: main NeuralInfoTheory v34.tex. ISBN Cover based on photograph of Purkinje cell from mouse cerebellum injected with Lucifer Yellow. Courtesy of National Center for Microscopy and Imaging Research, University of California.

4 For Teleri

5 Contents 1. All That We See Introduction E cient Coding In Search of General Principles Information Theory Neurons, Signals and Noise An Overview of Chapters Information Theory Introduction Finding a Route, Bit by Bit Information and Entropy Entropy and Uncertainty Maximum Entropy Distributions Channel Capacity Mutual Information The Gaussian Channel Bandwidth, Capacity and Power Summary Measuring Neural Information Introduction Spikes Why Spikes? Neural Information Gaussian Firing Rates Information About What? Does Timing Precision Matter? Rate Codes and Timing Codes Summary Pricing Neural Information Introduction Paying With Spikes Paying With Hardware Paying With Energy Diminishing Returns in Flies Paying With Reluctance Optimal Axon Diameter Optimal Distribution of Axon Diameters

6 4.9. Axon Diameter and Spike Speed Optimal Mean Firing Rate The Marginal Value Theorem Optimal Distribution of Firing Rates Optimal Synaptic Conductance Summary Encoding Colour Introduction The Eye How Aftere ects Occur The Problem With Colour A Neural Encoding Strategy Encoding Colour Why Aftere ects Occur Measuring Mutual Information Maximising Mutual Information Principal Component Analysis PCA and Mutual Information Evidence for Colour Encoding Summary Encoding Time Introduction Linear Models Neurons and Wine Glasses The LNP Model Estimating LNP Parameters The Predictive Coding Model Estimating Predictive Coding Parameters Predictive Coding and Information Evidence for Predictive Coding Summary Encoding Space Introduction Spatial Frequency Do Ganglion Cells Decorrelate Images? Are Receptive Field Structures Optimal? Predictive Coding of Images Evidence For Predictive Coding Are Receptive Field Sizes Optimal? Summary

7 8. Encoding Visual Contrast Introduction Not Wasting Entropy Anatomy of a Fly s Eye Maximum Entropy Encoding Evidence for Maximum Entropy Encoding Summary Reading the Neural Code Introduction Bayes Rule Bayesian Decoding Encoding vs Decoding Nonlinear Encoding and Linear Decoding Linear Decodability How Bayes Adds Bits Bayesian Brains Summary Not Even How, But Why Introduction The E cient Coding Principle Counter Arguments Crossing the Neural Rubicon Further Reading 187 Appendices 189 A. Glossary 189 B. Mathematical Symbols 193 C. Correlation and Independence 197 D. A Vector Matrix Tutorial 199 E. Key Equations 203 References 205 Index 211

8 Preface Some scientists consider the brain to be a collection of heuristics or hacks, which have accumulated over evolutionary history. Others think that the brain relies on a small number of fundamental principles, which can be used to explain the diverse systems within the brain. This book provides a rigorous account of how Shannon s mathematical theory of information can be used to test one such principle, which (for historical reasons) is called the e cient coding hypothesis. From Words to Mathematics. The methods used to explore the e cient coding hypothesis lie in the realms of mathematical modelling. Mathematical models demand a precision unattainable with purely verbal accounts of brain function. With this precision, comes an equally precise quantitative predictive power. In contrast, the predictions of purely verbal models can be vague, and this vagueness also makes them virtually indestructible, because predictive failures can often be explained away. No such luxury exists for mathematical models. In this respect, mathematical models are easy to test, and if they are weak models then they are easy to disprove. So, in the Darwinian world of mathematical modelling, survivors tend to be few, but those few tend to be supremely fit. This is not to suggest that purely verbal models are always inferior. Such models are a necessary first step in understanding?. But continually refining a verbal model into ever more rarefied forms is not scientific progress. Eventually, a purely verbal model should evolve to the point where it can be reformulated as a mathematical model, with predictions that can be tested against measurable physical quantities. Happily, most branches of neuroscience reached this state of scientific maturity some time ago. Signposts. The writing style adopted here in Principles of Neural Information Theory describes the raw science of neural information theory, un-fettered by the conventions of standard textbooks. Accordingly, key concepts are introduced informally, before being

9 described mathematically; and each equation is accompanied by explanatory text. However, such an informal style can easily be mis-interpreted as poor scholarship, because informal writing is often sloppy writing. But the way we write is usually only loosely related to the way we speak, when giving a lecture, for example. A good lecture includes many asides and hints about to what is, and is not, important. In contrast, scientific writing is usually formal, bereft of sign-posts about where the main theoretical structures are to be found, and how to avoid the many pitfalls which can mislead the unwary. So, unlike most textbooks, and like the best lectures, this book is intended to be both informal and rigorous, with prominent sign-posts as to where the main insights are to be found, and many warnings about where they are not. Originality. Evidence is presented in the form of scientific papers and books, which are cited using a conventional format (e.g. Smith and Jones (1959)). When the sheer number of cited sources is large, numbered superscripts are used in order to prevent the text from becoming un-readable. Occasionally, facts are presented without evidence, either because they are reasonably self-evident, or because they can be found in standard texts. Consequently, the reader may wonder if the ideas being presented are derived from other scientists, or if they are unique to this book. In such cases, be re-assured that none of the ideas in this book belong to the author. Indeed, like most books, this book represents a synthesis of ideas from many sources, but the general approach is inspired principally by these texts: Vision (1981) 76 by Marr, Spikes (1997) 95 by Rieke, Warland, de Ruyter van Steveninck and Bialek, Biophysics (2012) 17 by Bialek, and Principles of Neural Design (2015) 110 by Sterling and Laughlin. What Is Not Included. A short book like this cannot cover all aspects of a subject in detail, and choosing what to leave out is as important as choosing what to include. In order to compensate for this necessity, pointers to material not included, or not covered in detail, can be found in the References and (annotated) Further Reading section.

10 Who Should Read This Book? The material presented in this book is intended to be accessible to readers with a basic scientific education at undergraduate level. In essence, this book is intended for those who wish to understand how the fundamental ingredients of inert matter, energy and information have been forged by evolution to produce a particularly e cient computational machine, the brain. Understanding how the elements of this triumvirate interact demands knowledge from a wide variety of academic disciplines, but principally from biology and mathematics. Accordingly, reading this book will require some patience from mathematicians to navigate the conceptual shift involved in re-interpreting the mathematics of signal processing in terms of neural computation, and it will require sustained e ort from biologists to understand the mathematical material. Even though some of the mathematical material is fairly sophisticated, care has been taken to ensure it is explained from a reasonably low level. Additionally, key concepts are introduced using diagrams wherever possible, so that readers can gain insight into the basic ideas being presented. PowerPoint Slides of Figures. Most of the figures used in this book can be downloaded from Corrections. Please corrections to j.v.stone@she eld.ac.uk. A list of corrections can be found at Acknowledgments. Thanks to all those involved in developing the freely available L A TEX2" software, which was used to typeset this book. Shashank Vatedka deserves a special mention for checking the mathematics in a final draft of this book. (NOT DONE YET) Thanks to Caroline Orr for meticulous copy-editing and proofreading. (NOT DONE YET). Thanks to Kevin Gurney, John de Pledge, Royston Sellman, and Steve Snow for interesting discussions on the role of information in biology. For reading draft versions of this book, I am very grateful to David Atwell, Ilona Box, Karl Friston, Raymond Lister, Stuart Wilson and... Jim Stone, She eld, England, 2017.

11 To understand life, one has to understand not just the flow of energy, but also the flow of information. W Bialek, 2012.

12 Chapter 1 All That We See When we see, we are not interpreting the pattern of light intensity that falls on our retina; we are interpreting the pattern of spikes that the million cells of our optic nerve send to the brain. Rieke, Warland, De Ruyter van Steveninck, and Bialek, Introduction All that we see begins with an image formed on the eye s retina (Figure 1.1). Initially, this image is recorded by 126 million photoreceptors within the retina. The outputs of these photoreceptors are then repackaged or encoded, via a series of intermediate connections, into a sequence of digital pulses or spikes, that travel through the one million neurons of the optic nerve which connect the eye to the brain. The fact that we see so well suggests that the retina must be extraordinarily accurate when it encodes the image into spikes, and the brain must be equally accurate when it decodes those spikes into all that we see (Figure 1.2). But the eye and brain are not only good at translating the world into spikes, and spikes into perception, they are also e cient at transmitting information from the eye to the brain whilst expending as little energy as possible. Precisely how e cient, is the subject of this book. 1

13 1 All That We See 1.2. E cient Coding Neurons communicate information, and that is pretty much all that they do. But neurons are energetically expensive to make, maintain, and run 70. Half of a child s total energy budget, and a fifth of an adult s budget, is required just to keep the brain ticking over 106 (Figure 1.3). For children and adults, half the total energy budget of the brain is used for neuronal information processing, and the rest is used for basic maintenance 9. The high cost of using neurons probably accounts for the fact that only 2-4% of them are active at any one time 71. Given that neurons and spikes are so expensive, we should be unsurprised to find that when the visual data from the eye is encoded as a series of spikes, each neuron and each spike conveys as much information as possible. These considerations have given rise to the e cient coding hypothesis 8;14;17;37;53;95;122;123, an idea developed over many years by Horace Barlow (1959) 13. The e cient coding hypothesis is conventionally interpreted to mean that neurons encode sensory data to transmit as much information as possible. Even though it is not usually made explicit, if data are encoded e ciently as described above then this often implies that the amount of energy paid for information is as small as possible. In order to avoid confusion, we adopt a specific interpretation of the e cient coding hypothesis here: neurons transmit as much information as possible per Joule of energy expended, defined here as energy e ciency 11;52;62;64;82;87;88;110;121. Re#na Op#c Nerve Lens Figure 1.1. Cross section of eye. 2

14 1.3. In Search of General Principles There are a number of di erent methods which collectively fall under the umbrella term e cient coding. However, to a first approximation, the results of applying these various methods tend to be quite similar 95, even though the methods themselves appear quite di erent. These methods include sparse coding 46, principal component analysis, independent component analysis 16;111, information maximisation (infomax) 73, predictive coding 92;109 and redundancy reduction 6. We will encounter most of these broadly similar methods throughout this book, but we place special emphasis on predictive coding because it is based on a single principle, and it has a wide range of applicability In Search of General Principles The test of a theory is not just whether or not it accounts for a body of data, but also how complex the theory is in relation to the complexity of the data being explained. Clearly, if a theory is, in some sense, more convoluted than the phenomenon it explains then it is not much of a theory. As an extreme example, if each of the 86 billion neurons in the brain required its own unique theory then the resultant collective theory of brain function would be almost as complex as the brain itself. This is why we favour theories that explain a vast range of phenomena with the minimum of words or equations. A prime example of such a parsimonious theory is Newton s theory of gravitation, which explains a Response (spikes) Encoding b Luminance Decoding Reconstructed luminance Time (ms) Figure 1.2. Encoding and decoding. A rapidly changing luminance (bold curve in b) is encoded as a neuronal spike train (a), which is decoded to yield an estimate of the luminance (thin curve in b). 3

15 1 All That We See (amongst other things) how a ball falls to Earth, how atmospheric pressure varies with height above the Earth, and how the Earth orbits the Sun. In essence, we favour theories which rely on a general principle to explain a diverse range of physical phenomena. If we want to understand how the brain works then we need more than a theory which is expressed in mere words. For example, if the theory of gravitation were stated only in words then we could say that each planet has an approximately circular orbit, but we would have to use many words to prove precisely why each orbit must be elliptical, and to state exactly how elliptical each orbit is. In contrast, a few equations express these facts exactly, and without ambiguity. Thus, whereas words are required to provide theoretical context, mathematics imposes a degree of precision which is extremely difficult, if not impossible, to achieve with words alone. To quote one of the first great scientists, The universe is written in this grand book, which stands continually open to our gaze, but it cannot be understood unless one first learns to comprehend the language in which it is written. It is written in the language of mathematics, without which it is humanly impossible to understand a single word of it. Galileo Galilei, In the spirit of Galileo s recommendation, a rigorous theory of information processing in the brain should begin with a quantitative definition of information (see Chapter 2). (a) (b) Figure 1.3. a) The human brain weighs 1300g, contains about 1010 neurons, and consumes 12 Watts of power. The outer surface seen here is the neocortex. b) Each neuron (plus its support structures) therefore accounts for an average of 1200pJ/s (1pJ/s = J/s). 4

16 1.4. Information Theory 1.4. Information Theory Here in the 21st century, where we rely on computers for almost every aspect of our daily lives, it seems obvious that information is important. However, it would have been impossible for us to know just how important it is before Claude Shannon almost single-handedly created information theory in the 1940s. Since that time, it has become increasingly apparent that information, and the energy cost of each bit of information, imposes fundamental, unbreachable limits on the form and function of biological mechanisms. In this book, we concentrate on one particular function, information processing in the brain. Claude Shannon s theory, first published in , heralded a transformation in our understanding of information. Before 1948, information had been regarded as a kind of poorly defined miasmic fluid. But afterwards, it became apparent that information is a welldefined and, above all, measurable quantity. Shannon s theory of information provides a mathematical definition of information, and describes precisely how much information can be communicated between di erent elements of a system. This may not sound like much, but Shannon s theory underpins our understanding of how signals and noise are related, and why there are definite limits to the rate at which information can be processed within any system, whether man-made or biological Neurons, Signals and Noise When a question is typed into a computer search engine, the results provide useful information, but this is buried in a sea of mostly useless data. In this internet age, it is easy for us to appreciate the di erence between information and mere data, and we have learned to treat the information as useful signal and the rest as useless noise. This experience is now so commonplace that phrases like signal to noise ratio are becoming part of everyday language. Even though most people are unaware of the precise meaning of this phrase, they know intuitively that data comprise a combination of signal and noise. 5

17 1 All That We See This type of scenario is ubiquitous in the natural world, where the ability of eyes and ears to extract useful signals from noisy sensory data, and to encode those signals e ciently, is the key to survival 113. Indeed, the e cient coding hypothesis suggests that the evolution of sense organs, and of the brains that process data from those organs, is primarily driven by the need to minimise the energy expended for each bit of information acquired from the environment. And, because information theory tells us how to measure information precisely, it provides an objective benchmark against which the performance of neurons can be compared. The maximum rate at which information can be transmitted through a neuron can be increased in a number of di erent ways. However, whichever way we (or evolution) chooses to do this, doubling the information rate costs more than a doubling in neuronal hardware, and more than twice the amount of power (energy per second) 110. This is a universal phenomenon, which implies a diminishing information return on every additional micrometre of neuron diameter, and on every additional Joule of energy invested in transmitting spikes along a neuron. Information theory does not place any conditions on what type of mechanism implements information processing; in other words, on exactly how it is to be achieved. However, unless there are unlimited amounts of power available, relatively little information will reach the brain without some form of encoding. In other words, information theory does not specify how any biological function, such as vision, is implemented, but it does set fundamental limits on what is achievable by any implementation, biological or otherwise. Once we acknowledge that these limits are unbreachable, and that they e ectively extort such a high price, we should be unsurprised that Nature has managed to evolve brains which are exquisitely sensitive to the many trade-o s between time, neuronal hardware, energy and information. As we shall see, whenever such a trade-o is encountered, the brain seems to maximise the amount of information gained for each Joule of energy expended. 6

18 1.6. An Overview of Chapters The distinction between a function and the mechanism which implements that function is a cornerstone of David Marr s (1982) 76 computational theory of vision. Marr s approach stresses the interplay of theory and practice, and is summarised in a single quotation: Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: It just cannot be done. Even though Marr did not address the role of information theory directly, his analytic approach has served as a source of inspiration, not only for the approach adopted in this book, but also for much of the progress made within computational neuroscience An Overview of Chapters This section contains technical terms which are explained fully in the relevant chapters, and in the Glossary. To fully appreciate the importance of information theory for neural computation, some familiarity with the basic elements of information theory is required; these elements are presented in Chapter 2 (which can be skipped on a first reading of the book). In Chapter 3, we consider, how to apply information theory to the problem of measuring the amount of information in the output of a spiking neuron, and how much of this information (i.e. mutual information) is related to the neuron s input. This leads to an analysis of the nature of the neural code; specifically, whether it is a rate code or a spike timing code. In Chapter 4, we discover that one of the consequences of information theory (specifically, Shannon s noisy coding theorem) is that the cost of information rises inexorably and disproportionately with information rate. This steep rise explains why parameters like axon diameter, the distribution of axon diameters, mean firing rate, and synaptic conductance, appear to be tuned to minimise the cost of information. In Chapter 5, we consider how the correlations between the outputs of photoreceptors sensitive to similar colours threaten to reduce information rates, and how this can be ameliorated by synaptic preprocessing in the retina. This pre-processing explains not only how, but 7

19 1 All That We See also why, there is a red-green aftere ect, but no red-blue aftere ect. A more formal account involves using principal component analysis to show which particular synaptic connection values maximise neuronal information throughput. In Chapter 6, the lessons learned so far are applied to the problem of encoding time-varying, correlated visual inputs. We explore how a standard (LNP) neuron model can be used for e cient coding of the temporal structure of retinal images, and how a model based on predictive coding yields similar results to this standard model. We also show how predictive coding is a viable candidate for implementing e cient coding. In Chapter 7, we consider how information theory predicts the receptive field structures of retinal ganglion cells across a range of luminance conditions. Evidence is presented that these the receptive field structures are also obtained using predictive coding. Once colour, spatial and temporal structure has been encoded by a neuron, the resultant signals must pass through the neuron s non-linear input/output (transfer) function. Accordingly, Chapter 8 is based on a classic paper by Simon Laughlin (1981) 68,whichpredictstheprecise form that this transfer function should adopt in order to maximise information throughput; a form which matches the transfer function found in visual neurons. Having considered how to encode sensory inputs, the problem of how to decode neuronal outputs is addressed in Chapter 9, where the importance of prior knowledge or experience is explored in the context of Bayes theorem. We also consider how often a neuron should produce a spike so that each spike conveys as much information as possible, and we find that the answer coincides with a vital property of e cient communication (i.e. linear decodability). A defining property of the computational approach adopted in this text is that, within each chapter, we explore particular neuronal mechanisms, how they work, and (most importantly) why they work in the way they do. As a result, each chapter documents evidence that the design of biological mechanisms is determined largely by the need to process information e ciently. 8

20 Chapter 2 Information Theory Abasicideaininformationtheoryisthatinformationcanbetreated very much like a physical quantity, such as mass or energy. C Shannon, Introduction Just as a bird cannot fly without obeying the laws of physics, so, a brain cannot function without obeying the laws of information. And, just as the shape of a bird s wing is ultimately determined by the laws of physics, so the structure of a neuron is ultimately determined by the laws of information. In order to understand how these laws are related to neural computation, it is necessary to have some familiarity with Shannon s theory of information. Every image formed on the retina, and every sound that reaches the ear is sensory data, which contains information about some aspect of the world. For an owl, the sound of a mouse rustling a leaf may indicate a meal is below; for the mouse, a flickering shadow overhead may indicate it is about to become a meal. Precisely how much information is gained by a receiver from sensory data depends on three things. First, and self-evidently, it depends on the amount of information in the data. Second, the relative amounts of relevant information or signal, and irrelevant information or noise, in the data. Third, the ability of the receiver to separate the signal from the noise. 9

21 Further Reading Bialek (2012) 17. Biophysics. This recent text is a comprehensive and rigorous account of the physics which underpins biological processes, including computational neuroscience and morphogenesis. Bialek adopts the information-theoretic approach so successfully applied in his previous book (Spikes, see below). The writing style is fairly informal, but this book requires a high level of mathematical competence to be fully appreciated. However, it is so full of valuable insights that it will reward careful reading by less numerate readers. Highly recommended. Dayan P and Abbott LF (2001) 31. Theoretical Neuroscience This classic text is a comprehensive and rigorous account of computational neuroscience, which demands a high level of mathematical competence. Doya K, Ishii S, Pouget A and Rao RPN (2007) 38. Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press. This book of edited chapters contains some superb introductions to Bayes rule in the context of vision and brain function, as well as more advanced material. Gerstner, W and Kistler, WM 49. Spiking neuron models: Single neurons, populations, plasticity, Excellent text on models of the detailed dynamics of neurons. Land MF and Nilsson DE (2002). Animal eyes. New York: Oxford University Press. A landmark book by two of the most authoritative exponents of the intricacies of eye design. Marr, D (1982, reprinted 2010) Vision: A computational investigation into the human representation and processing of visual information. This classic book inspired a generation of vision scientists to explore the computational approach, succinctly summarised in this famous quotation, Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: It just cannot be done. Marr s book is included here because his computational framework underpins much of the work reported in this text. Reza FM (1961) 93. An Introduction to Information Theory. Comprehensive and mathematically rigorous, with a reasonably geometric approach. Rieke F, Warland D, van Steveninck RR and Bialek W (1997) 95. Spikes: Exploring the Neural Code. First modern text to formulate questions about the functions of single neurons in terms of information 187

22 Further Reading theory. Superbly written in a tutorial style, well argued, and fearless in its pursuit of solid answers to hard questions. Research papers by these authors are highly recommended for their clarity. Eliasmith C (2004) 42. Neural Engineering. Technical account which stresses the importance of linearisation using opponent processing, and nonlinear encoding in the presence of linear decoding. Sterling P and Laughlin S (2015) 110. Principles of Neural Design. Comprehensive and detailed account of how information theory constrains the design of neural systems. Sterling and Laughlin interpret physiological findings from a wide range of organisms in terms of a single unifying framework: Shannon s mathematical theory of information. A remarkable book, highly recommended. Zhaoping L (2014) 123. Understanding Vision. Contemporary account of vision based mainly on the e cient coding hypothesis. Even though this book is technically demanding, the introductory chapters give a good overview of the approach. Tutorial Material Byrne JH, Neuroscience Online, McGovern Medical School University of Texas. Laughlin SB (2006). The Hungry Eye: Energy, Information and Retinal Function, Lay DC (1997). Linear Algebra and its Applications. Pierce JR (1980). An Introduction to Information Theory: Symbols, Signals and Noise. Riley KF, Hobson MP and Bence SJ (2006). Mathematical Methods for Physics and Engineering. Scholarpedia. This online encyclopedia includes excellent tutorials on computational neuroscience. Smith S (2013). Digital signal processing: A practical guide for engineers and scientists. Freely available at Stone JV (2012). Vision and Brain: How We See The World. Stone JV (2013). Bayes Rule: A Tutorial Introduction. Stone JV (2015). Information Theory: A Tutorial Introduction. 188

23 Appendix A Glossary ATP Adenosine triphosphate: molecule responsible for energy transfer. autocorrelation function Given a sequence x = (x 1,..., x n ), the autocorrelation function specifies how quickly the correlation between nearby values x i and x i+d diminishes with distance d. average Given a variable x, the average, mean or expected value of a sample of n values of x is x =1/n P n j=1 x j. Bayes rule Given an observed value y 1 of the variable y, Bayes rule states that the posterior probability that the variable x has the value x 1 is p(x 1 y 1 )=p(y 1 x 1 )p(x 1 )/p(y 1 ). binary digit A binary digit can be either a 0 or a 1. bit Information required to choose between two equally probable alternatives. capacity The maximum rate at which a communication channel can communicate information from its input to its output. central limit theorem As the number of samples from almost any distribution increases, the distribution of sample means becomes increasingly Gaussian. coding capacity The maximum amount of information that could be conveyed by a neuron with a given firing rate. conditional probability The probability that the value of one random variable x has the value x 1 given that the value of another random variable y has the value y 1, written as p(x 1 y 1 ). conditional entropy Given two random variables x and y, the average uncertainty regarding the value of x when the value of y is known, H(x y) = E[log(1/p(x y))] bits. convolution A filter can be used to change (blur or sharpen) a signal by convolving the signal with the filter, as defined in Section 6.4. correlation See Appendix C. 189

24 Glossary cross-correlation function Given two sequences x =(x 1,..., x n ) and y = (y 1,..., y n ), the cross-correlation function specifies how quickly the correlation between x i and y i+d diminishes with d. cumulative distribution function The cumulative area under the probability density function (pdf) of a variable. decoding Translating neural outputs (e.g. firing rates) to inputs (e.g. stimulus values). Modelled using a decoding kernel v. decoding function Maps neuronal outputs y to inputs x MAP,where x MAP is the most probable value of x given y. decorrelated If two Gaussian variables are decorrelated (or equivalently) uncorrelated then each variable provides no information about the other variable. See Appendix C. dynamic linearity If a neuron s response to a sinusoidal input is a sinusoidal output then it has dynamic linearity. e cient coding hypothesis States that sensory data is encoded so as to maximise the amount of information transmitted. In this book, the e cient coding hypothesis is interpreted to mean that sensory data is encoded so as to maximise energy e ciency. encoding Translating neural inputs (e.g. stimulus values) to outputs (e.g. firing rates). Modelled using an encoding kernel w. encoding function Maps neuronal inputs x to outputs y MAP,where y MAP is the most probable value of y given x. Equal to the transfer function if the distribution of noise in y is symmetric. energy e ciency The amount of information (i.e. coding capacity) or mutual information obtained per Joule of energy ". entropy The entropy of a variable y is a measure of its variability. If y adopts m possible values with probabilities p(y 1 ),..., p(m)) then its entropy is H(y) = P m i=1 p(y i) log 2 1/p(y i )bits. equivariant Having equal variance. expected value See average. filter See Wiener kernel. Gaussian variable A variable with a Gaussian distribution of values. iid If a variable x has values which are sampled independently from the same probability distribution then x is independent and identically distributed (iid). independence If two variables x and y are independent then the value x provides no information about the value y, and vice versa. information The information conveyed by a variable x which adopts the value x = x 1 with probability p(x 1 ) is log 2 (1/p(x 1 )) bits. joint probability The probability that two or more variables simultaneously adopt specified values. 190

25 Glossary Joule One Joule is the energy required to raise 100g by 1 metre. kernel See filter. linear See static linearity and dynamic linearity. linear filter See kernel. linear decodability If a continuous signal s, which has been encoded as a spike train y, can be reconstructed using a linear filter then it is linearly decodable. static linearity If a system has a static linearity then the response to an input value x t is proportional to x t. logarithm If x is expressed as a logarithm y with base a then y = log a x is the power to which we have to raise a in order to obtain x. mean See average. micron One micron equals 10 6 m, or one micrometre (1 µm). mitochondrion Cell organelle responsible for generating ATP. monotonic If y = f(x) is a monotonic function of x then a change in x always induces an increase or it always induces a decrease in y. mutual information The reduction in uncertainty I(y, x) regarding the value of one variable x induced by knowing the value of another variable y. Mutual information is symmetric, so I(y, x) = I(x, y). noise The random jitter that is part of a measured quantity. natural statistics The statistical distribution of values of a physical parameter (e.g. contrast) observed in the natural world. neocortex The most recent, in evolutionary terms, part of the mammalian brain, it is a 2-3mm layer of cells comprising six layers of neurons. orthogonal Perpendicular. phase The phase of a signal can be considered as its left-right position. A sine and a cosine wave have a phase di erence of 90 degrees. power The rate at which energy is expended per second (Joules/s). power spectrum A histogram of frequency versus the power at each frequency for a given signal is the power spectrum of that signal. probability distribution Given a variable x which can adopt the values {x 1,..., x n }, the probability distribution is p(x) = {p(x 1 ),..., p(x n )}, wherep(x i ) is the probability that x = x i. probability function (pf) A function p(x) of a discrete random variable y defines the probability of each value of x. The probability that x = x 1 is p(x = x 1 ) or, more succinctly, p(x 1 ). pyramidal neuron Regarded as the computational engine with various brain regions, especially the neocortex. Each pyramidal neuron receives thousands of synaptic inputs, and has an axon which is about 4cm in length. 191

26 Glossary random variable (RV) The concept of a random variable x can be understood from a simple example, like the throw of a die. Each physical outcome is a number x i, which is the value of the random variable, so that x = x i. The probability of each value is defined by a probability distribution p(x) ={p(x 1 ), p(x 2 ),...}. redundancy Given an ordered set of values of a variable (e.g. in an image or sound), if a value can be obtained from a knowledge of other values then it is redundant. signal In this book, the word signal usually refers to a noiseless variable, whereas the word variable is used to refer to noisy and noiseless variables. signal to noise ratio Given a variable y = x + which is a mixture of a signal x with variance S and noise with variance N, the signal to noise ratio of y is SNR= S/N. standard deviation The square root of the variance of a variable. theorem A mathematical statement which has been proven to be true. timing precision If firing rate is measured using a timing interval of t = 0.001s then the timing precision is =1/ ts = 1,000s 1. transfer function The sigmoidal function E[y] = g(x) which maps neuronal inputs x to mean output values. Also known as a static nonlinearity in signal processing. See encoding function. tuning function Defines how firing rate changes as a function of a physical parameter, such as luminance, contrast, or colour. uncertainty In this text, uncertainty refers to the surprisal (i.e. log(1/p(y))) of a variable y. uncorrelated See decorrelated. variable A variable is like a container, usually for a number. Continuous variables (e.g. temperature) can adopt any value, whereas discrete variables (e.g. a die) adopt certain values only. variance The variance is of x is var(x) =E[(x x) 2 ], and is a measure of how spread out the values of a variable are. vector A vector is an ordered list of m numbers x =(x 1,..., x m ). white By analogy with white light, a signal which contains an equal proportion of all frequencies is white, so it has a flat power spectrum. A Gaussian signal with uncorrelated values is white. Wiener kernel Spatial or temporal filter, linearity in this text. which has dynamically 192

27 Appendix B Mathematical Symbols convolution operator. See pages 106 and 130. (upper case letter delta) represents a small increment. t small timing interval used to measure firing rate. r (nabla, also called del), represents a vector valued gradient. ˆ (hat) used to indicate an estimated value. For example, ˆv x is an estimate of the variance v x. x indicates the absolute value of x (e.g. if x = 3then x = 3). apple if x apple y then x is less than or equal to y. if x y then x is greater than or equal to y. means approximately equal to. if a random variable x has a distribution p(x) thenthisiswritten as x p(x). / indicates proportional to. E [ ] the mean or expectation of a variable. P (capital sigma), represents summation. For example, if we represent the n = 3 numbers 2, 5 and 7 as x 1 = 2, x 2 = 5, x 3 =7then their sum x sum is x sum = nx x i = x 1 + x 2 + x 3 = i=1 The variable i is counted up from 1 to n, and, for each i, theterm x i adopts a new value and is added to a running total. 193

28 Mathematical Symbols Q (capital pi) represents multiplication. For example, if we use the values defined above then the product of these n = 3 integers is x prod = ny i=1 x i = x 1 x 2 x 3 =2 5 7 = 70. (B.1) The variable i is counted up from 1 to n, and, for each i, theterm x i adopts a new value, and is multiplied by a running total. (epsilon) coding e ciency, which is the mutual information between neuronal input and output expressed as a proportion of the coding capacity of the neuron s output. " (Latin letter epsilon) energy e ciency, which is the number of bits of information (coding capacity) or mutual information obtained per Joule of energy. (iota) denotes timing precision, =1/ t s 1. (lambda) mean and variance of a Poisson distribution. s (lambda) space constant used in exponential decay of membrane voltage, the distance over which voltage decays by 67%. (eta) ganglion cell noise. (ksi) photoreceptor noise. µ (mu) the mean of a variable. (x, y) (rho) the correlation between x and y. (sigma) the standard deviation of a distribution. corr(x, y) the estimated correlation between x and y. cov(x, y) the estimated covariance between x and y. C channel capacity, the maximum information that can be transmitted through a channel, usually measured in bits per second (bits/s). C m m covariance matrix of the output values of m cones. C t C s m m temporal covariance matrix of the output values of m cones. m m spatial covariance matrix of the output values of m cones. 194

29 Mathematical Symbols C m m covariance matrix of noise values. e constant, equal to E the mean, average, or expected value of a variable x, written as E[x]. g encoding function, which transforms a signal s =(s 1,..., s k )into channel inputs x =(x 1,..., x n ), so x = g(s). H(x) entropy of x, which is the average Shannon information of the probability distribution p(x) of the random variable x. H(x y) conditional entropy of the conditional probability distribution p(x y). This is the average uncertainty in the value of x after the value of y is observed. H(y x) conditional entropy of the conditional probability distribution p(y x). This is the average uncertainty in the value of y after the value of x is observed. H(x, y) entropy of the joint probability distribution p(x, y). I(x, y) mutual information between x and y, average number of bits provided by each value of y about the value of x, and vice versa. ln x natural logarithm (log to the base e) of x. log x logarithm of x. Logarithms use base 2 in this text, and base is indicated with a subscript if the base is unclear (e.g. log 2 x). m number of di erent possible messages or input values. M number of bins in a histogram. N noise variance in Shannon s fundamental equation for the capacity of a Gaussian channel C = 1 2 log(1 + P/N). n the number of observations in a data set (e.g. coin flip outcomes). p(x) the probability distribution of the random variable x. p(x i ) the probability (density) that the random variable x = x i. p(x, y) joint probability distribution of the random variables x and y. p(x i y i ) the conditional probability that x = x i given that y = y i. R the entropy of a spike train, such that R max R I(x, y) R info. 195

30 Mathematical Symbols R info R max estimate of the mutual information I(x, y) between neuronal input and output. neural coding capacity, the maximum rate at which neuronal information can be communicated. s stimulus values, s =(s 1,..., s n ). t small interval of time; usually, t = 0.001s for neurons. v x if x has mean µ then the variance of x is v x = 2 x =E[(µ x) 2 ]. w vector of m weights comprising a linear encoding kernel; w is usually an estimate of the optimal kernel w. v vector of m weights comprising a linear decoding kernel; v is usually an estimate of the optimal kernel v. x cone output values, x =(x 1,..., x n ). x vector variable; x t is the value of the vector variable x at time t. X vector variable x expressed as an m n data matrix. y neuron output values. y model neuron output values. y vector variable; y t is the value of the vector variable y at time t. z ganglion cell input values, z =(z 1,..., z n ). z vector variable; z t is the value of the vector variable z at time t. 196

31 Appendix C Correlation and Independence Correlation and Covariance. The similarity between two variables x and y is measured in terms of a quantity called the correlation. Ifx has a value x t at time t and y has a value y t at the same time t then the correlation between x and y is defined as (x, y) = c xy E[(x t x)(y t y)], (C.1) where x is the average value of x, y is the average value of y, and c xy is a constant which ensures that the correlation has a value between -1 and +1. Given a sample of n pairs of values, the correlation is estimated as corr(x, y) = c xy n nx (x t x)(y t y). (C.2) t=1 We are not usually concerned with the value of the constant c xy,but for completeness, it is defined as c xy = p var(x) var(y), where var(x) and var(y) are the variances in x and y, respectively. Variance is a measure of the spread in the values of of a variable. For example, the variance in x is estimated as var(x) = 1 n nx (x t x) 2. (C.3) In fact, it is conventional to use the un-normalised version of correlation, called the covariance, which is estimated as cov(x, y) = 1 n t=1 nx (x t x)(y t y). (C.4) t=1 Because covariance is proportional to correlation, these terms are often used interchangeably. 197

32 Correlation and Independence It will greatly simplify notation if we assume all variables have a mean of zero. For example, this allows us to express the covariance more succinctly as n cov(x, y) = 1X x t yt n t=1 (C.5) = E[xt yt ], (C.6) and the correlation as corr(x, y) = cxy E[xt yt ]. (C.7) Decorrelation and Independence. If two variables are independent then the value of one variable provides no information about the corresponding value of the other variable. More precisely, if two variables x and y are independent then their joint distribution p(x, y) is given by the product of the distributions p(x) and p(y) p(x, y) = p(x) p(y). (C.8) Given this definition, we can also consider decorrelation as a special case of independence. For example, if two zero-mean variables x and y are independent then, for any non-zero values of and, the correlation between x t and yt is zero, corr(x, y ) = cxy E[x t yt ] = 0. p(x,y) (C.9) p(x,y) x y (a) x y (b) Figure C.1. (a) Joint probability density function p(y, x) for correlated Gaussian variables x and y. The probability density p(x, y) is indicated by the density of points on the ground plane at (x, y). The marginal distributions p(y) and p(x) are on the side axes. (b) Joint probability density function p(x, y) for independent Gaussian variables, for which p(x, y) = p(x)p(y). 198

33 Appendix D A Vector Matrix Tutorial The single key fact about vectors and matrices is that each vector represents a point located in space, and a matrix moves that point to a di erent location. Everything else is just details. Vectors. A number, such as 1.234, is known as a scalar, and a vector is an ordered list of scalars. Here is an example of a vector with two components a and b: w =(a, b). Note that vectors are written in bold type. The vector w can be represented as a single point in a graph, where the location of this point is by convention a distance of a from the origin along the horizontal axis and a distance of b from the origin along the vertical axis. Adding Vectors. Thevector sum of two vectors is the addition of their corresponding elements. Consider the addition of two pairs of scalars (x 1, x 2 ) and (a, b) (a + x 1 ), (b + x 2 ). (D.1) Clearly, (x 1, x 2 ) and (a, b) can be written as vectors: z = (a + x 1 ), (b + x 2 ) (D.2) = (x 1, x 2 )+(a, b) = x + w. (D.3) Thus the sum of two vectors is another vector, which is known as the resultant of those two vectors. Subtracting Vectors. Subtracting vectors is similarly implemented by the subtraction of corresponding elements so that z = x w (D.4) = (x 1 a), (x 2 b). (D.5) 199

34 Vector Matrix Multiplying Vectors. Consider the sum given by the multiplication of two pairs of scalars (x 1, x 2 ) and (a, b) y = ax 1 + bx 2. (D.6) Clearly, (x 1, x 2 ) and (a, b) can be written as vectors y = (x 1, x 2 ).(a, b), = x.w, (D.7) where equation (D.7) is to be interpreted as equation (D.6). This multiplication of corresponding vector elements is known as the inner, scalar or dot product, and is often denoted with a dot, as here. Vector Length. First, as each vector represents a point in space it must have a distance from the origin, and this distance is known as the vector s length or modulus, denoted as x for a vector x. For a vector x =(x 1, x 2 ) with two components this distance is given by the length of the hypotenuse of a triangle with sides x 1 and x 2, so that q x = x x2 2. (D.8) Angle between Vectors. The angle between two vectors x and w is cos = x.w x w. (D.9) Crucially, if = 90 degrees then the inner product is zero, because cos 90 = 0, irrespective of the lengths of the vectors. Vectors at 90 degrees to each other are known as orthogonal vectors. Row and Column Vectors. Vectors come in two basic flavours, row vectors and column vectors. A simple notational device to transform a row vector (x 1, x 2 ) into a column vector (or vice versa) is the transpose operator, T (x 1, x 2 ) T x1 =. (D.10) The reason for having row and column vectors is because it is often necessary to combine several vectors into a single matrix which is then used to multiply a single vector x, defined here as x 2 x =(x 1, x 2 ) T. (D.11) In such cases it is necessary to keep track of which vectors are row vectors and which are column vectors. If we redefine w as a column 200

35 Vector Matrix vector, w =(a, b) T,thentheinnerproductw.x can be written as y = w T x (D.12) x1 = (a, b) (D.13) x 2 = x 1 w 1 + x 2 w 2. (D.14) Here, each element of the row vector w T is multiplied by the corresponding element of the column x, and the results are summed. Writing the inner product in this way allows us to specify many pairs of such products as a vector-matrix product. If x is a vector variable such that x 1 and x 2 have been measured n times (e.g. at n time consecutive time steps) then y is a variable with n values x11, x (y 1, y 2,..., y n ) = (a, b) 12,..., x 1n. (D.15) x 21, x 22,..., x 2n Here, each (single element) column y 1t is given by the inner product of the corresponding column in x with the row vector w. This can now be rewritten succinctly as y = w T x. Vector Matrix Multiplication. If we reset the number of times x has been measured to N = 1 for now then we can consider the simple case of how two scalar values y 1 and y 2 are given by the inner products y 1 = w T 1 x (D.16) y 2 = w T 2 x, (D.17) where w 1 =(a, b) T and w 2 =(c, d) T. If we consider the pair of values y 1 and y 2 as a vector y = (y 1, y 2 ) T then we can rewrite equations (D.16) and (D.17) as (y 1, y 2 ) T =(w T 1 x, w T 2 x) T. (D.18) If we combine the column vectors w 1 and w 2 then we can define a matrix W W = (w 1, w 2 ) T (D.19) = a c b d. (D.20) 201

36 Vector Matrix We can now rewrite equation (D.18) as (y 1, y 2 ) T a b = c d (x 1, x 2 ) T. (D.21) This can be written more succinctly as y = W x. This defines the standard syntax for vector-matrix multiplication. Note that the column vector (x 1, x 2 ) T is multiplied by the first row in W to obtain y 1 and is multiplied by the second row in W to obtain y 2. Just as the vector x represents a point on a plane, so the point y represents a (usually di erent) point on the plane. Thus the matrix W implements a linear geometric transformation of points from x to y. If n > 1thenthetth column (y 1t, y 2t ) T in y is obtained as the product of tth column (x 1t, x 2t ) T in x with the row vectors in W y11, y 12,..., y 1n a b x11, x = 12,..., x 1n y 21, y 22,..., y 2n c d x 21, x 22,..., x 2n = (w 1, w 2 ) T (x 1, x 2 ) T (D.22) = W x. (D.23) Each (single element) column in y 1 is a scalar value which is obtained by taking the inner product of the corresponding column in x with the first row vector w1 T in W. Similarly, each column in y 2 is obtained by taking the inner product of the corresponding column in x with the second row vector w2 T in W. Transpose of Vector-Matrix Product. It is useful to note that if y = W x then the transpose y T of this vector-matrix product is y T =(W x) T = x T W T, (D.24) where the transpose of a matrix is defined by T W T a b a = = c d b c d. (D.25) Matrix Inverse. By analogy with scalar algebra, if y = W x then x = W 1 y,wherew 1 is the inverse of W. If the columns of a matrix are orthogonal then W 1 = W T. 202

37 Appendix E Key Equations Logarithms use base 2 unless stated otherwise. Entropy Joint entropy H(y) = H(y) = H(x, y) = H(x, y) = mx mx 1 p(y i ) log p(y i ) bits i=1 Z x i=1 j=1 Z y Z p(y) log 1 dy bits p(y) (E.2) mx p(x i, y j ) log x p(x, y) log 1 p(x i, y j ) bits 1 dx dy bits p(x, y) (E.4) (E.1) (E.3) H(x, y) = I(x, y) +H(x y) +H(y x) bits (E.5) Conditional Entropy H(x y) = H(y x) = H(x y) = H(y x) = mx i=1 j=1 mx i=1 j=1 Z Z y y Z Z mx p(x i, y j ) log mx p(x i, y j ) log x x p(x, y) log p(x, y) log 1 p(x i y j ) bits 1 p(y j x i ) bits 1 dx dy bits p(x y) (E.8) 1 dx dy bits p(y x) (E.9) (E.6) (E.7) H(x y) = H(x, y) H(y) bits (E.10) H(y x) = H(x, y) H(x) bits (E.11) 203

38 Key Equations from which we obtain the chain rule for entropy H(x, y) = H(x) +H(y x) bits (E.12) = H(y)+H(x y) bits (E.13) Marginalisation mx p(x i )= p(x i, y j ), mx p(y j )= p(x i, y j ) (E.14) j=1 Mutual Information Z Z p(x) = p(x, y) dy, p(y) = i=1 y x p(x, y) dx (E.15) I(x, y) = I(x, y) = mx mx i=1 j=1 Z y Z x p(x i, y j ) log p(x i, y j ) p(x i )p(y j ) bits p(x, y) log p(x, y) dx dy bits p(x)p(y) (E.17) (E.16) I(x, y) = H(x) +H(y) H(x, y) (E.18) = H(x) H(x y) (E.19) = H(y) H(y x) (E.20) = H(x, y) [H(x y) +H(y x)] bits (E.21) If y = x+, withx and y (possibly non-white) Gaussian variables with bandwidth W, where the Fourier components of the signal x f and noise f have variance S f and N f (respectively) then Z W I(x, y) = 1/2 log 1+ S f df bits, f=0 N f where S f /N f is the signal to noise ratio at frequency f. (E.22) Channel Capacity C = max p(x) I(x, y)bits. (E.23) If the channel input x has variance S, the noise has variance N, and both x and are white and Gaussian then C = 1/2 log 2 1+ S bits, (E.24) N where the ratio of variances S/N is the signal to noise ratio. 204

39 References [1] EH Adelson and JR Bergen. Spatiotemporal energy models for the perception of motion. JOSA A, 2(2): , [2] ED Adrian. The impulses produced by sensory nerve endings: Part I. Journal of Physiology, 61:49 72,1926. [3] L Aitchison and M Lengyel. The Hamiltonian brain: E cient probabilistic inference with excitatory-inhibitory neural circuit dynamics. PLOS Computational Biology, 12(12):e ,2016. [4] RM Alexander. Optima for animals. Princeton University Press, [5] JJ Atick, Z Li, and AN Redlich. Understanding retinal color coding from first principles. Neural Computation, 4(4): , [6] JJ Atick and AN Redlich. Towards a theory of early visual processing. Neural Computation, 2(3): ,1990. [7] JJ Atick and AN Redlich. What does the retina know about natural scenes? Neural Computation, 4(2): ,1992. [8] F Attneave. Some informational aspects of visual perception. Psychological Review, pages ,1954. [9] D Attwell and SB Laughlin. An energy budget for signaling in the grey matter of the brain. J Cerebral Blood Flow and Metabolism, 21(10): , [10] R Baddeley, LF Abbott, MCA Booth, F Sengpiel, T Freeman, EA Wakeman, and ET Rolls. Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Proc Royal Society London, B, 264: ,1997. [11] V Balasubramanian and MJ Berry. A test of metabolically e cient coding in the retina. Network: Computation in Neural Systems, 13(4): , [12] V Balasubramanian and P Sterling. Receptive fields and functional architecture in the retina. JPhysiology, 587(12): , [13] HB Barlow. Sensory mechanisms, the reduction of redundancy, and intelligence. In DV Blake and AM Utlley, editors, Proc Symp Mechanization Thought Processes (Vol2) [14] HB Barlow. Possible principles underlying the transformations of sensory messages. Sensory communication, pages , [15] HB Barlow, R Fitzhugh, and SW Ku er. Change of organization in the receptive fields of the cat s retina during dark adaptation. Journal of Physiology, 137: ,1957. [16] AJ Bell and TJ Sejnowski. An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 7: , [17] WBialek. Biophysics: Searching for Principles. Princeton University Press,

40 References [18] W Bialek, M DeWeese, F Rieke, and D Warland. Bits and brains: Information flow in the nervous system. Physica A: Statistical Mechanics and its Applications, 200(1-4): ,1993. [19] R Bogacz. A tutorial on the free-energy framework for modelling perception and learning. J. of Mathematical Psychology, [20] BG Borghuis, CP Ratli, RG Smith, P Sterling, and V Balasubramanian. Design of a neuronal array. JNeuroscience,28(12): , [21] A Borst and FE Theunissen. Information theory and neural coding. Nature neuroscience, 2(11): ,1999. [22] DH Brainard, P Longere, PB Delahunt, WT Freeman, JM Kraft, and BXiao.Bayesianmodelofhumancolorconstancy.Journal of Vision, 6(11), [23] W Brendel, R Bourdoukan, P Vertechi, CK Machens, and S Denéve. Learning to represent signals spike by spike. ArXiv e-prints, [24] N Brenner, W Bialek, and R de Ruyter van Steveninck. Adaptive rescaling maximizes information transmission. Neuron, 26: , [25] G Buchsbaum and Gottschalk. Trichromacy, opponent colours coding and optimum colour information transmission in the retina. Proc Roy Soc London, B, 220(1218):89 113,1983. [26] J Burge and WS Geisler. Optimal disparity estimation in natural stereo images. Journal of Vision, 14(2):1,2014. [27] J Bussgang. Cross-correlation function of amplitude-distorted Gaussian signals. Lab. Elect., Mass. Inst. Technol., Cambridge, MA, USA, Tech. Rep, 216,1952. [28] EL Charnov. Optimal foraging, the marginal value theorem. Theoretical population biology, 9(2): ,1976. [29] P Dagum and M Luby. Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artificial intelligence, 60(1): , [30] NB Davies, JR Krebs, and SA West. An Introduction to Behavioural Ecology. Wiley,2012. [31] P Dayan and DC Abbott. Theoretical Neuroscience. MIT Press, New York, NY, USA, [32] MH DeGroot. Probability and Statistics, 2nd edition. UK, Addison- Wesley, [33] A Dettner, S Münzberg, and T Tchumatchenko. Temporal pairwise spike correlations fully capture single-neuron information. Nature Communications, 7,2016. [34] MR Deweese. Optimization principles for the neural code. PhDthesis, Princeton University, [35] MR DeWeese and W Bialek. Information flow in sensory neurons. Il Nuovo Cimento D, 17(7-8): ,1995. [36] E Doi, JL Gauthier, GD Field, J Shlens, A Sher, M Greschner, TA Machado, LH Jepson, K Mathieson, DE Gunning, AM Litke, L Paninski, EJ Chichilnisky, and EP Simoncelli. E cient coding of spatial information in the primate retina. J. Neuroscience, 32(46): , [37] E Doi and MS Lewicki. A simple model of optimal population coding for sensory systems. PLoS Comp Biol, 10(8). [38] K Doya, S Ishii, A Pouget, and R Rao. The Bayesian Brain. MIT, MA,

41 References [39] KR Du y and DH Hubel. Receptive field properties of neurons in the primary visual cortex under photopic and scotopic lighting conditions. Vision Research, 47: ,2007. [40] TE Duncan. Evaluation of likelihood functions. Information and Control, 13(1):62 74,1968. [41] P Dyan and LF Abbott. Theoretical Neuroscience. MIT Press, [42] CEliasmith.Neural Engineering. MIT Press, [43] AL Fairhall, GD Lewen, W Bialek, and RR de Ruyter van Steveninck. E ciency and ambiguity in an adaptive neural code. Nature, 412(6849): , [44] LA Finelli, S Haney, M Bazhenov, M Stopfer, and TJ Sejnowski. Synaptic learning rules and sparse coding in a model sensory system. PLoS Comp Biol, [45] BJ Fischer and JL Pena. Owl s behavior and neural representation predicted by Bayesian inference. Nature Neuroscience, 14: , [46] P Foldiak and M Young. Sparse coding in the primate cortex. in The Handbook of Brain Theory and Neural Networks, ed. Arbib, MA, pages , [47] P Garrigan, CP Ratli, JM Klein, P Sterling, DH Brainard, and V Balasubramanian. Design of a trichromatic cone array. PLoS Comput Biol, 6(2):e ,2010. [48] JL Gauthier, GD Field, A Sher, M Greschner, J Shlens, AM Litke, and EJ Chichilnisky. Receptive fields in primate retina are coordinated to sample visual space more uniformly. Public Library of Science (PLoS) Biology, 7(4),2009. [49] W Gerstner and WM Kistler. Spiking neuron models: Single neurons, populations, plasticity. CUP,2002. [50] AR Girshick, MS Landy, and EP Simoncelli. Cardinal rules: visual orientation perception reflects knowledge of environmental statistics. Nature neuroscience, 14(7): ,2011. [51] D Guo, S Shamai, and S Verdú. Mutual information and minimum mean-square error in Gaussian channels. IEEE Transactions on Information Theory, 51(4): ,2005. [52] JJ Harris, R Jolivet, and D Attwell. Synaptic energy use and supply. Neuron, 75(5): ,2012. [53] JJ Harris, R Jolivet, E Engl, and D Attwell. Energy-e cient information transfer by visual pathway synapses. Current Biology, 25(24): , [54] HK Hartline. Visual receptors and retinal interaction. Nobel lecture, pages , [55] BY Hayden, JM Pearson, and ML Platt. Neuronal basis of sequential foraging decisions in a patchy environment. Nature neuroscience, 14(7): , [56] A Jarvstad, SK Rushton, PA Warren, and U Hahn. Knowing when to move on cognitive and perceptual decisions in time. Psychological science, 23(6): ,2012. [57] EC Johnson, DL Jones, and R Ratnam. A minimum-error, energyconstrained neural code is an instantaneous-rate code. Journal of computational neuroscience, 40(2): ,2016. [58] E Kaplan. The M, P, and K pathways of the primate visual system. In LM Chalupa and JS Werner, editors, The Visual Neurosciences, Volume 1, pages MIT Press,

42 References [59] DC Knill. Robust cue integration: A Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant. Journal of Vision, 7:1 24,2007. [60] DC Knill and R Richards. Perception as Bayesian inference. Cambridge University Press, New York, NY, USA, [61] DC Knill and JA Saunders. Do humans optimally integrate stereo and texture information for judgments of surface slant? Vision Research, 43: , [62] K Koch, J McLean, M Berry, P Sterling, V Balasubramanian, and MA Freed. E ciency of information transmission by retinal ganglion cells. Current Biology, 14(17): ,2004. [63] K Koch, J McLean, R Segev, MA Freed, MJ Berry, V Balasubramanian, and P Sterling. How much the eye tells the brain. Current biology, 16(14): , [64] L Kostal, P Lansky, and MD McDonnell. Metabolic cost of neuronal information in an empirical stimulus-response model. Biological cybernetics, 107(3): ,2013. [65] L Kostal, P Lansky, and J-P Rospars. E cient olfactory coding in the pheromone receptor neuron of a moth. PLoS Computational Biology, 4(4), [66] SW Ku er. Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16:37 68, [67] MF Land and DE Nilsson. Animal eyes. OUP,2002. [68] SB Laughlin. A simple coding procedure enhances a neuron s information capacity. ZNaturforsch, 36c: , [69] SB Laughlin. Matching coding to scenes to enhance e ciency. In O.J. Braddick and A.C. Sleigh, editors, Physical and biological processing of images, pages Springer, Berlin, [70] SB Laughlin, RR de Ruyter van Steveninck, and JC Anderson. The metabolic cost of neural information. Nature Neuro, 1(1):36 41, [71] P Lennie. The cost of cortical computation. Current Biology, 13: , [72] WB Levy and RA Baxter. Energy e cient neural codes. Neural computation, 8(3): ,1996. [73] R Linsker. Self-organization in perceptual network. Computer, pages , [74] BF Logan Jr. Information in the zero crossings of bandpass signals. AT&T Technical Journal, 56:487510,1977. [75] ZF Mainen and TJ Sejnowski. Reliability of spike timing in neocortical neurons. Science, 268(5216):1503, [76] D Marr. Vision: A computational investigation into the human representation and processing of visual information. 1982,2010. [77] M Meister and MJ Berry. The neural code of the retina. Neuron, 22: , [78] A Moujahid, A D Anjou, and M Graña. Energy demands of diverse spiking cells from the neocortex, hippocampus, and thalamus. Frontiers in computational neuroscience, 8:41,2014. [79] INemenman,GDLewen,WBialek,andRRdeRuytervanSteveninck. Neural coding of natural stimuli: Information at sub-millisecond resolution. PLoS Comp Biology, 4(3), [80] S Nirenberg, SM Carcieri, AL Jacobs, and PE Latham. Retinal ganglion cells act largely as independent encoders. Nature, 208

43 References 411(6838): , June [81] JE Niven, JC Anderson, and SB Laughlin. Fly photoreceptors demonstrate energy-information trade-o s in neural coding. PLoS Biology, 5(4), [82] JE Niven and SB Laughlin. Energy limitation as a selective pressure on the evolution of sensory systems. JExpBiol, 211: , [83] H. Nyquist. Certain topics in telegraph transmission theory. Proceedings of the IEEE, 90(2): ,1928. [84] BA Olshausen. 20 years of learning about vision: Questions answered, questions unanswered, and questions not yet asked. In 20 Years of Computational Neuroscience, pages [85] S Pajevic and PJ Basser. An optimum principle predicts the distribution of axon diameters in PLoS ONE, [86] L Paninski, JW Pillow, and EP Simoncelli. Maximum likelihood estimation of a stochastic integrate-and-fire neural encoding model. Neural computation, 16(12): ,2004. [87] JA Perge, K Koch, R Miller, P Sterling, and V Balasubramanian. How the optic nerve allocates space, energy capacity, and information. JNeuroscience,29(24): ,2009. [88] JA Perge, JE Niven, E Mugnaini, V Balasubramanian, and P Sterling. Why do axons di er in caliber? JNeuroscience,32(2): ,2012. [89] E Persi, D Hansel, L Nowak, P Barone, and C van Vreeswijk. Powerlaw input-output transfer functions explain the contrast-response and tuning properties of neurons in visual cortex. PLoS Comput Biol, 7(2):e , [90] MD Plumbley. E cient information transfer and anti-hebbian neural networks. Neural Networks, 6: , [91] K Rajan and W Bialek. Maximally informative stimulus energies in the analysis of neural responses to natural signals. PloS one, 8(11):e71959, [92] RPN Rao and DH Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field e ects. Nature Neuroscience, 2:79 87,1999. [93] FM Reza. Information Theory. New York, McGraw-Hill, [94] F Rieke, DA Bodnar, and W Bialek. Naturalistic stimuli increase the rate and e ciency of information transmission by primary auditory a erents. Proc Roy Soc Lond, 262(1365): , [95] F Rieke, D Warland, RR de Ruyter van Steveninck, and W Bialek. Spikes: Exploring the Neural Code. MIT Press, Cambridge, MA, [96] KF Riley, MP Hobson, and SJ Bence. Mathematical methods for physics and engineering. Cambridge University Press, [97] TW Ruderman DL, Cronin and C Chiao. Statistics of cone responses to natural images: Implications for visual coding. Journal of the Optical Society of America, 15: ,1998. [98] AB Saul and AL Humphrey. Spatial and temporal response properties of lagged and nonlagged cells in cat lateral geniculate nucleus. Journal of neurophysiology, 64(1): ,1990. [99] B Sengupta, MB Stemmler, and KJ Friston. Information and e ciency in the nervous system - a synthesis. PLoS Computuational Biology, July [100] CE Shannon. A mathematical theory of communication. Bell System Technical Journal,27: ,

44 References [101] CE Shannon and W Weaver. The Mathematical Theory of Communication. University of Illinois Press, [102] T Sharpee, NC Rust, and W Bialek. Analyzing neural responses to natural signals: maximally informative dimensions. Neural computation, 16(2): ,2004. [103] TO Sharpee. Computational identification of receptive fields. Annual review of neuroscience, 36: ,2013. [104] EC Smith and MS Lewicki. E cient auditory coding. Nature, 439(7079): , [105] SSmith.Digital signal processing: A practical guide for engineers and scientists. Newnes, [106] L Sokolo. Circulation and energy metabolism of the brain. Basic neurochemistry, 2: ,1989. [107] SG Solomon and P Lennie. The machinery of colour vision. Nature Reviews Neuroscience, 8(4): ,2007. [108] WW Sprague, EA Cooper, I Tošić, and MS Banks. Stereopsis is adaptive for the natural environment. Science Advances, 1(4), [109] MV Srinivasan, SB Laughlin, and A Dubs. Predictive coding: A fresh view of inhibition in the retina. Proc Roy Soc London, B., 216(1205): , [110] P Sterling and S Laughlin. Principles of Neural Design. MIT Press, [111] JV Stone. Independent Component Analysis: A Tutorial Introduction. MIT Press, Boston, [112] JV Stone. Footprints sticking out of the sand (Part II): Children s Bayesian priors for lighting direction and convexity. Perception, 40(2): , [113] JV Stone. Vision and Brain. MIT Press, [114] JV Stone. Bayes Rule: A Tutorial Introduction to Bayesian Analysis. Sebtel Press, She eld, England, [115] JV Stone, IS Kerrigan, and J Porrill. Where is the light? Bayesian perceptual priors for lighting direction. Proceedings Royal Society London (B), 276: ,2009. [116] SP Strong, RR de Ruyter van Steveninck, W Bialek, and R Koberle. On the application of information theory to neural spike trains. In Pac Symp Biocomput, volume621,page32,1998. [117] P Tiesinga, JM Fellous, and TJ Sejnowski. Regulation of spike timing in visual cortical circuits. Nature reviews neuroscience, 9(2):97 107, [118] JH van Hateren. A theory of maximizing sensory information. Biological Cybernetics, 68:23 29,1992. [119] BT Vincent and RJ Baddeley. Synaptic energy e ciency in retinal processing. Vision research, 43(11): , [120] MJ Wainwright. Visual adaptation as optimal information transmission. Vision research, 39(23): , [121] Z Wang, X Wei, A Stocker, and D Lee. E cient neural codes under metabolic constraints. In NIPS2016, [122] X Wei and AA Stocker. Mutual information, Fisher information, and e cient coding. Neural Computation, 28(2): ,2016. [123] LZhaoping. Understanding Vision: Theory, Models, and Data. OUP Oxford,

45 Index action potential, 28, 29 adenosine triphosphate (ATP), 47, 189 aftere ect, 65 autocorrelation function, 189 average, 189 axon, 28 band-pass filter, 139 bandwidth, 25, 26 Bayes rule, 16, 165, 189 binary digits vs bits, 12 bit, 10, 12, 189 Bussgang s theorem, 102, 115 calculus of variations, 54 capacity, 17, 189 central limit theorem, 36, 63, 189 chain rule for entropy, 204 channel Gaussian, 23 channel capacity, 10, 17, 18, 189 characteristic equation, 89 chromatic aberration, 67 coding capacity, 32, 43, 58, 189 coding e ciency, 38 colour aftere ect, 65 communication channel, 10 compound eye, 153 computational neuroscience, 6 conditional entropy, 21, 178, 189 conditional probability, 189 conduction velocity, 28, 54 cone, 66 contrast definition, 155 convolution operator, 104, 129 cornea, 153 correlation, 189, 197 correlation time, 176 covariance matrix, 88, 112 Cramer-Rao bound, 99 cricket, 39 cross-correlation function, 190 cross-correlaton function, 113 cumulative distribution function, 163, 190 cut-o frequency, 140 dark noise, 73 data matrix, 78 decodability linear, 191 decoding, 190 decoding function, 190 decorrelated, 190 dendrite, 27 determinant, 89 die, 14 di erence-of-gaussian, 131 di erential entropy, 23 Dirac delta function, 105 direct method, 37 dot product, 79, 105 dynamic linearity, 95, 190 e cient coding hypothesis, i, 2, 6, 49,

46 Index e cient coding model, 97 e cient coding principle, 185 empirical Bayes, 171, 180 encoding, 10, 158, 160, 190 neural, 103 encoding function, 190 energy e ciency, 2, 51, 56, 118, 122, 190 entropy, 13, 14, 190 entropy vs information, 15 equivariant, 85, 190 expected value, 190 exponential decay, 30 exponential distribution, 17 fan diagram, 20 filter, 190 firing rate, 29 Fisher information, 22 fly s eye, 153 Fourier analysis, 26, 126 Fourier transform, 136 Galileo, 4 ganglion cell, 67, 92, 116, 122, 129 Gaussian distribution, 17, 36 Gaussian variable, 190 glial cells, 58 identity matrix, 89 iid, 190 independence, 190 information, 12, 15, 190 information theory, 5 information vs entropy, 15 inner product, 79, 105 Jensen s inequality, 86 Joule, 43, 191 kernel, 96, 127, 191 Lagrange multipliers, 60 large monopolar cells, 48, 155 least squares estimate, 99, 108, 170 linear, 191 decodability, 176, 191 decoding vs decodability, 177 filter, 38, 96, 119, 142, 191 function, 101 superposition, 97, 101 system, 97 time invariant, 97 linearity dynamic, 95 static, 95 LMC, 155 LNP neuron model, 103 Logan s theorem, 141 logarithm, 11, 12, 191 low-pass filter, 127, 140 LSE and information, 115 marginal value theorem, 56, 61 Marr, David, 7 matrix, 78, 79 maximum a posteriori, 168 maximum entropy, 17 maximum entropy e ciency, 58 maximum entropy model, 151, 164 maximum likelihood, 99, 168 measurement precision, 40 Mexican hat, 130 micron, 191 midget ganglion cell, 92 mimic model, 96 mitochondria, 47 mitochondrial volume, 50 mitochondrion, 191 model e cient coding, 97 linear filter, 96 mimic, 96 predictive coding, 97 monotonic, 191 mutual information, 21,

47 Index Gaussian channel, 23 myelin, 47 myelin sheath, 29 natural statistics, 164, 180, 191 neocortex, 191 neural arithmetic, 74 neural encoding, 103 neural superposition, 153 neuron, 1, 27 noise, 5, 191 noisy channel coding theorem, 22, 26 Nyquist rate, 25 o -centre cell, 130 ommatidia, 153 on-centre cell, 130 opponent process, 69, 92, 131 optic nerve, 44, 46 organelle, 47 orthogonal, 191 parasol ganglion cell, 92 phase, 136, 191 photopsin, 73 Poisson distribution, 36 power, 25, 191 power density, 52 power spectrum, 137, 191 predictive coding, 2, 97, 99, 116, 121, 141, 185 prior probability, 168 probability distribution, 14, 191 function, 191 mass function, 191 pyramidal cell, 27, 59 pyramidal neuron, 191 random variable, 13, 192 receptive field, 67 on-centre, 130 receptors, 30 redundancy, 192 retinal ganglion cell, 67 reverse correlation, 106 rhabdom, 153 rhodopsin, 66, 73 rotation matrix, 82 sample rate, 25 scalar variable, 79 scale invariance, 148 Shannon, C, 5, 9 signal to noise ratio, 25, 192 source coding theorem, 19 space constant, 31 spatial frequency, 126 spike, 1, 28, 29 spike cost, 47 spike speed, 54 spike timing precision, 40 spike train, 32 spike-triggered average, 105 standard deviation, 192 static linearity, 95, 191 surprisal, 12 synapse, 30 theorem, 10, 192 timing precision, 40, 192 transfer function, 29, 103, 104, 115, 151, 152, 192 transpose, 79, 202 tuning function, 192 uncertainty, 13, 15, 192 uncorrelated, 192 uniform distribution, 17 variable, 192 scalar, 79 vector, 78 vector, 78, 192, 199 variable, 78 white, 106, 137, 192 whitening filter, 137 Wiener kernel, 96, 104, 122,

48 Dr James Stone is an Honorary Reader in Vision and Computational Neuroscience at the University of Sheffield, England. Seeing The Computational Approach to Biological Vision second edition John P. Frisby and James V. Stone

Principles of Neural Information Theory

Principles of Neural Information Theory Principles of Neural Information Theory Computational Neuroscience and Metabolic E ciency James V Stone Title: Principles of Neural Information Theory Author: James V Stone c 2017 Sebtel Press All rights

More information

Do Neurons Process Information Efficiently?

Do Neurons Process Information Efficiently? Do Neurons Process Information Efficiently? James V Stone, University of Sheffield Claude Shannon, 1916-2001 Nothing in biology makes sense except in the light of evolution. Theodosius Dobzhansky, 1973.

More information

Information Theory. Mark van Rossum. January 24, School of Informatics, University of Edinburgh 1 / 35

Information Theory. Mark van Rossum. January 24, School of Informatics, University of Edinburgh 1 / 35 1 / 35 Information Theory Mark van Rossum School of Informatics, University of Edinburgh January 24, 2018 0 Version: January 24, 2018 Why information theory 2 / 35 Understanding the neural code. Encoding

More information

arxiv: v2 [cs.it] 20 Feb 2018

arxiv: v2 [cs.it] 20 Feb 2018 Information Theory: A Tutorial Introduction James V Stone, Psychology Department, University of Sheffield, England. j.v.stone@sheffield.ac.uk File: main InformationTheory JVStone v3.tex arxiv:82.5968v2

More information

encoding and estimation bottleneck and limits to visual fidelity

encoding and estimation bottleneck and limits to visual fidelity Retina Light Optic Nerve photoreceptors encoding and estimation bottleneck and limits to visual fidelity interneurons ganglion cells light The Neural Coding Problem s(t) {t i } Central goals for today:

More information

Information Theory (Information Theory by J. V. Stone, 2015)

Information Theory (Information Theory by J. V. Stone, 2015) Information Theory (Information Theory by J. V. Stone, 2015) Claude Shannon (1916 2001) Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27:379 423. A mathematical

More information

Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment

Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment Jean-Pierre Nadal CNRS & EHESS Laboratoire de Physique Statistique (LPS, UMR 8550 CNRS - ENS UPMC Univ.

More information

Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons Yan Karklin and Eero P. Simoncelli NYU Overview Efficient coding is a well-known objective for the evaluation and

More information

An introduction to basic information theory. Hampus Wessman

An introduction to basic information theory. Hampus Wessman An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on

More information

Efficient Coding. Odelia Schwartz 2017

Efficient Coding. Odelia Schwartz 2017 Efficient Coding Odelia Schwartz 2017 1 Levels of modeling Descriptive (what) Mechanistic (how) Interpretive (why) 2 Levels of modeling Fitting a receptive field model to experimental data (e.g., using

More information

Estimation of information-theoretic quantities

Estimation of information-theoretic quantities Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/ liam liam@gatsby.ucl.ac.uk November 16, 2004 Some

More information

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable

More information

The Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017

The Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017 The Bayesian Brain Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester May 11, 2017 Bayesian Brain How do neurons represent the states of the world? How do neurons represent

More information

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is

More information

Membrane equation. VCl. dv dt + V = V Na G Na + V K G K + V Cl G Cl. G total. C m. G total = G Na + G K + G Cl

Membrane equation. VCl. dv dt + V = V Na G Na + V K G K + V Cl G Cl. G total. C m. G total = G Na + G K + G Cl Spiking neurons Membrane equation V GNa GK GCl Cm VNa VK VCl dv dt + V = V Na G Na + V K G K + V Cl G Cl G total G total = G Na + G K + G Cl = C m G total Membrane with synaptic inputs V Gleak GNa GK

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission

More information

Noisy channel communication

Noisy channel communication Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 6 Communication channels and Information Some notes on the noisy channel setup: Iain Murray, 2012 School of Informatics, University

More information

Lateral organization & computation

Lateral organization & computation Lateral organization & computation review Population encoding & decoding lateral organization Efficient representations that reduce or exploit redundancy Fixation task 1rst order Retinotopic maps Log-polar

More information

Introduction to Artificial Neural Networks and Deep Learning

Introduction to Artificial Neural Networks and Deep Learning Introduction to Artificial Neural Networks and Deep Learning A Practical Guide with Applications in Python Sebastian Raschka This book is for sale at http://leanpub.com/ann-and-deeplearning This version

More information

Independent Component Analysis

Independent Component Analysis Independent Component Analysis James V. Stone November 4, 24 Sheffield University, Sheffield, UK Keywords: independent component analysis, independence, blind source separation, projection pursuit, complexity

More information

Mid Year Project Report: Statistical models of visual neurons

Mid Year Project Report: Statistical models of visual neurons Mid Year Project Report: Statistical models of visual neurons Anna Sotnikova asotniko@math.umd.edu Project Advisor: Prof. Daniel A. Butts dab@umd.edu Department of Biology Abstract Studying visual neurons

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

Adaptation in the Neural Code of the Retina

Adaptation in the Neural Code of the Retina Adaptation in the Neural Code of the Retina Lens Retina Fovea Optic Nerve Optic Nerve Bottleneck Neurons Information Receptors: 108 95% Optic Nerve 106 5% After Polyak 1941 Visual Cortex ~1010 Mean Intensity

More information

Bayesian probability theory and generative models

Bayesian probability theory and generative models Bayesian probability theory and generative models Bruno A. Olshausen November 8, 2006 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using

More information

Information in Biology

Information in Biology Lecture 3: Information in Biology Tsvi Tlusty, tsvi@unist.ac.kr Living information is carried by molecular channels Living systems I. Self-replicating information processors Environment II. III. Evolve

More information

ECE Information theory Final (Fall 2008)

ECE Information theory Final (Fall 2008) ECE 776 - Information theory Final (Fall 2008) Q.1. (1 point) Consider the following bursty transmission scheme for a Gaussian channel with noise power N and average power constraint P (i.e., 1/n X n i=1

More information

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References 24th March 2011 Update Hierarchical Model Rao and Ballard (1999) presented a hierarchical model of visual cortex to show how classical and extra-classical Receptive Field (RF) effects could be explained

More information

Exponents and Logarithms

Exponents and Logarithms chapter 5 Starry Night, painted by Vincent Van Gogh in 1889. The brightness of a star as seen from Earth is measured using a logarithmic scale. Exponents and Logarithms This chapter focuses on understanding

More information

Exercises. Chapter 1. of τ approx that produces the most accurate estimate for this firing pattern.

Exercises. Chapter 1. of τ approx that produces the most accurate estimate for this firing pattern. 1 Exercises Chapter 1 1. Generate spike sequences with a constant firing rate r 0 using a Poisson spike generator. Then, add a refractory period to the model by allowing the firing rate r(t) to depend

More information

Bits. Chapter 1. Information can be learned through observation, experiment, or measurement.

Bits. Chapter 1. Information can be learned through observation, experiment, or measurement. Chapter 1 Bits Information is measured in bits, just as length is measured in meters and time is measured in seconds. Of course knowing the amount of information is not the same as knowing the information

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

Lecture 11: Continuous-valued signals and differential entropy

Lecture 11: Continuous-valued signals and differential entropy Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

Fisher Information Quantifies Task-Specific Performance in the Blowfly Photoreceptor

Fisher Information Quantifies Task-Specific Performance in the Blowfly Photoreceptor Fisher Information Quantifies Task-Specific Performance in the Blowfly Photoreceptor Peng Xu and Pamela Abshire Department of Electrical and Computer Engineering and the Institute for Systems Research

More information

PROBABILITY THEORY REVIEW

PROBABILITY THEORY REVIEW PROBABILITY THEORY REVIEW CMPUT 466/551 Martha White Fall, 2017 REMINDERS Assignment 1 is due on September 28 Thought questions 1 are due on September 21 Chapters 1-4, about 40 pages If you are printing,

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

Information in Biology

Information in Biology Information in Biology CRI - Centre de Recherches Interdisciplinaires, Paris May 2012 Information processing is an essential part of Life. Thinking about it in quantitative terms may is useful. 1 Living

More information

INTRODUCTION TO INFORMATION THEORY

INTRODUCTION TO INFORMATION THEORY INTRODUCTION TO INFORMATION THEORY KRISTOFFER P. NIMARK These notes introduce the machinery of information theory which is a eld within applied mathematics. The material can be found in most textbooks

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Integrated CME Project Mathematics I-III 2013

Integrated CME Project Mathematics I-III 2013 A Correlation of -III To the North Carolina High School Mathematics Math I A Correlation of, -III, Introduction This document demonstrates how, -III meets the standards of the Math I. Correlation references

More information

Stephen F Austin. Exponents and Logarithms. chapter 3

Stephen F Austin. Exponents and Logarithms. chapter 3 chapter 3 Starry Night was painted by Vincent Van Gogh in 1889. The brightness of a star as seen from Earth is measured using a logarithmic scale. Exponents and Logarithms This chapter focuses on understanding

More information

A Mathematical Theory of Communication

A Mathematical Theory of Communication A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions

More information

Lecture 14 February 28

Lecture 14 February 28 EE/Stats 376A: Information Theory Winter 07 Lecture 4 February 8 Lecturer: David Tse Scribe: Sagnik M, Vivek B 4 Outline Gaussian channel and capacity Information measures for continuous random variables

More information

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 18 Maximum Entropy Models I Welcome back for the 3rd module

More information

Review for Exam 1. Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA

Review for Exam 1. Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA Review for Exam Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 0003 March 26, 204 Abstract Here are some things you need to know for the in-class

More information

Revision of Lecture 4

Revision of Lecture 4 Revision of Lecture 4 We have completed studying digital sources from information theory viewpoint We have learnt all fundamental principles for source coding, provided by information theory Practical

More information

Notes on Random Variables, Expectations, Probability Densities, and Martingales

Notes on Random Variables, Expectations, Probability Densities, and Martingales Eco 315.2 Spring 2006 C.Sims Notes on Random Variables, Expectations, Probability Densities, and Martingales Includes Exercise Due Tuesday, April 4. For many or most of you, parts of these notes will be

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Information and Entropy. Professor Kevin Gold

Information and Entropy. Professor Kevin Gold Information and Entropy Professor Kevin Gold What s Information? Informally, when I communicate a message to you, that s information. Your grade is 100/100 Information can be encoded as a signal. Words

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Channel Coding: Zero-error case

Channel Coding: Zero-error case Channel Coding: Zero-error case Information & Communication Sander Bet & Ismani Nieuweboer February 05 Preface We would like to thank Christian Schaffner for guiding us in the right direction with our

More information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

6.02 Fall 2011 Lecture #9

6.02 Fall 2011 Lecture #9 6.02 Fall 2011 Lecture #9 Claude E. Shannon Mutual information Channel capacity Transmission at rates up to channel capacity, and with asymptotically zero error 6.02 Fall 2011 Lecture 9, Slide #1 First

More information

Noisy-Channel Coding

Noisy-Channel Coding Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/05264298 Part II Noisy-Channel Coding Copyright Cambridge University Press 2003.

More information

Slide11 Haykin Chapter 10: Information-Theoretic Models

Slide11 Haykin Chapter 10: Information-Theoretic Models Slide11 Haykin Chapter 10: Information-Theoretic Models CPSC 636-600 Instructor: Yoonsuck Choe Spring 2015 ICA section is heavily derived from Aapo Hyvärinen s ICA tutorial: http://www.cis.hut.fi/aapo/papers/ijcnn99_tutorialweb/.

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Chapter 9: The Perceptron

Chapter 9: The Perceptron Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed

More information

Bayesian Linear Regression [DRAFT - In Progress]

Bayesian Linear Regression [DRAFT - In Progress] Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Lecture 1: Introduction, Entropy and ML estimation

Lecture 1: Introduction, Entropy and ML estimation 0-704: Information Processing and Learning Spring 202 Lecture : Introduction, Entropy and ML estimation Lecturer: Aarti Singh Scribes: Min Xu Disclaimer: These notes have not been subjected to the usual

More information

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece.

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece. Machine Learning A Bayesian and Optimization Perspective Academic Press, 2015 Sergios Theodoridis 1 1 Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens,

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future

More information

Error Analysis in Experimental Physical Science Mini-Version

Error Analysis in Experimental Physical Science Mini-Version Error Analysis in Experimental Physical Science Mini-Version by David Harrison and Jason Harlow Last updated July 13, 2012 by Jason Harlow. Original version written by David M. Harrison, Department of

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Lecture 2. Capacity of the Gaussian channel

Lecture 2. Capacity of the Gaussian channel Spring, 207 5237S, Wireless Communications II 2. Lecture 2 Capacity of the Gaussian channel Review on basic concepts in inf. theory ( Cover&Thomas: Elements of Inf. Theory, Tse&Viswanath: Appendix B) AWGN

More information

California Common Core State Standards for Mathematics Standards Map Mathematics I

California Common Core State Standards for Mathematics Standards Map Mathematics I A Correlation of Pearson Integrated High School Mathematics Mathematics I Common Core, 2014 to the California Common Core State s for Mathematics s Map Mathematics I Copyright 2017 Pearson Education, Inc.

More information

Elementary Linear Algebra, Second Edition, by Spence, Insel, and Friedberg. ISBN Pearson Education, Inc., Upper Saddle River, NJ.

Elementary Linear Algebra, Second Edition, by Spence, Insel, and Friedberg. ISBN Pearson Education, Inc., Upper Saddle River, NJ. 2008 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. APPENDIX: Mathematical Proof There are many mathematical statements whose truth is not obvious. For example, the French mathematician

More information

Bayesian Inference. 2 CS295-7 cfl Michael J. Black,

Bayesian Inference. 2 CS295-7 cfl Michael J. Black, Population Coding Now listen to me closely, young gentlemen. That brain is thinking. Maybe it s thinking about music. Maybe it has a great symphony all thought out or a mathematical formula that would

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Bayes Rule: A Tutorial Introduction ( www45w9klk)

Bayes Rule: A Tutorial Introduction (  www45w9klk) Bayes Rule: A Tutorial Introduction (https://docs.google.com/document/pub?id=1qm4hj4xmmowfvsklgcqpudojya5qcnii_ www45w9klk) by JV Stone 1 Introduction All decisions are based on data, but the best decisions

More information

Machine Learning Srihari. Information Theory. Sargur N. Srihari

Machine Learning Srihari. Information Theory. Sargur N. Srihari Information Theory Sargur N. Srihari 1 Topics 1. Entropy as an Information Measure 1. Discrete variable definition Relationship to Code Length 2. Continuous Variable Differential Entropy 2. Maximum Entropy

More information

Understanding Aha! Moments

Understanding Aha! Moments Understanding Aha! Moments Hermish Mehta Department of Electrical Engineering & Computer Sciences University of California, Berkeley Berkeley, CA 94720 hermish@berkeley.edu Abstract In this paper, we explore

More information

Direct Proof and Counterexample I:Introduction

Direct Proof and Counterexample I:Introduction Direct Proof and Counterexample I:Introduction Copyright Cengage Learning. All rights reserved. Goal Importance of proof Building up logic thinking and reasoning reading/using definition interpreting :

More information

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120 Whitman-Hanson Regional High School provides all students with a high- quality education in order to develop reflective, concerned citizens and contributing members of the global community. Course Number

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Direct Proof and Counterexample I:Introduction. Copyright Cengage Learning. All rights reserved.

Direct Proof and Counterexample I:Introduction. Copyright Cengage Learning. All rights reserved. Direct Proof and Counterexample I:Introduction Copyright Cengage Learning. All rights reserved. Goal Importance of proof Building up logic thinking and reasoning reading/using definition interpreting statement:

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

What is the neural code? Sekuler lab, Brandeis

What is the neural code? Sekuler lab, Brandeis What is the neural code? Sekuler lab, Brandeis What is the neural code? What is the neural code? Alan Litke, UCSD What is the neural code? What is the neural code? What is the neural code? Encoding: how

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Probability, Random Processes and Inference

Probability, Random Processes and Inference INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

Classification & Information Theory Lecture #8

Classification & Information Theory Lecture #8 Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Parameter estimation Conditional risk

Parameter estimation Conditional risk Parameter estimation Conditional risk Formalizing the problem Specify random variables we care about e.g., Commute Time e.g., Heights of buildings in a city We might then pick a particular distribution

More information

Transformation of stimulus correlations by the retina

Transformation of stimulus correlations by the retina Transformation of stimulus correlations by the retina Kristina Simmons (University of Pennsylvania) and Jason Prentice, (now Princeton University) with Gasper Tkacik (IST Austria) Jan Homann (now Princeton

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Department of Mechanical Engineering 6.050J/2.0J Information and Entropy Spring 2005 Issued: March 7, 2005

More information

Fracture Mechanics: Fundamentals And Applications, Third Edition Free Pdf Books

Fracture Mechanics: Fundamentals And Applications, Third Edition Free Pdf Books Fracture Mechanics: Fundamentals And Applications, Third Edition Free Pdf Books With its combination of practicality, readability, and rigor that is characteristic of any truly authoritative reference

More information

HS AP Physics 1 Science

HS AP Physics 1 Science Scope And Sequence Timeframe Unit Instructional Topics 5 Day(s) 20 Day(s) 5 Day(s) Kinematics Course AP Physics 1 is an introductory first-year, algebra-based, college level course for the student interested

More information

for Complex Environmental Models

for Complex Environmental Models Calibration and Uncertainty Analysis for Complex Environmental Models PEST: complete theory and what it means for modelling the real world John Doherty Calibration and Uncertainty Analysis for Complex

More information

CSC2515 Assignment #2

CSC2515 Assignment #2 CSC2515 Assignment #2 Due: Nov.4, 2pm at the START of class Worth: 18% Late assignments not accepted. 1 Pseudo-Bayesian Linear Regression (3%) In this question you will dabble in Bayesian statistics and

More information