Correlations in Populations: Information-Theoretic Limits
|
|
- Maryann Carroll
- 5 years ago
- Views:
Transcription
1 Correlations in Populations: Information-Theoretic Limits Don H. Johnson Ilan N. Goodman Department of Electrical & Computer Engineering Rice University, Houston, Texas
2 Population coding Describe a population as parallel point process channels Variations Separate inputs Common input Dependence among channels What do information theoretic considerations suggest is best? X X 1 X 2 X N Y 1 Y 2 Y N
3 Modeling approach We would like to use point process models for the outputs Technically very difficult to describe connection-induced dependencies Use simpler Bernoulli models, capable of describing complex correlation structures Assume homogeneous populations P( X 1, X 2,K, X N )= P( X 1 )P( X 2 )LP( X N ) P(X j =1) ' # (2) $( X " 1+ i % p)( X j % p) ) & p(1% p) () i> j # (3) * $( X + i % p)( X j % p)( X k % p) &, i> j>k [ p(1% p)] 3/ 2 +,
4 A note on modeling Correlation, orthogonal model P( X 1, X 2,K, X N ) = P( X 1 )P( X 2 )L P( X N ) 1 + Exponential model P( X 1, X 2,K, X N ) " exp % & ' + & ( ( ' % i > j > k % i > j " (2) # ( X i $ p i )( X j $ p j ) p i (1$ p i ) p j (1$ p j ) " (3) # ( X i $ p)( X j $ p)( X k $ p) p i (1 $ p i ) p j (1$ p j )p k (1$ p k ) $ # i X i + $ # ij X i X j + $ # ijk X i X j X k + L i i, j i, j,k ( ) * ) + + *
5 Fisher information analysis How should the stimulus be encoded in spike rate to achieve constant Fisher information? Input structure not important 0.5 Two-neuron Bernoulli Model p ρ 0.5 Three-neuron Bernoulli Model p ρ ρ3 10 Poisson Counting Model Total rate Common rate α α 0 0 α 1
6 Kullback-Leibler distance and data analysis D X (" 1 " 0 ) = # p(x;" 1 )log p(x;" 1 ) x p(x;" 0 ) α 0, α 1 two different stimulus conditions p (x; α) - response probabilities K-L distance is the exponential rate of a Neyman-Pearson classifier s false-alarm probability P F ~ 2!ND X (" 1 " 0 ) for fixed PM Distance resulting from information perturbations is proportional to Fisher information D X (" 0 + #" " 0 ) $ F(" 0 ) % (#") 2
7 Data Processing Theorem Redux X " X,Y (# 0,# 1 ) = D Y (# 1 # 0 ) D X (# 1 # 0 ) Systems cannot create information 0 $ " X,Y (# 0,# 1 ) $ 1 Basis for a system theory for information processing and determining which structures are inherently more effective Y
8 Population encoding properties from a K-L distance perspective Individual inputs don t necessarily achieve maximal information transfer " X,Y (N ) # max Explicitly indicating that the inputs encode a single quantity reveals that perfect fidelity is possible X. Y 1 Y 2 Y N 1 γ X,Y (N) i " X i,y i 0 N 1 " X,Y (N ) # 1$ k N or " X,Y (N ) # 1$ e $kn
9 Another viewpoint: Channel Capacity Capacity for the stationary point process channel is known If 0 λ t λ max is the power constraint C (bits/s) = " max eln 2 = " max If we additionally constrain average rate #" max % C (bits/s) = eln 2, " > " o $ " ln 2 ln ", " o = " max % max e," < " & o " Capacity achieved by a Poisson process driven by a random telegraph wave λ max t
10 Channel capacity of populations Use a Bernoulli model and investigate the small probability limit to determine capacity for parallel Poisson channels The two input structures have the same capacity C ( N ) = NC (1) = N " # max eln 2 X 1 Y 1 Y 1 X 2 Y 2 X Y 2 X N Y N Y N
11 Imposing connection dependence changes the story Using Bernoulli models, connection dependence can be added Caveat: modeling Poisson processes Interesting restrictions arise Capacity depends only on pairwise correlations (dependencies) Only positive pairwise correlations possible Restricted range of correlation values For homogenous populations: 0 " # " 1 N $1 For inhomogenous populations: 0 " # " # max
12 3.5 3 Capacity results Capacity achieved with a homogeneous population Correlation affects the two input structures differently M = Separate inputs M = Separate inputs Single, common input 1.5 Single, common input Correlation Correlation Qualitatively similar to Gaussian channel results
13 However As population size increases, introducing connection-induced dependence reduces capacity Capacity unaffected by input- or connectioninduced dependence Fits with previous results derived using Fisher information
14 Poisson vs. Non-Poisson Models Results derived using a Poisson assumption How about non-poisson models? Probably impossible to extend Bernoulli approach to interesting non-poisson cases, but Kabanov showed that the single-channel Poisson capacity bounded the capacity of all other point process models Does this bound apply to multi-channel processes as well?
15 Connection-induced dependence Bernoulli model vague about how correlations are induced If internal feedback is used Feedback can increase capacity M. Lexa has shown that internal feedback can increase the performance of distributed classifiers Y 1 X Y 2 Y 3
16 Conclusions From two theoretical viewpoints, connectioninduced dependence not required to increase capacity Specific forms of dependence may increase a population s processing power Capacity afforded by non-poisson models probably bounded by Poisson result, but not in detail
Limits of population coding
Limits of population coding Don H. Johnson Department of Electrical & Computer Engineering, MS 366 Rice University 6100 Main Street Houston, Texas 77251 1892 dhj@rice.edu Abstract To understand whether
More informationINFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS. Michael A. Lexa and Don H. Johnson
INFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS Michael A. Lexa and Don H. Johnson Rice University Department of Electrical and Computer Engineering Houston, TX 775-892 amlexa@rice.edu,
More informationInferring the Capacity of the Vector Poisson Channel with a Bernoulli Model
Inferring the with a Bernoulli Model Don H. Johnson and Ilan N. Goodman Electrical & Computer Engineering Department, MS380 Rice University Houston, Texas 77005 1892 {dhj,igoodman}@rice.edu Third Revision
More informationInformation Theory. Mark van Rossum. January 24, School of Informatics, University of Edinburgh 1 / 35
1 / 35 Information Theory Mark van Rossum School of Informatics, University of Edinburgh January 24, 2018 0 Version: January 24, 2018 Why information theory 2 / 35 Understanding the neural code. Encoding
More informationWhen does interval coding occur?
When does interval coding occur? Don H. Johnson Λ and Raymon M. Glantz y Λ Department of Electrical & Computer Engineering, MS 366 y Department of Biochemistry and Cell Biology Rice University 6 Main Street
More informationTexas , USA
This article was downloaded by:[rice University] On: 27 February 2008 Access Details: [subscription number 776098947] Publisher: Informa Healthcare Informa Ltd Registered in England and Wales Registered
More informationMultivariate Dependence and the Sarmanov-Lancaster Expansion
Multivariate Dependence and the Sarmanov-Lancaster Expansion Ilan N. Goodman and Don H. Johnson ECE Department, Rice University Houston, TX 7751-189 igoodman@rice.edu, dhj@rice.edu July 5 Abstract We extend
More informationDistributed Structures, Sequential Optimization, and Quantization for Detection
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL., NO., JANUARY Distributed Structures, Sequential Optimization, and Quantization for Detection Michael A. Lexa, Student Member, IEEE and Don H. Johnson, Fellow,
More informationSymmetrizing the Kullback-Leibler distance
Symmetrizing the Kullback-Leibler distance Don H. Johnson Λ and Sinan Sinanović Computer and Information Technology Institute Department of Electrical and Computer Engineering Rice University Houston,
More informationDETECTION theory deals primarily with techniques for
ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for
More informationBasics of Information Processing
Abstract Don H. Johnson Computer & Information Technology Institute Department of Electrical & Computer Engineering Rice University, MS366 Houston, TX 77251 I describe the basics of probability theory,
More informationData Complexity and Success Probability for Various Cryptanalyses
Data Complexity and Success Probability for Various Cryptanalyses Céline Blondeau, Benoît Gérard and Jean Pierre Tillich INRIA project-team SECRET, France Blondeau, Gérard and Tillich. Data Complexity
More informationApplications of Information Geometry to Hypothesis Testing and Signal Detection
CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry
More information!) + log(t) # n i. The last two terms on the right hand side (RHS) are clearly independent of θ and can be
Supplementary Materials General case: computing log likelihood We first describe the general case of computing the log likelihood of a sensory parameter θ that is encoded by the activity of neurons. Each
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationThe homogeneous Poisson process
The homogeneous Poisson process during very short time interval Δt there is a fixed probability of an event (spike) occurring independent of what happened previously if r is the rate of the Poisson process,
More informationJointly Poisson processes
Jointly Poisson processes Don H. Johnson and Ilan N. Goodman Electrical & Computer Engineering Department, MS380 Rice University Houston, Texas 77005 1892 {dhj,igoodman}@rice.edu Abstract What constitutes
More informationNeural Population Structures and Consequences for Neural Coding
Journal of Computational Neuroscience 16, 69 80, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Neural Population Structures and Consequences for Neural Coding DON H. JOHNSON
More informationExponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger
Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential
More informationCSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture
CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What
More informationEncoding or decoding
Encoding or decoding Decoding How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms for extracting a stimulus
More informationSignal detection theory
Signal detection theory z p[r -] p[r +] - + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z)) Is there a better test to use than r? z p[r -] p[r +] - + The optimal
More informationInformation Theory. Coding and Information Theory. Information Theory Textbooks. Entropy
Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is
More informationEE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm
EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm 1. Feedback does not increase the capacity. Consider a channel with feedback. We assume that all the recieved outputs are sent back immediately
More information+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing
Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2015/2016, Our model and tasks The model: two variables are usually present: - the first one is typically discrete
More informationAdvanced Machine Learning & Perception
Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 6 Standard Kernels Unusual Input Spaces for Kernels String Kernels Probabilistic Kernels Fisher Kernels Probability Product Kernels
More informationIf there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,
Recall The Neyman-Pearson Lemma Neyman-Pearson Lemma: Let Θ = {θ 0, θ }, and let F θ0 (x) be the cdf of the random vector X under hypothesis and F θ (x) be its cdf under hypothesis. Assume that the cdfs
More informationNeural coding Ecological approach to sensory coding: efficient adaptation to the natural environment
Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment Jean-Pierre Nadal CNRS & EHESS Laboratoire de Physique Statistique (LPS, UMR 8550 CNRS - ENS UPMC Univ.
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationOn Design Criteria and Construction of Non-coherent Space-Time Constellations
On Design Criteria and Construction of Non-coherent Space-Time Constellations Mohammad Jaber Borran, Ashutosh Sabharwal, and Behnaam Aazhang ECE Department, MS-366, Rice University, Houston, TX 77005-89
More information3.3 Population Decoding
3.3 Population Decoding 97 We have thus far considered discriminating between two quite distinct stimulus values, plus and minus. Often we are interested in discriminating between two stimulus values s
More informationSean Escola. Center for Theoretical Neuroscience
Employing hidden Markov models of neural spike-trains toward the improved estimation of linear receptive fields and the decoding of multiple firing regimes Sean Escola Center for Theoretical Neuroscience
More informationExercises. Chapter 1. of τ approx that produces the most accurate estimate for this firing pattern.
1 Exercises Chapter 1 1. Generate spike sequences with a constant firing rate r 0 using a Poisson spike generator. Then, add a refractory period to the model by allowing the firing rate r(t) to depend
More informationPopulation Coding. Maneesh Sahani Gatsby Computational Neuroscience Unit University College London
Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2010 Coding so far... Time-series for both spikes and stimuli Empirical
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationLecture 5 Channel Coding over Continuous Channels
Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From
More informationLearning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014
Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of
More informationA minimalist s exposition of EM
A minimalist s exposition of EM Karl Stratos 1 What EM optimizes Let O, H be a random variables representing the space of samples. Let be the parameter of a generative model with an associated probability
More informationExternal field h ext. Supplementary Figure 1. Performing the second-order maximum entropy fitting procedure on an
Fisher information I = Susceptibility d s /dh ext 1.2 s 0.8 0.6 0.4 0.2 40 30 20 10 0 4 2 0 2 4 h ext Fit to sampled independent data Indep. model 4 2 0 2 4 External field h ext Supplementary Figure 1.
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationThe Poisson Channel with Side Information
The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH
More informationA DYNAMICAL STATE UNDERLYING THE SECOND ORDER MAXIMUM ENTROPY PRINCIPLE IN NEURONAL NETWORKS
COMMUN. MATH. SCI. Vol. 15, No. 3, pp. 665 692 c 2017 International Press A DYNAMICAL STATE UNDERLYING THE SECOND ORDER MAXIMUM ENTROPY PRINCIPLE IN NEURONAL NETWORKS ZHI-QIN JOHN XU, GUOQIANG BI, DOUGLAS
More informationToday. Calculus. Linear Regression. Lagrange Multipliers
Today Calculus Lagrange Multipliers Linear Regression 1 Optimization with constraints What if I want to constrain the parameters of the model. The mean is less than 10 Find the best likelihood, subject
More informationClustering means geometry in sparse graphs. Dmitri Krioukov Northeastern University Workshop on Big Graphs UCSD, San Diego, CA, January 2016
in sparse graphs Dmitri Krioukov Northeastern University Workshop on Big Graphs UCSD, San Diego, CA, January 206 Motivation Latent space models Successfully used in sociology since the 70ies Recently shown
More informationHorizontal mergers: Merger in the salmon market Unilateral effects in Cournot markets with differentiated products
Horizontal mergers: Unilateral effects in Cournot markets with differentiated products 1 1 Conseil de la Concurrence, Paris This presentation represents a personal view and does not necessarily reflect
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.
School of Computer Science 10-701 Introduction to Machine Learning aïve Bayes Readings: Mitchell Ch. 6.1 6.10 Murphy Ch. 3 Matt Gormley Lecture 3 September 14, 2016 1 Homewor 1: due 9/26/16 Project Proposal:
More informationSpatial point process. Odd Kolbjørnsen 13.March 2017
Spatial point process Odd Kolbjørnsen 13.March 2017 Today Point process Poisson process Intensity Homogeneous Inhomogeneous Other processes Random intensity (Cox process) Log Gaussian cox process (LGCP)
More informationEfficient Coding. Odelia Schwartz 2017
Efficient Coding Odelia Schwartz 2017 1 Levels of modeling Descriptive (what) Mechanistic (how) Interpretive (why) 2 Levels of modeling Fitting a receptive field model to experimental data (e.g., using
More informationInformation Theory in Intelligent Decision Making
Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory
More informationStatistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009
Statistics for Particle Physics Kyle Cranmer New York University 91 Remaining Lectures Lecture 3:! Compound hypotheses, nuisance parameters, & similar tests! The Neyman-Construction (illustrated)! Inverted
More informationEECS 750. Hypothesis Testing with Communication Constraints
EECS 750 Hypothesis Testing with Communication Constraints Name: Dinesh Krithivasan Abstract In this report, we study a modification of the classical statistical problem of bivariate hypothesis testing.
More informationConcerns of the Psychophysicist. Three methods for measuring perception. Yes/no method of constant stimuli. Detection / discrimination.
Three methods for measuring perception Concerns of the Psychophysicist. Magnitude estimation 2. Matching 3. Detection/discrimination Bias/ Attentiveness Strategy/Artifactual Cues History of stimulation
More informationTHE QUEEN S UNIVERSITY OF BELFAST
THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M
More information2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?
ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we
More informationUse of the likelihood principle in physics. Statistics II
Use of the likelihood principle in physics Statistics II 1 2 3 + Bayesians vs Frequentists 4 Why ML does work? hypothesis observation 5 6 7 8 9 10 11 ) 12 13 14 15 16 Fit of Histograms corresponds This
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationUniversity of Illinois ECE 313: Final Exam Fall 2014
University of Illinois ECE 313: Final Exam Fall 2014 Monday, December 15, 2014, 7:00 p.m. 10:00 p.m. Sect. B, names A-O, 1013 ECE, names P-Z, 1015 ECE; Section C, names A-L, 1015 ECE; all others 112 Gregory
More informationInformation Theory and Hypothesis Testing
Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking
More informationDetection theory 101 ELEC-E5410 Signal Processing for Communications
Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationSPIKE TRIGGERED APPROACHES. Odelia Schwartz Computational Neuroscience Course 2017
SPIKE TRIGGERED APPROACHES Odelia Schwartz Computational Neuroscience Course 2017 LINEAR NONLINEAR MODELS Linear Nonlinear o Often constrain to some form of Linear, Nonlinear computations, e.g. visual
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More information15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due
More informationFisher Information in Gaussian Graphical Models
Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information
More informationEECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015
EECS564 Estimation, Filtering, and Detection Exam Week of April 0, 015 This is an open book takehome exam. You have 48 hours to complete the exam. All work on the exam should be your own. problems have
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationComments. x > w = w > x. Clarification: this course is about getting you to be able to think as a machine learning expert
Logistic regression Comments Mini-review and feedback These are equivalent: x > w = w > x Clarification: this course is about getting you to be able to think as a machine learning expert There has to be
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationAPC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words
APC486/ELE486: Transmission and Compression of Information Bounds on the Expected Length of Code Words Scribe: Kiran Vodrahalli September 8, 204 Notations In these notes, denotes a finite set, called the
More informationLecture #11: Classification & Logistic Regression
Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded
More informationc 2010 Christopher John Quinn
c 2010 Christopher John Quinn ESTIMATING DIRECTED INFORMATION TO INFER CAUSAL RELATIONSHIPS BETWEEN NEURAL SPIKE TRAINS AND APPROXIMATING DISCRETE PROBABILITY DISTRIBUTIONS WITH CAUSAL DEPENDENCE TREES
More informationEE 4TM4: Digital Communications II. Channel Capacity
EE 4TM4: Digital Communications II 1 Channel Capacity I. CHANNEL CODING THEOREM Definition 1: A rater is said to be achievable if there exists a sequence of(2 nr,n) codes such thatlim n P (n) e (C) = 0.
More information9 Classification. 9.1 Linear Classifiers
9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationNovel spectrum sensing schemes for Cognitive Radio Networks
Novel spectrum sensing schemes for Cognitive Radio Networks Cantabria University Santander, May, 2015 Supélec, SCEE Rennes, France 1 The Advanced Signal Processing Group http://gtas.unican.es The Advanced
More informationData complexity and success probability of statisticals cryptanalysis
Data complexity and success probability of statisticals cryptanalysis Céline Blondeau SECRET-Project-Team, INRIA, France Joint work with Benoît Gérard and Jean-Pierre Tillich aaa C.Blondeau Data complexity
More informationNatural Image Statistics and Neural Representations
Natural Image Statistics and Neural Representations Michael Lewicki Center for the Neural Basis of Cognition & Department of Computer Science Carnegie Mellon University? 1 Outline 1. Information theory
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationSuperiorized Inversion of the Radon Transform
Superiorized Inversion of the Radon Transform Gabor T. Herman Graduate Center, City University of New York March 28, 2017 The Radon Transform in 2D For a function f of two real variables, a real number
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationMLPR: Logistic Regression and Neural Networks
MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition Amos Storkey Amos Storkey MLPR: Logistic Regression and Neural Networks 1/28 Outline 1 Logistic Regression 2 Multi-layer
More informationOutline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.
Outline MLPR: and Neural Networks Machine Learning and Pattern Recognition 2 Amos Storkey Amos Storkey MLPR: and Neural Networks /28 Recap Amos Storkey MLPR: and Neural Networks 2/28 Which is the correct
More informationNeural Encoding: Firing Rates and Spike Statistics
Neural Encoding: Firing Rates and Spike Statistics Dayan and Abbott (21) Chapter 1 Instructor: Yoonsuck Choe; CPSC 644 Cortical Networks Background: Dirac δ Function Dirac δ function has the following
More informationEconomics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,
Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem
More informationSpace-Time Coding for Multi-Antenna Systems
Space-Time Coding for Multi-Antenna Systems ECE 559VV Class Project Sreekanth Annapureddy vannapu2@uiuc.edu Dec 3rd 2007 MIMO: Diversity vs Multiplexing Multiplexing Diversity Pictures taken from lectures
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationMarkov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued
Markov Chains X(t) is a Markov Process if, for arbitrary times t 1 < t 2
More informationExponential Families
Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,
More informationPosterior Regularization
Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationChapter 4 No. 4.0 Answer True or False to the following. Give reasons for your answers.
MATH 434/534 Theoretical Assignment 3 Solution Chapter 4 No 40 Answer True or False to the following Give reasons for your answers If a backward stable algorithm is applied to a computational problem,
More information