Encoding or decoding
|
|
- Todd Jackson
- 5 years ago
- Views:
Transcription
1 Encoding or decoding
2 Decoding How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms for extracting a stimulus estimate directly quantify the relationship between stimulus and response using information theory
3 The optimal linear estimator Let s start with a rate response, r(t) and a stimulus, s(t). The optimal linear estimator is closest to satisfying Want to solve for K. Multiply by s(t-t ) and integrate over t:
4 The optimal linear estimator produced terms which are simply correlation functions: Given a convolution, Fourier transform: Now we have a straightforward algebraic equation for K(w): Solving for K(t),
5 The optimal linear estimator For white noise, the correlation function C ss (t) = s 2 d(t), So K(t) is simply C rs (t).
6 Stimulus reconstruction K t
7 Stimulus reconstruction
8 Stimulus reconstruction
9 Reading minds: the LGN Yang Dan, UC Berkeley
10 Other decoding approaches
11 Binary choice tasks Britten et al. 92: measured both behavior + neural responses
12 Behavioral performance
13 Predictable from neural activity? Discriminability: d = ( <r> + - <r> - )/ s r
14 Signal detection theory z p(r -) p(r +) <r> - <r> + Decoding corresponds to comparing test, r, to threshold, z. a(z) = P[ r z -] false alarm rate, size b(z) = P[ r z +] hit rate, power Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z))
15 ROC curves summarize performance of test for different thresholds z Want b 1, a 0.
16 ROC: two alternative forced choice Threshold z is the result from the first presentation The area under the ROC curve corresponds to P[correct]
17 Is there a better test to use than r? The optimal test function is the likelihood ratio, l(r) = p[r +] / p[r -]. (Neyman-Pearson lemma) Recall a(z) = P[ r z -] b(z) = P[ r z +] false alarm rate, size hit rate, power Then l(z) = (db/dz) / (da/dz) = db/da i.e. slope of ROC curve
18 The logistic function If p[r +] and p[r -] are both Gaussian, one can show that P[correct] = ½ erfc(-d /2). To interpret results as two-alternative forced choice, need simultaneous responses from + neuron and from neuron. Simulate - neuron responses from same neuron in response to stimulus. Ideal observer: performs as area under ROC curve.
19 More d Again, if and p[r -] and p[r +] are Gaussian, p[+] and p[-] are equal, P[+ r] = 1/ [1 + exp(-d (r - <r>)/ s)]. d is the slope of the sigmoidal fitted to P[+ r]
20 Neurons vs organisms Close correspondence between neural and behaviour.. Why so many neurons? Correlations limit performance.
21 Priors z p[r -] p[r +] <r> - <r> + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z))
22 The wind or a tiger? Classification of noisy data: single photon responses Rieke
23 Nonlinear separation of signal and noise Classification of noisy data: single photon responses P(I noise) P(I signal) I Rieke
24 Nonlinear separation of signal and noise Classification of noisy data: single photon responses P(I noise) P(I signal) I Rieke
25 Nonlinear separation of signal and noise Classification of noisy data: single photon responses P(I noise) P(I signal) I Rieke
26 Nonlinear separation of signal and noise Classification of noisy data: single photon responses P(I noise) P(noise) P(I signal) P(signal) I Rieke
27 How about costs? P(I noise) P(noise) P(I signal) P(signal) I
28 Building in cost Penalty for incorrect answer: L +, L - For an observation r, what is the expected loss? Loss - = L - P[+ r] Loss + = L + P[- r] Cut your losses: answer + when Loss + < Loss - i.e. Using Bayes, L + P[- r] < L - P[+ r]. P[+ r] = p[r +]P[+]/p(r); P[- r] = p[r -]P[-]/p(r); l(r) = p[r +]/p[r -] > L + P[-] / L - P[+].
29 Relationship of likelihood to tuning curves For small stimulus differences s and s + ds like comparing to threshold
30 Decoding from many neurons: population codes Population code formulation Methods for decoding: population vector Bayesian inference maximum likelihood maximum a posteriori Fisher information
31 Cricket cercal cells Jacobs G A et al. J Exp Biol 2008;211: by The Company of Biologists Ltd
32 Cricket cercal cells
33 Population vector RMS error in estimate Theunissen & Miller, 1991
34 Population coding in M1 r 0 Hand reaching direction Cosine tuning curve of a motor cortical neuron
35 Population coding in M1 Cosine tuning: Pop. vector: For sufficiently large N, is parallel to the direction of arm movement
36 Population coding in M1 Cosine tuning: Pop. vector: Difficulties with this coding scheme?
37 Is this the best one can do? The population vector is neither general nor optimal. Optimal : make use of all information in the stimulus/response distributions
38 Bayesian inference Bayes law: likelihood function conditional distribution prior distribution a posteriori distribution marginal distribution
39 Bayesian estimation Want an estimator s Bayes Introduce a cost function, L(s,s Bayes ); minimize mean cost. For least squares cost, L(s,s Bayes ) = (s s Bayes ) 2 ; solution is the conditional mean.
40 Bayesian inference By Bayes law, likelihood function a posteriori distribution
41 Maximum likelihood Find maximum of P[r s] over s More generally, probability of the data given the model Model = stimulus assume parametric form for tuning curve
42 Bayesian inference By Bayes law, likelihood function a posteriori distribution
43 Population vector RMS error in estimate Theunissen & Miller, 1991
44 MAP and ML ML: s* which maximizes p[r s] MAP: s* which maximizes p[s r] Difference is the role of the prior: differ by factor p[s]/p[r] For cercal data:
45 Decoding an arbitrary continuous stimulus Work through a specific example assume independence assume Poisson firing Noise model: Poisson distribution P T [k] = (lt) k exp(-lt)/k!
46 Decoding an arbitrary continuous stimulus E.g. Gaussian tuning curves
47 Need to know full P[r s] Assume Poisson: Assume independent: Population response of 11 cells with Gaussian tuning curves
48 ML Apply ML: maximize ln P[r s] with respect to s Set derivative to zero, use sum = constant From Gaussianity of tuning curves, If all s same
49 MAP Apply MAP: maximise ln p[s r] with respect to s Set derivative to zero, use sum = constant From Gaussianity of tuning curves,
50 Given this data: Prior with mean -2, variance 1 MAP: Constant prior
51 How good is our estimate? For stimulus s, have estimated s est Bias: Variance: Mean square error: Cramer-Rao bound: Fisher information (ML is unbiased: b = b = 0)
52 Fisher information Alternatively: Quantifies local stimulus discriminability
53 Fisher information for Gaussian tuning curves For the Gaussian tuning curves w/poisson statistics:
54 Are narrow or broad tuning curves better? Approximate: Thus, Narrow tuning curves are better But not in higher dimensions!..what happens in 2D?
55 Fisher information and discrimination Recall d' = mean difference/standard deviation Can also decode and discriminate using decoded values. Trying to discriminate s and s+ds: Difference in ML estimate is Ds (unbiased) variance in estimate is 1/I F (s).
56 Limitations of these approaches Tuning curve/mean firing rate Correlations in the population
57 The importance of correlation Shadlen and Newsome, 98
58 The importance of correlation
59 The importance of correlation
60 Entropy and Shannon information Model-based vs model free
61 Entropy and Shannon information For a random variable X with distribution p(x), the entropy is H[X] = - S x p(x) log 2 p(x) Information is defined as I[X] = - log 2 p(x)
62 Mutual information Typically, information = mutual information: how much knowing the value of one random variable r (the response) reduces uncertainty about another random variable s (the stimulus). Variability in response is due both to different stimuli and to noise. How much response variability is useful, i.e. can represent different messages, depends on the noise. Noise can be specific to a given stimulus.
63 Mutual information Information quantifies how independent r and s are: I(s;r) = D KL [P(r,s), P(r)P(s)] Alternatively: I(s;r) = H[P(r)] S s P(s) H[P(r s)].
64 Mutual information Mutual information is the difference between the total response entropy and the mean noise entropy: I(s;r) = H[P(r)] S s P(s) H[P(r s)]. Need to know the conditional distribution P(s r) or P(r s). Take a particular stimulus s=s 0 and repeat many times to obtain P(r s 0 ). Compute variability due to noise: noise entropy
65 Mutual information Information is symmetric in r and s Examples: response is unrelated to stimulus: p[r s] =?, MI =? response is perfectly predicted by stimulus: p[r s] =?
66 Simple example r + encodes stimulus +, r - encodes stimulus - but with a probability of error: P(r + +) = 1- p P(r - -) = 1- p What is the response entropy H[p]? What is the noise entropy?
67 Entropy and Shannon information Entropy Information H[p] = -p + log p + (1-p + )log(1-p + ) When p + = ½, H[P(r s)] = -p log p (1-p)log(1-p)
Signal detection theory
Signal detection theory z p[r -] p[r +] - + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z)) Is there a better test to use than r? z p[r -] p[r +] - + The optimal
More information3.3 Population Decoding
3.3 Population Decoding 97 We have thus far considered discriminating between two quite distinct stimulus values, plus and minus. Often we are interested in discriminating between two stimulus values s
More informationNeural Decoding. Chapter Encoding and Decoding
Chapter 3 Neural Decoding 3.1 Encoding and Decoding In chapters 1 and 2, we considered the problem of predicting neural responses to known stimuli. The nervous system faces the reverse problem, determining
More information3 Neural Decoding. 3.1 Encoding and Decoding. (r 1, r 2,..., r N ) for N neurons is a list of spike-count firing rates, although,
3 Neural Decoding 3.1 Encoding and Decoding In chapters 1 and 2, we considered the problem of predicting neural responses to known stimuli. The nervous system faces the reverse problem, determining what
More information+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing
Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable
More informationPopulation Coding. Maneesh Sahani Gatsby Computational Neuroscience Unit University College London
Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2010 Coding so far... Time-series for both spikes and stimuli Empirical
More informationExercises. Chapter 1. of τ approx that produces the most accurate estimate for this firing pattern.
1 Exercises Chapter 1 1. Generate spike sequences with a constant firing rate r 0 using a Poisson spike generator. Then, add a refractory period to the model by allowing the firing rate r(t) to depend
More informationCSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture
CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What
More information!) + log(t) # n i. The last two terms on the right hand side (RHS) are clearly independent of θ and can be
Supplementary Materials General case: computing log likelihood We first describe the general case of computing the log likelihood of a sensory parameter θ that is encoded by the activity of neurons. Each
More informationThe Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017
The Bayesian Brain Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester May 11, 2017 Bayesian Brain How do neurons represent the states of the world? How do neurons represent
More information1/12/2017. Computational neuroscience. Neurotechnology.
Computational neuroscience Neurotechnology https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ 1 Neurotechnology http://www.lce.hut.fi/research/cogntech/neurophysiology Recording
More informationInformation Theory. Mark van Rossum. January 24, School of Informatics, University of Edinburgh 1 / 35
1 / 35 Information Theory Mark van Rossum School of Informatics, University of Edinburgh January 24, 2018 0 Version: January 24, 2018 Why information theory 2 / 35 Understanding the neural code. Encoding
More informationencoding and estimation bottleneck and limits to visual fidelity
Retina Light Optic Nerve photoreceptors encoding and estimation bottleneck and limits to visual fidelity interneurons ganglion cells light The Neural Coding Problem s(t) {t i } Central goals for today:
More informationConcerns of the Psychophysicist. Three methods for measuring perception. Yes/no method of constant stimuli. Detection / discrimination.
Three methods for measuring perception Concerns of the Psychophysicist. Magnitude estimation 2. Matching 3. Detection/discrimination Bias/ Attentiveness Strategy/Artifactual Cues History of stimulation
More informationExercise Sheet 4: Covariance and Correlation, Bayes theorem, and Linear discriminant analysis
Exercise Sheet 4: Covariance and Correlation, Bayes theorem, and Linear discriminant analysis Younesse Kaddar. Covariance and Correlation Assume that we have recorded two neurons in the two-alternative-forced
More informationEstimation of information-theoretic quantities
Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/ liam liam@gatsby.ucl.ac.uk November 16, 2004 Some
More informationWhat is the neural code? Sekuler lab, Brandeis
What is the neural code? Sekuler lab, Brandeis What is the neural code? What is the neural code? Alan Litke, UCSD What is the neural code? What is the neural code? What is the neural code? Encoding: how
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationEfficient coding of natural images with a population of noisy Linear-Nonlinear neurons
Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons Yan Karklin and Eero P. Simoncelli NYU Overview Efficient coding is a well-known objective for the evaluation and
More informationBayesian Inference. 2 CS295-7 cfl Michael J. Black,
Population Coding Now listen to me closely, young gentlemen. That brain is thinking. Maybe it s thinking about music. Maybe it has a great symphony all thought out or a mathematical formula that would
More informationDecoding. How well can we learn what the stimulus is by looking at the neural responses?
Decoding How well can we learn what the stimulus is by looking at the neural responses? Two approaches: devise explicit algorithms for extracting a stimulus estimate directly quantify the relationship
More informationNeural coding Ecological approach to sensory coding: efficient adaptation to the natural environment
Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment Jean-Pierre Nadal CNRS & EHESS Laboratoire de Physique Statistique (LPS, UMR 8550 CNRS - ENS UPMC Univ.
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationMathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory
Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 206 Jonathan Pillow Homework 8: Logistic Regression & Information Theory Due: Tuesday, April 26, 9:59am Optimization Toolbox One
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationThe exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationCS 630 Basic Probability and Information Theory. Tim Campbell
CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII
More informationLecture 2: August 31
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationComputer Vision Group Prof. Daniel Cremers. 3. Regression
Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the
More informationLecture 1: Introduction, Entropy and ML estimation
0-704: Information Processing and Learning Spring 202 Lecture : Introduction, Entropy and ML estimation Lecturer: Aarti Singh Scribes: Min Xu Disclaimer: These notes have not been subjected to the usual
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationStatistical Signal Processing Detection, Estimation, and Time Series Analysis
Statistical Signal Processing Detection, Estimation, and Time Series Analysis Louis L. Scharf University of Colorado at Boulder with Cedric Demeure collaborating on Chapters 10 and 11 A TT ADDISON-WESLEY
More informationStephen Scott.
1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training
More informationNeural Encoding Models
Neural Encoding Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2011 Studying sensory systems x(t) y(t) Decoding: ˆx(t)= G[y(t)]
More informationEstimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction
Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationIntroduction to Statistical Learning Theory
Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated
More informationPrimer on statistics:
Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationIntro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation
Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor
More informationLecture 5: GPs and Streaming regression
Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationWeek 5: Logistic Regression & Neural Networks
Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationVariable selection and feature construction using methods related to information theory
Outline Variable selection and feature construction using methods related to information theory Kari 1 1 Intelligent Systems Lab, Motorola, Tempe, AZ IJCNN 2007 Outline Outline 1 Information Theory and
More informationBayesian probability theory and generative models
Bayesian probability theory and generative models Bruno A. Olshausen November 8, 2006 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using
More informationMixture of Gaussians Models
Mixture of Gaussians Models Outline Inference, Learning, and Maximum Likelihood Why Mixtures? Why Gaussians? Building up to the Mixture of Gaussians Single Gaussians Fully-Observed Mixtures Hidden Mixtures
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationChapter 2 Signal Processing at Receivers: Detection Theory
Chapter Signal Processing at Receivers: Detection Theory As an application of the statistical hypothesis testing, signal detection plays a key role in signal processing at receivers of wireless communication
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationAn introduction to Bayesian inference and model comparison J. Daunizeau
An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationBasics of Information Processing
Abstract Don H. Johnson Computer & Information Technology Institute Department of Electrical & Computer Engineering Rice University, MS366 Houston, TX 77251 I describe the basics of probability theory,
More informationApplication: Can we tell what people are looking at from their brain activity (in real time)? Gaussian Spatial Smooth
Application: Can we tell what people are looking at from their brain activity (in real time? Gaussian Spatial Smooth 0 The Data Block Paradigm (six runs per subject Three Categories of Objects (counterbalanced
More informationOverview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation
Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated
More informationINFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS. Michael A. Lexa and Don H. Johnson
INFORMATION PROCESSING ABILITY OF BINARY DETECTORS AND BLOCK DECODERS Michael A. Lexa and Don H. Johnson Rice University Department of Electrical and Computer Engineering Houston, TX 775-892 amlexa@rice.edu,
More informationArtificial Neural Networks Examination, June 2004
Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationLecture 3. Linear Regression II Bastian Leibe RWTH Aachen
Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationIf there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,
Recall The Neyman-Pearson Lemma Neyman-Pearson Lemma: Let Θ = {θ 0, θ }, and let F θ0 (x) be the cdf of the random vector X under hypothesis and F θ (x) be its cdf under hypothesis. Assume that the cdfs
More informationDigital Transmission Methods S
Digital ransmission ethods S-7.5 Second Exercise Session Hypothesis esting Decision aking Gram-Schmidt method Detection.K.K. Communication Laboratory 5//6 Konstantinos.koufos@tkk.fi Exercise We assume
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationProblem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30
Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain
More informationAdvanced statistical methods for data analysis Lecture 1
Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 1, 2011 Today: Generative discriminative classifiers Linear regression Decomposition of error into
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationMinimum Error-Rate Discriminant
Discriminants Minimum Error-Rate Discriminant In the case of zero-one loss function, the Bayes Discriminant can be further simplified: g i (x) =P (ω i x). (29) J. Corso (SUNY at Buffalo) Bayesian Decision
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationNeural Decoding. Mark van Rossum. School of Informatics, University of Edinburgh. January 2012
Neural Decoding Mark van Rossum School of Informatics, University of Edinburgh January 2012 0 Acknowledgements: Chris Williams and slides from Gatsby Liam Paninski. Version: January 31, 2018 1 / 63 2 /
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationArtificial Neural Networks Examination, March 2004
Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More information