Achievable rates for pattern recognition
|
|
- Hubert Stokes
- 5 years ago
- Views:
Transcription
1 Achievable rates for pattern recognition M Brandon Westover Joseph A O Sullivan Washington University in Saint Louis Departments of Physics and Electrical Engineering
2 Goals of Information Theory Models Bounds Design?
3 The Pattern Recognition Problem Model 1: The Naïve View
4 Training set Selection process Recognition Environment X 1 X 2 X Mc Select one Test pattern x h
5 Training set Selection process Recognition Environment X 1 X 2 X Mc Select one Test pattern x h Recognition System X 1,X 2,,X Mc g ĥ Memory Recognition module Objective: g(x h )=h
6 X 1 X 2 X Mc Select one x h X 1,X 2,,X Mc g ĥ Objective: g(x h )=h
7 Mumford, 2002
8 Mumford, 2002
9 ``Easy things are hard -M Minksy What makes recognition problems hard? Problem-intrinsic challenges Data ambiguity Data complexity
10 Mumford, 95
11 The infinity of signature variation Kersten, 1998
12
13 Yuille and Kersten, 03
14 The Pattern Recognition Problem Model 2: The Probabilistic View
15 X 1 X 2 X Mc Select one x h X 1,X 2,,X Mc g ĥ Objective: g(x h )=h
16 Data model Imaging model X 1 X 2 p(x) X 1 Select one p(h) x h p(y x) y X Mc X 1,X 2,,X Mc g ĥ Objective: Pr{g(x h )=h}>1-ε
17
18 ``Easy things are hard -M Minksy What makes pattern recognition hard? Problem-intrinsic challenges Data ambiguity Data complexity Problem-solver-intrinsic challenges Faulty components Data storage limitations Data processing limitations
19 The simplest recognition circuit Barlow, 1959 Barlow, 1959
20 Redundancy Absolute Perceptual Data Compression Data volume ~10 8 bits/sec Data complexity Data cost ~10 9 ATP/bit Cortical energy budget ~10 20 ATP/sec Storage capacity Accessibility Simplicity Stability Modeling Prediction Inference
21 The Pattern Recognition Problem Model 3: Information Theoretic View
22 Data model Imaging model X 1 X 2 p(x) X 1 Select one p(h) x h p(y x) y X Mc X 1,X 2,,X Mc g ĥ Objective: Pr{g(x h )=h}>1-ε
23 X 1 X 2 p(x) X 1 Select one p(h) x h p(y x) y X Mc memory encoder f φ sensory encoder V sensory representation memory representation U 1,U 2,,U Mx g ĥ Objective: Pr{g(V )=h}>1-ε
24 X 1 X 2 p(x) X 1 Select one p(h) x h p(y x) y X Mc R x memory encoder f φ sensory encoder R y V sensory representation memory representation U 1,U 2,,U Mx g ĥ Objective: Pr{g(V )=h}>1-ε, st R=(, R x, R y )
25 Goals of Information Theory Models Bounds Design?
26 p(x) X 1 X 2 Problem statement X Mc Select one p(h) p(y x) Determine the admissible rates R x,r y, and for reliable recognition x h y R x f R y φ v U 1,U 2,,U Mx g ĥ
27 Pattern Recognition Codes: Definitions p(x) x 1 x 2 x 3 select pattern p(h) x h y p(y x) x Mc g 2 f g 1 φ u(i) i γ(i) M x v(i) i γ(i) M y 11 11
28 Achievable rates
29 Characterizing
30 Characterizing
31 All p(x,y,u,v) Outer bound: proof strategy 1/ p ** (x,y,u,v) given (f,φ,g) construct R U-X-Y, X-Y-V etc
32 Characterizing
33 All p(x,y,u,v) Inner bound: proof strategy construct (f,φ,g) 1/ given p * (x,y,u,v) R U-X-Y-V etc
34 H(Y) R x > I(X;U) R y > I(Y;V) < I(U;V)-I(U;V X,Y) I(Y;V) 0 0 I(X;U) H(X)
35 H(Y) V=Y R x > I(X;U) R y > I(Y;V) < I(U;V)-I(U;V X,Y) U=X On the border, U-X-Y-V, so R * =R ** =R I(Y;V) 0 0 I(X;U) H(X)
36 H(Y) V=Y R x > I(X;U) R y > I(Y;V) < I(U;V) U=X I(Y;V) 0 0 I(X;U) H(X)
37 H(Y) V=Y =I(X;Y) R x > I(X;U) R y > I(Y;V) < I(U;V) U=X `Unlimited U,V capacity: U=X, Y=V Rc < I(X;Y) I(Y;V) 0 0 I(X;U) H(X) Channel coding!
38 H(Y) V=Y =I(X;Y) R x > I(X;U) R y > I(Y;V) < I(U;V) =0 U=X Poor memory: U=0 Rc < I(0;V)=0 I(Y;V) 0 0 I(X;U) =0 H(X) Poor senses: V=0 Rc<I(U;0)=0
39 H(Y) V=Y =I(X;Y)-I(X;Y U) =I(X;Y) R x > I(X;U) R y > I(Y;V) < I(U;V) =0 U=X `Unlimited V capacity: V=Y Rc < I(X;Y)-I(X;Y U) I(Y;V) 0 0 I(X;U) =0 H(X)
40 H(Y) V=Y =I(X;Y)-I(X;Y U) =I(X;Y) R x > I(X;U) R y > I(Y;V) < I(U;V) =0 U=X `Unlimited U capacity: U=X Rc < I(X;Y)-I(X;Y V) I(Y;V) 0 0 I(X;U) =0 H(X)
41 H(Y) V=Y =I(X;Y)-I(X;Y U) =I(X;Y) =0 U=X =I(X;Y)-I(X;Y V) I(Y;V) 0 0 I(X;U) =0 H(X)
42 Revisiting GAP
43 All p(x,y,u,v) The gap 1/ R U-X-Y, X-Y-V etc U-X-Y-V etc
44 A related gap: The distributed source coding problem p(x,y) X Y f φ g (U,V) -Problem: Characterize the achievable (R x,r y,d x,d y ) -Posed in early 70 s -Only partial solutions so far
45 Comparison of the gaps Pattern Recognition Distributed Source Coding
46 Closing comments Objective framework for normalizing recognition system performance / guiding system design Closing the gap Extensions for finite n? Learning codebooks for real examples Connections to the information bottleneck framework Similar philosophy: distortion should be defined by the task!
47
48 References for borrowed images David Mumford, 1995, in Neuronal Architecturesfor Pattern-theoretic problems, from the book Large Scale Neuronal Theories of the Brain, edited by C koch and JL Davis Dan Kersten, 1998, slide from NIPS tutorial titled Computational Vision: Principles of Perceptual Inference Available at: Kersten, D, Mamassian P & Yuille A 2003 (in press), Object perception as Bayesian inference Annual Review of Psychology David Mumford and Agnes Desolneux, 2002, in the introductory chapter to Pattern Theory Through Examples Available from
Efficient Coding. Odelia Schwartz 2017
Efficient Coding Odelia Schwartz 2017 1 Levels of modeling Descriptive (what) Mechanistic (how) Interpretive (why) 2 Levels of modeling Fitting a receptive field model to experimental data (e.g., using
More informationDeep learning in the visual cortex
Deep learning in the visual cortex Thomas Serre Brown University. Fundamentals of primate vision. Computational mechanisms of rapid recognition and feedforward processing. Beyond feedforward processing:
More informationEstimation of information-theoretic quantities
Estimation of information-theoretic quantities Liam Paninski Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/ liam liam@gatsby.ucl.ac.uk November 16, 2004 Some
More informationInformation Theory. Coding and Information Theory. Information Theory Textbooks. Entropy
Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is
More informationLecture 3. Linear Regression II Bastian Leibe RWTH Aachen
Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression
More informationNeural coding Ecological approach to sensory coding: efficient adaptation to the natural environment
Neural coding Ecological approach to sensory coding: efficient adaptation to the natural environment Jean-Pierre Nadal CNRS & EHESS Laboratoire de Physique Statistique (LPS, UMR 8550 CNRS - ENS UPMC Univ.
More informationEntropies & Information Theory
Entropies & Information Theory LECTURE I Nilanjana Datta University of Cambridge,U.K. See lecture notes on: http://www.qi.damtp.cam.ac.uk/node/223 Quantum Information Theory Born out of Classical Information
More informationELEC546 Review of Information Theory
ELEC546 Review of Information Theory Vincent Lau 1/1/004 1 Review of Information Theory Entropy: Measure of uncertainty of a random variable X. The entropy of X, H(X), is given by: If X is a discrete random
More informationClassification & Information Theory Lecture #8
Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing
More informationLooking forwards and backwards: Similarities and differences in prediction and retrodiction
Looking forwards and backwards: Similarities and differences in prediction and retrodiction Kevin A Smith (k2smith@ucsd.edu) and Edward Vul (evul@ucsd.edu) University of California, San Diego Department
More informationIntroduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.
L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate
More informationCSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture
CSE/NB 528 Final Lecture: All Good Things Must 1 Course Summary Where have we been? Course Highlights Where do we go from here? Challenges and Open Problems Further Reading 2 What is the neural code? What
More informationDept. of Linguistics, Indiana University Fall 2015
L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission
More informationFlexible Gating of Contextual Influences in Natural Vision. Odelia Schwartz University of Miami Oct 2015
Flexible Gating of Contextual Influences in Natural Vision Odelia Schwartz University of Miami Oct 05 Contextual influences Perceptual illusions: no man is an island.. Review paper on context: Schwartz,
More informationBayesian decision making
Bayesian decision making Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských partyzánů 1580/3, Czech Republic http://people.ciirc.cvut.cz/hlavac,
More informationWhat and where: A Bayesian inference theory of attention
What and where: A Bayesian inference theory of attention Sharat Chikkerur, Thomas Serre, Cheston Tan & Tomaso Poggio CBCL, McGovern Institute for Brain Research, MIT Preliminaries Outline Perception &
More information12/2/15. G Perception. Bayesian Decision Theory. Laurence T. Maloney. Perceptual Tasks. Testing hypotheses. Estimation
G89.2223 Perception Bayesian Decision Theory Laurence T. Maloney Perceptual Tasks Testing hypotheses signal detection theory psychometric function Estimation previous lecture Selection of actions this
More informationPrinciples of Communications
Principles of Communications Weiyao Lin Shanghai Jiao Tong University Chapter 10: Information Theory Textbook: Chapter 12 Communication Systems Engineering: Ch 6.1, Ch 9.1~ 9. 92 2009/2010 Meixia Tao @
More informationVariational Autoencoders. Presented by Alex Beatson Materials from Yann LeCun, Jaan Altosaar, Shakir Mohamed
Variational Autoencoders Presented by Alex Beatson Materials from Yann LeCun, Jaan Altosaar, Shakir Mohamed Contents 1. Why unsupervised learning, and why generative models? (Selected slides from Yann
More informationNoisy channel communication
Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 6 Communication channels and Information Some notes on the noisy channel setup: Iain Murray, 2012 School of Informatics, University
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More information16.36 Communication Systems Engineering
MIT OpenCourseWare http://ocw.mit.edu 16.36 Communication Systems Engineering Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 16.36: Communication
More informationCommunication by Regression: Sparse Superposition Codes
Communication by Regression: Sparse Superposition Codes Department of Statistics, Yale University Coauthors: Antony Joseph and Sanghee Cho February 21, 2013, University of Texas Channel Communication Set-up
More informationInformation Theoretic Limits of Randomness Generation
Information Theoretic Limits of Randomness Generation Abbas El Gamal Stanford University Shannon Centennial, University of Michigan, September 2016 Information theory The fundamental problem of communication
More informationNatural Image Statistics and Neural Representations
Natural Image Statistics and Neural Representations Michael Lewicki Center for the Neural Basis of Cognition & Department of Computer Science Carnegie Mellon University? 1 Outline 1. Information theory
More informationProbability models for machine learning. Advanced topics ML4bio 2016 Alan Moses
Probability models for machine learning Advanced topics ML4bio 2016 Alan Moses What did we cover in this course so far? 4 major areas of machine learning: Clustering Dimensionality reduction Classification
More information(Classical) Information Theory III: Noisy channel coding
(Classical) Information Theory III: Noisy channel coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract What is the best possible way
More informationNetwork Computing and State Space Semantics versus Symbol Manipulation and Sentential Representation
Network Computing and State Space Semantics versus Symbol Manipulation and Sentential Representation Brain Unlike Turing Machine/Symbol Manipulator 1. Nervous system has massively parallel architecture
More informationDifferentiation. Timur Musin. October 10, University of Freiburg 1 / 54
Timur Musin University of Freiburg October 10, 2014 1 / 54 1 Limit of a Function 2 2 / 54 Literature A. C. Chiang and K. Wainwright, Fundamental methods of mathematical economics, Irwin/McGraw-Hill, Boston,
More informationEE 4TM4: Digital Communications II. Channel Capacity
EE 4TM4: Digital Communications II 1 Channel Capacity I. CHANNEL CODING THEOREM Definition 1: A rater is said to be achievable if there exists a sequence of(2 nr,n) codes such thatlim n P (n) e (C) = 0.
More informationLinear discriminant functions
Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative
More informationLecture 2: August 31
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy
More informationStatistical Structure in Natural Language. Tom Jackson April 4th PACM seminar Advisor: Bill Bialek
Statistical Structure in Natural Language Tom Jackson April 4th PACM seminar Advisor: Bill Bialek Some Linguistic Questions n Why does language work so well? Unlimited capacity and flexibility Requires
More informationDistributed Lossless Compression. Distributed lossless compression system
Lecture #3 Distributed Lossless Compression (Reading: NIT 10.1 10.5, 4.4) Distributed lossless source coding Lossless source coding via random binning Time Sharing Achievability proof of the Slepian Wolf
More informationCh. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory
Ch. 8 Math Preliminaries for Lossy Coding 8.5 Rate-Distortion Theory 1 Introduction Theory provide insight into the trade between Rate & Distortion This theory is needed to answer: What do typical R-D
More informationChapter 9 Fundamental Limits in Information Theory
Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For
More informationCounty Council Named for Kent
\ Y Y 8 9 69 6» > 69 ««] 6 : 8 «V z 9 8 x 9 8 8 8?? 9 V q» :: q;; 8 x () «; 8 x ( z x 9 7 ; x >«\ 8 8 ; 7 z x [ q z «z : > ; ; ; ( 76 x ; x z «7 8 z ; 89 9 z > q _ x 9 : ; 6? ; ( 9 [ ) 89 _ ;»» «; x V
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationCoding for Computing. ASPITRG, Drexel University. Jie Ren 2012/11/14
Coding for Computing ASPITRG, Drexel University Jie Ren 2012/11/14 Outline Background and definitions Main Results Examples and Analysis Proofs Background and definitions Problem Setup Graph Entropy Conditional
More informationthe Information Bottleneck
the Information Bottleneck Daniel Moyer December 10, 2017 Imaging Genetics Center/Information Science Institute University of Southern California Sorry, no Neuroimaging! (at least not presented) 0 Instead,
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationVariable selection and feature construction using methods related to information theory
Outline Variable selection and feature construction using methods related to information theory Kari 1 1 Intelligent Systems Lab, Motorola, Tempe, AZ IJCNN 2007 Outline Outline 1 Information Theory and
More informationHierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References
24th March 2011 Update Hierarchical Model Rao and Ballard (1999) presented a hierarchical model of visual cortex to show how classical and extra-classical Receptive Field (RF) effects could be explained
More informationModule 1. Introduction to Digital Communications and Information Theory. Version 2 ECE IIT, Kharagpur
Module ntroduction to Digital Communications and nformation Theory Lesson 3 nformation Theoretic Approach to Digital Communications After reading this lesson, you will learn about Scope of nformation Theory
More informationUnsupervised Learning
Unsupervised Learning Week 1: Introduction, Statistical Basics, and a bit of Information Theory Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent
More informationAn Introduction to Statistical and Probabilistic Linear Models
An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning
More informationBayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington
Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The
More informationECE Information theory Final
ECE 776 - Information theory Final Q1 (1 point) We would like to compress a Gaussian source with zero mean and variance 1 We consider two strategies In the first, we quantize with a step size so that the
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationUCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, Solutions to Homework Set #5
UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, 2017 Solutions to Homework Set #5 3.18 Bounds on the quadratic rate distortion function. Recall that R(D) = inf F(ˆx x):e(x ˆX)2 DI(X; ˆX).
More information8/13/10. Visual perception of human motion. Outline. Niko Troje BioMotionLab. Perception is. Stimulus Sensation Perception. Gestalt psychology
Visual perception of human motion Outline Niko Troje BioMotionLab! Vision from the psychologist s point of view: A bit of history and a few concepts! Biological motion: perception and analysis Department
More informationRevision of Lecture 5
Revision of Lecture 5 Information transferring across channels Channel characteristics and binary symmetric channel Average mutual information Average mutual information tells us what happens to information
More informationMachine Learning Foundations
Machine Learning Foundations ( 機器學習基石 ) Lecture 8: Noise and Error Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系
More informationInformation Theory CHAPTER. 5.1 Introduction. 5.2 Entropy
Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is
More informationCompression and Coding
Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)
More informationCompressing Tabular Data via Pairwise Dependencies
Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet
More informationThe wake-sleep algorithm for unsupervised neural networks
The wake-sleep algorithm for unsupervised neural networks Geoffrey E Hinton Peter Dayan Brendan J Frey Radford M Neal Department of Computer Science University of Toronto 6 King s College Road Toronto
More informationInformation Theory - Entropy. Figure 3
Concept of Information Information Theory - Entropy Figure 3 A typical binary coded digital communication system is shown in Figure 3. What is involved in the transmission of information? - The system
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Naïve Bayes Matt Gormley Lecture 18 Oct. 31, 2018 1 Reminders Homework 6: PAC Learning
More informationOn Optimal Coding of Hidden Markov Sources
2014 Data Compression Conference On Optimal Coding of Hidden Markov Sources Mehdi Salehifar, Emrah Akyol, Kumar Viswanatha, and Kenneth Rose Department of Electrical and Computer Engineering University
More informationCh. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited
Ch. 8 Math Preliminaries for Lossy Coding 8.4 Info Theory Revisited 1 Info Theory Goals for Lossy Coding Again just as for the lossless case Info Theory provides: Basis for Algorithms & Bounds on Performance
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC
More informationProbabilistic & Unsupervised Learning
Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationCommunication by Regression: Achieving Shannon Capacity
Communication by Regression: Practical Achievement of Shannon Capacity Department of Statistics Yale University Workshop Infusing Statistics and Engineering Harvard University, June 5-6, 2011 Practical
More informationQuiz 2 Date: Monday, November 21, 2016
10-704 Information Processing and Learning Fall 2016 Quiz 2 Date: Monday, November 21, 2016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED. 2. Write your name,
More informationSYDE 575: Introduction to Image Processing. Image Compression Part 2: Variable-rate compression
SYDE 575: Introduction to Image Processing Image Compression Part 2: Variable-rate compression Variable-rate Compression: Transform-based compression As mentioned earlier, we wish to transform image data
More informationLecture Notes 15 Prediction Chapters 13, 22, 20.4.
Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data
More informationNatural Language Processing
SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University October 9, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class
More informationSolutions to Homework Set #1 Sanov s Theorem, Rate distortion
st Semester 00/ Solutions to Homework Set # Sanov s Theorem, Rate distortion. Sanov s theorem: Prove the simple version of Sanov s theorem for the binary random variables, i.e., let X,X,...,X n be a sequence
More informationLateral organization & computation
Lateral organization & computation review Population encoding & decoding lateral organization Efficient representations that reduce or exploit redundancy Fixation task 1rst order Retinotopic maps Log-polar
More informationDATA MINING LECTURE 9. Minimum Description Length Information Theory Co-Clustering
DATA MINING LECTURE 9 Minimum Description Length Information Theory Co-Clustering MINIMUM DESCRIPTION LENGTH Occam s razor Most data mining tasks can be described as creating a model for the data E.g.,
More information6.02 Fall 2011 Lecture #9
6.02 Fall 2011 Lecture #9 Claude E. Shannon Mutual information Channel capacity Transmission at rates up to channel capacity, and with asymptotically zero error 6.02 Fall 2011 Lecture 9, Slide #1 First
More information30.5. Iterative Methods for Systems of Equations. Introduction. Prerequisites. Learning Outcomes
Iterative Methods for Systems of Equations 0.5 Introduction There are occasions when direct methods (like Gaussian elimination or the use of an LU decomposition) are not the best way to solve a system
More informationStatistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart 1 Motivation Imaging is a stochastic process: If we take all the different sources of
More informationTutorial on Gaussian Processes and the Gaussian Process Latent Variable Model
Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,
More informationLecture 2: Logistic Regression and Neural Networks
1/23 Lecture 2: and Neural Networks Pedro Savarese TTI 2018 2/23 Table of Contents 1 2 3 4 3/23 Naive Bayes Learn p(x, y) = p(y)p(x y) Training: Maximum Likelihood Estimation Issues? Why learn p(x, y)
More informationVision Modules and Cue Combination
Cue Coupling This section describes models for coupling different visual cues. The ideas in this section are logical extensions of the ideas in the earlier sections. But we are now addressing more complex
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationPhysical Layer and Coding
Physical Layer and Coding Muriel Médard Professor EECS Overview A variety of physical media: copper, free space, optical fiber Unified way of addressing signals at the input and the output of these media:
More informationEC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY
EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY Discrete Messages and Information Content, Concept of Amount of Information, Average information, Entropy, Information rate, Source coding to increase
More informationarxiv:physics/ v1 [physics.bio-ph] 19 Feb 1999
Odor recognition and segmentation by coupled olfactory bulb and cortical networks arxiv:physics/9902052v1 [physics.bioph] 19 Feb 1999 Abstract Zhaoping Li a,1 John Hertz b a CBCL, MIT, Cambridge MA 02139
More informationORF 245 Fundamentals of Statistics Joint Distributions
ORF 245 Fundamentals of Statistics Joint Distributions Robert Vanderbei Fall 2015 Slides last edited on November 11, 2015 http://www.princeton.edu/ rvdb Introduction Joint Cumulative Distribution Function
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationCSC321 Lecture 20: Autoencoders
CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete
More informationBayesian Inference Course, WTCN, UCL, March 2013
Bayesian Course, WTCN, UCL, March 2013 Shannon (1948) asked how much information is received when we observe a specific value of the variable x? If an unlikely event occurs then one would expect the information
More informationORIE 4741: Learning with Big Messy Data. Generalization
ORIE 4741: Learning with Big Messy Data Generalization Professor Udell Operations Research and Information Engineering Cornell September 23, 2017 1 / 21 Announcements midterm 10/5 makeup exam 10/2, by
More information1 Background on Information Theory
Review of the book Information Theory: Coding Theorems for Discrete Memoryless Systems by Imre Csiszár and János Körner Second Edition Cambridge University Press, 2011 ISBN:978-0-521-19681-9 Review by
More informationAn Introduction to Independent Components Analysis (ICA)
An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce
More informationNetwork coding for multicast relation to compression and generalization of Slepian-Wolf
Network coding for multicast relation to compression and generalization of Slepian-Wolf 1 Overview Review of Slepian-Wolf Distributed network compression Error exponents Source-channel separation issues
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More information4 An Introduction to Channel Coding and Decoding over BSC
4 An Introduction to Channel Coding and Decoding over BSC 4.1. Recall that channel coding introduces, in a controlled manner, some redundancy in the (binary information sequence that can be used at the
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationOptimization in the Big Data Regime 2: SVRG & Tradeoffs in Large Scale Learning. Sham M. Kakade
Optimization in the Big Data Regime 2: SVRG & Tradeoffs in Large Scale Learning. Sham M. Kakade Machine Learning for Big Data CSE547/STAT548 University of Washington S. M. Kakade (UW) Optimization for
More informationLast Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression
CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 22 Dan Weld Learning Gaussians Naïve Bayes Last Time Gaussians Naïve Bayes Logistic Regression Today Some slides from Carlos Guestrin, Luke Zettlemoyer
More informationDynamical Systems and Deep Learning: Overview. Abbas Edalat
Dynamical Systems and Deep Learning: Overview Abbas Edalat Dynamical Systems The notion of a dynamical system includes the following: A phase or state space, which may be continuous, e.g. the real line,
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II
1 Non-linear regression techniques Part - II Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector
More information