ECE INFORMATION THEORY. Fall 2011 Prof. Thinh Nguyen

Similar documents
Lecture 1. Introduction

3F1 Information Theory, Lecture 1

Outline of the Lecture. Background and Motivation. Basics of Information Theory: 1. Introduction. Markku Juntti. Course Overview

(Classical) Information Theory III: Noisy channel coding

Entropies & Information Theory

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Chapter 9 Fundamental Limits in Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Block 2: Introduction to Information Theory

Information Theory and Coding Techniques

Introduction to Information Theory. Part 2

1 Background on Information Theory

The Measure of Information Uniqueness of the Logarithmic Uncertainty Measure

Shannon's Theory of Communication

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Fundamentals of Information Theory Lecture 1. Introduction. Prof. CHEN Jie Lab. 201, School of EIE BeiHang University

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Part I. Entropy. Information Theory and Networks. Section 1. Entropy: definitions. Lecture 5: Entropy

Dept. of Linguistics, Indiana University Fall 2015

Module 1. Introduction to Digital Communications and Information Theory. Version 2 ECE IIT, Kharagpur

A Mathematical Theory of Communication

UNIT I INFORMATION THEORY. I k log 2

Sequences and Information

Principles of Communications

Lecture 2: August 31

Information in Biology

Shannon s A Mathematical Theory of Communication

6.02 Fall 2012 Lecture #1

Information Theory and Coding Techniques: Chapter 1.1. What is Information Theory? Why you should take this course?

Information in Biology

Information Theory. Po-Ning Chen, Professor Institute of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30010, R.O.C.

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

(Classical) Information Theory II: Source coding

Intro to Information Theory

A Simple Introduction to Information, Channel Capacity and Entropy

Classification & Information Theory Lecture #8

CSCI 2570 Introduction to Nanocomputing

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Introduction to Information Theory. Part 4

ELEMENTS O F INFORMATION THEORY

Chapter I: Fundamental Information Theory

ECE Information theory Final (Fall 2008)

INTRODUCTION TO INFORMATION THEORY

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Lecture 11: Quantum Information III - Source Coding

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Information Theory, Statistics, and Decision Trees

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

Introduction to Information Theory

Lecture 4 Noisy Channel Coding

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Lecture 8: Shannon s Noise Models

Lecture 1: Introduction, Entropy and ML estimation

Computing and Communications 2. Information Theory -Entropy

Shannon s noisy-channel theorem

Introduction to Information Theory. Part 3

Towards a Theory of Information Flow in the Finitary Process Soup

Noisy channel communication

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019

Reliable Computation over Multiple-Access Channels

Information Theory - Entropy. Figure 3

One Lesson of Information Theory

Entropy Rate of Stochastic Processes

Information Theory (Information Theory by J. V. Stone, 2015)

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

ESE 523 Information Theory

1. Basics of Information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

COS 341: Discrete Mathematics

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Communication Theory II

Discrete Probability Refresher

to mere bit flips) may affect the transmission.

AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN

Chapter 2 Review of Classical Information Theory

Chaos, Complexity, and Inference (36-462)

AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013

How to Quantitate a Markov Chain? Stochostic project 1

VID3: Sampling and Quantization

The Method of Types and Its Application to Information Hiding

Coding for Discrete Source

Computational Models

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Lecture 4 Channel Coding

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Noisy-Channel Coding

Revision of Lecture 5

2018/5/3. YU Xiangyu

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

6.02 Fall 2011 Lecture #9

Lecture 2: Perfect Secrecy and its Limitations

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

Upper Bounds on the Capacity of Binary Intermittent Communication

Characterization of Information Channels for Asymptotic Mean Stationarity and Stochastic Stability of Nonstationary/Unstable Linear Systems

Solutions to Homework Set #3 Channel and Source coding

Transcription:

1 ECE 566 - INFORMATION THEORY Fall 2011 Prof. Thinh Nguyen

LOGISTICS Prerequisites: ECE 353, strong mathematical background Office hours: M, W 11-Noon, KEC3115 Course Content: Entropy, Relative Entropy, and Mutual Information Asymptoptic Equipartition Property Entropy Rates of a Stochastic ti Process Data Compression Shannon Coding Theorems Channel Capacity Differential Entropy Gaussian Channel Rate Distortion Theory (time permitted, most likely not) 2

LOGISTICS Textbook: Elements of Information Theory, Thomas Cover and Joy Thomas, Wiley, second edition i Recommended reading: Information Theory, Inference, and Learning Algorithms, D. MacKay http://www.inference.phy.cam.ac.uk/mackay/itila/ A Mathematical Theory of Communication, C. E. Shannon, Vol 27, pp.379-423, 623-656, July, October, 1948, http://cm.bell labs.com/cm/ms/what/shannonday/shannon1948.pdf Class email list: ece566-f11@engr.orst.edu 3

LOGISTICS Grading Policy Scribing: 5% Homework 30% Midterm (Take home) 30% Final exam/project (Take home) 35% 4

WHAT IS INFORMATION? Information theory deals with the concept of information, its measurement and its applications. 5

ORIGIN OF INFORMATION THEORY Two schools of thoughts British Semantic information : related to the meaning of messages Pragmatic information : related to the usage and effect of messages American Syntactic: related to the symbols from which messages are composed, and their interrelations 6

ORIGIN OF INFORMATION THEORY Consider the following sentences 1. Jacqueline was awarded the gold medal by the judges at the national skating competition. 2. The judges awarded Jacqueline the gold medal at the national skating competition. 3. There is a traffic jam on I-5 between Corvallis and Eugene in Oregon 4. There is a traffic jam on I-5 in Oregon (1) and (2): syntactically different but semantically and pragmatically identical. (3) and (4): syntactically and semantically different, (3) gives more precise information than (4) (3) and (4) are irrelevant for Californians. 7

ORIGIN OF INFORMATION THEORY The British tradition is closely related to philosophy, psychology and biology. Influenced scientists include MacKay, Carnap, Bar-Hillel, Very hard if not impossible ibl to quantify The American tradition deals with the syntactic aspects of information. Influenced scientists include Shannon, Renyi, Gallager and Csiszar, Can be rigorously quantified by mathematics 8

ORIGIN OF INFORMATION THEORY Basic questions of information theory in the American tradition involve the measurement of syntactic information The fundamental limits on the amount of information which can be transmitted The fundamental limits on the compression of information which can be achieved How to build information processing systems approaching these limits? 9

ORIGIN OF INFORMATION THEORY H. Nyquist (1924) published an article wherein he discussed how messages (characters) could be sent over a telegraph channel with maximum possible speed, but without distortion. R. Hartley (1928) who first defined a measure of information as the logarithm of the number of distinguishable messages one can represent using n characters, each can take s possible letters. 10

ORIGIN OF INFORMATION THEORY For a message of length n, we have the Hartley s measure of information as: H H n n ( s ) log( s ) nlog( s ) For a message of length of 1, we have the Hartley s measure of information as: H H ( s 1 ) log( s 1 ) 1log( s) This definition corresponds with the intuitive idea that a message consisting of n symbols contains n times as much information as a message consisting of only one symbol. 11

ORIGIN OF INFORMATION THEORY f ( s ) nf ( s ) Note that t any function would satisfy our intuition. One can show that the only functions that satisfy such equation is of the form: n f ( s) log ( s) a The choice of a is arbitrary, and is more a matter of normalization. 12

AND NOW, THE SHANNON S INFORMATION THEORY (a.k.a communication theory, mathematical information theory, or in short as information theory) 13

CLAUDE SHANNON The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. (Claude Shannon 1948) Channel Coding Theorem: It is possible to achieve near perfect communication of information over a noisy channel. In this course we will: Define what we mean by information Show how we can compress the information in a source to its theoretically minimum value and show the tradeoff between data compression and distortion. Prove the Channel Coding Theorem and derive the information capacity of different channels 1916-2001 14

The little formula that starts it all x A H( X) E[ log2 ( X)] p( x)log2 p( x) 15

SOME FUNDAMENTAL QUESTIONS (that can be addressed by information theory) What is the minimum number of bits that can be used to represent the following texts? "There is a wisdom that is woe; but there is a woe that is madness. And there is a Catskill eagle in some souls that can alike dive down into the blackest gorges, and soar out of them again and become invisible in the sunny spaces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lowest swoop the mountain eagle is still higher than other birds upon the plain, even though h they soar." - Moby Dick, Herman Melville 16

SOME FUNDAMENTAL QUESTIONS (that can be addressed by information theory) How fast can information be sent reliably over an error prone channel? "Thera is a wisdem that is woe; but thera is a woe that is madness. And there is a Catskill eagle in sme souls that can alike divi down into the blackst goages, and soar out of thea agein and become invisble in the sunny speces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lotest swoap the mountin eagle is still highhr than other berds upon the plain, even though h they soar." - Moby Dick, Herman Melville 17

FROM QUESTIONS TO APPLICATIONS 18

SOME NOT SO FUNDAMENTAL QUESTIONS (that can be addressed by information theory) How to make money? (Kelly s Portfolio Theory) What is the optimal dating strategy? 19

FUNDAMENTAL ASSUMPTION OF SHANNON S INFORMATION THEORY Physical Model (Laws) E.g., Newtonian universe is deterministic, all events can be predicted d with 100% accuracy by Newton s laws, given the initial conditions. U N I V E R S E If there is no irreducible randomness, the description/history of our universe, therefore can be compressed into a few equations. 20

FUNDAMENTAL ASSUMPTION OF SHANNON S INFORMATION THEORY Probabilistic model Some well-defined stochastic process that generates all the observations U N I V E R S E Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based on the ignorant model. Nevertheless, it is a good approximation, depending on how accurate the probabilistic model used to describe the phenomenon of interest. 21

FUNDAMENTAL ASSUMPTION OF SHANNON S INFORMATION THEORY "There is a wisdom that is woe; but there is a woe that is madness. And there is a Catskill eagle in some souls that can alike dive down into the blackest gorges, and soar out of them again and become invisible in the sunny spaces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lowest swoop the mountain eagle is still higher than other birds upon the plain, even though they soar." - Moby Dick, Herman Melville We know the texts above is not random, but we can build a probabilistic model for it. For example, a first order approximation could be to build a histogram of different letters in the text, t then approximate the probability that a particular letter appears based on the histogram (assuming iid). 22

WHY SHANNON INFORMATION IS SOMEWHAT UNSATISFACTORY? How much Shannon information does this picture contain? 23

ALGORITHMIC INFORMATION THEORY (KOLMOGOROV COMPLEXITY) The Kolmogorov complexity K(x) of a sequence x is the minimum size of the program and its inputs needed d to generate x. Example: If x was a sequence of all ones, a highly compressible sequence, the program would simply be a print statement t t in a loop. On the other extreme, if x were a random sequence with no structure, then the only program that could generate it would contain the sequence itself. 24

REVIEW OF BASIC PROBABILITY A discrete random variable X takes a value x from the alphabet A with probability p(x). 25

EXPECTED VALUE If g(x) is real valued and defined on A then X E [ g( X)] p( x) g( x) x A 26

SHANNON INFORMATION CONTENT The Shannon Information Content of an outcome with probability p is log 2 p 27

MINESWEEPER Where is the bomb? 16 possibilities - needs 4 bits to specify 28

ENTROPY H ( X ) E [ log 2 ( p ( X ))] p ( x)log 2 ( p ( x )) 2 x A 29

ENTROPY EXAMPLES Bernoulli Random Variable 30

ENTROPY EXAMPLES Four Colored Shapes A = [ ; ; ; ] p(x) = [1/2;1/4;1/8;1/8] 31

DERIVATION OF SHANNON ENTROPY H ( X ) n i 1 p i log 2 p i Intuitive Requirements: 1. We want H to be a continuous function of probabilities p i. That is, a i small change in p i should only cause a small change in H 2. If all events are equally likely, that is, p i = 1/n for all i, then H should be a monotonically increasing function of n. The more possible outcomes there are, the more information should be contained in the occurrence of any particular outcome. 3. It does not matter how we divide the group of outcomes, the total of information should be the same. 32

DERIVATION OF SHANNON ENTROPY 33

DERIVATION OF SHANNON ENTROPY 34

DERIVATION OF SHANNON ENTROPY 35

DERIVATION OF SHANNON ENTROPY 36