3F1 Information Theory, Lecture 3
|
|
- Andrea Austin
- 6 years ago
- Views:
Transcription
1 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013
2 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output of a source Binary Memoryless Source X 1, X 2, X 3,... X 1, X 2,... independent and identically distributed P Xi (1) = 1 P Xi (0) = 0.01 What s the optimal binary variable length code for X i? E[W ] = 1, but H(X) = bits. Redundancy ρ 1 = E[W ] H(X) = 190 Can we do better?
3 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output of a source Binary Memoryless Source X 1, X 2, X 3,... X 1, X 2,... independent and identically distributed P Xi (1) = 1 P Xi (0) = 0.01 What s the optimal binary variable length code for X i? E[W ] = 1, but H(X) = bits. Redundancy ρ 1 = E[W ] H(X) = 190 Can we do better?
4 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 3 / 21 Parsing a source into blocks of 2 x 1 x 2 P X1 X 2 (x 1, x 2 ) Codeword E[W ] = , H(X 1 X 2 ) = bits, where H(X 1 X 2 ) = x 1 x 2 P X1 X 2 (x 1, x 2 ) log P X1 X 2 (x 1, x 2 ) Redundancy per symbol: Can we do better? ρ 2 = E[W ] H(X 1X 2 ) 2 =
5 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 4 / 21 Encoding a block of length N If encoding a block of length N using Huffman or Shannon-Fano coding, H(X 1... X N ) E[W ] < H(X 1... X N ) + 1 where H(X 1... X 2 ) = x 1...x 2 P(x 1,..., x 2 ) log P(x 1,..., x 2 ) Thus, ρ N = E[W ] H(X 1... X N ) N and H(X 1... X N ) + 1 H(X 1... X N ) N lim ρ N = 0, N i.e., the redundancy tends to zero = 1 N
6 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 5 / 21 The problem with block compression The alphabet size grows exponentially in the block length N Huffman s and Fano s algorithms become infeasible for large N, as does storing the codebook In Shannon s version of Shannon-Fano coding, the probability and cumulative probability can be computed recursively: P(x 1,..., x N ) = P(x 1,..., x N 1 )P(x N ) F(x 1,..., x N ) = F (x 1,..., x N 1 ) + F(x N )P(x 1,..., x N 1 ) Compute the cumulative probability for a specific block without computing all others! If only there wasn t the need for the alphabet to be ordered...
7 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 6 / 21 Reminder: Shannon s version of S-F Coding x P X (x) P(X<x) P(X < x) b log P X (x) Codeword F D E C B A Order the symbols in order of non-increasing probability Compute the cumulative probabilities Express the cumulative probabilities in D-ary The codeword is the fractional part of the cumulative probabilities truncated to length log P X (x)
8 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 7 / 21 Source intervals / code intervals x P(X<x) P(X < x) b F D E C B A A B C E D A B C E D F F
9 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 8 / 21 What if we don t order the source probabilities? F F x P(X<x) P(X < x) b A B C D E F E D C B A E D C A B
10 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 9 / 21 New approach x [a, b] A [0.0, 0.05] B [0.05, 0.15] C [0.15, 0.3] D [0.3, 0.5] E [0.5, 0.7] F [0.7, 1.0] a = P(X < x) b = P(X < x) + P(X = x) F E D C Draw source intervals in no particular order Pick the largest interval [k2 w i, (k + 1)2 w i ] that fits in each source interval How large is w i? w i log p i +1 B A
11 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 9 / 21 New approach x [a, b] A [0.0, 0.05] B [0.05, 0.15] C [0.15, 0.3] D [0.3, 0.5] E [0.5, 0.7] F [0.7, 1.0] a = P(X < x) b = P(X < x) + P(X = x) F E D C Draw source intervals in no particular order Pick the largest interval [k2 w i, (k + 1)2 w i ] that fits in each source interval How large is w i? w i log p i +1 B A
12 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 10 / 21 Arithmetic Coding Recursive computation of the source interval: At the end of the source block, generate the shortest possible codeword whose interval fits in the computed source interval E[W ] < H(X 1... X N ) + 2 0
13 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 11 / 21 Arithmetic Coding = Recursive Shannon-Fano Coding Claude E. Shannon Robert L. Fano Peter Elias Richard C. Pasco Jorma Rissanen H(X 1... X N ) E[W ] < H(X 1... X N ) + 2 ρ N = 2/N No need to wait until all the source block has been received to start generating code symbols Arithmetic Coding is an infinite state machine whose states need ever growing precision Finite precision implementation requires a few tricks (and loses some performance)
14 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 12 / 21 English Text Letter Frequency Letter Frequency Letter Frequency a j s b k t c l u d m v e n w f o x g p y h q z i r H(X) = bits P( and ) = = P( eee ) = = P( eee ) >> P( and ). Is this right? No! Can we do better than H(X) for sources with memory?
15 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 12 / 21 English Text Letter Frequency Letter Frequency Letter Frequency a j s b k t c l u d m v e n w f o x g p y h q z i r H(X) = bits P( and ) = = P( eee ) = = P( eee ) >> P( and ). Is this right? No! Can we do better than H(X) for sources with memory?
16 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 13 / 21 Discrete Stationary Source (DSS) properties H N (X) def = 1 N H(X 1... X N ) Entropy Rate H(X N X 1... X N 1 ) H N (X) H(X N X 1... X N 1 ) and H N (X) are non-increasing functions of N lim N H(X N X 1... X N 1 ) = lim N H N (X) def = H (X) H (X) is the entropy rate of a DSS Shannon s converse source coding theorem for a DSS E[W ] N H (X) log D
17 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 14 / 21 Coding for discrete stationary sources Arithmetic coding can use conditional probabilities The intervals will be different at every step depending on the source context Prediction by Partial Matching (PPM) and Context Tree Weighing (CTW) are techniques to build the context tree based source model on the fly, achieving compression rates of approx 2.2 binary symbols per ASCII character What is H (X) for English text? (assuming language is a stationary source, which is a disputed proposition) Predictor Predictor Source Source Encoder Source Decoder Destination
18 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 15 / 21 Markov Chain Andrey Andreyevivich Markov Stationary state random process S 1, S 2,... P(s N s 1... s N 1 ) = P(s N s N 1 ) Markov information source: states S i are mapped into source symbols X i Unifilar information source: from any state, all neighbouring states map to distinct symbols
19 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P X2 X 1 (1 0) = 1 P X2 X 1 (0 0) = P X2 X 1 (1 1) = 1 P X2 X 1 (0 1) = 0.8 Can we compute P X1 (1) = 1 P X1 (0)? Stationarity implies P X1 (1) = P X2 (1) and thus P X1 (1) = P X2 (1) = P X1 X 2 (01) + P X1 X 2 (11) = P X2 X 1 (1 0)P X1 (0) + P X2 X 1 (1 1)P X1 (1)
20 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P X2 X 1 (1 0) = 1 P X2 X 1 (0 0) = P X2 X 1 (1 1) = 1 P X2 X 1 (0 1) = 0.8 Can we compute P X1 (1) = 1 P X1 (0)? Stationarity implies P X1 (1) = P X2 (1) and thus P X1 (1) = P X2 (1) = P X1 X 2 (01) + P X1 X 2 (11) = P X2 X 1 (1 0)P X1 (0) + P X2 X 1 (1 1)P X1 (1)
21 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source Define the matrix T = [ PX2 X 1 (0 0) P X2 X 1 (0 1) P X2 X 1 (1 0) P X2 X 1 (1 1) and the vector P = [P X1 (0), P X1 (1)] T, then we are looking for the solution P to the equation P = TP, i.e., the eigenvector of T for the eigenvalue 1. Note that since T is a stochastic matrix (its columns sum to 1), it will always have 1 as an eigenvalue. ]
22 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P = [ ] P implies [ ] P = 0 which, together with the constraint [11]P = 1 (probabilities sum to 1) yields [ ] [ ] 0 P = and finally P = Entropy rate of the source: [ PX1 (0) P X1 (1) ] = [ H (X) = lim N H(X N X 1... X N 1 ) = H(X N X N 1 ) = H(X 2 X 1 ) = H(X 2 X 1 = 0)P X1 (0) + H(X 2 X 1 = 1)P X1 (1) = h(0.1) h() = bits ]
23 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
24 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1, =
25 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1, = ( )
26 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
27 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
28 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
29 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
30 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
31 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
32 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,
33 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 18 / 21 Determining the codeword Source interval [0.1389, ] in binary: [ , ] b The probability of the source sequence is P X1...X 8 (0, 1, 1, 1, 1, 1, 1, 1) = = log 2 P X1...X 8 (0, 1, 1, 1, 1, 1, 1, 1) = 4.543, therefore we can either truncate after 5 or 6 digits, depending if the resulting code sequence is contained in the source interval No 5 digit code sequence corresponds to a code interval contained in our source interval: Source interval: Length 5 codeword intervals: The 6 digit code sequence corresponds to the code interval [ , ] b = [ , ] which is fully contained in the source interval and therefore satisfies the prefix condition
34 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval
35 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result:
36 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,
37 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,
38 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,
39 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,
40 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,
41 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,
42 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,1,
43 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [ , ] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,1,
44 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 20 / 21 Shannon s Twin Experiment Source?? Encoder Decoder Sink
45 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 21 / 21 Shannon s twin experiment Shannon 1 and Shannon 2 are hypothetical fully identical twins (who look alike, talk alike, and think exactly alike) An operator in the transmitter asks Shannon 1 to guess the next source symbol of the source based on the context The operator counts the number of guesses until Shannon 1 gets it right, and transmits this number An operator in the receiver asks Shannon 2 to guess the next source symbol based on the context. Shannon 2 will get the same answer as Shannon 1 after the same number of guesses. An upper bound on the entropy rate of English is the entropy of the number of guesses A better bound would take dependencies between numbers of guesses into account (if Shannon 1 needed many guesses for a symbol then chances are that he will need many for the next as well, whereas if he guessed right the first time, chances are that he s in the middle of a word and will guess the next symbol correctly as well)
3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationMotivation for Arithmetic Coding
Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationLecture 3 : Algorithms for source coding. September 30, 2016
Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39
More information3F1 Information Theory, Lecture 1
3F1 Information Theory, Lecture 1 Jossy Sayir Department of Engineering Michaelmas 2013, 22 November 2013 Organisation History Entropy Mutual Information 2 / 18 Course Organisation 4 lectures Course material:
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationU Logo Use Guidelines
COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature
More informationChapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0
Part II Information Theory Concepts Chapter 2 Source Models and Entropy Any information-generating process can be viewed as a source: { emitting a sequence of symbols { symbols from a nite alphabet text:
More informationEECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have
EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More information2018/5/3. YU Xiangyu
2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationHomework Set #2 Data Compression, Huffman code and AEP
Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationInformation Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST
Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it
More informationData Compression Techniques (Spring 2012) Model Solutions for Exercise 2
582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More information1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.
Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without
More informationCoding for Discrete Source
EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively
More informationMultimedia Communications. Mathematical Preliminaries for Lossless Compression
Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when
More informationCSCI 2570 Introduction to Nanocomputing
CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationInformation and Entropy
Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication
More informationChapter 5: Data Compression
Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More informationStream Codes. 6.1 The guessing game
About Chapter 6 Before reading Chapter 6, you should have read the previous chapter and worked on most of the exercises in it. We ll also make use of some Bayesian modelling ideas that arrived in the vicinity
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationBasic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.
Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of
More informationChapter 9 Fundamental Limits in Information Theory
Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For
More information10-704: Information Processing and Learning Fall Lecture 9: Sept 28
10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy
More informationSolutions to Set #2 Data Compression, Huffman code and AEP
Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code
More informationComplex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity
Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary
More informationSummary of Last Lectures
Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More informationDigital communication system. Shannon s separation principle
Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation
More informationLecture 8: Shannon s Noise Models
Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 8: Shannon s Noise Models September 14, 2007 Lecturer: Atri Rudra Scribe: Sandipan Kundu& Atri Rudra Till now we have
More informationCS6304 / Analog and Digital Communication UNIT IV - SOURCE AND ERROR CONTROL CODING PART A 1. What is the use of error control coding? The main use of error control coding is to reduce the overall probability
More informationInformation & Correlation
Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What
More informationEE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018
Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationMAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)
MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK SATELLITE COMMUNICATION DEPT./SEM.:ECE/VIII UNIT V PART-A 1. What is binary symmetric channel (AUC DEC 2006) 2. Define information rate? (AUC DEC 2007)
More informationKolmogorov complexity ; induction, prediction and compression
Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds
More informationMARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for
MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1
More informationLecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)
3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx
More informationELEC 515 Information Theory. Distortionless Source Coding
ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered
More informationCS4800: Algorithms & Data Jonathan Ullman
CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More informationCMPT 365 Multimedia Systems. Lossless Compression
CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding
More informationData Compression Techniques
Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More informationData Compression Techniques
Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationMAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A
MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK DEPARTMENT: ECE SEMESTER: IV SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A 1. What is binary symmetric channel (AUC DEC
More informationUsing an innovative coding algorithm for data encryption
Using an innovative coding algorithm for data encryption Xiaoyu Ruan and Rajendra S. Katti Abstract This paper discusses the problem of using data compression for encryption. We first propose an algorithm
More informationData Compression. Limit of Information Compression. October, Examples of codes 1
Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality
More informationDCSP-3: Minimal Length Coding. Jianfeng Feng
DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than
More informationUncertainity, Information, and Entropy
Uncertainity, Information, and Entropy Probabilistic experiment involves the observation of the output emitted by a discrete source during every unit of time. The source output is modeled as a discrete
More informationInformation Theory. Week 4 Compressing streams. Iain Murray,
Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]
More informationContext tree models for source coding
Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding
More informationIntroduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.
L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate
More informationITCT Lecture IV.3: Markov Processes and Sources with Memory
ITCT Lecture IV.3: Markov Processes and Sources with Memory 4. Markov Processes Thus far, we have been occupied with memoryless sources and channels. We must now turn our attention to sources with memory.
More informationDept. of Linguistics, Indiana University Fall 2015
L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission
More informationCOMM901 Source Coding and Compression. Quiz 1
German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter
More informationLec 05 Arithmetic Coding
ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic
More informationCourse notes for Data Compression - 1 The Statistical Coding Method Fall 2005
Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Peter Bro Miltersen August 29, 2005 Version 2.0 1 The paradox of data compression Definition 1 Let Σ be an alphabet and let
More informationlossless, optimal compressor
6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.
More informationLecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationSource Coding Techniques
Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.
More informationLecture 4 Channel Coding
Capacity and the Weak Converse Lecture 4 Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 15, 2014 1 / 16 I-Hsiang Wang NIT Lecture 4 Capacity
More informationHuffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University
Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw
More informationPART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015
Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:
More informationDigital Communications III (ECE 154C) Introduction to Coding and Information Theory
Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I
More informationClassical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006
Classical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006 Fabio Grazioso... July 3, 2006 1 2 Contents 1 Lecture 1, Entropy 4 1.1 Random variable...............................
More informationLecture 4 Noisy Channel Coding
Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem
More informationECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)
ECE 74 - Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) 1. A Huffman code finds the optimal codeword to assign to a given block of source symbols. (a) Show that cannot be a Huffman
More informationLecture 6 I. CHANNEL CODING. X n (m) P Y X
6- Introduction to Information Theory Lecture 6 Lecturer: Haim Permuter Scribe: Yoav Eisenberg and Yakov Miron I. CHANNEL CODING We consider the following channel coding problem: m = {,2,..,2 nr} Encoder
More informationDigital Communication Systems ECS 452
Digital Communication Systems ECS 452 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 2. Source Coding 1 Office Hours: BKD, 6th floor of Sirindhralai building Tuesday 14:20-15:20 Wednesday 14:20-15:20
More information4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information
4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk
More informationExercise 1. = P(y a 1)P(a 1 )
Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a
More informationExercises with solutions (Set B)
Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th
More informationEE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16
EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More informationChapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University
Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission
More informationA Mathematical Theory of Communication
A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationCapacity of a channel Shannon s second theorem. Information Theory 1/33
Capacity of a channel Shannon s second theorem Information Theory 1/33 Outline 1. Memoryless channels, examples ; 2. Capacity ; 3. Symmetric channels ; 4. Channel Coding ; 5. Shannon s second theorem,
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 4 Source modeling and source coding Stochastic processes and models for information sources First Shannon theorem : data compression
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More information