3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Similar documents
Topics. Probability Theory. Perfect Secrecy. Information Theory

PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Outline. Computer Science 418. Number of Keys in the Sum. More on Perfect Secrecy, One-Time Pad, Entropy. Mike Jacobson. Week 3

Lecture 2. Capacity of the Gaussian channel

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Principles of Communications

Entropies & Information Theory

Noisy-Channel Coding

ECE Information theory Final (Fall 2008)

Exercises with solutions (Set D)

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

CSCI 2570 Introduction to Nanocomputing

Introduction to Information Theory. B. Škorić, Physical Aspects of Digital Security, Chapter 2

1 Introduction to information theory

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Revision of Lecture 5

Perfectly-Secret Encryption

Information Leakage of Correlated Source Coded Sequences over a Channel with an Eavesdropper

Coding for Discrete Source

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Lecture Notes. Advanced Discrete Structures COT S

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Chapter 9 Fundamental Limits in Information Theory

Cryptographic Engineering

(Classical) Information Theory III: Noisy channel coding

Cryptography. Lecture 2: Perfect Secrecy and its Limitations. Gil Segev

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

One Lesson of Information Theory

Exercise 1. = P(y a 1)P(a 1 )

Computer Science A Cryptography and Data Security. Claude Crépeau

Shannon s noisy-channel theorem

Shannon s Theory of Secrecy Systems

Lecture 16 Oct 21, 2014

Elliptic Curve Cryptography

Channel Coding for Secure Transmissions

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Lecture Notes. Advanced Discrete Structures COT S

18.310A Final exam practice questions

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Exercises with solutions (Set B)

Chapter I: Fundamental Information Theory

Solution to Midterm Examination

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Dept. of Linguistics, Indiana University Fall 2015

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Communication Theory II

Lecture 8: Shannon s Noise Models

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Noisy channel communication

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Outline. CPSC 418/MATH 318 Introduction to Cryptography. Information Theory. Partial Information. Perfect Secrecy, One-Time Pad

Homework Set #2 Data Compression, Huffman code and AEP

CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Lecture 8: Channel Capacity, Continuous Random Variables

Problem 7.7 : We assume that P (x i )=1/3, i =1, 2, 3. Then P (y 1 )= 1 ((1 p)+p) = P (y j )=1/3, j=2, 3. Hence : and similarly.

X 1 : X Table 1: Y = X X 2

Information Sources. Professor A. Manikas. Imperial College London. EE303 - Communication Systems An Overview of Fundamentals

Univ.-Prof. Dr. rer. nat. Rudolf Mathar. Written Examination. Cryptography. Tuesday, August 29, 2017, 01:30 p.m.

ECE Information theory Final

2018/5/3. YU Xiangyu

Information-Theoretic Security: an overview

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

ELECTRONICS & COMMUNICATIONS DIGITAL COMMUNICATIONS

Lecture 22: Final Review

PERFECTLY secure key agreement has been studied recently

THE UNIVERSITY OF CALGARY FACULTY OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE DEPARTMENT OF MATHEMATICS & STATISTICS MIDTERM EXAMINATION 1 FALL 2018

MATH3302 Cryptography Problem Set 2

Revision of Lecture 4

ELEC546 Review of Information Theory

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

The simple ideal cipher system

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.


On the Secrecy Capacity of Fading Channels

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Information-theoretic Secrecy A Cryptographic Perspective

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Solutions to Homework Set #3 Channel and Source coding

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Solution of Exercise Sheet 6

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Lecture Note 3 Date:

Capacity of AWGN channels

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

3F1 Information Theory, Lecture 1

Lecture 8 - Cryptography and Information Theory

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

Optimum Binary-Constrained Homophonic Coding

Digital communication system. Shannon s separation principle

UNIT I INFORMATION THEORY. I k log 2

Lecture 18: Gaussian Channel

Transcription:

Engineering Tripos Part IIA THIRD YEAR 3F: Signals and Systems INFORMATION THEORY Examples Paper Solutions. Let the joint probability mass function of two binary random variables X and Y be given in the following table: x, y P XY x, y,.,.3,.,.4 Express the following quantities in terms of the binary entropy function hx = x log x x log x. a HX We marginalise P X = P XY, + P XY, = / and hence HX = h/ = [bits]. b HY Similarly, P Y = P XY, + P XY, =.3 and hence HY = h.3. c HX Y We compute and and hence P X Y = P XY, P Y P X Y = P XY, P Y =..3 = 3 =.3.7 = 3 7 HX Y = P Y HX Y = + P Y HX Y = =.3h/3 +.7h3/7.

d HY X We compute and and hence P Y X = P XY, P X P Y X = P XY, P X =..5 = 5 =..5 = 5 HY X = P X HY X = + P X HY X = =.5h/5 +.5h/5. e HXY Using the chain rule, we can compute either or HXY = HX + HY X = +.5h/5 +.5h/5 =.8464 [bits] HXY = HY + HX Y = h.3 +.3h/3 +.7h3/7 =..8464 [bits] which, as can be observed, yields the same result. f IX; Y We can write either or IX; Y = HX HX Y =.3h/3.7h3/7 =.349 [bits] IX; Y = HY HY X = h.3.5h/5.5h/5 =.349 [bits] which, as can again be observed, yields the same result.. Let an N-ary random variable X be distributed as follows: { P X = p P X k = p for k =, 3,..., N. N Express the entropy of X in terms of the binary entropy function h.. p HX = p log p N N log p N p = p log p p log N = p log p p log p + p logn = hp + p logn.

3. A discrete memoryless source has an alphabet of eight letters, x i, i =,,, 8 with probabilities.5,.,.5,.,.,.8,.5 and.5. a Use the Huffman encoding to determine a binary code for the source output. The diagram below shows a the working to produce the Huffman code. The order in which the probabilities are merged is shown in brackets above the probabilities. 6 7.5 -------------------------------------------.58 ---. 5 / /. -----------------.4 -------------------------/ / 4 /.5 ------------------------------.33 --/ 3 / /. -----. --/ / / /. --/ / /.8 ----------------.8 --/ /.5 -----. --/ /.5 --/ Labelling the upper branch and the lower branch gives the code: Symbol Code x x x 3 x 4 x 5 x 6 x 7 x 8 Note that other labellings are possible and also there is a choice at the second merge since there are two probabilities of. at that time. b Determine the average codeword length L Average codeword length: L =..5 +.. + 3..5 + 3.. + 3.. + 4..8 + 5..5 + 5..5 =.83 bits. 3

c Determine the entropy of the source and hence its efficiency HS =.5. ln.5 +.. ln. +.5. ln.5 +.. ln. +.. ln. +.5. ln.5 +.5. ln.5 =.798 bits. The average codeword length is greater than the entropy as expected. Efficiency =.798 /.83 = 98.9%. 4. Show that for statistically independent events HX, X,, X n = n HX i i= First expand out HX, X,, X n HX, X,, X n = N N i = i = N n i n= The probabilities are independent so this is = N N i = i = N n i n= P x i, x i, x in log P x i, x i, x in n n P x ij log P x ij The last summation only indexes i n so we can move all the other terms in the first product to the left of it. We can also replace the log of a product with the sum of the logs. = N N i = i = N n i n = n P x ij Nn n P x in log P x ij i n= All terms in the right hand sum except the last one do not depend on i n and since Nn i n= P x i n = we can transform N n i n= n n P x in log P x ij = log P x ij + Now the right hand term is independent of i i n and N N i = i = N n i n = n P x ij = N n i n= P x in log P x in 4

so we have HX, X,, X n = N n i n= P x in log P x in + N N i = i = = HX n + HX, X,, X n by induction this gives N n i n = n P x ij n log P x ij = n HX i i= 5. A five-level non-uniform quantizer for a zero-mean signal results in the 5 levels b, a,, a, b with corresponding probabilities of occurrence p b = p b =.5, p a = p a =. and p =.7. a Design a Huffman code that encodes one signal sample at a time and determine the average bit rate per sample..7 ------------------------\ \ -a. --\ --. -. --\ / a. --/ \ / -.3 --/ -b.5 --\ / -. --/ b.5 --/ Symbol Code giving code: a a b b Average bit rate =..7 + 3.. +. +.5 +.5 =.6 bits. Another version of the code would give bits to a or to a and 4 bits each to b and b. It has the same mean length. b Design a Huffman code that encodes two output samples at a time and determine the average bit rate per sample. We combine the 5 states into 5 pairs of states, in descending order of probability and form a Huffman code tree as follows. 5

Note that there are many different possible versions of this tree as many of the nodes have equal probabilities and so can be taken in arbitrary order. However the mean length and hence performance of the code should be the same in all cases. The code lengths are shown on the left, but should be filled in last, according to the number of branch-points in the tree that are passed, going from the root to each code state. len samp prob.49 ---------------------------------------------------------\ 4 -a.7 --\ \ 4 a.7 -----------.4 -----\ \ 4 -a.7 --\ -----.8 -----------------\ \ 4 a.7 -----------.4 -----/ \ \ 5 -b.35 ---.7 -\ \ \ Root 5 b.35 -/ ---.4 -----------------------\ \. 5 -b.35 ---.7 -/ \ \ / 5 b.35 -/ \.5 / 6 -a-a. ---------\ \ / 7 a-a. ---. ----.3 -----\.3 / 7 -a a. -/ ---.5 -\ / 6 a a. ------------. -----/ \ / 7 -a-b.5 ---. -/ \ / 7 a-b.5 -/.9 / 7 -a b.5 ---. ---------\ / 7 a b.5 -/. \ / 7 -b-a.5 ---. ---------/ \ / 7 b-a.5 -/.4 / 7 -b a.5 ---. ---------\ / 7 b a.5 -/. / 8 -b-b.5 --.5 \ / 8 b-b.5 /. / 8 -b b.5 --.5 / 8 b b.5 / Summing probabilities for codewords of the same length: Av. codeword length =..49 + 4..8 + 5..4 + 6.. + 7..6 + 8.. =.93 bits per pair of samples. Hence code rate =.93/ =.465 bit / sample c What are the efficiencies of these two codes? Entropy of the message source =.7 log.7+.. log.+..5 log.5 =.4568 bit / sample Hence: Efficiency of code =.4568.6 = 9.5% Efficiency of code =.4568.465 = 99.44% Code represents a significant improvement, because it eliminates the zero state of code which has a probability well above.5. 6. While we cover in 3F and 4F5 the application of Shannon s theory to data compression and transmission, Shannon also applied the concepts 6

of entropy and mutual information to the study of secrecy systems. The figure below shows a cryptographic scenario where Alice wants to transmit a secret plaintext message X to Bob and they share a secret key Z, while the enemy Eve has access to the public message Y. Message Source Key Source X Alice Encrypter Z Eve Enemy Cryptanalyst Y Bob Decrypter Z X Message Destination a Write out two conditions using conditional entropies involving X,Y and Z to enforce the deterministic encryptability and decryptability of the messages. Hint: the entropy of a function given its argument is zero, e.g., for any random variable X, HfX X =. The conditions are HY XZ = because the ciphertext is a function of the secret plaintext message and the secret key, and HX Y Z = because the secret plaintext message can be inferred from the ciphertext and the key. b Shannon made the notion of an unbreakable cryptosystem precise by saying that a cryptosystem provides perfect secrecy if the enemy s observation is statistically independent of the plaintext, i.e., IX; Y =. Show that this implies Shannon s much cited bound on key size HZ HX, i.e., perfect secrecy can only be attained if the entropy of the key and hence its compressed length is at least as large as the entropy of the secret plaintext. 7

Since IX; Y =, we have HX = HX Y = HX, Z Y HZ X, Y HX, Z Y = HZ Y + HX Z, Y = HZ Y HZ c Vernam s cipher assumes a binary secret plaintext message X with any probability distribution P X = p = P X and a binary secret key Z that s uniform P Z = P Z = / and independent of X. The encrypter simply adds the secret key to the plaintext modulo, and the decrypter by adding the same key to the ciphertext can recover the plaintext. Show that Vernam s cipher achieves perfect secrecy, i.e., IX; Y =. We have P Y = P XY Z x,, z x z = P Y XZ x, zp X xp Z z x z and note that P Y XZ x, z = if x + z mod =, and P Y XZ x, z = otherwise, and so the expression above becomes P Y = p + p = and hence Furthermore, HY = [bits]. P Y X = P XY, P X = P XY Z,, + P XY Z,, P X = P XP Z P Y XZ, + P X P Z P Y XZ, P X = p + = p 8

and similarly P Y X = P XY, P X = P XY Z,, + P XY Z,, P X = P XP Z P Y XZ, + P X P Z P Y XZ, P X and hence, = + p p = HY X = P X HY X = +P X HY X = = ph/+ ph/ = [bits] and so, finally, IX; Y = HY HY X = = [bits]. 7. What is the entropy of the following continuous probability density functions? x < a P x =.5 < x < x > HX =.5 log.5 dx = log.5 b P x = λ e λ x HX = = = = λ lnλ/ = lnλ/ = log 4 = λ λ e λ x ln λ λ e λx ln λ e λx + λ 9 ln e λ x e λx λ e λx dx + dx dx λx dx λ λxe λx dx xe λx dx

integrating by parts with u = x and v = λe λx so v = e λx : = lnλ/ + λ [ xe ] λx e λx dx c P x = σ π e x /σ HX = = = lnλ/ = lnλ/ = lnλ/ = lnσ π = lnσ π σ π σ π σ π + λ + e x /σ e x /σ + σ π + λ ln ln /σ e x dx + σ /σ e x π x σ π dx σ σ π x x σ e x /σ dx x /σ e x dx σ integrating by parts with u = x and v = x σ e x /σ giving v = e x /σ : = lnσ π = lnσ π = lnσ π [ + σ xe x /σ ] π + σ + σ π π + = log σ π + log e = log σ πe e x /σ dx 8. Continuous variables X and Y are normally distributed with standard deviation σ =. P x = π e x / P y = π e y / A variable Z is defined by z = x + y. What is the mutual information of X and Z?

IZ; X = HZ HZ X Z is a normally distributed variable with standard deviation σ = since it is the sum of two independent variables with standard deviation σ. Hence by 6c above. HZ = log. π + log e If X is known then Z is still normally distributed, but now the mean = x and the standard deviation is since z = x + y and Y has zero mean and standard deviation σ =. So: HZ X = log π + log e So IZ; X = log. π log π. π = log π = log =.5 bit 9. A symmetric binary communications channel operates with signalling levels of ± volts at the detector in the receiver, and the rms noise level at the detector is.5 volts. The binary symbol rate is kbit/s. a Determine the probability of error on this channel and hence, based on mutual information, calculate the theoretical capacity of this channel for error-free communication. Using notation and results from sections. and.4 of the 3F Random Processes course and section 3.5 of the 3F Information Theory course: Probability of bit error, p e = Q v = Q σ.5 = Q4 = 3.7. 5 The mutual information between the channel input X and the channel output Y is then: IY ; X = HY HY X = HY H[p e p e ] Now, assuming equiprobable random bits at input and output of the channel: HY = [.5 log.5 +.5 log.5] = bit / sym

and, from the above p e : H[p e p e ] = [3.7. 5 log 3.7. 5 + 3.7. 5 log 3.7. 5 ] = 4.739. 4 + 4.57. 5 = 5.96. 4 bit / sym So IY ; X = 5.96. 4 =.99948 bit / sym Hence the error-free capacity of the binary channel is Symbol rate IY ; X = k..99948 = 99.948 kbit/s. b If the binary signalling were replaced by symbols drawn from a continuous process with a Gaussian normal pdf with zero mean and the same mean power at the detector, determine the theoretical capacity of this new channel, assuming the symbol rate remains at ksym/s and the noise level is unchanged. For symbols with a Gaussian pdf, section 4.4 of the course shows that the mutual information of the channel is now given by: IY ; X = hy hn = log σ Y σ N = log + σ X σn since the noise is uncorrelated with the signal, so σ Y = σ X + σ N. For the given channel, σ X /σ N = /.5 = 4. Hence IY ; X = log + 4 =.437 bit / sym and so the theoretical capacity of the Gaussian channel is C G = k..437 = 4.37 kbit/s over twice that of the binary channel c The three capacities are plotted on the next page. There is no loss at low SNR for using binary signalling as long as the output remains continuous. As the SNR increases, all binary signalling methods hit a bit/use ceiling whereas the capacity for continuous signalling continues to grow unbounded.

3.5 continuous input, continuous output binary input, binary output binary input, continuous output Capacity [bits/use].5.5 5 5 5 SNR [db] 3