One Lesson of Information Theory

Similar documents
State-of-the-Art Channel Coding

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

Advanced Topics in Digital Communications Spezielle Methoden der digitalen Datenübertragung

Principles of Communications

Lecture 8: Channel Capacity, Continuous Random Variables

Coding theory: Applications

Revision of Lecture 5

Lecture 12. Block Diagram

Channel Coding 1. Sportturm (SpT), Room: C3165

Shannon Information Theory

ELECTRONICS & COMMUNICATIONS DIGITAL COMMUNICATIONS

Channel Coding I. Exercises SS 2017

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

16.36 Communication Systems Engineering

Lecture 2. Capacity of the Gaussian channel

Lecture 14 February 28

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Revision of Lecture 4

Channel Coding I. Exercises SS 2017

Communication Theory II

Noisy channel communication

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Chapter 7: Channel coding:convolutional codes

Bounds on Mutual Information for Simple Codes Using Information Combining

Information Theory - Entropy. Figure 3

ECE Information theory Final

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Chapter 9 Fundamental Limits in Information Theory

Computing and Communications 2. Information Theory -Entropy

ECEN 655: Advanced Channel Coding

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1

Lecture 8: Shannon s Noise Models

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Physical Layer and Coding

ELEC546 Review of Information Theory

CSCI 2570 Introduction to Nanocomputing

Chapter I: Fundamental Information Theory

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Dept. of Linguistics, Indiana University Fall 2015

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Principles of Coded Modulation. Georg Böcherer

Digital Modulation 1

Digital Transmission Methods S

Lecture 2: August 31

A Relation between Conditional and Unconditional Soft Bit Densities of Binary Input Memoryless Symmetric Channels

ECE 564/645 - Digital Communications, Spring 2018 Homework #2 Due: March 19 (In Lecture)

A Systematic Description of Source Significance Information

Chapter 4: Continuous channel and its capacity

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Optimum Soft Decision Decoding of Linear Block Codes

Information Sources. Professor A. Manikas. Imperial College London. EE303 - Communication Systems An Overview of Fundamentals

Constellation Shaping for Communication Channels with Quantized Outputs

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

ECE 4400:693 - Information Theory

ITCT Lecture IV.3: Markov Processes and Sources with Memory

Soft-Output Trellis Waveform Coding

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

Lecture 22: Final Review

ECE Information theory Final (Fall 2008)

Information Theory and Coding Techniques

Estimation of the Capacity of Multipath Infrared Channels

Lecture 3: Channel Capacity


EE4512 Analog and Digital Communications Chapter 4. Chapter 4 Receiver Design

POLAR CODES FOR ERROR CORRECTION: ANALYSIS AND DECODING ALGORITHMS

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Introduction to Convolutional Codes, Part 1

One-Bit LDPC Message Passing Decoding Based on Maximization of Mutual Information

Lecture 18: Gaussian Channel

Noisy-Channel Coding

X 1 : X Table 1: Y = X X 2

Capacity of multiple-input multiple-output (MIMO) systems in wireless communications

Lecture 4 Capacity of Wireless Channels

(Classical) Information Theory III: Noisy channel coding

The PPM Poisson Channel: Finite-Length Bounds and Code Design

Quiz 2 Date: Monday, November 21, 2016

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

3F1 Information Theory, Lecture 1

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Shannon's Theory of Communication

On the Capacity of the Two-Hop Half-Duplex Relay Channel

EE5713 : Advanced Digital Communications

Maximum mutual information vector quantization of log-likelihood ratios for memory efficient HARQ implementations

Lecture 1. Introduction

Shannon s noisy-channel theorem

Problem 7.7 : We assume that P (x i )=1/3, i =1, 2, 3. Then P (y 1 )= 1 ((1 p)+p) = P (y j )=1/3, j=2, 3. Hence : and similarly.

Exercise 1. = P(y a 1)P(a 1 )

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Lecture 9 Polar Coding

Superposition Mapping & Related Coding Techniques

The Turbo Principle in Wireless Communications

UNIT I INFORMATION THEORY. I k log 2

Turbo Compression. Andrej Rikovsky, Advisor: Pavol Hanus

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

5. Density evolution. Density evolution 5-1

Transcription:

Institut für One Lesson of Information Theory Prof. Dr.-Ing. Volker Kühn Institute of Communications Engineering University of Rostock, Germany Email: volker.kuehn@uni-rostock.de http://www.int.uni-rostock.de/ September 2010 Volker Kühn - One Lesson of Information Theory UNIVERSITÄT ROSTOCK FAKULTÄT INFORMATIK UND ELEKTROTECHNIK

Outline of Lectures Lesson 1: One Lesson of Information Theory Principle structure of communication systems Definitions of entropy, mutual information, Channel coding theorem of Shannon Lesson 2: Introduction to Error Correcting Codes Basics of error correcting codes Linear block codes Convolutional codes (if time permits) Lesson 3: State-of-the-art channel coding Coding strategies to approach the capacity limits Definition of soft-information and turbo decoding principle Examples for state-of-the-art error correcting codes 2

Literature Lin/ Costello: Error Contol Coding: Fundamentals and Applications Bossert: Channel Coding Johannesson/Zigangirov: Fundamentals of Convolutional Codes Richardson, Urbanke: Modern Coding Theory Neubauer, Freudenberger, Kühn: Coding Theory Algorithms, Architectures, and Applications Johannesson: Information Theory Cover, Thomas: Elements of Information Theory 3

Outline of Lectures Lesson 1: One Lesson of Information Theory Principle structure of communication systems Definitions of entropy, mutual information, Channel coding theorem of Shannon Lesson 2: Introduction to Error Correcting Codes Basics of error correcting codes Linear block codes Convolutional codes (if time permits) Lesson 3: State-of-the-art channel coding Coding strategies to approach the capacity limits Definition of soft-information and turbo decoding principle Examples for state-of-the-art error correcting codes 4

Principle Structure of Digital Communication System analog source source encoder digital source u k Source generates analog signal (e.g. voice, video) Source coding samples, quantizes and compresses analog signal Digital Source: comprises analog source and source coding, delivers digital data vector u of length k 5

Principle Structure of Digital Communication System analog source source encoder digital source u k channel encoder x n Channel encoder adds redundancy to u resulting in code word x of length n Channel encoder may consist of several constituent codes Code rate: R c = k / n 6

Principle Structure of Digital Communication System analog source source encoder digital source Modulator maps discrete vector x onto analog waveform and moves it into the transmission band Physical channel represents transmission medium Multipath propagation intersymbol interference (ISI) Time varying fading, i.e. deep fades Additive noise Demodulator: Moves signal back into baseband and performs lowpass filtering, sampling, quantization u channel encoder R c = k/n x y modulator demodulator physical channel time-discrete channel Time-discrete channel: comprises analog part of modulator, physical channel and analog part of demodulator 7

Principle Structure of Digital Communication System analog source source encoder digital source u channel encoder R c = k/n x modulator physical channel u channel decoder y demodulator Channel decoder: Estimation of u on the basis of received vector y y need not to consist of hard quantized values {0,1} Since encoder may consist of several parts, decoder may also consist of several modules time-discrete channel 8

Principle Structure of Digital Communication System analog source source encoder digital source source sink decoder digital sink u x channel encoder R c = k/n û y channel decoder modulator demodulator physical channel time-discrete channel Citation of Massey: The purpose of the modulation system is to create a good discrete channel from the modulator input to the demodulator output, and the purpose of the coding system is to transmit the information bits reliably through this discrete channel at the highest practicable rate. 9

Time-Discrete Channel Time-discrete channel comprises analog parts of modulator and demodulator as well as physical transmission medium x i X discrete input alphabet X = {X 0,..., X X 1 } discrete channel y i Y discrete or continuous output alphabets Y = {Y 0,..., Y Y 1 } Y = R Probabilities, probability densities: Joint probability of event: Conditional probabilities: A posteriori probabilities: Pr{X ν }, Pr{Y μ } Pr{X ν, Y μ } Pr{Y μ X ν } Pr{X ν Y μ } p(y) p(x = X ν, y) p(y x = X ν ) Pr{X ν y} 10

AWGN: Additive White Gaussian Noise x i n i y i p(y x = X ν ) = 1 p e (y Xν ) 2σ N 2 2πσ 2 N 2 0.8 signal-to-noise-ratio E s /N 0 : 2 db 0.8 signal-to-noise-ratio E s /N 0 : 6 db 0.6 p(y x = 1) p(y x = +1) 0.6 p(y x = 1) p(y x = +1) 0.4 0.2 p(y) 0.4 0.2 p(y) 0-4 -2 0 2 4 y 0-4 -2 0 2 4 y 11

Error Probability of AWGN Channel and BPSK Complementary error function: 0.8 0.6 P s = 1 π Z p Es /N 0 e ξ 2 decision threshold Y 0 Y 1 dξ = 1 2 erfc Ãr Es N 0! 0.4 0.2 0-4 -2 0 2 4 X 0 = -1 X 1 = +1 12

Transition to Discrete Channels Discrete channels arise from quantization of continuous channel output We consider binary antipodal transmission: X = {X 0, X 1 } = {+1, 1} Generally continuously distributed channel output: Y = R L-bit quantization due to finite precision of digital circuits delivers alphabet Y = {Y 0,..., Y 2 L 1} L = 1: Hard-Decision: Y = {Y 0, Y 1 } = {+1, 1} = X L = 2: four output symbols: Y = {Y 0, Y 1, Y 2, Y 3 } L = 3: eight output symbols: Y = {Y 0, Y 1, Y 2, Y 3, Y 4, Y 5, Y 6, Y 7 } 13

Discrete Channels (1) Binary Symmetric Channel (BSC) from hard decision (L = 2) Y = {Y 0, Y 1 } = {+1, 1} X 0 = -1 X 1 = +1 Y 0 Y 1 X 0 X 1 1-P e Y 0 P e 1-P e P e Y 1 Ãr! P e = 1 2 erfc Es N 0 14

Discrete Channels (2) Binary Symmetric Erasure Channel (BSEC) X 0 P e 1-P e -P q P q P e Y 0 Y 2 P X q 1 Y 1-P e -P 1 q X 0 = -1 X 1 = +1 -a +a Y 0 Y 2 Y 1 15

Discrete Channels (3) 2-Bit-Quantization -a 0 +a Y 0 Y 0 Y 1 Y 2 Y 3 X 0 Y 1 Y 2 X 1 Y 3 16

Outline of Lectures Lesson 1: One Lesson of Information Theory Principle structure of communication systems Definitions of entropy, mutual information, Channel coding theorem of Shannon Lesson 2: Introduction to Error Correcting Codes Basics of error correcting codes Linear block codes Convolutional codes (if time permits) Lesson 3: State-of-the-art channel coding Coding strategies to approach the capacity limits Definition of soft-information and turbo decoding principle Examples for state-of-the-art error correcting codes 17

Information, Entropy Amount of information should depend on probability: For independent events: I(X ν ) = f(pr{x ν }) Pr{X ν, Y μ } = Pr{X ν } Pr{Y μ } I(X ν, Y μ ) = I(X ν ) + I(Y μ ) Logarithm is sole function that maps product onto a sum µ 1 I(X ν ) = log 2 (Pr{X ν }) = log 2 0 Pr(X ν ) Entropy: H(X) = P ν Pr{X ν} log 2 (Pr{X ν }) = E log 2 (Pr{X}) ª Entropy is a measure of uncertainty 18

Examples for Entropy Set of events: X = X 1, X 2, X 3, X 4, X 5 ª Each event occurs with certain probability Pr{X 1 } = 0.30 I(X 1 ) = 1.7370 Pr{X 2 } = 0.20 I(X 1 ) = 2.3219 Pr{X 3 } = 0.20 I(X 1 ) = 2.3219 Pr{X 4 } = 0.15 I(X 1 ) = 2.7370 Pr{X 5 } = 0.15 I(X 1 ) = 2.7370 H(X) = X ν Pr{X ν} log 2 (Pr{X ν }) = 2.271 bit Entropy of a set is maximized, when all M elements are equally likely max H(X) = H equal(x) = X M 1 Pr{X} ν=0 1 M log 2(M) = log 2 (M) bit = 2, 32 bit 19

Example: LCD for 10 digits a b c d e f g digit 0 1 2 3 4 5 6 7 8 9 a 1 0 0 0 1 1 1 0 1 1 b 1 0 1 0 0 0 1 0 1 0 c 1 0 1 1 0 1 1 1 1 1 d 0 0 1 1 1 1 1 0 1 1 e 1 0 1 1 0 1 1 0 1 1 f 1 1 1 1 1 0 0 1 1 1 g 1 1 0 1 1 1 1 1 1 1 All digits with same probability: Amount of information per digit: Entropy of alphabet: Absolute redundancy: Relative redundancy: Pr{X ν } = 0.1 I(X ν ) = log 2 (Pr{X ν }) = log 2 (10) = 3.32 bit H(X) = P ν Pr{X ν} I(X ν ) = 3.32 bit R = m H(X) = 7 bit 3.32 bit = 3.68 bit r = R/m = 3.68 bit/7 bit = 52.54% 20

Binary Entropy Function Set of events: X = X 1, X 2, ª Event probabilities: Pr{X 1 } = P 1 Pr{X 2 } = 1 P 1 H(X) = H 2 (P 1 ) = P 1 log 2 (P 1 ) (1 P 1 ) log 2 (1 P 1 ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 P 1 0.8 1 21

Illumination of Entropies H(X) H(Y) H(X Y) H(Y X) H(X; Y) H(X, Y) H(X), H(Y) H(X, Y) H(X Y) H(Y X) H(X; Y) entropies of source and sink alphabet entropy of sink alphabet joint entropy of source and sink equivocation: information lost during transmission irrelevance, information originating not from source mutual information: information correctly received at sink 22

Joint Entropy, Equivocation, Irrelevance Joint Information I(X ν, Y μ ) = log 2 Pr{Xν, Y μ } Joint Entropy of source and sink: H(X, Y) = X ν Xμ Pr{X ν, Y μ } log 2 Pr{Xν, Y μ } = E log 2 Pr{Xν, Y μ } ª Equivocation: Information lost during transmission Xμ Pr{X ν, Y μ } log 2 Pr{Xν Y μ } H(X Y) = H(X, Y) H(Y) = X ν = E log 2 Pr{Xν Y μ } ª Irrelevance Xμ Pr{X ν, Y μ } log 2 Pr{Yμ X ν } H(Y X) = H(X, Y) H(X) = X ν = E log 2 Pr{Yμ X ν } ª 23

Mutual Information Definition of Mutual Information H(X; Y) = H(X) H(X Y) = H(Y) H(Y X) = H(X) + H(Y) H(X, Y) = X X µ Pr{Y Pr{Yμ X ν } μ X ν } Pr{X ν } log 2 ν μ Pr{Y μ } ½ µ ¾ Pr{Yμ X ν } = E log 2 Pr{Y μ } ½ µ ¾ Pr{Yμ, X ν } = E log 2 Pr{Y μ } Pr{X ν } Mutual information is the amount of information common to X and Y Mutual information is the reduction of uncertainty in X due to the knowledge of Y 24

Illustration of Channel Capacity H(X Y) equivocation H(X) H(X; Y) H(Y) H(Y X) irrelevance Maximization of mutual information with respect to source statistics delivers channel capacity: X X C = sup Pr{X} ν μ Pr{Y μ X ν } Pr{X ν } log 2 Pr{Y μ X ν } Pr{Y μ } 25

Outline of Lectures Lesson 1: One Lesson of Information Theory Principle structure of communication systems Definitions of entropy, mutual information, Channel coding theorem of Shannon Lesson 2: Introduction to Error Correcting Codes Basics of error correcting codes Linear block codes Convolutional codes (if time permits) Lesson 3: State-of-the-art channel coding Coding strategies to approach the capacity limits Definition of soft-information and turbo decoding principle Examples for state-of-the-art error correcting codes 26

Channel Coding Theorem of Shannon Shannon, 1948: A Mathematical Theory of Communication If a channel has the capacity C, there exist a code with rate R c C for which the probability of a decoding error can be made arbitrary small. Converse Theorem: If a channel has the capacity C, a reliable (error-free) communication cannot be achieved for codes with rates R c > C. Theorems are not constructive, i.e. they do not provide a construction guideline for powerful codes 27

Capacity of Binary Channels Statistics of channel X 0 X 1 1 1 Mutual information Y 0 Y 1 Pr ( ª P 0 for μ = 0 X ν = 1 P 0 for μ = 1 Pr ( ª 1 for μ = ν Y μ X ν = 0 for μ 6=ν Pr ( ª P 0 for μ = 0 Y μ = 1 P 0 for μ = 1 H(X; Y) = P 0 log 2 1 P 0 + (1 P 0 ) log 2 1 1 P 0 = H 2 (P 0 ) = H(X) Hint: 0 log 2 (0) = 0 Perfect transmission without any errors! 28

Capacity of Binary Channels Statistics of channel X 0 X 1 1 1 Mutual information Y 0 Y 1 Pr ( ª P 0 for μ = 0 X ν = 1 P 0 for μ = 1 Pr ( ª 0 for μ = ν Y μ X ν = 1 for μ 6=ν Pr ( ª 1 P 0 for μ = 0 Y μ = P 0 for μ = 1 H(X; Y) = P 0 log 2 1 P 0 + (1 P 0 ) log 2 1 1 P 0 = H 2 (P 0 ) = H(X) Hint: 0 log 2 (0) = 0 Perfect transmission without any errors! 29

Capacity of Binary Erasure Channel Statistics of BEC channel X 0 1-P e Y 0 P e Y 2 X 1 Y 1 1-P e Mutual information of BEC Pr ( ª P 0 for μ = 0 X ν = 1 P 0 for μ = 1 Pr ª P 0 (1 P e ) for μ = 0 Y μ = P e (P 0 + 1 P 0 ) = P e for μ = 2 (1 P 0 ) (1 P e ) for μ = 1 I(X; Y) = (1 P e )P 0 log 2 1 P e P 0 (1 P e ) + P ep 0 log 2 P e P e (P 0 + 1 P 0 ) + (1 P e )(1 P 0 ) log 2 1 P e (1 P 0 )(1 P e ) + P e(1 P 0 ) log 2 P e P e = (1 P e ) H 2 (P 0 ) 30

Capacity of Binary Erasure Channel Mutual information of BEC and different statistics of input signal 1 Pr(X 0 )=0.1 0.8 Pr(X 0 )=0.2 Pr(X 0 )=0.3 0.6 Pr(X 0 )=0.4 Pr(X 0.4 0 )=0.5 0.2 0 0 0.2 0.4 0.6 0.8 1 P e Capacity of BEC for uniform input distribution: C BEC = (1 P e ) 31

Capacity of Binary Symmetric Channel Statistics of BSC channel for uniform input distribution X 0 X 1 1-P e 1-P e P e P e Y 0 Y 1 Mutual information of BSC Pr ª ª 1 X 0 = Pr X1 = 2 Pr ( ª 1 P e for μ = ν Y μ X ν = P e for μ 6=ν Pr Y 0 ª = Pr Y1 ª = 1 2 C BSC = 2 (1 P e ) 1 2 log 2 2(1 Pe ) + 2 P e 1 2 log 2 2Pe = (1 P e ) 1 + log 2 (1 P e ) + P e 1 + log 2 (P e ) = 1 + (1 P e ) log 2 (1 P e ) + P e log 2 (P e ) = 1 H 2 (P e ) 32

Capacity of Binary Symmetric Channel Mutual information of BSC and different statistics of input signal 1 0.8 Pr{X 0 } = 0.1 Pr{X 0 } = 0.3 Pr{X 0 } = 0.5 C(p e ) 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 p e Capacity of BSC for uniform input distribution C BSC = 1 + P e log 2 (P e ) + (1 P e ) log 2 (1 P e ) = 1 H 2 (P e ) 33

Binary Symmetric Erasure Channel (BSEC) Quantization parameter a has to be optimized with respect to channel capacity C Optimal choice depends on signal-to-noise-ratio Es/N0 X 0 1-P e -P q P e Y 0 P q Y 2 X 1 1-P e -P q Y 1 X 0 = -1 X 1 = +1 -a +a Y 0 Y 1 Y 2 C BSEC = 1 P q +P e log 2 (P e )+(1 P e P q ) log 2 (1 P e P q ) (1 P q ) log 2 (1 P q ) 34

Channel Capacity for BSC and BSEC BSEC 1 0.8 0.6 0.4 C BSEC, a=opt. C BSC a opt 0.2 E s /N 0 in db 0-20 -10 0 10 E s /N 0 in db a > 1 leads only to minor improvement of channel capacity 35

Capacity of AWGN Channel Additive White Gaussian Noise Channel n N (0, σ 2 N ) x N (0, σ 2 X ) y N (0, σ2 Y ) Differential entropy of Gaussian random process Z h(x) = p X (ξ) log 2 p X (ξ) dξ = 1 2 log 2(2πeσX) 2 Capacity of AWGN channel C = h(y) h(y X) = h(y) h(n) = 1 2 log 2 2πe(σ 2 X + σn 2 ) 1 2 log 2(2πeσN 2 ) = 1 2 log 2 1 + σ 2 X /σ 2 N 36

Channel Capacity of BPSK and AWGN Influence of quantization 1 0.8 0.6 C 0.4 q = q = 1 0.2 q = 2 q = 3 gauss 0-10 -5 0 5 10 E S / N0 in db 37

Ultimate Communication Limit Energy per information bit: E b = E s / C E s = C E b Capacity of 1-D AWGN channel C = 1 µ 2 log 2 1 + 2 Es N 0 = 1 µ 2 log 2 1 + 2C Eb N 0 Minimum signal to noise ratio E b = 22C 1 N 0 2C C 0 1.59 db ln(2) C 1 0.8 0.6 0.4 0.2 0-2 -1 0 1 2 Eb / N0 in db -1.59 db 38

Institut für Thanks for your attention! September 2010 Volker Kühn - One Lesson of Information Theory UNIVERSITÄT ROSTOCK FAKULTÄT INFORMATIK UND ELEKTROTECHNIK