Introduction to Convolutional Codes, Part 1

Similar documents
NAME... Soc. Sec. #... Remote Location... (if on campus write campus) FINAL EXAM EE568 KUMAR. Sp ' 00

Chapter 7: Channel coding:convolutional codes

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

1 1 0, g Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g

Binary Convolutional Codes

BASICS OF DETECTION AND ESTIMATION THEORY

Code design: Computer search

Appendix D: Basics of convolutional codes

Channel Coding and Interleaving

Physical Layer and Coding

Soft-Output Trellis Waveform Coding

Example of Convolutional Codec

RADIO SYSTEMS ETIN15. Lecture no: Equalization. Ove Edfors, Department of Electrical and Information Technology

The Viterbi Algorithm EECS 869: Error Control Coding Fall 2009

Decoding the Tail-Biting Convolutional Codes with Pre-Decoding Circular Shift

Convolutional Codes ddd, Houshou Chen. May 28, 2012

ELEC 405/511 Error Control Coding. Binary Convolutional Codes

Roll No. :... Invigilator's Signature :.. CS/B.TECH(ECE)/SEM-7/EC-703/ CODING & INFORMATION THEORY. Time Allotted : 3 Hours Full Marks : 70

Turbo Codes for Deep-Space Communications

4 An Introduction to Channel Coding and Decoding over BSC

Lecture 4: Proof of Shannon s theorem and an explicit code

An Introduction to Low Density Parity Check (LDPC) Codes

Coding theory: Applications

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Lecture 12. Block Diagram

Chapter10 Convolutional Codes. Dr. Chih-Peng Li ( 李 )

Channel Coding I. Exercises SS 2017

Introduction to Binary Convolutional Codes [1]

GEORGIA INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING Final Examination - Fall 2015 EE 4601: Communication Systems

Lecture 3 : Introduction to Binary Convolutional Codes

Digital Communications

LDPC Codes. Intracom Telecom, Peania

CSCI 2570 Introduction to Nanocomputing

CHAPTER 8 Viterbi Decoding of Convolutional Codes

Introduction to convolutional codes

Linear Block Codes. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Coding on a Trellis: Convolutional Codes


EE 229B ERROR CONTROL CODING Spring 2005

Lecture 8: Shannon s Noise Models

Decision-Point Signal to Noise Ratio (SNR)

LDPC Codes. Slides originally from I. Land p.1

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

LECTURE 10. Last time: Lecture outline

ELECTRONICS & COMMUNICATIONS DIGITAL COMMUNICATIONS

Digital Communication Systems ECS 452. Asst. Prof. Dr. Prapun Suksompong 5.2 Binary Convolutional Codes

Channel Coding I. Exercises SS 2017

Introduction to Wireless & Mobile Systems. Chapter 4. Channel Coding and Error Control Cengage Learning Engineering. All Rights Reserved.

Error Correction and Trellis Coding

Turbo Codes. Manjunatha. P. Professor Dept. of ECE. June 29, J.N.N. College of Engineering, Shimoga.

16.36 Communication Systems Engineering

Convolutional Codes. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 6th, 2008

Trellis-based Detection Techniques

THIS paper is aimed at designing efficient decoding algorithms

Lecture 4 Noisy Channel Coding

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

PUNCTURED 8-PSK TURBO-TCM TRANSMISSIONS USING RECURSIVE SYSTEMATIC CONVOLUTIONAL GF ( 2 N ) ENCODERS

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Exact Probability of Erasure and a Decoding Algorithm for Convolutional Codes on the Binary Erasure Channel

Lecture 4: Linear Codes. Copyright G. Caire 88

Belief propagation decoding of quantum channels by passing quantum messages

On the Joint Decoding of LDPC Codes and Finite-State Channels via Linear Programming

Belief-Propagation Decoding of LDPC Codes

Exponential Error Bounds for Block Concatenated Codes with Tail Biting Trellis Inner Codes

Chapter 7. Error Control Coding. 7.1 Historical background. Mikael Olofsson 2005

Maximum Likelihood Sequence Detection

Data Detection for Controlled ISI. h(nt) = 1 for n=0,1 and zero otherwise.

Error Correction Methods

Turbo Codes are Low Density Parity Check Codes

ECEN 655: Advanced Channel Coding

One Lesson of Information Theory

Simplified Implementation of the MAP Decoder. Shouvik Ganguly. ECE 259B Final Project Presentation

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Maximum Likelihood Decoding of Codes on the Asymmetric Z-channel

Approaching Blokh-Zyablov Error Exponent with Linear-Time Encodable/Decodable Codes

Punctured Convolutional Codes Revisited: the Exact State Diagram and Its Implications

SOFT DECISION FANO DECODING OF BLOCK CODES OVER DISCRETE MEMORYLESS CHANNEL USING TREE DIAGRAM

ECS 332: Principles of Communications 2012/1. HW 4 Due: Sep 7

Message Passing Algorithm and Linear Programming Decoding for LDPC and Linear Block Codes

Low Density Lattice Codes

Encoder. Encoder 2. ,...,u N-1. 0,v (0) ,u 1. ] v (0) =[v (0) 0,v (1) v (1) =[v (1) 0,v (2) v (2) =[v (2) (a) u v (0) v (1) v (2) (b) N-1] 1,...

Modulation & Coding for the Gaussian Channel

Lecture 4 : Introduction to Low-density Parity-check Codes

Communication Theory II

Revision of Lecture 5

An analysis of the computational complexity of sequential decoding of specific tree codes over Gaussian channels

Lecture 7 September 24

for some error exponent E( R) as a function R,

Communication by Regression: Sparse Superposition Codes

Non-Linear Turbo Codes for Interleaver-Division Multiple Access on the OR Channel.

Error Detection and Correction: Hamming Code; Reed-Muller Code

Convolutional Codes Klaus von der Heide

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011

Cyclic Redundancy Check Codes

Mapper & De-Mapper System Document

Sub-Gaussian Model Based LDPC Decoder for SαS Noise Channels

Performance of small signal sets

Chapter 3 Linear Block Codes

An Introduction to Low-Density Parity-Check Codes

ECE8771 Information Theory & Coding for Digital Communications Villanova University ECE Department Prof. Kevin M. Buckley Lecture Set 2 Block Codes

Transcription:

Introduction to Convolutional Codes, Part 1 Frans M.J. Willems, Eindhoven University of Technology September 29, 2009

Elias, Father of Coding Theory Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission

Peter Elias (U.S., 1923-2001) Figure: P. Elias. 1955 - Coding for Noisy Channels. Hamming had already introduced parity-check codes, but Peter went a giant step farther by showing for the binary symmetric channel that such linear codes suffice to exploit a channel to its fullest. In particular, he showed that error probability as a function of delay is bounded above and below by exponentials, whose exponents agree for a considerable range of values of the channel and the code parameters, and that these same results apply to linear codes. These exponential error bounds presaged those obtained for general channels ten years later by Gallager. In this same paper Peter introduced and named convolutional codes. His motivation was to show that it was in principle possible, by using a convolutional code with infinite constraint length, to transmit information at a rate equal to channel capacity with probability one that no decoded symbol will be in error. (by J.L. Massey)

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission Textbook Convolutional Encoder We consider the encoder that appears in almost every elementary text on convolutional codes. It consists of two connected delay elements (a shift register) and two modulo-2 adders (EXORs). The output of a delay element time t is equal to its input at time t 1. The time t is integer. u(t) s 1 (t) D s 2 (t) D v 1 (t) v 2 (t) Motivation is that shift-registers can be used to produce random sequences, and we have learned from Shannon that random codes reach capacity.

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission Finite-State Description Let T be the number of symbols that is to be encoded. For the input u(t) of the encoder we assume that u(t) {0, 1}, for t = 1, 2,, T and u(t + 1) = u(t + 2) = 0. (1) For the outputs v 1 (t) and v 2 (t) of the encoder we then have that v 1 (t) = u(t) s 2 (t), v 2 (t) = u(t) s 1 (t) s 2 (t), for t = 1, 2,, T + 2, (2) while the states s 1 (t) and s 2 (t) satisfy s 1 (1) = s 2 (1) = 0, s 1 (t) = u(t 1), s 2 (t) = s 1 (t 1), for t = 2, 3,, T + 3. (3) Note the the encoder starts and stops in the all-zero state (s 1, s 2 ) = (0, 0).

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission Rate, Memory, Constraint Length Codewords are now created as follows: inputword = u(1), u(2),, u(t ), codeword = v 1 (1), v 2 (1), v 1 (2), v 2 (2),, v 1 (T + 2), v 2 (T + 2). (4) The length of the input words is T hence there are 2 T codewords. We assume that they all have probability 2 T, or I (1), U(2),, U(T ) are all uniform. The length of the codewords is 2(T + 2) therefore the rate R T = log 2T 2(T + 2) = T 2(T + 2) = 1 2 1 T + 2. (5) Note that R = R T = 1/2. We therefore call our encoder a rate-1/2 encoder. The memory M associated with our code is 2. In general the encoder uses the past inputs u t M,, u t 1 and the current input u t to construct the outputs v 1 (t), v 2 (t). Related to this is the constraint length K = M + 1, since a new pair of outputs is determined by M previous input symbols and the current one.

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission Convolution, Linearity Why do we call this a convolutional encoder? To see why note that s 1 (t) = u(t 1) and s 2 (t) = s 1 (t 1) = u(t 2). Therefore v 1 (t) = u(t) s 2 (t) = u(t) u(t 2) v 2 (t) = u(t) s 1 (t) s 2 (t) = u(t) u(t 1) u(t 2). (6) Define the impulse responses h 1 (t) and h 2 (t) with coefficients h 1 (0), h 1 (1), h 1 (2) = 1, 0, 1, h 2 (0), h 2 (1), h 2 (2) = 1, 1, 1, (7) and the other coefficients equal to zero, then v 1 (t) = u(t τ)h 1 (τ) = u h 1 (t), v 2 (t) = τ=0,1, u(t τ)h 2 (τ) = u h 2 (t), (8) τ=0,1, result from convolving u with h 1 and h 2. This makes our convolutional code linear.

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission A Systematic Convolutional Code u(t) v 1 (t) D D D v 2 (t) The encoder in the figure above is systematic since one of its outputs is equal to the input i.e. v 1 (t) = u(t). The rate R of this code is 1/2, its memory M = 3.

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission A Rate-2/3 Convolutional Code u 1 (t) u 2 (t) D D v 1 (t) v 2 (t) v 3 (t) For every k = 2 binary input symbols the encoder in figure above produces n = 3 binary output symbols. Therefore its rate R = k/n = 2/3. The memory M of this encoder is 1 since only u 1 (t 1) and u 2 (t 1) are used to produce v 1 (t), v 2 (t), and v 3 (t).

Textbook Encoder Encoder Properties Systematic Codes and Different Rates Transmission Transmission via a BSC Suppose that we use our example encoder and take the length T of the input words u = (u 1, u 2,, u T ) equal to 6. We then get codewords x = (x 1, x 2,, x 2(T +2) ) = (v 1 (1), v 2 (1), v 1 (2),, v 2 (T + 2)) with codeword length equal to 2(T + 2) = 16. We assume that all codewords are equiprobable. These codewords are transmitted over a binary symmetric channel (BSC), see the figure below, with cross-over probability 0 p 1/2. Now suppose that we receive 1 p 0 0 p x y p 1 1 1 p y = (y 1, y 2,, y 2(T +2) ) = (10, 11, 00, 11, 10, 11, 11, 00). (9) How should we efficiently decode this received sequence?

Communicating a Message, Error Probability P(m) m e(m) x y m P(y x) d(y) Consider the communication system in the figure. A message source produces message m M with a-priori probability Pr{M = m}. An encoder transforms the message into a channel input 1 x X, hence x = e(m). Now the channel output 2 y Y is received with probability Pr{Y = y X = x}. The decoder observes y and produces an estimate m M of the transmitted message, hence m = d(y). How should decoding rule d( ) be chosen such that the error probability is minimized? P e = Pr{ M M} (10) 1 This is in general a sequence. 2 Typically a sequence.

The Maximum A-Posteriori Probability (MAP) Rule First we form an upper bound for the probability that no error occurs: 1 P e = y = y Pr{M = d(y), Y = y} Pr{Y = y} Pr{M = d(y) Y = y} y Pr{Y = y} max Pr{M = m Y = y}. (11) m Observe that equality is achieved if and only if 3 d(y) = arg max Pr{M = m Y = y}, for all y that can occur. (12) m Since {Pr{M = m Y = y}, m M} are the a-posteriori probabilities after having received y, we call this rule the maximum a-posteriori probability rule (MAP-rule). 3 It is possible that the maximum is not obtained for a unique m.

The Maximum-Likelihood (ML) Rule Suppose that all message probabilities are equal. Then d(y) = arg max Pr{M = m Y = y}, m = arg max m = arg max m Pr{M = m} Pr{Y = y M = m} Pr{Y = y} Pr{Y = y M = m} M Pr{Y = y} = arg max Pr{Y = y M = m} m = arg max Pr{Y = y X = e(m)}, for all y that can occur. (13) m Since {Pr{Y = y X = e(m)}, m M} are the likelihoods for receiving y, we call this rule the maximum-likelihood rule (ML-rule).

The Minimum-Distance (MD) Rule Suppose that all message probabilities are equal, and that e(m) is a binary codeword of length L, for all m M. Moreover let y be the output sequence of a binary symmetric channel with cross-over probability 0 p 1/2, when e(m) is its input sequence. Then ( ) Pr{Y = y X = e(m)} = p d H (e(m),y) (1 p) L d H (e(m),y) p dh = (1 p) L (e(m),y). 1 p (14) where d H (, ) denotes Hamming distance. Now d(y) = arg max Pr{Y = y X = e(m)}, m = arg min d m H (e(m), y), for all y that can occur. (15) Since {d H (e(m), y), m M} are the Hamming distances between codewords e(m) and the received sequence y, we call this rule the minimum (Hamming) distance rule (MD-rule). Conclusion is that minimum Hamming distance decoding should be applied to decode y = (10, 11, 00, 11, 10, 11, 11, 00).

Complexity of We could do an exhaustive search. Using minimum-distance (MD) decoding we could search all 2 T = 64 codewords. A serious disadvantage of this approach is that the search complexity increases exponentially in the number of input symbols T. We will therefore discuss an efficient method, called the Viterbi algorithm. The complexity of this method is linear in T.

Transition Table For our textbook encoder we can determine the transition table. This table contains the output pair v 1 v 2 (t) and next state s 1 s 2 (t + 1) given current state s 1 s 2 (t) and input u(t). See below: This table leads to the state diagram. u(t) s 1 s 2 (t) v 1 v 2 (t) s 1 s 2 (t + 1) 0 0, 0 0, 0 0, 0 1 0, 0 1, 1 1, 0 0 0, 1 1, 1 0, 0 1 0, 1 0, 0 1, 0 0 1, 0 0, 1 0, 1 1 1, 0 1, 0 1, 1 0 1, 1 1, 0 0, 1 1 1, 1 0, 1 1, 1

State Diagram In the state diagram, see figure below, states are denoted by s 1 s 2 (t). Along the branches that lead from the current state s 1 s 2 (t) to the next state s 1 s 2 (t + 1) we find the input/outputs u(t)/v 1 v 2 (t)). 0/00 00 0/11 1/11 1/00 01 10 0/01 0/10 1/10 11 1/01

Trellis Diagram To see what sequences of states are possible as a function of the time t we can take a look at the trellis (in Dutch hekwerk ) diagram (see figure below). The horizontal axis is the time-axis. States are denoted by s 1 s 2 (t), and along the branches are again the input and outputs u(t)/v 1 v 2 (t). 0/00 0/00 0/00 0/00 0/00 0/00 0/00 0/00 00 00 00 00 00 00 00 00 00 1/11 1/11 1/11 1/11 1/11 1/11 1/11 1/11 0/11 0/11 0/11 0/11 0/11 0/11 0/11 0/11 01 01 01 01 01 01 01 01 01 1/00 1/00 1/00 1/00 1/00 1/00 1/00 1/00 0/01 0/01 0/01 0/01 0/01 0/01 0/01 0/01 10 10 10 10 10 10 10 10 10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 0/10 0/10 0/10 0/10 0/10 0/10 0/10 0/10 11 11 11 11 11 11 11 11 11 1/01 1/01 1/01 1/01 1/01 1/01 1/01 1/01

Truncated Trellis Each codeword now corresponds to a path in the trellis diagram. This path starts at t = 1 in state s 1 s 2 = 00 and traverses along T + 2 branches and then ends at t = T + 3 in state s 1 s 2 = 00 again. The truncated trellis (see figure below) contains only the states and branches that can actually occur. 0/00 0/00 0/00 0/00 0/00 0/00 0/00 0/00 00 00 00 00 00 00 00 00 00 1/11 1/11 1/11 1/11 1/11 1/11 0/11 0/11 0/11 0/11 0/11 0/11 01 01 01 01 01 01 1/00 1/00 1/00 1/00 0/01 0/01 0/01 0/01 0/01 0/01 10 10 10 10 10 10 1/10 1/10 1/10 1/10 1/10 0/10 0/10 0/10 0/10 0/10 11 11 11 11 11 START STOP 1/01 1/01 1/01 1/01

Trellis with Branch Distances After having received channel output sequence y, to do MD-decoding, we must be able to compute Hamming distances d H (x, y) where x is a codeword. Since T +2 d H (x, y) = d H (v 1 v 2 (t), y(2t 1)y(2t)), (16) t=1 we first determine all branch distances d H (v 1 v 2 (t), y(2t 1)y(2t)), see figure. y = 10 11 00 11 10 11 11 00 00-1 00-2 00-0 00-2 00-1 00-2 00-2 00-0 00 00 00 00 00 00 00 00 00 11-1 11-0 11-2 11-0 11-1 11-0 11-2 11-0 11-1 11-0 11-0 11-2 01 01 01 01 01 01 00-0 00-2 00-1 00-2 01-1 01-1 01-1 01-2 01-1 01-1 10 10 10 10 10 10 10-1 10-1 10-1 10-0 10-1 10-1 10-1 10-0 10-1 10-1 11 11 11 11 11 START STOP 01-1 01-1 01-2 01-1

Viterbi s Principle Let s = s 1 s 2 and v = v 1 v 2. s (t) v (t) s (t) v (t) s(t + 1) Assume that state s(t + 1) at time t + 1 can be reached only via states s (t) and s (t) at time t through the branches v (t) and v (t) respectively. Then (Viterbi [1967]): Best path to s(t + 1) = best of ( all paths to s (t) extended by v (t), all paths to s (t) extended by v (t) ) = best of ( best path to s (t) extended by v (t), best path to s (t) extended by v (t) ). This principle can be used recursively. First determine the best path leading from the start state s(1) to all states at time 2, then the best path to all states at time 3, etc., and finally determine the best part to the final state s(t + 3). Dijkstra s shortest path algorithm [1959] is more general and more complex than the Viterbi method.

The Define D s(t) to be the total distance of a best path leading to state s at time t. Let B s(t) denote a best path leading to this state. Define d s,s(t 1, t) to be the distance corresponding to the branch connecting state s at time t 1 to state s at time t. Let b s,s(t 1, t) denote this branch. 1. Set t := 1. Also set the total distance of the starting state D 00 (1) := 0 and set the best path leading to it B 00 (1) := φ i.e. equal to the empty path. 2. Increment t i.e. t := t + 1. For all possible states s at time t let A s(t) be the set of states at time t 1 that have a branch leading to state s at time t. Assume that s A s(t) minimizes D s (t 1) + d s,s(t 1, t) i.e. survives. Then set Here denotes concatenation. D s(t) := D s (t 1) + d s,s(t 1, t) B s(t) := B s (t 1) b s,s(t 1, t). 3. If t = T + 3 output the best path B 00 (t), otherwise go to step 2.

Forward: Add, Compare, and Select Best-path-metrics are denoted in the states. An asterisk denotes that each of the two incoming paths into a state can be chosen as survivor. y = 10 00-1 0 1 11-1 1 START 11-0 01-1 10-1 11 00 11 10 11 11 00 00-2 00-0 00-2 00-1 00-2 00-2 00-0 3 3 2 3 3 4 4 11-2 11-0 11-1 11-0 11-2 11-0 11-1 11-0 11-0 11-2 2 2 3* 3 4* 4 00-0 00-2 00-1 00-2 01-1 01-1 01-2 01-1 01-1 1 2 3 3 3 10-1 10-1 10-0 10-1 10-1 10-1 10-0 10-1 10-1 2 2 3* 3 4* STOP 01-1 01-1 Note that there is a best path (codeword) at distance 4 from the received sequence. 01-2 01-1

Backward: Trace Back Tracing back from the stop state results in decoded path (00, 11, 01, 11, 11, 01, 11, 00), The corresponding input sequence is (0, 1, 0, 0, 1, 0). An equally good input sequence would be (0, 0, 0, 1, 1, 0). The second and fourth input digit are therefore not so reliable.... x = 00 11 01 11 00-1 00-2 00-0 0 1 3 3 11-1 11-0 11-2 11-0 11-2 11-0 2 2 00-0 00-2 01-1 01-1 01-1 1 1 2 START 10-1 11 01 11 00 00-2 00-1 00-2 00-2 00-0 2 3 3 4 4 11-1 11-0 11-1 11-0 11-0 11-2 3* 3 4* 4 00-1 00-2 01-2 01-1 01-1 3 3 3 10-1 10-1 10-0 10-1 10-1 10-1 10-0 10-1 10-1 2 2 3* 3 4* STOP 01-1 01-1 01-2 01-1

Complexity Fortunately the complexity of the Viterbi algorithm is linear in the codeword length T. At each time we have to add, compare and select (ACS) in every state. The complexity is therefore also linear in the number of states at each time, which is 4 in our case. In general the number of states is 2 m where m is the number of delay elements in the encoder. Therefore Viterbi decoding is in practise only possible (now) if m is not much higher than say 10, i.e. the number of states is not much more than 2 10 = 1024. Later we shall see that the code performance improves for increasing values of m.

Exercise We transmit an information-word (x(1), x(2), x(3), x(4), x(5)) over an inter-symbol-interference (ISI) channel. This information-word is preceded and followed by zeroes, hence x(t) = 0 for integer t / {1, 2, 3, 4, 5} x(t) { 1, 1} for t {1, 2, 3, 4, 5}. All 32 information-words occur with equal probability. For the ISI channel for integer times t we have that y(t) = x(t) + x(t 1) + n(t) where the probability density function of the noise n(t) is given by p(n) = 1 2π exp( n2 2 ), thus n(t) has a Gaussian density. The received sequence satisfies y(1) = +0.3, y(2) = +0.2, y(3) = +0.1, y(4) = 1.1, y(5) = +2.5 en y(6) = +0.5. Decode the information-word with a decoder that minimizes the word-error probability. Show first that the decoder should minimize (squared) Euclidean distance.