Modulation & Coding for the Gaussian Channel

Modulation & Coding for the Gaussian Channel Trivandrum School on Communication, Coding & Networking January 27 30, 2017 Lakshmi Prasad Natarajan Dept. of Electrical Engineering Indian Institute of Technology Hyderabad lakshminatarajan@iith.ac.in 1 / 54

Digital Communication Convey a message from transmitter to receiver in a finite amount of time, where the message can assume only finitely many values. ˆ time can be replaced with any resource: space available in a compact disc, number of cells in flash memory Picture courtesy brigetteheffernan.wordpress.com 2 / 54

The Additive Noise Channel ˆ Message m takes finitely many, say M, distinct values Usually, not always, M = 2 k, for some integer k assume m is uniformly distributed over {1,..., M} ˆ Time duration T transmit signal s(t) is restricted to 0 t T ˆ Number of message bits k = log 2 M (not always an integer) 3 / 54

Modulation Scheme ˆ The transmitter & receiver agree upon a set of waveforms {s 1 (t),..., s M (t)} of duration T. ˆ The transmitter uses the waveform s i (t) for the message m = i. ˆ The receiver must guess the value of m given r(t). ˆ We say that a decoding error occurs if the guess ˆm m. Definition An M-ary modulation scheme is simply a set of M waveforms {s 1 (t),..., s M (t)} each of duration T. Terminology ˆ Binary: M = 2, modulation scheme {s 1 (t), s 2 (t)} ˆ Antipodal: M = 2 and s 2 (t) = s 1 (t) ˆ Ternary: M = 3, Quaternary: M = 4 4 / 54

Parameters of Interest ˆ Bit rate R = log 2 M T bits/sec Energy of the i th waveform E i = s i (t) 2 = ˆ Average Energy i=1 i=1 T t=0 s 2 i (t)dt M M 1 T E = P (m = i)e i = s i (t) 2 M t=0 ˆ Energy per message bit E b = E log 2 M ˆ Probability of error P e = P (m ˆm) Note P e depends on the modulation scheme, noise statistics and the demodulator. 5 / 54

Example: On-Off Keying, M = 2 6 / 54

Objectives 1 Characterize and analyze a modulation scheme in terms of energy, rate and error probability. What is the best/optimal performance that one can expect? 2 Design a good modulation scheme that performs close to the theoretical optimum. Key tool: Signal Space Representation ˆ Represent waveforms as vectors: geometry of the problem ˆ Simplifies performance analysis and modulation design ˆ Leads to efficient modulation/demodulation implementations 7 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 8 / 54

References ˆ I. M. Jacobs and J. M. Wozencraft, Principles of Communication Engineering, Wiley, 1965. ˆ G. D. Forney and G. Ungerboeck, Modulation and coding for linear Gaussian channels, in IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2384-2415, Oct 1998. ˆ D. Slepian and H. O. Pollak, Prolate spheroidal wave functions, Fourier analysis and uncertainty I, in The Bell System Technical Journal, vol. 40, no. 1, pp. 43-63, Jan. 1961. ˆ H. J. Landau and H. O. Pollak, Prolate spheroidal wave functions, Fourier analysis and uncertainty III: The dimension of the space of essentially time- and band-limited signals, in The Bell System Technical Journal, vol. 41, no. 4, pp. 1295-1336, July 1962. 8 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 9 / 54

Goal Map waveforms s 1 (t),..., s M (t) to M vectors in a Euclidean space R N, so that the map preserves the mathematical structure of the waveforms. 9 / 54

Quick Review of R N : N-Dimensional Euclidean Space R N = { (x 1, x 2,..., x N ) x 1,..., x N R } Notation: x = (x 1, x 2,..., x N ) and 0 = (0, 0,..., 0) Addition Properties: ˆ x + y = (x 1,..., x N ) + (y 1,..., y N ) = (x 1 + y 1,..., x N + y N ) ˆ x y = (x 1,..., x N ) (y 1,..., y N ) = (x 1 y 1,..., x N y N ) ˆ x + 0 = x for every x R N Multiplication Properties: ˆ a x = a (x 1,..., x N ) = (ax 1,..., ax N ), where a R ˆ a(x + y) = ax + ay ˆ (a + b)x = ax + bx ˆ ax = 0 if and only if a = 0 or x = 0 10 / 54

Quick Review of R N : Inner Product and Norm Inner Product ˆ x, y = y, x = x 1 y 1 + x 2 y 2 + + x N y N ˆ x, y + z = x, y + x, z (distributive law) ˆ ax, y = a x, y ˆ If x, y = 0 we say that x and y are orthogonal Norm ˆ x = x 2 1 + + x2 N = x, x denotes the length of x ˆ x 2 = x, x denotes the energy of the vector x ˆ x 2 = 0 if and only if x = 0 ˆ If x = 1 we say that x is of unit norm ˆ x y is the distance between two vectors. Cauchy-Schwarz Inequality ˆ x, y x y ˆ Or equivalently, 1 x, y x y 1 11 / 54

Waveforms as Vectors The set of all finite-energy waveforms of duration T and the Euclidean space R N share many structural properties. Addition Properties ˆ We can add and subtract two waveforms x(t) + y(t), x(t) y(t) ˆ The all-zero waveform 0(t) = 0 for 0 t T is the additive identity x(t) + 0(t) = x(t) for any waveform x(t) Multiplication Properties ˆ We can scale x(t) using a real number a and obtain a x(t) ˆ a(x(t) + y(t)) = ax(t) + ay(t) ˆ (a + b)x(t) = ax(t) + bx(t) ˆ ax(t) = 0(t) if and only if a = 0 or x(t) = 0(t) 12 / 54

Inner Product and Norm of Waveforms Inner Product ˆ x(t), y(t) = y(t), x(t) = T t=0 x(t)y(t)dt ˆ x(t), y(t) + z(t) = x(t), y(t) + x(t), z(t) (distributive law) ˆ ax(t), y(t) = a x(t), y(t) ˆ If x(t), y(t) = 0 we say that x(t) and y(t) are orthogonal Norm ˆ x(t) = x(t), x(t) = T t=0 x2 (t)dt is the norm of x(t) ˆ x(t) 2 = T t=0 x2 (t)dt denotes the energy of x(t) ˆ If x(t) = 1 we say that x(t) is of unit norm ˆ x(t) y(t) is the distance between two waveforms Cauchy-Schwarz Inequality ˆ x(t), y(t) x(t) y(t) for any two waveforms x(t), y(t) We want to map s 1 (t),..., s M (t) to vectors s 1,..., s M R N so that the addition, multiplication, inner product and norm properties are preserved. 13 / 54

Orthonormal Waveforms Definition A set of N waveforms {φ 1 (t),..., φ N (t)} is said to be orthonormal if 1 φ 1 (t) = φ 2 (t) = = φ N (t) = 1 (unit norm) 2 φ i (t), φ j (t) = 0 for all i j (orthogonality) The role of orthonormal waveforms is similar to that of the standard basis e 1 = (1, 0, 0,..., 0), e 2 = (0, 1, 0,..., 0),, e N = (0, 0,..., 0, 1) Remark Say x(t) = x 1 φ 1 (t) + x N φ N (t), y(t) = y 1 φ 1 (t) + + y N φ N (t) N N x(t), y(t) = x i φ i (t), y j φ j (t) = x i y j φ i (t), φ j (t) i=1 j=1 i j = x i y j = x i y i i j=i i = x, y 14 / 54

Example 15 / 54

Orthonormal Basis Definition An orthonormal basis for {s 1 (t),..., s M (t)} is an orthonormal set {φ 1 (t),..., φ N (t)} such that s i (t) = s i,1 φ i (t) + s i,2 φ 2 (t) + + s i,m φ N (t) for some choice of s i,1, s i,2,..., s i,n R ˆ We associate s i (t) s i = (s i,1, s i,2,..., s i,n ) ˆ A given modulation scheme can have many orthonormal bases. ˆ The map s 1 (t) s 1, s 2 (t) s 2,..., s M (t) s M depends on the choice of orthonormal basis. 16 / 54

Modulation Scheme Example: M-ary Phase Shift Keying ˆ s i (t) = A cos(2πf c t + 2πi M ), i = 1,..., M ˆ Expanding s i (t) using cos(c + D) = cos C cos D sin C sin D s i (t) = A cos Orthonormal Basis ( 2πi M ) cos(2πf c t) A sin ( 2πi M ) sin(2πf c t) ˆ Use φ 1 (t) = 2/T cos(2πf c t) and φ 2 (t) = 2/T sin(2πf c t) ( ) ( ) T 2πi T 2πi s i (t) = A 2 cos φ 1 (t) + A M 2 sin φ 2 (t) M ˆ Dimension N = 2 Waveform to Vector s i (t) ( A2 T 2 cos ( ) 2πi A2 T, M 2 sin ( ) ) 2πi M 17 / 54

8-ary Phase Shift Keying 18 / 54

How to find an orthonormal basis Gram-Schmidt Procedure Given a modulation scheme {s 1 (t),..., s M (t)}, constructs an orthonormal basis φ 1 (t),..., φ N (t) for the scheme. Similar to QR factorization of matrices r 1,1 r 1,2 r 1,M r 2,1 r 2,2 r 2,M A = [a 1 a 2 a M ] = [q 1 q 2 q N ]... = QR r N,1 r N,2 r N,M s 1,1 s 2,1 s M,1 s 1,2 s 2,2 s M,2 [s 1 (t) s M (t)] = [φ 1 (t) φ N (t)]... s 1,N s 2,N s M,N 19 / 54

Waveforms to Vectors, and Back Say {φ 1 (t),..., φ N (t)} is an orthonormal basis for {s 1 (t),..., s M (t)}. N Then, s i (t) = s i,j φ j (t) for some choice of {s i,j } Waveform to Vector j=1 s i (t), φ j (t) = k s i,k φ k (t), φ j (t) = k s i,k φ k (t), φ j (t) = s i,j s i (t) (s i,1, s i,2,..., s i,n ) = s i where s i,1 = s i (t), φ 1 (t), s i,2 = s i (t), φ 2 (t),..., s i,n = s i (t), φ N (t) Vector to Waveform s i = (s i,1,..., s i,n ) s i,1 φ 1 (t) + s i,2 φ 2 (t) + + s i,n φ N (t) ˆ Every point in R N corresponds to a unique waveform. ˆ Going back and forth between vectors and waveforms is easy. 20 / 54

Waveforms to Vectors, and Back Caveat v(t) Waveform to vector v v Vector to waveform ˆv(t) ˆv(t) = v(t) iff v(t) is some linear combination of φ 1 (t),..., φ N (t), or equivalently, v(t) is some linear combination of s 1 (t),..., s M (t) 21 / 54

Equivalence Between Waveform and Vector Representations Say v(t) = v 1 φ 1 (t) + + v N φ N (t) and u(t) = u 1 φ 1 (t) + + u N φ N (t) Addition v(t) + u(t) v + u Scalar Multiplication a v(t) a v Energy v(t) 2 v 2 Inner product v(t), u(t) v, u Distance v(t) u(t) v u Basis φ i (t) e i (Std. basis) 22 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 23 / 54

Vector Gaussian Channel Definition An M-ary modulation scheme of dimension N is a set of M vectors {s 1,..., s M } in R N ˆ Average energy E = 1 ( s 1 2 + + s M 2) M 23 / 54

Vector Gaussian Channel Relation between received vector r and transmit vector s i The j th component of received vector r = (r 1,..., r N ) r j = r(t), φ i (t) = s i (t) + n(t), φ j (t) = s i (t), φ j (t) + n(t), φ j (t) = s i,j + n j Denoting n = (n 1,..., n N ) we obtain r = s i + n If n(t) is a Gaussian random process, noise vector n follows Gaussian distribution. Note Effective noise at the receiver ˆn(t) = n 1 φ 1 (t) + + n N φ N (t) In general, n(t) not a linear combination of basis, and ˆn(t) n(t), 24 / 54

Designing a Modulation Scheme 1 Choose an orthonormal basis φ 1 (t),..., φ N (t) Determines bandwidth of transmit signals, signalling duration T 2 Construct a (vector) modulation scheme s 1,..., s M R N Determines the signal energy, probability of error An N-dimensional modulation scheme exploits N uses of a scalar Gaussian channel r j = s i,j + n j where j = 1,..., N With limits on bandwidth and signal duration, how large can N be? 25 / 54

Dimension of Time/Band-limited Signals Say transmit signals s(t) must be time/band limited 1 s(t) = 0 if t < 0 or t T, and (time-limited) 2 S(f) = 0 if f < f c W 2 or f > f c + W 2 (band-limited) Uncertainty Principle: No non-zero signal is both time- and band-limited. No signal transmission is possible! We relax the constraint to approximately band-limited 1 s(t) = 0 if t < 0 or t > T, and (time-limited) 2 fc+w/2 S(f) 2 df (1 δ) f=f c W/2 0 + Here δ > 0 is the fraction of out-of-band signal energy. S(f) 2 df (approx. band-lim.) What is the largest dimension N of time-limited/approximately band-limited signals? 26 / 54

Dimension of Time/band-limited Signals Let T > 0 and W > 0 be given, and consider any δ, ɛ > 0. Theorem (Landau, Pollak & Slepian 1961-62) If T W is sufficiently large, there exists N = 2T W (1 ɛ) orthonormal waveforms φ 1 (t),..., φ N (t) such that 1 φ i (t) = 0 if t < 0 or t > T, and (time-limited) 2 fc+w/2 f=f c W/2 In summary Φ i (f) 2 df (1 δ) + 0 Φ i (f) 2 df (approx. band-lim.) ˆ We can pack N 2T W dimensions if the time-bandwidth product T W is large enough. ˆ Number of dimensions/channel uses normalized to 1 sec of transmit duration and 1 Hz of bandwidth N T W 2 dim/sec/hz 27 / 54

Relation between Waveform & Vector Channels Assume N = 2T W Signal energy E i s i (t) 2 s i 2 1 Avg. energy E M i s i(t) 2 1 M i s i 2 Transmit Power Rate S R E T log 2 M T E N 2W log 2 M N 2W Parameters for Vector Gaussian Channel ˆ Spectral Efficiency η = 2 log 2 M/N (unit: bits/sec/hz) Allows comparison between schemes with different bandwidths. Related to rate as η = R/W ˆ Power P = E/N (unit: Watt/Hz) Related to actual transmit power as S = 2W P 28 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 29 / 54

Detection in the Gaussian Channel Definition Detection/Decoding/Demodulation is the process of estimating the message m given the received waveform r(t) and the modulation scheme {s 1 (t),..., s M (t)}. Objective: Design the decoder to minimize P e = P ( ˆm m). 29 / 54

The Gaussian Random Variable ˆ P (X < a) = P (X > a) = Q(a) ˆ Q( ) is a decreasing function ˆ Y = σx is Gaussian with mean 0 and var σ 2, i.e., N (0, σ 2 ) ˆ P (Y > b) = P (σx > b) = P (X > b σ ) = Q ( ) b σ 30 / 54

White Gaussian Noise Process n(t) Noise waveform n(t) modelled as a white Gaussian random process, i.e., as a a collection of random variables {n(τ) < τ < + } such that ˆ Stationary random process Statistics of the processes n(t) and n(t constant) are identical ˆ Gaussian random process Any linear combination of finitely many samples of n(t) is Gaussian a 1 n(t 1 ) + a 2 n(t 2 ) + + a l n(t l ) Gaussian distributed ˆ White random process The power spectrum N(f) of the noise process is flat N(f) = N o 2 W/Hz, for < f < + 31 / 54

32 / 54

Noise Process Through Waveform-to-Vector Converter Properties of the noise vector n = (n 1,..., n N ) ˆ n 1, n 2,..., n N are independent N (0, N o /2) random variables f(n i ) = 1 ( ) exp n2 i πno N o ˆ Noise vector n describes only a part of n(t) ˆn(t) = n 1 φ 1 (t) + + n N φ N (t) n(t) The noise component not captured by waveform-to-vector converter: n(t) = n(t) ˆn(t) 0 33 / 54

White Gaussian Noise Vector n n = (n 1,..., n N ) ˆ Probability density of n = (n 1,..., n N ) in R N N ) 1 f noise (n) = f(n 1,..., n N ) = f(n i ) = ( ( πn o ) exp n 2 N i=1 N o Probability density depends only on n 2 Spherically symmetric: Isotropic distribution Density highest near 0 and decreasing in n 2 noise vector of larger norm less likely than a vector with smaller norm ( ˆ For any a R N, n, a N 0, a 2 N ) o 2 ˆ a 1,..., a K are orthonormal n, a 1,..., n, a K are independent N (0, N o /2) 34 / 54

n(t) Carries Irrelevant Information ˆ r = s i + n does not carry all the information in r(t) ˆr(t) = r 1 φ 1 (t) + + r N φ N (t) r(t) ˆ The information about r(t) not contained in r r(t) j r j φ j (t) = s i (t) + n(t) j s i,j φ j (t) j n j φ j (t) = n(t) Theorem The vector r contains all the information in r(t) that is relevant to the transmitted message. ˆ n(t) is irrelevant for the optimum detection of transmit message. 35 / 54

The (Effective) Vector Gaussian Channel ˆ Modulation Scheme/Code is a set {s 1,..., s M } of M vectors in R N ˆ Power P = 1 N s 1 2 + + s M 2 M ˆ Noise variance σ 2 = N o (per dimension) 2 ˆ Signal to noise ratio SNR = P σ 2 = 2P N o ˆ Spectral Efficiency η = 2 log 2 M bits/s/hz (assuming N = 2T W ) N 36 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 37 / 54

Optimum Detection Rule Objective Given {s 1,..., s M } & r, provide an estimate ˆm of the transmit message m, so that P e = P ( ˆm m) is as small as possible. Optimal Detection: Maximum a posteriori (MAP) detector Given received vector r, choose the vector s j that has the highest probability of being transmitted ˆm = arg max P (s k transmitted r received ) k {1,...,M} In other words, choose ˆm = k if P (s k transmitted r received ) > P (s j transmitted r received ) for every j k ˆ In case of a tie, can choose one of the indices arbitrarily. This does not increase P e. 37 / 54

Use Bayes rule P (A B) = Optimum Detection Rule P (A)P (B A) P (B) P (s k )f(r s k ) ˆm = arg max P (s j r) = arg max k k f(r) P (s j ) = Probability of transmitting s j = 1/M (equally likely messages) f(r s k ) = Probability density of r when s k is transmitted f(r) = Probability density of r averaged over all possible transmissions ˆm = arg max k 1/M f(r s k ) f(r) = arg max f(r s k ) k Likelihood function f(r s k ), Max. likelihood rule ˆm = arg max k f(r s k ) If all the M messages are equally likely Max. a posteriori detection = Max. likelihood (ML) detection 38 / 54

Maximum Likelihood Detection in Vector Gaussian Channel Use the model r = s i + n and the assumption n is independent of s i ˆm = arg max k f(r s k ) = arg max f noise (r s k s k ) k = arg max f noise (r s k ) k = arg max k 1 ( πn o ) N exp = arg min k r s k 2 ML Detection Rule for Vector Gaussian Channel Choose ˆm = k if r s k < r s j for every j k ( r s k 2 ) ˆ Also called minimum distance/nearest neighbor decoding ˆ In case of a tie, choose one of the contenders arbitrarily. N o 39 / 54

Example: M = 6 vectors in R 2 The k th Decision region D k D k = set of all points closer to s k than any other s j = { r R N r s k < r s j for all j k } The ML detector outputs ˆm = k if r D k. 40 / 54

Examples in R 2 41 / 54

1 Signal Space Representation 2 Vector Gaussian Channel 3 Vector Gaussian Channel (contd.) 4 Optimum Detection 5 Probability of Error 42 / 54

Error Probability when M = 2 Scenario Let {s 1, s 2 } R N be a binary modulation scheme with ˆ P (s 1 ) = P (s 2 ) = 1/2, and ˆ detected using the nearest neighbor decoder ˆ Error E occurs if (s 1 tx, ˆm = 2) or (s 2 tx, ˆm = 1) ˆ Conditional error probability P (E s 1 ) = P ( ˆm = 2 s 1 ) = P ( r s 2 < r s 1 s 1 ) ˆ Note that P (E) = P (s 1 )P (E s 1 ) + P (s 2 )P (E s 2 ) = P (E s 1 ) + P (E s 2 ) 2 ˆ P (E s i ) can be easy to analyse 42 / 54

Conditional Error Probability when M = 2 E s 1 : s 1 is transmitted r = s 1 + n, and r s 1 2 > r s 2 2 (E s 1 ) : s 1 + n s 1 2 > s 1 + n s 2 2 n 2 > s 1 s 2 + n, s 1 s 2 + n n 2 > s 1 s 2, s 1 s 2 + s 1 s 2, n + n, s 1 s 2 + n, n n 2 > s 1 s 2 2 + 2 n, s 1 s 2 + n 2 n, s 1 s 2 < s 1 s 2 2 2 s 1 s 2 2 n, s 1 s 2 < s 1 s 2 2 N o 2 2 n, s 1 s 2 s 1 s 2 N o < s 1 s 2 2No 1 2 s 1 s 2 N o 43 / 54

ˆ Z = n, N o 2 Error Probability when M = 2 s 1 s 2 s 1 s 2 ˆ P (E s 1 ) = P 2 N o s 1 s 2 2 s 1 s 2 N o ( ) s1 s 2 ˆ P (E s 2 ) = Q 2No is Gaussian with zero mean and variance 2 = N o 2 N o ( Z < s ) ( ) 1 s 2 s1 s 2 = Q 2No 2No P (E) = P (E s 1 ) + P (E s 2 ) 2 2 s 1 s 2 s 1 s 2 ( ) s1 s 2 = Q 2No 2 = 1 ˆ Error probability decreasing function of distance s 1 s 2 44 / 54

Bound on Error Probability when M > 2 Scenario Let C = {s 1,..., s M } R N be a modulation/coding scheme with ˆ P (s 1 ) = = P (s M ) = 1/M, and ˆ detected using the nearest neighbor decoder ˆ Minimum distance d min = smallest Euclidean distance between any pair of vectors in C d min = min i j s i s j ˆ Observe that s i s j d min for any i j ˆ Since Q( ) is a decreasing function ( ) ( ) si s j dmin Q Q 2No 2No for any i j ˆ Bound based only on d min Simple calculations, not tight, intuitive 45 / 54

46 / 54

Union Bound on Conditional Error Probability Assume that s 1 is transmitted, i.e., r = s 1 + n. We know that ( ) s1 s j P ( r s j < r s j s 1 ) = Q 2No Decoding error occurs if r is closer some s j than s 1, j = 2, 3,..., M M P (E s 1 ) = P (r / D 1 s 1 ) = P r s j < r s 1 s 1 j=2 From union bound P (A 2 A M ) P (A 2 ) + + P (A M ) M M ( ) s1 s j P (E s 1 ) P ( r s j < r s j s 1 ) = Q 2No j=2 j=2 47 / 54

Union Bound on Error Probability Since Q( ) is a decreasing function and s 1 s j d min M ( ) s1 s j M ( ) dmin P (E s 1 ) Q Q 2No 2No j=2 j=2 ( ) dmin P (E s 1 ) (M 1)Q 2No Upper bound on average error probability P (E) = M i=1 P (s i )P (E s i ) ( ) dmin P (E) (M 1)Q 2No Note ˆ Exact P e (or good approximations better than the union bound) can be derived for several constellations, for example PAM, QAM and PSK. ˆ Chernoff bound can be useful: Q(a) 1 2 exp( a2 /2) for a 0 ˆ Union bound, in general, is loose. 48 / 54

ˆ Abscissa is [SNR] db = 10 log 10 SNR ˆ The union bound is a reasonable approximation for large values of SNR 49 / 54

Performance of QAM and FSK η = 2 log 2 M N 50 / 54

Performance of QAM and FSK Probability of Error P e = 10 5 Modulation/Code Spectral Efficiency Signal-to-Noise Ratio η (bits/sec/hz) SNR (db) 16-QAM 4 20 4-QAM 2 13 2-FSK 1 12.6 8-FSK 3/4 7.5 16-FSK 1/2 4.6 How good are these modulation schemes? What is the best trade-off between SNR and η? 51 / 54

Capacity of the (Vector) Gaussian Channel Let the maximum allowable power be P and noise variance be N o /2. SNR = P N o /2 = 2P N o What is the highest η achievable while ensuring that P e is small? Theorem Given an ɛ > 0 and any constant η such that η < log 2 (1 + SNR), there exists a coding scheme with P e ɛ and spectral efficiency at least η. Conversely, for any coding scheme with η > log 2 (1 + SNR) and M sufficiently large, P e is close to 1. C(SNR) = log 2 (1 + SNR) is the capacity of the Gaussian channel. 52 / 54

How Good/Bad are QAM and FSK? Least SNR required to communicate reliably with spectral efficiency η is SNR (η) = 2 η 1 Probability of Error P e = 10 5 Modulation/Code η SNR (db) SNR (η) 16-QAM 4 20 11.7 4-QAM 2 13 4.7 2-FSK 1 12.6 0 8-FSK 3/4 7.5 1.7 16-FSK 1/2 4.6 3.8 53 / 54

How to Perform Close to Capacity? ˆ We need P e to be small at a fixed finite SNR d min must be large to ensure that P e is small ˆ It is necessary to use coding schemes in high dimensions N 1 Can ensure that d min constant N ˆ If N is large it is possible to pack vectors {s i } in R N such that Average power is at the most P d min is large η is close to log 2 (1 + SNR) P e is small ˆ A large N implies that M = 2 ηn/2 is also large. We must ensure that such a large code can be encoded/decoded with practical complexity Several known coding techniques η > 1: Trellis coded modulation, multilevel codes, lattice codes, bit-interleaved coded modulation, etc. η < 1: Low-density parity-check codes, turbo codes, polar codes, etc. Thank You! 54 / 54