Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound

Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee cosiderig oly chaels with discrete iput ad output alphabets. This does does ot iclude the most importat chael for commuicatio egieers, amely the Gaussia chael. This is the chael oe ecouters i modelig commuicatio over a bad-limited chaels ad additive white Gaussia oise. The full treatmet of the physical cotiuous time chael is beyod the scope of this course. Istead, we preset the treatmet for a importat sub-block of the practical chael the discrete time, memoryless additive Gaussia oise (AGN) chael. Specifically, for X = Y = R, for each iput x X the AGO chael (X, W, Y) produces a radom output Y distributed N (x, σ ), i.e., for iput x, the chael output is give by Y = x + Z where Z N (0, σ ). The chael is statioary ad memoryless, i.e., for each chael use a idepedet zero mea Gaussia oise with the same variace σ is added to the iput. c Himashu Tyagi. Feel free to use with ackowledgemet.

Oe ca aively start by askig the questio: How may messages ca be set reliably over this chael. With a little thought, we ca covice ourselves that this questio is ot iterestig. For A > 0, by choosig A m as the mth codeword, for m =,,..., ay error requiremet ca be satisfied by choosig A to be large eough. Thus, ay umber of messages ca be set reliably eve with oe chael use. But of course the scheme above is ot practical. I fact, the theoretical model of AGN is used to model trasmissio over a badlimited additive white Gaussia oise (AWGN) chael. The amplitude of the iput of AGN is directly related to the amplitude of the iput sigal ad the average power x i of a codeword for chael uses is related to the average power of the iput sigal. Sice our circuits ca oly work with a fixed maximum amplitude ad our batteries ca provide oly fiite power, we must impose these two costraits o our codewords. I this course, we will oly cosider the average power costrait, which is easier to hadle tha the peak amplitude costrait. Specifically, we cosider the followig codig problem: Defiitio 7. (Chael codes with average power costraits). For a AGN (R, W, R), a (, M, P ) code cosists of codewords x m R ad associated decodig sets D m R, m M, such that each codeword satisfies the average power costraits: x mi P, m M. The otios of average ad maximum probability of error are defied as before (see lecture 0). A rate R > 0 is ɛ-achievable for W with average power costrait P if there exist (, R, P ) codes with maximum probability of error less tha ɛ, for all sufficietly large. The supremum over all ɛ-achievable rate with avg. power costrait P is called the ɛ-capacity at power P, deoted C ɛ,p (W ). The capacity of the chael at power P, deoted As before, we will work uder the maximum probability of error criterio; we ca proceed as i lecture 4 to show that results remai the same uder the average probability of error criterio.

C P (W ), is give by C P (W ) = lim ɛ 0 C ɛ,p (W ). 7. Capacity of AGN ad the sphere packig boud We shall establish the followig result, which characterizes C P (W ). Theorem 7. (Capacity of AGN). Give a AGN with oise variace σ, P > 0, 0 < ɛ <, we have C ɛ,p (W ) = C P (W ) = log ( + P σ ). The ratio (P/σ ) is ofte referred to as the allowed sigal-to-oise ratio (SNR); it ca be formally show to be equivalet to the SNR for sedig a sigal over a badlimited AWGN chael. Thus, the formula for capacity ca be re-stated as C P (W ) = log( + SNR), perhaps the most famous formula of Iformatio theory. I fact, the result above claims more. It shows that the maximum possible rate will ot icrease eve if a ovaishig probability of error ɛ (0, ) is allowed. We begi by provig the strog coverse, i.e., for every ɛ (0, ), C ɛ,p (W ) / log(+ SNR). Coceptually, the AGN chael is very similar to BSC. Let s recall our proof of sphere packig boud for the BSC. However, the details are somewhat techically ivolved. Nevertheless, I have icluded the proof to show how the ideas go through. There are two parts of the proof: First, we show that ay set D m such that W (D m x m ) is large, must have cardiality greater tha roughly h(δ). Secod, there are o more tha sequeces i all. Thus, the umber of codewords possible is o more tha ( h(δ)). Whe tryig to follow the same recipe for AGN, we have o trouble i extedig the first step, with the uderstadig that cardialities eed to replaced by volumes. Note that proofs for BSC essetially worked with bouds o pmf. For AGN, we eed to work with bouds o probability desities. Specifically, cosider a (, M, P ) code with the maximum probability of error less tha ɛ. 3

What is miimum possible volume of a decodig set of this code? The followig lemma aswers this questio. Lemma 7.3. Give a set D R, x R, ad 0 < ɛ < such that for a AGN W with oise variace σ W (D x) ɛ, for every η > 0 ad 0 < δ < ɛ, we have for all sufficietly large that ( ) πe vol(d) ( σ ) e η ( ɛ δ), Remark 7.4. The proof of the lemma is simple, but the otatios make it look difficult. Note that for a sufficietly large, a ball of radius ρ i Euclidea space R has volume less tha (you ca see the wikipedia article for volumes of -balls ad the refereces therei): ( πe ) (ρ). () Thus, the result above says that ay large probability set has volume more tha roughly a ball of radius σ. Also, ote that by Chebyshev s iequality whe we sed a iput sequece x over AGN, with large probability we receive a Y i a ball of radius roughly σ aroud it. The result above says that ay large probability set must have at least as much volume as this ball. Proof. Deote by fw (y x) the desity of the output whe x is set at y. Note that for AGN f W (y x) = (πσ ) e y x σ, where y x deotes the Euclidea distace betwee x ad y. Therefore, if y x ρ, we have f W (y x) (πσ ) 4 e ρ σ.

For η > 0, ρ = σ ( η), ad sufficietly large, we show that with large probability the output of a AGN chael is outside a Euclidea ball of radius ρ with ceter x, if the iput to the chael is x. Ideed, deotig by B ρ (x) a ball of radius ρ aroud x, cosider a radom variable Y = (Y,..., Y ) with idepedet etries with the ith coordiated geerated by N (x i, σ ). The, by Chebyshev s iequality P ( Y i σ ( η) ) Var [ Y i ] σ η = Var [ Y ] σ, η where the right side goes to zero as goes to ifiity. Equivaletly, lim P (Y / B ρ (x)) =. Therefore, give a set D such that W (D x) ɛ, we have W (D B ρ (x) c x) ɛ δ, for all sufficietly large. Thus, ɛ δ W (D B ρ (x) c x) vol(d B ρ ) (πσ ) e ( η), i.e., ( ) πe vol(d) ( σ ) e η ( ɛ δ), which completes the proof. This brigs us to the secod questio: 5

Where do the received vectors lie with large probability? This questio is difficult to aswer with good precisio. I fact, the fial aswer we will give is somewhat surprisig. Naively, we kow that each codeword x m satisfies x m P. As remarked above, whe we sed a codeword x m, with large probability we see a output withi a radius σ of this codeword. Thus, for ay codeword x m, the received vectors y satisfy y x m + y x m ( P + σ), with large probability. Therefore, with ρ = ( P + σ), for ay codeword x m, the received vectors lie i B ρ (0) with large probability, i.e., W (B ρ (0) x m ) δ, () for all sufficietly large. To obtai a coverse boud, we combie this observatio with Lemma 7.3 as follows: Cosider D m = D m B ρ (0). Sice W (D m x m ) ɛ, () implies that W (D m x m ) ɛ δ. But the by Lemma 7.3, each D m satisfies ( ) πe vol(d m) ( σ ) Furthermore, all the D ms are disjoit ad lie withi B ρ (0). Thus, by () the sum of their volumes is less tha ( πe ) ( rho ) ( πe ) ( ( P + σ)). Therefore, the maximum umber M of such disjoit sets we ca have satisfies ( ) P log M log + σ. 6

But ufortuately log(+ x) is i geeral more tha / log(+x)! Therefore, this boud does ot yield a coverse for our capacity result. What did we miss? What we missed is a iterestig fact about Gaussia radom variables (i fact, about the so-called ocetral chi-squared radom variables). Namely, we ca show () with ρ P + σ. I fact, we ca use Chebyshev s iequality to see this. Whe x is set, the received vector Y = (Y,..., Y ) is idepedet with Y i N (x i, σ ). Thus, E [ Y i ] = (x i + σ ) = bx + σ. But how about its variace. Note that a straightforward calculatio shows that Var [ Y i ] = Var [ Yi ] = x 4 i + 6σx i + σ 4. But ow we are i trouble. We oly kow that x P, but have o hadle over x4 i. I fact, it is ot easy to boud the variace i terms of just x, it s ot easy but it is ideed possible! The so-called Gaussia Poicaré iequality gives that Var [ Y i ] 4 Therefore, by Chebyshev s iequality, for every η > 0 P ( Y i > ( x + σ ) + η E [ Yi ] = 4( x + σ ). ) ( ) x 4 η + σ η. 7

I particular, if x P, P ( Y i > (P + σ + η) Thus, for all sufficietly large ad ρ give by ) 4(P + σ ) η. ρ = (P + σ + η), for every x such that x P, we have P (Y B ρ (x)) = W (B ρ (x) x) δ. (3) We ca ow complete our proof with (3) i place of (). Let D m = D m B ρ (x m ). The, by (3) ad the assumptio that W (D m x m ) ɛ, we have that W (D m x m ) ɛ δ. Therefore, by Lemma 7.3, each D m satisfies ( ) πe vol(d m) ( σ ) e η/ ( ɛ δ), (4) for all sufficietly large. O the other had, sice all D m are subsets of B ρ (0) ad disjoit, M vol(d m) vol(b ρ (0)) m= ( πe ( πe = ) (ρ ) ) ( (P + σ + η)) (5) where the secod iequality uses () ad holds for all sufficietly large. Thus, by com- 8

biig (4) ad (5) we get log M log P + σ + η σ + η + log ɛ δ. Thus, for every η > 0, every 0 < δ < ( ɛ)/ad sufficietly large log M ( log + P + η ) σ + η + log ɛ δ. By takig η 0 ad δ 0, we get C ɛ,p (W ) log ( + P σ ), which completes the proof of the strog coverse. 9