CSCE 790S Background Results - PDF Free Download

CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each result, or group of related results, s roughly one page long. Contents 1 The Cauchy-Schwarz Inequalty The Schur Trangular Form and the Spectral Theorem 3 3 The Polar and Sngular Value Decompostons 4 4 Sterlng s Approxmaton 6 5 Inequaltes of Markov and Chebyshev 7 6 Relatve Entropy 8 7 A Standard Tal Inequalty 9 1

1 The Cauchy-Schwarz Inequalty Ths s one of the most versatle nequaltes n all of mathematcs. Theorem 1.1 (Cauchy-Schwarz) For any real numbers a 1,..., a n and b 1,..., b n, a 1 b 1 + + a n b n (a 1 + + a n)(b 1 + + b n), (1) wth equalty holdng ff the two vectors (a 1,..., a n ) and (b 1,..., b n ) are lnearly dependent. Proof. There are many, many ways of provng ths. Here s a drect calculaton. We have, 0 1 <j n(a b j a j b ) = [a b j (a b j a j b ) a j b (a b j a j b )] <j = [a b j (a b j a j b ) + a j b (a j b a b j )] = <j <j a b j (a b j a j b ) + <j a j b (a j b a b j ) = a b j (a b j a j b ) + a b j (a b j a j b ) = a b j (a b j a j b ) = <j j< j,j = a b j ( n ) ( n ) a b a j b j = a b j a b.,j,j =1 j=1 =1 a b j (a b j a j b ) Addng ( a b ) to both sdes then takng the square root of both sdes (notng that the square root functon s strctly monotone ncreasng) yelds the nequalty (1). Clearly, equalty holds above ff a b j a j b = 0 for all < j, or equvalently, a b j = a j b for all < j. It s not hard to check that ths condton s equvalent to (a 1,..., a n ) and (b 1,..., b n ) beng lnearly dependent. Note that (1) stll holds f we remove the absolute value delmters from the left-hand sde. In that case, equalty holds ff there exsts a λ 0 such that ether (a 1,..., a n ) = λ(b 1,..., b n ) or (b 1,..., b n ) = λ(a 1,..., a n ). Corollary 1. (Trangle Inequalty for Complex Numbers) For any z, w C, z + w z + w. Proof. Wrtng z = a 1 + a and w = b 1 + b for real a 1, a, b 1, b, we have z + w = (a 1 + b 1 ) + (a + b ) = a 1 + a + b 1 + b + (a 1 b 1 + a b ) a 1 + a + b 1 + b + (a ( a 1 + a )(b1 + b ) = 1 + a + b 1 ) + b = ( z + w ). Takng the square root of both sdes yelds the corollary. Corollary 1.3 For any complex numbers z 1,..., z n and w 1,..., w n, z1w 1 + + znw n ( z 1 + + z n )( w 1 + + w n ). () Proof. We have z1w 1 + + znw n z1w 1 + + znw n (by Corollary 1.) = z 1 w 1 + + z n w n ( z 1 + + z n )( w 1 + + w n ). (by Theorem 1.1) Corollary 1.4 For any column vectors u, v C n, u v u v.

The Schur Trangular Form and the Spectral Theorem Theorem.1 (Schur Trangular Form) For every n n matrx M, there exsts a untary U and an upper trangular T (both n n matrces) such that M = UT U. Proof. We prove ths by nducton on n. The n = 1 case s trval. Now supposng the theorem holds for n 1, we prove t holds for n + 1. Let M be any (n + 1) (n + 1) matrx. We let A be the lnear operator on C n+1 whose matrx s M wth respect to some orthonormal bass. A has some egenvalue λ wth correspondng unt egenvector v. Usng the Gram-Schmdt procedure, we can fnd an orthonormal bass {y 1,..., y n+1 } for C n+1 such that y 1 = v. Wth respect to ths bass, the matrx for A looks lke N = λ w 0 N, where w s some vector n C n and N s an n n matrx. Snce M and N represent the same operator wth respect to dfferent orthonormal bases, they must be untarly conjugate,.e., there s a untary V such that M = V NV. N s an n n matrx, so we apply the nductve hypothess to get a untary W and an upper trangular T (both n n matrces) such that N = W T W. Now we can factor N: N = λ w 0 W T W = 1 0 0 W λ w W 0 T 1 0 0 W = W T W, where W = 1 0 0 W and T = λ w W 0 T. T s clearly upper trangular, and t s easly checked that W W = I, usng the fact that W s untary. Thus W s untary, and we get M = V NV = V W T W V = UT U, where U = V W s untary. A Schur bass for an operator A s an orthonormal bass that gves an upper trangular matrx for A. Theorem. If an n n matrx A s both upper trangular and normal, then A s dagonal. Proof. Suppose that A s upper trangular and normal, but not dagonal. Then there s some < j such that [A] j 0. Let j be least such that there exsts < j such that [A] j 0. For ths and j, we get [AA ] = [A] k [A ] k = [A] k [A] k = [A] k = [A] k [A] + [A] j > [A]. (3) The last nequalty follows from the fact that [A] j 0. Smlarly, [A A] = [A] k[a] k = k= [A] k = [A] k = [A]. (4) The next to last equaton holds because A s upper trangular, and the last equaton holds because of our mnmum choce of j and the fact that < j. From (3) and (4), we have [AA ] > [A A]. But A s normal, so these two quanttes must be equal. From ths contradcton we get that A must be dagonal. Corollary.3 (Spectral Theorem for Normal Operators) Every normal matrx s untarly conjugate to a dagonal matrx. Equvalently, every normal operator has an orthonormal egenbass. 3

3 The Polar and Sngular Value Decompostons Theorem 3.1 (Polar Decomposton) For every n n matrx A there are s an n n untary matrx U and a unque n n matrx H such that H 0 and A = UH. In fact, H = A. Proof. Frst unqueness. If A = UH wth U untary and H 0, then A = A A = H U UH = H H = H = H. Now exstence. Let {e 1,..., e n } be the standard orthonormal bass for C n. We frst prove the specal case where A s the dagonal matrx dag(s 1, s,..., s n ) for some real values s 1 s s n 0. Let 0 k n be largest such that s k > 0 (k = 0 f A = 0). Thus we have A = [ D 0 0 0 where D s the k k nonsngular matrx dag(s 1,..., s k ). If j > k, then A e j = 0, and thus 0 = A e j = A e j = A Ae j, whence Ae j = Ae j Ae j = e j A Ae j = e j 0 = 0, and so Ae j = 0. Ths means that A = [ B 0 ], where B s some n k matrx, and the last n k columns of A are 0. We have [ B B 0 0 0 ], ] [ ] B [ ] = B 0 = A A = A = 0 [ D 0 0 0 and so B B = D. Let W be an n (n k) matrx whose columns are unt vectors orthogonal to all the columns of B and to each other. (There are many possbltes for W f k < n; the columns of W can be any orthonormal set n the orthogonal complement of the space spanned by the columns of B.) By our choce of W, we have B W = 0, W B = 0, and W W = I. Fnally, defne U := [ BD 1 W ]. We clam that U s untary and that A = U A. Notng that D 1 s Hermtean, we have [ ] D U U = 1 B [ BD W 1 W ] [ ] [ ] D = 1 B BD 1 D 1 B W I 0 W BD 1 W = = I, W 0 I and therefore U s untary. We also have U A = [ BD 1 W ] [ D 0 0 0 ] = [ B 0 ] = A. Now for the general case. Snce A 0 (and hence normal), there s a untary V such that V A V = dag(s 1,..., s n ) for some real values s 1 s n 0. Snce V A V = V A AV = V A AV = (V AV ) (V AV ) = V AV, we see that V AV satsfes the specal case, above, and so there s a untary U such that V AV = U V AV. It follows that A = V V AV V = V U V AV V = V UV A V V = V UV A, whch proves the theorem because V UV s untary. ], Theorem 3. (Sngular Value Decomposton) For any n n matrx A there exst n n untary matrces V, W and unque real values s 1 s s n 0 such that A = V DW, where D = dag(s 1,..., s n ). Furthermore, s 1,..., s n are the egenvalues of A. 4

The s 1,..., s n are known as the sngular values of A. Proof. For unqueness, f A = V DW as above, then A = A A = W DV V DW = W D W = W D W = W DW, and so the dagonal entres of D must be the egenvalues of A. For exstence, the Polar Decomposton gves a untary U such that A = U A. Snce A 0 (and hence s normal), there exsts a untary Y such that A = Y DY, where D = dag(s 1,..., s n ) for some s 1 s n 0. Then A = U A = UY DY. Settng V := UY and W := Y proves the theorem. 5

4 Sterlng s Approxmaton Theorem 4.1 (Sterlng s Approxmaton) n! πn(n/e) n. Here, f(n) g(n) means that lm n f(n)/g(n) = 1. We ll prove a slghtly weaker verson of Theorem 4.1 that nevertheless suffces for all our purposes, namely, Theorem 4. (Weak Sterlng) For all postve ntegers n, e ( n ) n ( n ) n n n! e n. e e Proof. We start wth an ntegral approxmaton. The theorem clearly holds for n = 1, so assume n. Snce the log functon s concave downward, we clam that for all such that n, log + log( 1) 1 log x dx log 1. (5) The left-hand sde s the area of the trapezod T 1 formed by the ponts ( 1, 0), (, 0), (, log ), ( 1, log( 1)), and the rght-hand sde s the area of the trapezod T formed by the ponts ( 1, 0), (, 0), (, log ), ( 1, log 1/). Note that T s upper edge s the tangent lne to the curve y = log x at the pont (, log ). By concavty of log, the regon under the curve y = log x n the nterval [ 1, ] contans T 1 and s contaned n T, hence the nequaltes (5). Now note that log(n!) = n =1 log = n = log. Summng (5) from = to n and smplfyng, we get log(n!) log n n 1 log x dx = n log n n + 1 log(n!) 1 = 1, (6) usng the closed form log x dx = x log x x + C. The sum on the rght-hand sde of (6) s the Harmonc seres, whch satsfes another ntegral approxmaton: Equatons (6) and (7) yeld log n! log n = 1 n dx = log n log. (7) x n log n n + 1 log(n!) log n + log, and so n log n n + 1 + log n log log n! n log n n + 1 + log n. (8) Takng e to the power of all three quanttes n (8) and smplfyng, we have e ( n ) n ( n ) n n n! e n e e as desred. 6

5 Inequaltes of Markov and Chebyshev We only consder random varables that are real-valued and over dscrete sample spaces. If X s such a random varable, then we let E[X] and var[x] respectvely denote the expected value (mean) of X and the varance of X. Theorem 5.1 (Markov s Inequalty) Let X be a random varable wth fnte mean, and suppose X 0. For every real c > 0, Pr[X c] E[X]. c Proof. Let Ω be the sample space for X. We have E[X] = X(a) Pr[a] = X(a) Pr[a] + X(a) Pr[a] a Ω a:x(a) c a:x(a)<c X(a) Pr[a] c Pr[a] = c Pr[X c]. a:x(a) c a:x(a) c Dvdng both sdes by c proves the theorem. Theorem 5. (Chebyshev s Inequalty) Let X be a random varable wth fnte mean and varance, and let a > 0 be real. Pr[ X E[X] a ] var[x] a. Proof. We nvoke Markov s Inequalty wth the random varable Y = (X E[X]), lettng c = a. Note that Y 0, E[Y ] = var[x], and Pr[ X E[X] a ] = Pr[Y a ]. 7

6 Relatve Entropy Let p = (p 1, p,...) and q = (q 1, q,...) be two probablty dstrbutons over some (fnte or nfnte) dscrete sample space {1,,...}. The relatve entropy of q wth respect to p s defned as H(q; p) = p lg q p, (9) Where the sum s taken over all such that p > 0. If q = 0 and p > 0 for some, then H(q; p) =. Otherwse, the sum n (9) may or may not converge, but we always have the followng regardless: Theorem 6.1 H(q; p) 0, wth equalty holdng f and only f p = q. Proof. We use that fact that log x x 1 for all x > 0, wth equalty holdng ff x = 1. We have H(q; p) = p lg q p = 1 p log q log p 1 ( ) q p 1 log p 1 = (p q ) log ( 1 = 1 ) q log 0. It s easy to see that equalty holds above f and only f p = q. An mportant specal case s when q = (q 1,..., q n ) = (1/n,..., 1/n) s the unform dstrbuton on a sample space of sze n (and p = (p 1,..., p n ) s arbtrary). In ths case, we have H(q; p) = lg n H(p 1,..., p n ). (10) If (p, 1 p) and (q, 1 q) are bnary dstrbutons, then we abbrevate H((q, 1 q); (p, 1 p)) by h(q; p). Note that by (10), h(1/, p) = 1 h(p). 8

7 A Standard Tal Inequalty It mght be necessary to read Secton 6 before ths one. Let 0 < p < 1 and let n > 0 be an nteger. In ths secton, we gve an upper bound for the sum t ( n ) =0 p (1 p) n, where t pn. [For example, ths sum s the probablty of gettng at most t heads among n flps of a p-based con (.e., n dentcal Bernoull trals wth bas p). The expected number of heads among n flps s pn, and we want to show that the probablty of gettng sgnfcantly fewer than pn heads dmnshes exponentally wth n.] Theorem 7.1 Let n be a postve nteger. Let 0 < p < 1 be arbtrary, and set q = 1 p. If t s an nteger such that 0 t pn, then t ( ) n p q n nh(p;t/n), (11) where h( ; ) s the bnary relatve entropy defned n Secton 6. =0 Proof. If t = 0, then h(p; t/n) = h(p; 0) = lg q, and so both sdes of (11) equal q n and so the nequalty s satsfed. Now suppose 0 < t pn. Set λ = t/n, and let µ = 1 λ. Note that 0 < λ p < 1 and 0 < q µ < 1. Defne C = pt q n t λ t µ n t. For any 0 t, we have p q n = C Therefore, startng wth the left-hand sde of (11), we get t =0 ( ) n p q n C For the rght-hand sde of (11), we get ( ) t q ( µ ) t λ t µ n t C λ t µ n t = Cλ µ n. p λ t =0 ( ) n λ µ n C =0 ( ) n λ µ n = C(λ + µ) n = C. ( p ) ( ) nλ nµ q ( p ) ( ) t n t q nh(p;t/n) = nh(p;λ) = n[λ lg(p/λ)+µ lg(q/µ)] = = = C, λ µ λ µ whch proves the theorem. 9