CSCE 790S Background Results

Similar documents
APPENDIX A Some Linear Algebra

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

The Order Relation and Trace Inequalities for. Hermitian Operators

More metrics on cartesian products

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

MATH Homework #2

Homework Notes Week 7

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Singular Value Decomposition: Theory and Applications

Google PageRank with Stochastic Matrix

Norms, Condition Numbers, Eigenvalues and Eigenvectors

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

1 Matrix representations of canonical matrices

REAL ANALYSIS I HOMEWORK 1

Lecture 3: Probability Distributions

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

An Inequality for the trace of matrix products, using absolute values

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

Lecture 12: Discrete Laplacian

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Lecture 3. Ax x i a i. i i

On Finite Rank Perturbation of Diagonalizable Operators

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

Math 217 Fall 2013 Homework 2 Solutions

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Exercise Solutions to Real Analysis

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

The lower and upper bounds on Perron root of nonnegative irreducible matrices

Perron Vectors of an Irreducible Nonnegative Interval Matrix

w ). Then use the Cauchy-Schwartz inequality ( v w v w ).] = in R 4. Can you find a vector u 4 in R 4 such that the

Linear Approximation with Regularization and Moving Least Squares

First day August 1, Problems and Solutions

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Maximizing the number of nonnegative subsets

Eigenvalues of Random Graphs

Problem Set 9 Solutions

PHYS 215C: Quantum Mechanics (Spring 2017) Problem Set 3 Solutions

HADAMARD PRODUCT VERSIONS OF THE CHEBYSHEV AND KANTOROVICH INEQUALITIES

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

NOTES ON SIMPLIFICATION OF MATRICES

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Another converse of Jensen s inequality

6. Stochastic processes (2)

6. Stochastic processes (2)

Complete subgraphs in multipartite graphs

Learning Theory: Lecture Notes

Homework 1 Lie Algebras

Expected Value and Variance

Math 702 Midterm Exam Solutions

Appendix B. Criterion of Riemann-Stieltjes Integrability

COMPUTING THE NORM OF A MATRIX

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

GELFAND-TSETLIN BASIS FOR THE REPRESENTATIONS OF gl n

Lecture 10: May 6, 2013

The internal structure of natural numbers and one method for the definition of large prime numbers

Vapnik-Chervonenkis theory

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

1 GSW Iterative Techniques for y = Ax

e - c o m p a n i o n

a b a In case b 0, a being divisible by b is the same as to say that

Dimensionality Reduction Notes 1

Representation theory and quantum mechanics tutorial Representation theory and quantum conservation laws

DIFFERENTIAL FORMS BRIAN OSSERMAN

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Random Walks on Digraphs

Difference Equations

MMA and GCMMA two methods for nonlinear optimization

Strong Markov property: Same assertion holds for stopping times τ.

HANSON-WRIGHT INEQUALITY AND SUB-GAUSSIAN CONCENTRATION

2.3 Nilpotent endomorphisms

SL n (F ) Equals its Own Derived Group

PES 1120 Spring 2014, Spendier Lecture 6/Page 1

COMBINATORIAL IDENTITIES DERIVING FROM THE n-th POWER OF A 2 2 MATRIX

Anti-van der Waerden numbers of 3-term arithmetic progressions.

5 The Rational Canonical Form

Solutions to exam in SF1811 Optimization, Jan 14, 2015

2 More examples with details

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Edge Isoperimetric Inequalities

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Advanced Quantum Mechanics

BACKGROUND: WEAK CONVERGENCE, LINEAR ALGEBRA 1. WEAK CONVERGENCE

Randomness and Computation

Errors for Linear Systems

Transcription:

CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each result, or group of related results, s roughly one page long. Contents 1 The Cauchy-Schwarz Inequalty The Schur Trangular Form and the Spectral Theorem 3 3 The Polar and Sngular Value Decompostons 4 4 Sterlng s Approxmaton 6 5 Inequaltes of Markov and Chebyshev 7 6 Relatve Entropy 8 7 A Standard Tal Inequalty 9 1

1 The Cauchy-Schwarz Inequalty Ths s one of the most versatle nequaltes n all of mathematcs. Theorem 1.1 (Cauchy-Schwarz) For any real numbers a 1,..., a n and b 1,..., b n, a 1 b 1 + + a n b n (a 1 + + a n)(b 1 + + b n), (1) wth equalty holdng ff the two vectors (a 1,..., a n ) and (b 1,..., b n ) are lnearly dependent. Proof. There are many, many ways of provng ths. Here s a drect calculaton. We have, 0 1 <j n(a b j a j b ) = [a b j (a b j a j b ) a j b (a b j a j b )] <j = [a b j (a b j a j b ) + a j b (a j b a b j )] = <j <j a b j (a b j a j b ) + <j a j b (a j b a b j ) = a b j (a b j a j b ) + a b j (a b j a j b ) = a b j (a b j a j b ) = <j j< j,j = a b j ( n ) ( n ) a b a j b j = a b j a b.,j,j =1 j=1 =1 a b j (a b j a j b ) Addng ( a b ) to both sdes then takng the square root of both sdes (notng that the square root functon s strctly monotone ncreasng) yelds the nequalty (1). Clearly, equalty holds above ff a b j a j b = 0 for all < j, or equvalently, a b j = a j b for all < j. It s not hard to check that ths condton s equvalent to (a 1,..., a n ) and (b 1,..., b n ) beng lnearly dependent. Note that (1) stll holds f we remove the absolute value delmters from the left-hand sde. In that case, equalty holds ff there exsts a λ 0 such that ether (a 1,..., a n ) = λ(b 1,..., b n ) or (b 1,..., b n ) = λ(a 1,..., a n ). Corollary 1. (Trangle Inequalty for Complex Numbers) For any z, w C, z + w z + w. Proof. Wrtng z = a 1 + a and w = b 1 + b for real a 1, a, b 1, b, we have z + w = (a 1 + b 1 ) + (a + b ) = a 1 + a + b 1 + b + (a 1 b 1 + a b ) a 1 + a + b 1 + b + (a ( a 1 + a )(b1 + b ) = 1 + a + b 1 ) + b = ( z + w ). Takng the square root of both sdes yelds the corollary. Corollary 1.3 For any complex numbers z 1,..., z n and w 1,..., w n, z1w 1 + + znw n ( z 1 + + z n )( w 1 + + w n ). () Proof. We have z1w 1 + + znw n z1w 1 + + znw n (by Corollary 1.) = z 1 w 1 + + z n w n ( z 1 + + z n )( w 1 + + w n ). (by Theorem 1.1) Corollary 1.4 For any column vectors u, v C n, u v u v.

The Schur Trangular Form and the Spectral Theorem Theorem.1 (Schur Trangular Form) For every n n matrx M, there exsts a untary U and an upper trangular T (both n n matrces) such that M = UT U. Proof. We prove ths by nducton on n. The n = 1 case s trval. Now supposng the theorem holds for n 1, we prove t holds for n + 1. Let M be any (n + 1) (n + 1) matrx. We let A be the lnear operator on C n+1 whose matrx s M wth respect to some orthonormal bass. A has some egenvalue λ wth correspondng unt egenvector v. Usng the Gram-Schmdt procedure, we can fnd an orthonormal bass {y 1,..., y n+1 } for C n+1 such that y 1 = v. Wth respect to ths bass, the matrx for A looks lke N = λ w 0 N, where w s some vector n C n and N s an n n matrx. Snce M and N represent the same operator wth respect to dfferent orthonormal bases, they must be untarly conjugate,.e., there s a untary V such that M = V NV. N s an n n matrx, so we apply the nductve hypothess to get a untary W and an upper trangular T (both n n matrces) such that N = W T W. Now we can factor N: N = λ w 0 W T W = 1 0 0 W λ w W 0 T 1 0 0 W = W T W, where W = 1 0 0 W and T = λ w W 0 T. T s clearly upper trangular, and t s easly checked that W W = I, usng the fact that W s untary. Thus W s untary, and we get M = V NV = V W T W V = UT U, where U = V W s untary. A Schur bass for an operator A s an orthonormal bass that gves an upper trangular matrx for A. Theorem. If an n n matrx A s both upper trangular and normal, then A s dagonal. Proof. Suppose that A s upper trangular and normal, but not dagonal. Then there s some < j such that [A] j 0. Let j be least such that there exsts < j such that [A] j 0. For ths and j, we get [AA ] = [A] k [A ] k = [A] k [A] k = [A] k = [A] k [A] + [A] j > [A]. (3) The last nequalty follows from the fact that [A] j 0. Smlarly, [A A] = [A] k[a] k = k= [A] k = [A] k = [A]. (4) The next to last equaton holds because A s upper trangular, and the last equaton holds because of our mnmum choce of j and the fact that < j. From (3) and (4), we have [AA ] > [A A]. But A s normal, so these two quanttes must be equal. From ths contradcton we get that A must be dagonal. Corollary.3 (Spectral Theorem for Normal Operators) Every normal matrx s untarly conjugate to a dagonal matrx. Equvalently, every normal operator has an orthonormal egenbass. 3

3 The Polar and Sngular Value Decompostons Theorem 3.1 (Polar Decomposton) For every n n matrx A there are s an n n untary matrx U and a unque n n matrx H such that H 0 and A = UH. In fact, H = A. Proof. Frst unqueness. If A = UH wth U untary and H 0, then A = A A = H U UH = H H = H = H. Now exstence. Let {e 1,..., e n } be the standard orthonormal bass for C n. We frst prove the specal case where A s the dagonal matrx dag(s 1, s,..., s n ) for some real values s 1 s s n 0. Let 0 k n be largest such that s k > 0 (k = 0 f A = 0). Thus we have A = [ D 0 0 0 where D s the k k nonsngular matrx dag(s 1,..., s k ). If j > k, then A e j = 0, and thus 0 = A e j = A e j = A Ae j, whence Ae j = Ae j Ae j = e j A Ae j = e j 0 = 0, and so Ae j = 0. Ths means that A = [ B 0 ], where B s some n k matrx, and the last n k columns of A are 0. We have [ B B 0 0 0 ], ] [ ] B [ ] = B 0 = A A = A = 0 [ D 0 0 0 and so B B = D. Let W be an n (n k) matrx whose columns are unt vectors orthogonal to all the columns of B and to each other. (There are many possbltes for W f k < n; the columns of W can be any orthonormal set n the orthogonal complement of the space spanned by the columns of B.) By our choce of W, we have B W = 0, W B = 0, and W W = I. Fnally, defne U := [ BD 1 W ]. We clam that U s untary and that A = U A. Notng that D 1 s Hermtean, we have [ ] D U U = 1 B [ BD W 1 W ] [ ] [ ] D = 1 B BD 1 D 1 B W I 0 W BD 1 W = = I, W 0 I and therefore U s untary. We also have U A = [ BD 1 W ] [ D 0 0 0 ] = [ B 0 ] = A. Now for the general case. Snce A 0 (and hence normal), there s a untary V such that V A V = dag(s 1,..., s n ) for some real values s 1 s n 0. Snce V A V = V A AV = V A AV = (V AV ) (V AV ) = V AV, we see that V AV satsfes the specal case, above, and so there s a untary U such that V AV = U V AV. It follows that A = V V AV V = V U V AV V = V UV A V V = V UV A, whch proves the theorem because V UV s untary. ], Theorem 3. (Sngular Value Decomposton) For any n n matrx A there exst n n untary matrces V, W and unque real values s 1 s s n 0 such that A = V DW, where D = dag(s 1,..., s n ). Furthermore, s 1,..., s n are the egenvalues of A. 4

The s 1,..., s n are known as the sngular values of A. Proof. For unqueness, f A = V DW as above, then A = A A = W DV V DW = W D W = W D W = W DW, and so the dagonal entres of D must be the egenvalues of A. For exstence, the Polar Decomposton gves a untary U such that A = U A. Snce A 0 (and hence s normal), there exsts a untary Y such that A = Y DY, where D = dag(s 1,..., s n ) for some s 1 s n 0. Then A = U A = UY DY. Settng V := UY and W := Y proves the theorem. 5

4 Sterlng s Approxmaton Theorem 4.1 (Sterlng s Approxmaton) n! πn(n/e) n. Here, f(n) g(n) means that lm n f(n)/g(n) = 1. We ll prove a slghtly weaker verson of Theorem 4.1 that nevertheless suffces for all our purposes, namely, Theorem 4. (Weak Sterlng) For all postve ntegers n, e ( n ) n ( n ) n n n! e n. e e Proof. We start wth an ntegral approxmaton. The theorem clearly holds for n = 1, so assume n. Snce the log functon s concave downward, we clam that for all such that n, log + log( 1) 1 log x dx log 1. (5) The left-hand sde s the area of the trapezod T 1 formed by the ponts ( 1, 0), (, 0), (, log ), ( 1, log( 1)), and the rght-hand sde s the area of the trapezod T formed by the ponts ( 1, 0), (, 0), (, log ), ( 1, log 1/). Note that T s upper edge s the tangent lne to the curve y = log x at the pont (, log ). By concavty of log, the regon under the curve y = log x n the nterval [ 1, ] contans T 1 and s contaned n T, hence the nequaltes (5). Now note that log(n!) = n =1 log = n = log. Summng (5) from = to n and smplfyng, we get log(n!) log n n 1 log x dx = n log n n + 1 log(n!) 1 = 1, (6) usng the closed form log x dx = x log x x + C. The sum on the rght-hand sde of (6) s the Harmonc seres, whch satsfes another ntegral approxmaton: Equatons (6) and (7) yeld log n! log n = 1 n dx = log n log. (7) x n log n n + 1 log(n!) log n + log, and so n log n n + 1 + log n log log n! n log n n + 1 + log n. (8) Takng e to the power of all three quanttes n (8) and smplfyng, we have e ( n ) n ( n ) n n n! e n e e as desred. 6

5 Inequaltes of Markov and Chebyshev We only consder random varables that are real-valued and over dscrete sample spaces. If X s such a random varable, then we let E[X] and var[x] respectvely denote the expected value (mean) of X and the varance of X. Theorem 5.1 (Markov s Inequalty) Let X be a random varable wth fnte mean, and suppose X 0. For every real c > 0, Pr[X c] E[X]. c Proof. Let Ω be the sample space for X. We have E[X] = X(a) Pr[a] = X(a) Pr[a] + X(a) Pr[a] a Ω a:x(a) c a:x(a)<c X(a) Pr[a] c Pr[a] = c Pr[X c]. a:x(a) c a:x(a) c Dvdng both sdes by c proves the theorem. Theorem 5. (Chebyshev s Inequalty) Let X be a random varable wth fnte mean and varance, and let a > 0 be real. Pr[ X E[X] a ] var[x] a. Proof. We nvoke Markov s Inequalty wth the random varable Y = (X E[X]), lettng c = a. Note that Y 0, E[Y ] = var[x], and Pr[ X E[X] a ] = Pr[Y a ]. 7

6 Relatve Entropy Let p = (p 1, p,...) and q = (q 1, q,...) be two probablty dstrbutons over some (fnte or nfnte) dscrete sample space {1,,...}. The relatve entropy of q wth respect to p s defned as H(q; p) = p lg q p, (9) Where the sum s taken over all such that p > 0. If q = 0 and p > 0 for some, then H(q; p) =. Otherwse, the sum n (9) may or may not converge, but we always have the followng regardless: Theorem 6.1 H(q; p) 0, wth equalty holdng f and only f p = q. Proof. We use that fact that log x x 1 for all x > 0, wth equalty holdng ff x = 1. We have H(q; p) = p lg q p = 1 p log q log p 1 ( ) q p 1 log p 1 = (p q ) log ( 1 = 1 ) q log 0. It s easy to see that equalty holds above f and only f p = q. An mportant specal case s when q = (q 1,..., q n ) = (1/n,..., 1/n) s the unform dstrbuton on a sample space of sze n (and p = (p 1,..., p n ) s arbtrary). In ths case, we have H(q; p) = lg n H(p 1,..., p n ). (10) If (p, 1 p) and (q, 1 q) are bnary dstrbutons, then we abbrevate H((q, 1 q); (p, 1 p)) by h(q; p). Note that by (10), h(1/, p) = 1 h(p). 8

7 A Standard Tal Inequalty It mght be necessary to read Secton 6 before ths one. Let 0 < p < 1 and let n > 0 be an nteger. In ths secton, we gve an upper bound for the sum t ( n ) =0 p (1 p) n, where t pn. [For example, ths sum s the probablty of gettng at most t heads among n flps of a p-based con (.e., n dentcal Bernoull trals wth bas p). The expected number of heads among n flps s pn, and we want to show that the probablty of gettng sgnfcantly fewer than pn heads dmnshes exponentally wth n.] Theorem 7.1 Let n be a postve nteger. Let 0 < p < 1 be arbtrary, and set q = 1 p. If t s an nteger such that 0 t pn, then t ( ) n p q n nh(p;t/n), (11) where h( ; ) s the bnary relatve entropy defned n Secton 6. =0 Proof. If t = 0, then h(p; t/n) = h(p; 0) = lg q, and so both sdes of (11) equal q n and so the nequalty s satsfed. Now suppose 0 < t pn. Set λ = t/n, and let µ = 1 λ. Note that 0 < λ p < 1 and 0 < q µ < 1. Defne C = pt q n t λ t µ n t. For any 0 t, we have p q n = C Therefore, startng wth the left-hand sde of (11), we get t =0 ( ) n p q n C For the rght-hand sde of (11), we get ( ) t q ( µ ) t λ t µ n t C λ t µ n t = Cλ µ n. p λ t =0 ( ) n λ µ n C =0 ( ) n λ µ n = C(λ + µ) n = C. ( p ) ( ) nλ nµ q ( p ) ( ) t n t q nh(p;t/n) = nh(p;λ) = n[λ lg(p/λ)+µ lg(q/µ)] = = = C, λ µ λ µ whch proves the theorem. 9