Lecture 9 Sept 29, 2017

Similar documents
Dimensionality Reduction Notes 2

An introduction to chaining, and applications to sublinear algorithms

Large-Scale Data-Dependent Kernel Approximation Appendix

A MULTIDIMENSIONAL ANALOGUE OF THE RADEMACHER-GAUSSIAN TAIL COMPARISON

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

CIS 700: algorithms for Big Data

A MULTIDIMENSIONAL ANALOGUE OF THE RADEMACHER-GAUSSIAN TAIL COMPARISON

Dimensionality Reduction Notes 1

ENTROPIC QUESTIONING

Lecture 10: May 6, 2013

Lecture 4: Constant Time SVD Approximation

Lecture 3. Ax x i a i. i i

Lecture 4: Universal Hash Functions/Streaming Cont d

Deriving the X-Z Identity from Auxiliary Space Method

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Chapter 2 Transformations and Expectations. , and define f

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Explicit bounds for the return probability of simple random walk

Math 217 Fall 2013 Homework 2 Solutions

Lecture Space-Bounded Derandomization

Errors for Linear Systems

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise

Non-negative Matrices and Distributed Control

Eigenvalues of Random Graphs

Randomness and Computation

Notes on Frequency Estimation in Data Streams

18.1 Introduction and Recap

Homework Notes Week 7

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

Problem Set 9 Solutions

Lecture 5 September 17, 2015

The Second Anti-Mathima on Game Theory

The Order Relation and Trace Inequalities for. Hermitian Operators

2. High dimensional data

COS 521: Advanced Algorithms Game Theory and Linear Programming

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Feature Selection: Part 1

MAT 578 Functional Analysis

The lower and upper bounds on Perron root of nonnegative irreducible matrices

1 Convex Optimization

APPENDIX A Some Linear Algebra

1 The Mistake Bound Model

HANSON-WRIGHT INEQUALITY AND SUB-GAUSSIAN CONCENTRATION

Appendix B. Criterion of Riemann-Stieltjes Integrability

The Second Eigenvalue of Planar Graphs

Lecture 3 January 31, 2017

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A GENERALIZATION OF JUNG S THEOREM. M. Henk

WHY NOT USE THE ENTROPY METHOD FOR WEIGHT ESTIMATION?

Spectral Graph Theory and its Applications September 16, Lecture 5

Lecture 21: Numerical methods for pricing American type derivatives

ENGI9496 Lecture Notes Multiport Models in Mechanics

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

More metrics on cartesian products

CENTRAL LIMIT THEORY FOR THE NUMBER OF SEEDS IN A GROWTH MODEL IN d WITH INHOMOGENEOUS POISSON ARRIVALS

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 6 Luca Trevisan September 12, 2017

arxiv: v1 [quant-ph] 6 Sep 2007

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

CSCE 790S Background Results

GENERIC CONTINUOUS SPECTRUM FOR MULTI-DIMENSIONAL QUASIPERIODIC SCHRÖDINGER OPERATORS WITH ROUGH POTENTIALS

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Week 5: Neural Networks

Supplement to Clustering with Statistical Error Control

Expected Value and Variance

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Multi-dimensional Central Limit Theorem

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

PHZ 6607 Lecture Notes

Exercises of Chapter 2

12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA. 4. Tensor product

Simple Analyses of the Sparse Johnson-Lindenstrauss Transform

Lecture 3: Shannon s Theorem

SUCCESSIVE MINIMA AND LATTICE POINTS (AFTER HENK, GILLET AND SOULÉ) M(B) := # ( B Z N)

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

An efficient algorithm for multivariate Maclaurin Newton transformation

Lecture 2: Gram-Schmidt Vectors and the LLL Algorithm

2.3 Nilpotent endomorphisms

Finding Dense Subgraphs in G(n, 1/2)

Perron Vectors of an Irreducible Nonnegative Interval Matrix

New Liu Estimators for the Poisson Regression Model: Method and Application

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

Bounds for Spectral Radius of Various Matrices Associated With Graphs

First day August 1, Problems and Solutions

Report on Image warping

Lecture 14: Bandits with Budget Constraints

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Approximate Smallest Enclosing Balls

Convergence of random processes

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Introduction to Algorithms

Transcription:

Sketchng Algorthms for Bg Data Fall 2017 Prof. Jelan Nelson Lecture 9 Sept 29, 2017 Scrbe: Mtal Bafna 1 Fast JL transform Typcally we have some hgh-mensonal computatonal geometry problem, an we use JL to spee up our algorthm n two steps: (1) apply a JL map Π to reuce the problem to low menson m, then (2) solve the lower-mensonal problem. As m s mae smaller, typcally (2) becomes faster. However, eally we woul also lke step (1) to be as fast as possble. In ths secton, we nvestgate two approaches to spee up the computaton of Πx. One of the analyses wll make use of the followng Chernoff boun. Theorem 1 (Chernoff boun). Let X 1,..., X n be nepenent ranom varables n [0, τ], an wrte µ := E X. Then ε > 0, P( ( e ε X µ > εµ) < 2 (1 + ε) 1+ε ) µ/τ The approach we cover here was nvestgate by Alon an Chazelle [AC09]. Ths approach gves a runnng tme to compute Πx of roughly O( log ). They calle ther transformaton the Fast Johnson-Lnenstrauss Transform (FJLT). A constructon smlar to thers, whch we wll analyze here, s the m n matrx Π efne as Π = SHD (1) m where S s an m samplng matrx wth replacement (each row has a 1 n a unformly ranom locaton an zeroes elsewhere, an the rows are nepenent), H s a boune orthonormal system, an D = ag(α) for a vector α of nepenent Raemachers. A boune orthonormal system s a matrx H C such that H H = I an max,j H,j 1/. For example, H can be the Fourer matrx or Haamar matrx. The motvaton for the constructon (1) s spee: D can be apple n O() tme, H n O( log ) tme (e.g. usng the Fast Fourer Transform or ve an conquer n the case of the Haamar matrx), an S n O(m) tme. Thus, overall, applyng Π to any fxe vector x takes O( log ) tme. Compare ths wth usng a ense matrx of Raemachers, whch takes O(m) tme to apply. We wll now gve some ntuton behn why such a Π works. Conser the samplng matrx S whch samples a ranom coornate of x. If the norm of x s sprea out among ts coornates then n expectaton the norm of Sx s the norm of x. But what o we o n the case where x has mass only on a few coornates. It s known that a Fourer matrx spreas out the mass of vectors wth hghly concentrate mass an vce versa. So we multply S wth H an to hanle the case where H concentrates the mass of vectors wth ther mass sprea out we fnally multply x n the begnnng by D α. 1

1.1 Analyss of [AC09] We wll show that for m ε 2 log(1/δ) log(/δ), the ranom Π escrbe n (1) proves DJL. We wll conser the case of H as the normalze Haamar matrx, so that every entry of H s n { 1/, 1/ }. Theorem 2. Let x R n be an arbtrary unt norm vector, an suppose 0 < ε, δ < 1/2. Also let Π = m SHD as escrbe above wth a number of rows equal to m ε 2 log(1/δ) log(n/δ). Then P Π ( Πx 2 2 1 > ε) < δ. Proof. Defne y = HDx. The goal s to frst show that HDx = O( log(/δ)/n) wth probablty 1 δ/2, then contone on ths event, that (1 ε) m Sy 2 2 (1 + ε) wth probablty 1 δ/2. For the frst event, note y = (HDx) = n j=1 σ j ( 1 γ,j x j ) = σ, z, where γ,j = 1 an z s the vector wth (z ) j = 1 γ,j x j. Thus by Khntchne s nequalty Thus by a unon boun,, P( y > 2 log(4/δ) n ) < 2e log(/δ) = δ 2. 2 log(4/δ) 2 log(4/δ) P( y > ) = P( : y > ) < δ n n 2. Now, let us conton on ths event that y 2 2 log(4/δ)/n := τ/. For [m], efne X = y 2. By the Chernoff boun above, P( m =1 whch s at most δ/2 for m ε 2 log(1/δ) log(/δ). ( e ε ) m/τ X m > εm) < 2 (1 + ε) 1+ε, Remark 1. Note that the FJLT as analyze above proves suboptmal m. If one esre optmal m, one can nstea use the embeng matrx Π Π,where Π s the FJLT an Π s, say, a ense matrx wth Raemacher entres havng the optmal m = O(ε 2 log(1/δ)) rows. The ownse s that the runtme to apply our embeng worsens by an atve m m. [AC09] slghtly mprove ths atve term (by an ε 2 multplcatve factor) by replacng the matrx S wth a ranom sparse matrx P. Can a better analyss be gven? Unfortunately not by much: the quaratc epenence log 2 (1/δ) nees to be there by an example of Erc Prce. The ba case s x has 1/ 1/4 on the frst coornates, an magne δ 2. 2

1.2 Analyss base on RIP Here we gve a fferent analyss, base on combnng the man results of [KW11] an [RV08] whch use the metho of channg, as seen n the last lecture. Frst we have to gve a efnton. Defnton 1. We say a matrx Π R m n satsfes the (ε, k)-restrcte sometry property (or RIP for short) f for all k-sparse vectors x of unt Euclean norm, 1 ε Πx 2 2 1 + ε. Usng the fact that the operator norm of a matrx M s equal to sup x x T Mx, t follows that beng (ε, k)-rip s equvalent to sup I k (Π (T ) ) Π (T ) < ε, T [n] where Π (T ) s the m T matrx obtane by restrctng Π to the columns n T. As we wll see later n the course, ths noton of RIP s useful for compresse sensng, whch s closely relate to the heavy htters problem. For now, we wll just use t to obtan fast JL by combnng t wth the followng theorem of [KW11]. Theorem 3. There exsts a unversal constant C > 0 such that the followng hols. Suppose A satsfes (ε/c, k)-rip for k C log(1/δ), an let α { 1, 1} n be chosen unformly at ranom. Then for any x R n of unt norm P α ( AD α x 2 2 1 > ε) < δ. In other wors, the probablty strbuton Π = AD α over matrces, nuce by α, satsfes the strbutonal JL property. We wll not prove Theorem 3 here, but we wll show that the matrx msh satsfes RIP wth postve probablty for farly small m. That s, there oes some choce of few rows of a boune orthonormal system that gves RIP (though unfortunately we o not know whch explct set, though see [BDF + 11]). A number of bouns on the best m to acheve RIP for samplng Fourer/Haamar rows were gven, startng wth the work of Canés an Tao [CT06]. Then subsequent works gave better bouns [RV08, Bou14, HR16]. An analyss was also gven for a relate constructon n [NPW14]. We wll gve the analyss of [RV08] snce t s most smlar to what we saw n the last lecture. Recall for T R n, r(t ) := E sup σ, x. σ x T Last lecture we not nclue the absolute values, but t oes not make much of a fference (the Khntchne tal boun only ffers by a factor of two). Also recall that we showe r(t ) (T, 2 ), 3

where for T a set of vectors of at most unt norm, 1 (T, ) 2 k lg1/2 N (T,, 1 2 k ) lg 1/2 N (T,, u)u nf k=1 0 {T r} r=1 Ths was the Duley boun. Let us now show that for RIP, m = Ω(ε 2 k log 4 n) suffces. 2 r/2 sup x T r. x T We wll analyze a slghtly fferent constructon, just for ease of notaton. Instea of samplng m rows from H, we wll smply keep each row wth probablty m/, nepenently. Let η be an ncator for whether we keep row. Also, let us efne x to equal the th row of H, so x { 1, 1} n. We let β = E µ sup I k 1 m terms of β. µ z (T ) ) T an we wll now get an upper boun for β n E sup I k 1 µ z (T ) µ m = E sup E ( 1 µ µ µ m z (T ) 1 E µ,µ m sup ) T ) T 1 m µ z (T ) ) T ) µ z (T ) ) T µ z (T ) ) T ) Jensen s nequalty = 1 m E µ,µ,σ sup σ (µ µ )z (T ) ) T By symmetrzaton over σ 2 m E µ E σ sup σ µ z (T ) ) T Trangle nequalty = 2 m E µ E σ sup = 2 m E µ E σ sup sup σ µ x, z (T ) x R n 2 Usng the efn of operator norm of a matrx [] sup x D,k 2 [] σ µ x, z (T ) 2 where D,k 2 = set of all k-sparse unt vectors n R We let, T µ = {µ 1 x, z 1 2,..., µ x, z 2, x D,k 2 } an r(t µ) = E sup z Tµ σ, z. Duley s nequalty gves us that r(t ) (T, l 2 ). Let g(x) = (µ 1 x, z 1,..., µ x, z ) an g(y) s efne smlarly. We have that, So we get that, g(x) g(y) 2 max 1 j z j, x y 2 m (β + 1) 1/2. β β + 1 (D,k 2, ) m, 4

whch mples that β 2 CRβ CR 0. References [AC09] Nr Alon an Bernar Chazelle. The fast Johnson Lnenstrauss transform an approxmate nearest neghbors. SIAM J. Comput., 39(1):302 322, 2009. [BDF + 11] Jean Bourgan, Stephen Dlworth, Kevn For, Serge Konyagn, an Denka Kutzarova. Explct constructons of RIP matrces an relate problems. Duke Mathematcal Journal, 159(1):145 185, 2011. [Bou14] Jean Bourgan. An mprove estmate n the restrcte sometry problem. Geometrc Aspects of Functonal Analyss, 2116:65 70, 2014. [CT06] Emmanuel J. Canés an Terence Tao. Near-optmal sgnal recovery from ranom projectons: unversal encong strateges? IEEE Trans. Inform. Theory, 52(12):5406 5425, 2006. [HR16] [KW11] Ishay Havv an Oe Regev. The restrcte sometry property of subsample fourer matrces. In Proceengs of the Twenty-Seventh Annual ACM-SIAM Symposum on Dscrete Algorthms (SODA), pages 288 297, 2016. Felx Krahmer an Rachel War. New an mprove Johnson-Lnenstrauss embengs va the Restrcte Isometry Property. SIAM J. Math. Anal., 43(3):1269 1281, 2011. [NPW14] Jelan Nelson, Erc Prce, an Mary Wootters. New constructons of RIP matrces wth fast multplcaton an fewer rows. In Proceengs of the 25th Annual ACM-SIAM Symposum on Dscrete Algorthms (SODA), pages 1515 1528, January 2014. [RV08] Mark Ruelson an Roman Vershynn. On sparse reconstructon from Fourer an Gaussan measurements. Comm. Pure Appl. Math., 61(8):1025 1045, 2008. 5