Sparser Johnson-Lindenstrauss Transforms

Size: px
Start display at page:

Download "Sparser Johnson-Lindenstrauss Transforms"

Transcription

1 Sparser Johnson-Lindenstrauss Transforms Jelani Nelson Princeton February 16, 212 joint work with Daniel Kane (Stanford)

2 Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression)

3 Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression) compressed sensing (recover x from y when x is (near-)sparse) group-testing (as above, but Sx is Boolean multiplication) recover properties of x (entropy, heavy hitters,...) approximate norm preservation (want y x ) motif discovery (slightly different; randomly project discrete x onto subset of its coordinates) [Buhler-Tompa]

4 Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression) compressed sensing (recover x from y when x is (near-)sparse) group-testing (as above, but Sx is Boolean multiplication) recover properties of x (entropy, heavy hitters,...) approximate norm preservation (want y x ) motif discovery (slightly different; randomly project discrete x onto subset of its coordinates) [Buhler-Tompa] In many of these applications, random S is either required or obtains better parameters than deterministic constructions.

5 Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression) compressed sensing (recover x from y when x is (near-)sparse) group-testing (as above, but Sx is Boolean multiplication) recover properties of x (entropy, heavy hitters,...) approximate norm preservation (want y x ) motif discovery (slightly different; randomly project discrete x onto subset of its coordinates) [Buhler-Tompa] In many of these applications, random S is either required or obtains better parameters than deterministic constructions.

6 Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean space can be embedded into O(ε 2 log n)-dimensional Euclidean space so that all pairwise distances are preserved up to a 1 ± ε factor.

7 Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean space can be embedded into O(ε 2 log n)-dimensional Euclidean space so that all pairwise distances are preserved up to a 1 ± ε factor. Uses: Speed up geometric algorithms by first reducing dimension of input [Indyk-Motwani, 1998], [Indyk, 21] Low-memory streaming algorithms for linear algebra problems [Sarlós, 26], [LWMRT, 27], [Clarkson-Woodruff, 29] Essentially equivalent to RIP matrices from compressed sensing [Baraniuk et al., 28], [Krahmer-Ward, 211] (used for recovery of sparse signals)

8 How to prove the JL lemma Distributional JL (DJL) lemma Lemma For any < ε, δ < 1/2 there exists a distribution D ε,δ on R k d for k = O(ε 2 log(1/δ)) so that for any x of unit norm [ Pr Sx > ε ] < δ. S D ε,δ

9 How to prove the JL lemma Distributional JL (DJL) lemma Lemma For any < ε, δ < 1/2 there exists a distribution D ε,δ on R k d for k = O(ε 2 log(1/δ)) so that for any x of unit norm [ Pr Sx > ε ] < δ. S D ε,δ Proof of MJL: Set δ = 1/n 2 in DJL and x as the difference vector of some pair of points. Union bound over the ( n 2) pairs.

10 How to prove the JL lemma Distributional JL (DJL) lemma Lemma For any < ε, δ < 1/2 there exists a distribution D ε,δ on R k d for k = O(ε 2 log(1/δ)) so that for any x of unit norm [ Pr Sx > ε ] < δ. S D ε,δ Proof of MJL: Set δ = 1/n 2 in DJL and x as the difference vector of some pair of points. Union bound over the ( n 2) pairs. Theorem (Alon, 23) For every n, there exists a set of n points requiring target dimension k = Ω((ε 2 / log(1/ε)) log n). Theorem (Jayram-Woodruff, 211; Kane-Meka-N., 211) For DJL, k = Θ(ε 2 log(1/δ)) is optimal.

11 Proving the JL lemma Older proofs [Johnson-Lindenstrauss, 1984], [Frankl-Maehara, 1988]: Random rotation, then projection onto first k coordinates. [Indyk-Motwani, 1998], [Dasgupta-Gupta, 23]: Random matrix with independent Gaussian entries. [Achlioptas, 21]: Independent ±1 entries. [Clarkson-Woodruff, 29]: O(log(1/δ))-wise independent ±1 entries. [Arriaga-Vempala, 1999], [Matousek, 28]: Independent entries having mean, variance 1/k, and subgaussian tails

12 Proving the JL lemma Older proofs [Johnson-Lindenstrauss, 1984], [Frankl-Maehara, 1988]: Random rotation, then projection onto first k coordinates. [Indyk-Motwani, 1998], [Dasgupta-Gupta, 23]: Random matrix with independent Gaussian entries. [Achlioptas, 21]: Independent ±1 entries. [Clarkson-Woodruff, 29]: O(log(1/δ))-wise independent ±1 entries. [Arriaga-Vempala, 1999], [Matousek, 28]: Independent entries having mean, variance 1/k, and subgaussian tails Downside: Performing embedding is dense matrix-vector multiplication, O(k x ) time

13 Fast JL Transforms [Ailon-Chazelle, 26]: x PHDx, O(d log d + k 3 ) time P is a random sparse matrix, H is Hadamard, D has random ±1 on diagonal [Ailon-Liberty, 28]: O(d log k + k 2 ) time, also based on fast Hadamard transform [Ailon-Liberty, 211] and [Krahmer-Ward, 211]: O(d log d) for MJL, but with suboptimal k = O(ε 2 log n log 4 d).

14 Fast JL Transforms [Ailon-Chazelle, 26]: x PHDx, O(d log d + k 3 ) time P is a random sparse matrix, H is Hadamard, D has random ±1 on diagonal [Ailon-Liberty, 28]: O(d log k + k 2 ) time, also based on fast Hadamard transform [Ailon-Liberty, 211] and [Krahmer-Ward, 211]: O(d log d) for MJL, but with suboptimal k = O(ε 2 log n log 4 d). Downside: Slow to embed sparse vectors: running time is Ω(min{k x, d log d}).

15 Where Do Sparse Vectors Show Up? Document as bag of words: x i = number of occurrences of word i. Compare documents using cosine similarity. d = lexicon size; most documents aren t dictionaries Network traffic: x i,j = #bytes sent from i to j d = 2 64 (2 256 in IPv6); most servers don t talk to each other User ratings: x i is user s score for movie i on Netflix d = #movies; most people haven t rated all movies Streaming: x receives a stream of updates of the form: add v to x i. Maintaining Sx requires calculating v Se i....

16 Sparse JL transforms One way to embed sparse vectors faster: use sparse matrices.

17 Sparse JL transforms One way to embed sparse vectors faster: use sparse matrices. s = #non-zero entries per column in embedding matrix (so embedding time is s x ) reference value of s type [JL84], [FM88], [IM98],... k 4ε 2 log(1/δ) dense [Achlioptas1] k/3 sparse Bernoulli [WDALS9] no proof hashing [DKS1] Õ(ε 1 log 3 (1/δ)) hashing [KN1a], [BOR1] Õ(ε 1 log 2 (1/δ)) [KN12] O(ε 1 log(1/δ)) hashing (random codes)

18 Other related work CountSketch of [Charikar-Chen-FarachColton] gives s = O(log(1/δ)) (see [Thorup-Zhang])

19 Other related work CountSketch of [Charikar-Chen-FarachColton] gives s = O(log(1/δ)) (see [Thorup-Zhang]) Can recover (1 ± ε) x 2 from Sx, but not as Sx 2 (not an embedding into l 2 ) Not applicable in certain situations, e.g. in some nearest neighbor data structures, and when learning classifiers over projected vectors via stochastic gradient descent

20 Sparse JL Constructions [DKS, 21] s = Θ(ε 1 log 2 (1/δ))

21 Sparse JL Constructions [DKS, 21] s = Θ(ε 1 log 2 (1/δ)) [this work] s = Θ(ε 1 log(1/δ))

22 Sparse JL Constructions [DKS, 21] s = Θ(ε 1 log 2 (1/δ)) [this work] s = Θ(ε 1 log(1/δ)) [this work] k/s s = Θ(ε 1 log(1/δ))

23 Sparse JL Constructions (in matrix form) = k k/s = k Each black cell is ±1/ s at random

24 Sparse JL Constructions (nicknames) Graph construction k/s Block construction

25 Sparse JL notation (block construction) Let h(j, r), σ(j, r) be random hash location and random sign for copy of x j in rth block.

26 Sparse JL notation (block construction) Let h(j, r), σ(j, r) be random hash location and random sign for copy of x j in rth block. (Sx) i = (1/ s) h(j,r)=i x j σ(j, r)

27 Sparse JL notation (block construction) Let h(j, r), σ(j, r) be random hash location and random sign for copy of x j in rth block. (Sx) i = (1/ s) h(j,r)=i x j σ(j, r) Sx 2 2 = x s s x i x j σ(i, r)σ(j, r) 1 h(i,r)=h(j,r) r=1 i j

28 Sparse JL via Codes = k k/s = k Graph construction: Constant weight binary code of weight s. Block construction: Code over q-ary alphabet, q = k/s.

29 Sparse JL via Codes = k k/s = k Graph construction: Constant weight binary code of weight s. Block construction: Code over q-ary alphabet, q = k/s. Thm: Just need distance s O(s 2 /k) (block construction) (2s O(s 2 /k) for graph construction)

30 Analysis (block construction) k/s = k η i,j,r indicates whether i, j collide in rth block. Sx 2 2 = x Z Z = (1/s) r Z r Z r = i j x ix j σ(i, r)σ(j, r)η i,j,r

31 Analysis (block construction) k/s = k η i,j,r indicates whether i, j collide in rth block. Sx 2 2 = x Z Z = (1/s) r Z r Z r = i j x ix j σ(i, r)σ(j, r)η i,j,r Z is a quadratic form in σ, so apply known moment bounds for quadratic forms

32 Analysis k/s = k Theorem (Hanson-Wright, 1971) σ 1,..., σ n independent ±1, B R n n symmetric. For λ >, [ ] σ Pr T Bσ trace(b) > λ < e C min{λ2 / B 2 F,λ/ B 2} Reminder: B F = i,j B2 i,j B 2 is largest magnitude of eigenvalue of B

33 Analysis Z = 1 s s x i x j σ(i, r)σ(j, r)η i,j,r r=1 i j

34 Analysis Z = 1 s s x i x j σ(i, r)σ(j, r)η i,j,r = σ T Bσ r=1 i j B = 1 s B 1... B B s (B r ) i,j = x i x j η i,j,r

35 Frobenius norm bound B = 1 s B 1... B B s (B r ) i,j = x i x j η i,j,r

36 Frobenius norm bound B = 1 s B 1... B B s (B r ) i,j = x i x j η i,j,r B 2 F = 1 s 2 i j x 2 i x 2 j (#times i, j collide)

37 Frobenius norm bound B = 1 s B 1... B B s (B r ) i,j = x i x j η i,j,r B 2 F = 1 s i j x 2 2 i x j 2 (#times i, j collide) < O(1/k) x 4 2 = O(1/k) (good code!)

38 B r = 1 s Operator norm bound x1 2 x 1 x 2 η 1,2,r... x 1 x d η 1,d,r x 2 x 1 η 2,1,r x x 2 x d η 2,d,r x d x 1 η d,1,r... x d x d 1 η d,d 1,r xd 2 x s x xd 2

39 B r = 1 s Operator norm bound x1 2 x 1 x 2 η 1,2,r... x 1 x d η 1,d,r x 2 x 1 η 2,1,r x x 2 x d η 2,d,r x d x 1 η d,1,r... x d x d 1 η d,d 1,r xd 2 x s x xd 2 B r = 1 s (S r D)

40 B r = 1 s Operator norm bound x1 2 x 1 x 2 η 1,2,r... x 1 x d η 1,d,r x 2 x 1 η 2,1,r x x 2 x d η 2,d,r x d x 1 η d,1,r... x d x d 1 η d,d 1,r xd 2 x s x xd 2 B r = 1 s (S r D) D 2 = x 2 1

41 B r = 1 s Operator norm bound x1 2 x 1 x 2 η 1,2,r... x 1 x d η 1,d,r x 2 x 1 η 2,1,r x x 2 x d η 2,d,r x d x 1 η d,1,r... x d x d 1 η d,d 1,r xd 2 x s x xd 2 B r = 1 s (S r D) D 2 = x 2 1 S r = k/s i=1 u r,iu T r,i, these are the eigenvectors of S r (u r,i is projection of x onto coordinates hashing to i in rth block) S r 2 = max i u r,i 2 2 x 2 2 = 1

42 B r = 1 s Operator norm bound x1 2 x 1 x 2 η 1,2,r... x 1 x d η 1,d,r x 2 x 1 η 2,1,r x x 2 x d η 2,d,r x d x 1 η d,1,r... x d x d 1 η d,d 1,r xd 2 x s x xd 2 B r = 1 s (S r D) D 2 = x 2 1 S r = k/s i=1 u r,iu T r,i, these are the eigenvectors of S r (u r,i is projection of x onto coordinates hashing to i in rth block) S r 2 = max i u r,i 2 2 x 2 2 = 1 B r 2 (1/s) max{ S r 2, D 2 } 1/s

43 Wrapping up analysis B 2 F = O(1/k) B 2 1/s Apply Hanson-Wright:

44 Wrapping up analysis B 2 F = O(1/k) B 2 1/s Apply Hanson-Wright: Pr [ Z > ε] < e C min{ε 2 k, εs} k = Ω(log(1/δ)/ε 2 ), s = Ω(log(1/δ)/ε), QED

45 Code-based Construction: Caveat Need a sufficiently good code.

46 Code-based Construction: Caveat Need a sufficiently good code. Each pair of codewords should agree on O(s 2 /k) coordinates. Can get this with random code by Chernoff + union bound over pairs, but then need: s 2 /k log(d/δ) s k log(d/δ) = Ω(ε 1 log(d/δ) log(1/δ)).

47 Code-based Construction: Caveat Need a sufficiently good code. Each pair of codewords should agree on O(s 2 /k) coordinates. Can get this with random code by Chernoff + union bound over pairs, but then need: s 2 /k log(d/δ) s k log(d/δ) = Ω(ε 1 log(d/δ) log(1/δ)). Slightly better: Can assume d = O(ε 2 /δ) by first embedding into this dimension with s = 1 (Analysis: Chebyshev s inequality) Can get away with s = O(ε 1 log(1/(εδ)) log(1/δ)). Can we avoid the loss incurred by this union bound?

48 Code-based Construction: Caveat Need a sufficiently good code. Each pair of codewords should agree on O(s 2 /k) coordinates. Can get this with random code by Chernoff + union bound over pairs, but then need: s 2 /k log(d/δ) s k log(d/δ) = Ω(ε 1 log(d/δ) log(1/δ)). Slightly better: Can assume d = O(ε 2 /δ) by first embedding into this dimension with s = 1 (Analysis: Chebyshev s inequality) Can get away with s = O(ε 1 log(1/(εδ)) log(1/δ)). Can we avoid the loss incurred by this union bound?

49 Improving the Construction Pick h at random Analysis: Directly bound the l = log(1/δ)th moment of error term Z, then Markov s inequality Pr[ Z > ε] < ε l E[Z l ] (conditioning on the event h gives a code is too demanding)

50 Improving the Construction Pick h at random Analysis: Directly bound the l = log(1/δ)th moment of error term Z, then Markov s inequality Pr[ Z > ε] < ε l E[Z l ] (conditioning on the event h gives a code is too demanding) Z = (1/s) s r=1 i j x ix j σ(i, r)σ(j, r)η i,j,r

51 Improving the Construction Pick h at random Analysis: Directly bound the l = log(1/δ)th moment of error term Z, then Markov s inequality Pr[ Z > ε] < ε l E[Z l ] (conditioning on the event h gives a code is too demanding) Z = (1/s) s r=1 (Z = (1/s) s r=1 Z r ) i j x ix j σ(i, r)σ(j, r)η i,j,r

52 Improving the Construction Pick h at random Analysis: Directly bound the l = log(1/δ)th moment of error term Z, then Markov s inequality Pr[ Z > ε] < ε l E[Z l ] (conditioning on the event h gives a code is too demanding) Z = (1/s) s r=1 (Z = (1/s) s r=1 Z r ) E h,σ [Z l ] = 1 s l i j x ix j σ(i, r)σ(j, r)η i,j,r r 1 <...<r n t 1,...,t n>1 i t i =l ( l t 1,..., t n ) n i=1 E h,σ [Z t i r i ] Bound the tth moment of any Z r, then get the lth moment bound for Z by plugging into the above

53 Bounding E[Z t r ] Z r = i j x ix j σ(i, r)σ(j, r)η i,j,r

54 Bounding E[Z t r ] Z r = i j x ix j σ(i, r)σ(j, r)η i,j,r Monomials appearing in expansion of Zr t are in correspondence with directed multigraphs. (x 1 x 2 ) (x 3 x 4 ) (x 3 x 8 ) (x 4 x 8 ) (x 2 x 1 )

55 Bounding E[Z t r ] Z r = i j x ix j σ(i, r)σ(j, r)η i,j,r Monomials appearing in expansion of Zr t are in correspondence with directed multigraphs. (x 1 x 2 ) (x 3 x 4 ) (x 3 x 8 ) (x 4 x 8 ) (x 2 x 1 ) Monomial contributes to expectation iff all degrees even Analysis: Group monomials appearing in Zr t according to graph isomorphism class then do some combinatorics. Our analysis is tight up to a constant factor. 3

56 Bounding E[Z t r ] m = #connected components, v = #vertices, d u = degree of u E h,σ [Zr t ] = [ t ] ( t ) E x iu x ju G G t i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G u=1 η iu,j u,r u=1

57 Bounding E[Z t r ] m = #connected components, v = #vertices, d u = degree of u E h,σ [Zr t ] = [ t ] ( t ) E x iu x ju G G t i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G = ( s ) v m k G G t u=1 η iu,j u,r i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G u=1 ( t ) x iu x ju u=1

58 Bounding E[Z t r ] m = #connected components, v = #vertices, d u = degree of u E h,σ [Zr t ] = [ t ] ( t ) E x iu x ju G G t i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G = ( s ) v m k G G t G G t ( s k u=1 η iu,j u,r i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G ) v m 1 v! ( ) t d 1 /2,...,d v /2 u=1 ( t ) x iu x ju u=1

59 Bounding E[Z t r ] m = #connected components, v = #vertices, d u = degree of u E h,σ [Zr t ] = [ t ] ( t ) E x iu x ju G G t i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G = ( s ) v m k G G t G G t ( s k 2 O(t) v,m u=1 η iu,j u,r i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G ) v m 1 v! ( ) t d 1 /2,...,d v /2 ( ) v m t t v v ( s k G u=1 ( t ) x iu x ju u=1 ) d u du u

60 Bounding E[Z t r ] m = #connected components, v = #vertices, d u = degree of u E h,σ [Zr t ] = [ t ] ( t ) E x iu x ju G G t i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G = ( s ) v m k G G t G G t ( s k 2 O(t) v,m u=1 η iu,j u,r i 1 j 1,...,i t j t f ((i u,j u) t u=1 )=G ) v m 1 v! ( ) t d 1 /2,...,d v /2 ( ) v m t t v v ( s k G u=1 ( t ) x iu x ju u=1 ) d u du u

61 Bounding E[Z t r ] In the end, can show E[Z t r ] 2 O(t) { s/k (t/ log(k/s)) t t < log(k/s) otherwise Plug this into expression for E[Z l ], QED

62 Open Problems

63 Open Problems OPEN: Devise distribution which can be sampled using few random bits Current record: O(log d + log(1/ε) log(1/δ) + log(1/δ) log log(1/δ)) [Kane-Meka-N., 211] Existential: O(log d + log(1/δ)) OPEN: Prove a tight lower bound on achievable sparsity in a JL distribution. OPEN: Can we have a JL matrix such that we can multiply by any k k submatrix in k polylog(d) time? (ultimate goal) OPEN: Embed any vector in Õ(d) time into optimal k

Sparse Johnson-Lindenstrauss Transforms

Sparse Johnson-Lindenstrauss Transforms Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 211 joint work with Daniel Kane (Harvard) Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean

More information

Fast Dimension Reduction

Fast Dimension Reduction Fast Dimension Reduction Nir Ailon 1 Edo Liberty 2 1 Google Research 2 Yale University Introduction Lemma (Johnson, Lindenstrauss (1984)) A random projection Ψ preserves all ( n 2) distances up to distortion

More information

Fast Dimension Reduction

Fast Dimension Reduction Fast Dimension Reduction MMDS 2008 Nir Ailon Google Research NY Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes (with Edo Liberty) The Fast Johnson Lindenstrauss Transform (with Bernard

More information

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Nir Ailon May 22, 2007 This writeup includes very basic background material for the talk on the Fast Johnson Lindenstrauss Transform

More information

Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels

Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels Devdatt Dubhashi Department of Computer Science and Engineering, Chalmers University, dubhashi@chalmers.se Functional

More information

Fast Random Projections

Fast Random Projections Fast Random Projections Edo Liberty 1 September 18, 2007 1 Yale University, New Haven CT, supported by AFOSR and NGA (www.edoliberty.com) Advised by Steven Zucker. About This talk will survey a few random

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical

More information

Accelerated Dense Random Projections

Accelerated Dense Random Projections 1 Advisor: Steven Zucker 1 Yale University, Department of Computer Science. Dimensionality reduction (1 ε) xi x j 2 Ψ(xi ) Ψ(x j ) 2 (1 + ε) xi x j 2 ( n 2) distances are ε preserved Target dimension k

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices

Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices Jan Vybíral Austrian Academy of Sciences RICAM, Linz, Austria January 2011 MPI Leipzig, Germany joint work with Aicke

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT

CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT CS on CS: Computer Science insights into Compresive Sensing (and vice versa) Piotr Indyk MIT Sparse Approximations Goal: approximate a highdimensional vector x by x that is sparse, i.e., has few nonzero

More information

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

More information

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 17 10/28 Scribe: Morris Yau 1 Overview In the last lecture we defined subspace embeddings a subspace embedding is a linear transformation

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

16 Embeddings of the Euclidean metric

16 Embeddings of the Euclidean metric 16 Embeddings of the Euclidean metric In today s lecture, we will consider how well we can embed n points in the Euclidean metric (l 2 ) into other l p metrics. More formally, we ask the following question.

More information

Optimal Bounds for Johnson-Lindenstrauss Transformations

Optimal Bounds for Johnson-Lindenstrauss Transformations Journal of Machine Learning Research 9 08 - Submitted 5/8; Revised 9/8; Published 0/8 Optimal Bounds for Johnson-Lindenstrauss Transformations Michael Burr Shuhong Gao Department of Mathematical Sciences

More information

Faster Johnson-Lindenstrauss style reductions

Faster Johnson-Lindenstrauss style reductions Faster Johnson-Lindenstrauss style reductions Aditya Menon August 23, 2007 Outline 1 Introduction Dimensionality reduction The Johnson-Lindenstrauss Lemma Speeding up computation 2 The Fast Johnson-Lindenstrauss

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Foundations and Trends R in Theoretical Computer Science Vol. 10, No. 1-2 (2014) 1 157 c 2014 D. P. Woodruff DOI: 10.1561/0400000060 Sketching as a Tool for Numerical Linear Algebra David P. Woodruff IBM

More information

18.409: Topics in TCS: Embeddings of Finite Metric Spaces. Lecture 6

18.409: Topics in TCS: Embeddings of Finite Metric Spaces. Lecture 6 Massachusetts Institute of Technology Lecturer: Michel X. Goemans 8.409: Topics in TCS: Embeddings of Finite Metric Spaces September 7, 006 Scribe: Benjamin Rossman Lecture 6 Today we loo at dimension

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION

JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION LONG CHEN ABSTRACT. We give a brief survey of Johnson-Lindenstrauss lemma. CONTENTS 1. Introduction 1 2. JL Transform 4 2.1. An Elementary Proof

More information

Randomness Efficient Fast-Johnson-Lindenstrauss Transform with Applications in Differential Privacy

Randomness Efficient Fast-Johnson-Lindenstrauss Transform with Applications in Differential Privacy Randomness Efficient Fast-Johnson-Lindenstrauss Transform with Applications in Differential Privacy Jalaj Upadhyay Cheriton School of Computer Science University of Waterloo jkupadhy@csuwaterlooca arxiv:14102470v6

More information

Lecture 3 Sept. 4, 2014

Lecture 3 Sept. 4, 2014 CS 395T: Sublinear Algorithms Fall 2014 Prof. Eric Price Lecture 3 Sept. 4, 2014 Scribe: Zhao Song In today s lecture, we will discuss the following problems: 1. Distinct elements 2. Turnstile model 3.

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms 南京大学 尹一通 Martingales Definition: A sequence of random variables X 0, X 1,... is a martingale if for all i > 0, E[X i X 0,...,X i1 ] = X i1 x 0, x 1,...,x i1, E[X i X 0 = x 0, X 1

More information

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and

More information

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very

More information

The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction

The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction Kasper Green Larsen Jelani Nelson Abstract For any n > 1 and 0 < ε < 1/, we show the existence of an n O(1) -point subset

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Lecture 01 August 31, 2017

Lecture 01 August 31, 2017 Sketching Algorithms for Big Data Fall 2017 Prof. Jelani Nelson Lecture 01 August 31, 2017 Scribe: Vinh-Kha Le 1 Overview In this lecture, we overviewed the six main topics covered in the course, reviewed

More information

Combining geometry and combinatorics

Combining geometry and combinatorics Combining geometry and combinatorics A unified approach to sparse signal recovery Anna C. Gilbert University of Michigan joint work with R. Berinde (MIT), P. Indyk (MIT), H. Karloff (AT&T), M. Strauss

More information

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty Abstract The Fast Johnson-Lindenstrauss Transform (FJLT) was recently discovered by Ailon and Chazelle as a novel

More information

Lecture 18: March 15

Lecture 18: March 15 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 18: March 15 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may

More information

MANY scientific computations, signal processing, data analysis and machine learning applications lead to large dimensional

MANY scientific computations, signal processing, data analysis and machine learning applications lead to large dimensional Low rank approximation and decomposition of large matrices using error correcting codes Shashanka Ubaru, Arya Mazumdar, and Yousef Saad 1 arxiv:1512.09156v3 [cs.it] 15 Jun 2017 Abstract Low rank approximation

More information

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery

Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery Sparse analysis Lecture VII: Combining geometry and combinatorics, sparse matrices for sparse signal recovery Anna C. Gilbert Department of Mathematics University of Michigan Sparse signal recovery measurements:

More information

The space complexity of approximating the frequency moments

The space complexity of approximating the frequency moments The space complexity of approximating the frequency moments Felix Biermeier November 24, 2015 1 Overview Introduction Approximations of frequency moments lower bounds 2 Frequency moments Problem Estimate

More information

Optimality of the Johnson-Lindenstrauss Lemma

Optimality of the Johnson-Lindenstrauss Lemma Optimality of the Johnson-Lindenstrauss Lemma Kasper Green Larsen Jelani Nelson September 7, 2016 Abstract For any integers d, n 2 and 1/(min{n, d}) 0.4999 < ε < 1, we show the existence of a set of n

More information

Dense Fast Random Projections and Lean Walsh Transforms

Dense Fast Random Projections and Lean Walsh Transforms Dense Fast Random Projections and Lean Walsh Transforms Edo Liberty, Nir Ailon, and Amit Singer Abstract. Random projection methods give distributions over k d matrices such that if a matrix Ψ (chosen

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Edo Liberty. Accelerated Dense Random Projections

Edo Liberty. Accelerated Dense Random Projections Edo Liberty Accelerated Dense Random Projections CONTENTS 1. List of common notations......................................... 5 2. Introduction to random projections....................................

More information

Very Sparse Random Projections

Very Sparse Random Projections Very Sparse Random Projections Ping Li, Trevor Hastie and Kenneth Church [KDD 06] Presented by: Aditya Menon UCSD March 4, 2009 Presented by: Aditya Menon (UCSD) Very Sparse Random Projections March 4,

More information

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Lecturer: Roman Vershynin Scribe: Matthew Herman 1 Introduction Consider the set X = {n points in R N } where

More information

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Lecture 2. Frequency problems

Lecture 2. Frequency problems 1 / 43 Lecture 2. Frequency problems Ricard Gavaldà MIRI Seminar on Data Streams, Spring 2015 Contents 2 / 43 1 Frequency problems in data streams 2 Approximating inner product 3 Computing frequency moments

More information

17 Random Projections and Orthogonal Matching Pursuit

17 Random Projections and Orthogonal Matching Pursuit 17 Random Projections and Orthogonal Matching Pursuit Again we will consider high-dimensional data P. Now we will consider the uses and effects of randomness. We will use it to simplify P (put it in a

More information

A Unifying Framework for l 0 -Sampling Algorithms

A Unifying Framework for l 0 -Sampling Algorithms Distributed and Parallel Databases manuscript No. (will be inserted by the editor) A Unifying Framework for l 0 -Sampling Algorithms Graham Cormode Donatella Firmani Received: date / Accepted: date Abstract

More information

Sparsity Lower Bounds for Dimensionality Reducing Maps

Sparsity Lower Bounds for Dimensionality Reducing Maps Sparsity Lower Bounds for Dimensionality Reducing Maps Jelani Nelson Huy L. Nguy ên November 5, 01 Abstract We give near-tight lower bounds for the sparsity required in several dimensionality reducing

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Complexity Theory of Polynomial-Time Problems

Complexity Theory of Polynomial-Time Problems Complexity Theory of Polynomial-Time Problems Lecture 3: The polynomial method Part I: Orthogonal Vectors Sebastian Krinninger Organization of lecture No lecture on 26.05. (State holiday) 2 nd exercise

More information

Computing the Entropy of a Stream

Computing the Entropy of a Stream Computing the Entropy of a Stream To appear in SODA 2007 Graham Cormode graham@research.att.com Amit Chakrabarti Dartmouth College Andrew McGregor U. Penn / UCSD Outline Introduction Entropy Upper Bound

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Chaining Introduction With Some Computer Science Applications

Chaining Introduction With Some Computer Science Applications Chaining Introduction With Some Computer Science Applications The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Nelson,

More information

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Suppose again we have n sample points x,..., x n R p. The data-point x i R p can be thought of as the i-th row X i of an n p-dimensional

More information

Lecture 4 Thursday Sep 11, 2014

Lecture 4 Thursday Sep 11, 2014 CS 224: Advanced Algorithms Fall 2014 Lecture 4 Thursday Sep 11, 2014 Prof. Jelani Nelson Scribe: Marco Gentili 1 Overview Today we re going to talk about: 1. linear probing (show with 5-wise independence)

More information

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Linear Sketches A Useful Tool in Streaming and Compressive Sensing Linear Sketches A Useful Tool in Streaming and Compressive Sensing Qin Zhang 1-1 Linear sketch Random linear projection M : R n R k that preserves properties of any v R n with high prob. where k n. M =

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Data Streams & Communication Complexity

Data Streams & Communication Complexity Data Streams & Communication Complexity Lecture 1: Simple Stream Statistics in Small Space Andrew McGregor, UMass Amherst 1/25 Data Stream Model Stream: m elements from universe of size n, e.g., x 1, x

More information

Dimensionality Reduction Notes 3

Dimensionality Reduction Notes 3 Dimensionality Reduction Notes 3 Jelani Nelson minilek@seas.harvard.edu August 13, 2015 1 Gordon s theorem Let T be a finite subset of some normed vector space with norm X. We say that a sequence T 0 T

More information

Big Data. Big data arises in many forms: Common themes:

Big Data. Big data arises in many forms: Common themes: Big Data Big data arises in many forms: Physical Measurements: from science (physics, astronomy) Medical data: genetic sequences, detailed time series Activity data: GPS location, social network activity

More information

Tight Bounds for Distributed Functional Monitoring

Tight Bounds for Distributed Functional Monitoring Tight Bounds for Distributed Functional Monitoring Qin Zhang MADALGO, Aarhus University Joint with David Woodruff, IBM Almaden NII Shonan meeting, Japan Jan. 2012 1-1 The distributed streaming model (a.k.a.

More information

Lecture 1: Introduction to Sublinear Algorithms

Lecture 1: Introduction to Sublinear Algorithms CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 1: Introduction to Sublinear Algorithms March 31, 2014 Lecturer: Paul Beame Scribe: Paul Beame Too much data, too little time, space for

More information

Sketching as a Tool for Numerical Linear Algebra All Lectures. David Woodruff IBM Almaden

Sketching as a Tool for Numerical Linear Algebra All Lectures. David Woodruff IBM Almaden Sketching as a Tool for Numerical Linear Algebra All Lectures David Woodruff IBM Almaden Massive data sets Examples Internet traffic logs Financial data etc. Algorithms Want nearly linear time or less

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

Nonlinear Estimators and Tail Bounds for Dimension Reduction in l 1 Using Cauchy Random Projections

Nonlinear Estimators and Tail Bounds for Dimension Reduction in l 1 Using Cauchy Random Projections Journal of Machine Learning Research 007 Submitted 0/06; Published Nonlinear Estimators and Tail Bounds for Dimension Reduction in l Using Cauchy Random Projections Ping Li Department of Statistical Science

More information

Similarity searching, or how to find your neighbors efficiently

Similarity searching, or how to find your neighbors efficiently Similarity searching, or how to find your neighbors efficiently Robert Krauthgamer Weizmann Institute of Science CS Research Day for Prospective Students May 1, 2009 Background Geometric spaces and techniques

More information

Acceleration of Randomized Kaczmarz Method

Acceleration of Randomized Kaczmarz Method Acceleration of Randomized Kaczmarz Method Deanna Needell [Joint work with Y. Eldar] Stanford University BIRS Banff, March 2011 Problem Background Setup Setup Let Ax = b be an overdetermined consistent

More information

CSCI8980 Algorithmic Techniques for Big Data September 12, Lecture 2

CSCI8980 Algorithmic Techniques for Big Data September 12, Lecture 2 CSCI8980 Algorithmic Techniques for Big Data September, 03 Dr. Barna Saha Lecture Scribe: Matt Nohelty Overview We continue our discussion on data streaming models where streams of elements are coming

More information

Yale University Department of Computer Science

Yale University Department of Computer Science Yale University Department of Computer Science Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Nir Ailon 1 Edo Liberty 2 YALEU/DCS/TR-1385 July 2007 1 Institute for Advanced Study, Princeton

More information

Fast Moment Estimation in Data Streams in Optimal Space

Fast Moment Estimation in Data Streams in Optimal Space Fast Moment Estimation in Data Streams in Optimal Space Daniel M. Kane Jelani Nelson Ely Porat David P. Woodruff Abstract We give a space-optimal algorithm with update time O(log 2 (1/ε) log log(1/ε))

More information

arxiv: v4 [cs.ds] 5 Apr 2013

arxiv: v4 [cs.ds] 5 Apr 2013 Low Rank Approximation and Regression in Input Sparsity Time Kenneth L. Clarkson IBM Almaden David P. Woodruff IBM Almaden April 8, 2013 arxiv:1207.6365v4 [cs.ds] 5 Apr 2013 Abstract We design a new distribution

More information

A Unified Theorem on SDP Rank Reduction. yyye

A Unified Theorem on SDP Rank Reduction.   yyye SDP Rank Reduction Yinyu Ye, EURO XXII 1 A Unified Theorem on SDP Rank Reduction Yinyu Ye Department of Management Science and Engineering and Institute of Computational and Mathematical Engineering Stanford

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

Fully Understanding the Hashing Trick

Fully Understanding the Hashing Trick Fully Understanding the Hashing Trick Lior Kamma, Aarhus University Joint work with Casper Freksen and Kasper Green Larsen. Recommendation and Classification PG-13 Comic Book Super Hero Sci Fi Adventure

More information

Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390

Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390 Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390 Edo Liberty Nir Ailon Amit Singer Abstract We present a k d random projection matrix that is applicable to vectors

More information

Graph Sparsification I : Effective Resistance Sampling

Graph Sparsification I : Effective Resistance Sampling Graph Sparsification I : Effective Resistance Sampling Nikhil Srivastava Microsoft Research India Simons Institute, August 26 2014 Graphs G G=(V,E,w) undirected V = n w: E R + Sparsification Approximate

More information

Approximate Spectral Clustering via Randomized Sketching

Approximate Spectral Clustering via Randomized Sketching Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed

More information

Some notes on streaming algorithms continued

Some notes on streaming algorithms continued U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.

More information

Compressive Sensing with Random Matrices

Compressive Sensing with Random Matrices Compressive Sensing with Random Matrices Lucas Connell University of Georgia 9 November 017 Lucas Connell (University of Georgia) Compressive Sensing with Random Matrices 9 November 017 1 / 18 Overview

More information

Sparse Fourier Transform (lecture 1)

Sparse Fourier Transform (lecture 1) 1 / 73 Sparse Fourier Transform (lecture 1) Michael Kapralov 1 1 IBM Watson EPFL St. Petersburg CS Club November 2015 2 / 73 Given x C n, compute the Discrete Fourier Transform (DFT) of x: x i = 1 x n

More information

Randomized Numerical Linear Algebra: Review and Progresses

Randomized Numerical Linear Algebra: Review and Progresses ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

Optimality of the Johnson-Lindenstrauss lemma

Optimality of the Johnson-Lindenstrauss lemma 58th Annual IEEE Symposium on Foundations of Computer Science Optimality of the Johnson-Lindenstrauss lemma Kasper Green Larsen Computer Science Department Aarhus University Aarhus, Denmark larsen@cs.au.dk

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes DOI 10.1007/s005-008-9110-x Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty Received: 8 November 2007 / Revised: 9 June 2008 / Accepted: September 2008 Springer

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

New constructions of RIP matrices with fast multiplication and fewer rows

New constructions of RIP matrices with fast multiplication and fewer rows New constructions of RIP matrices with fast multiplication and fewer rows Jelani Nelson Eric Price Mary Wootters July 8, 203 Abstract In this paper, we present novel constructions of matrices with the

More information

COMS E F15. Lecture 22: Linearity Testing Sparse Fourier Transform

COMS E F15. Lecture 22: Linearity Testing Sparse Fourier Transform COMS E6998-9 F15 Lecture 22: Linearity Testing Sparse Fourier Transform 1 Administrivia, Plan Thu: no class. Happy Thanksgiving! Tue, Dec 1 st : Sergei Vassilvitskii (Google Research) on MapReduce model

More information

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5 CS 229r: Algorithms for Big Data Fall 215 Prof. Jelani Nelson Lecture 19 Nov 5 Scribe: Abdul Wasay 1 Overview In the last lecture, we started discussing the problem of compressed sensing where we are given

More information

Lecture 2 Sept. 8, 2015

Lecture 2 Sept. 8, 2015 CS 9r: Algorithms for Big Data Fall 5 Prof. Jelani Nelson Lecture Sept. 8, 5 Scribe: Jeffrey Ling Probability Recap Chebyshev: P ( X EX > λ) < V ar[x] λ Chernoff: For X,..., X n independent in [, ],

More information

Lecture 4 February 2nd, 2017

Lecture 4 February 2nd, 2017 CS 224: Advanced Algorithms Spring 2017 Prof. Jelani Nelson Lecture 4 February 2nd, 2017 Scribe: Rohil Prasad 1 Overview In the last lecture we covered topics in hashing, including load balancing, k-wise

More information

Nearest Neighbor Preserving Embeddings

Nearest Neighbor Preserving Embeddings Nearest Neighbor Preserving Embeddings Piotr Indyk MIT Assaf Naor Microsoft Research Abstract In this paper we introduce the notion of nearest neighbor preserving embeddings. These are randomized embeddings

More information

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Ata Kabán School of Computer Science The University of Birmingham Birmingham B15 2TT, UK http://www.cs.bham.ac.uk/

More information

Declaring Independence via the Sketching of Sketches. Until August Hire Me!

Declaring Independence via the Sketching of Sketches. Until August Hire Me! Declaring Independence via the Sketching of Sketches Piotr Indyk Andrew McGregor Massachusetts Institute of Technology University of California, San Diego Until August 08 -- Hire Me! The Problem The Problem

More information

Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT

Tutorial: Sparse Recovery Using Sparse Matrices. Piotr Indyk MIT Tutorial: Sparse Recovery Using Sparse Matrices Piotr Indyk MIT Problem Formulation (approximation theory, learning Fourier coeffs, linear sketching, finite rate of innovation, compressed sensing...) Setup:

More information

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M.

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M. Polynomial Representations of Threshold Functions and Algorithmic Applications Ryan Williams Stanford Joint with Josh Alman (Stanford) and Timothy M. Chan (Waterloo) Outline The Context: Polynomial Representations,

More information

Lecture 18 Nov 3rd, 2015

Lecture 18 Nov 3rd, 2015 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 18 Nov 3rd, 2015 Scribe: Jefferson Lee 1 Overview Low-rank approximation, Compression Sensing 2 Last Time We looked at three different

More information