Optimal Bounds for Johnson-Lindenstrauss Transformations

Size: px
Start display at page:

Download "Optimal Bounds for Johnson-Lindenstrauss Transformations"

Transcription

1 Journal of Machine Learning Research Submitted 5/8; Revised 9/8; Published 0/8 Optimal Bounds for Johnson-Lindenstrauss Transformations Michael Burr Shuhong Gao Department of Mathematical Sciences Clemson University, Clemson SC 9634, USA burr@clemsonedu sgao@clemsonedu Fiona Knoll Department of Mathematical Sciences University of Cincinnati, Cincinnati, OH 45, USA fknoll@gclemsonedu Editor: Kenji Fukumizu Abstract In 984, Johnson and Lindenstrauss proved that any finite set of data in a high-dimensional space can be projected to a lower-dimensional space while preserving the pairwise Euclidean distances between points up to a bounded relative error If the desired dimension of the image is too small, however, Kane, Meka, and Nelson 0 and Jayram and Woodruff 03 proved that such a projection does not exist In this paper, we provide a precise asymptotic threshold for the dimension of the image, above which, there exists a projection preserving the Euclidean distance, but, below which, there does not exist such a projection Keywords: Johnson-Lindenstrauss transformation, Dimension reduction, Phase transition, Asymptotic threshold Introduction In 984, Johnson and Lindenstrauss 984, in establishing a bound on the Lipschitz constant for the Lipschitz extension problem, proved that any finite set of data in a highdimensional space can be projected into a lower-dimensional space while preserving the pairwise Euclidean distance within any desired relative error In particular, for any finite set of vectors x,, x N R d and for any error factor 0 < <, there exists an absolute constant c such that for all k c log N, there exists a linear map A : R d R k such that for all pairs i, j N, x i x j Ax i Ax j + x i x j, where denotes the Euclidean norm These inequalities are implied by the following theorem by setting δ = and using the union bound: N Theorem Johnson and Lindenstrauss 984 For any real numbers 0 <, δ <, there exists an absolute constant c > 0 such that for any integer k c log δ, there exists a probability distribution D on k d real matrices such that for any fixed x R d, Prob A D [ x Ax + x ] > δ, where A D indicates that the matrix A is a random matrix with distribution D c 08 Michael Burr, Shuhong Gao, and Fiona Knoll License: CC-BY 40, see Attribution requirements are provided at

2 Burr, Gao, and Knoll Note that, in order to project a large number of vectors, δ must be sufficiently small For instance, suppose we wish to project a set of N = 0 vectors to a smaller dimensional space To apply the union bound to Inequality, we use any δ < 39 In this case, Inequality implies that the probability of preserving all pairwise distances between N points up to a relative error of is at least δn / > 0 Since the probability is nonzero, such a projection exists A probability distribution D satisfying Inequality is called an, δ-jl distribution, or simply a JL distribution Since these transformations are linear, without loss of generality, we assume for the rest of the paper that x = When a JL distribution is specified via an explicit construction, we may call a random projection x Ax generated in this way a JL transformation Since the introduction of JL distributions, there has been considerable work on explicit constructions of JL distributions, see, eg, Johnson and Lindenstrauss 984; Frankl and Maehara 988; Indyk and Motwani 998; Achlioptas 003; Ailon and Chazelle 006; Matousek 008; Dasgupta et al 00; Kane and Nelson 04; and the references therein A simple and easily described JL distribution is that of Achlioptas 003 In this construction, the entries of A are distributed as follows: a ij = { k /, with probability /, k /, with probability / The recent constructions in Ailon and Chazelle 006; Matousek 008; Dasgupta et al 00; Kane and Nelson 04 have focused on the complexity of computing a projection for the purpose of applications We note that the ability to project a vector to a smaller dimensional space, independent of the original dimension, while preserving the Euclidean norm up to a prescribed relative error, is highly desirable In particular, dimension reduction has applications in many fields, including machine learning Arriaga and Vempala 006; Weinberger et al 009, low rank approximation Clarkson and Woodruff 03; Nguyen et al 009; Ubaru et al 07, approximate nearest neighbors Ailon and Chazelle 006; Indyk and Motwani 998, data storage Candès 008; Cormode and Indyk 06, and document similarity Bingham and Mannila 00; Lin and Gunopulos 003 For both practical and theoretical purposes, it is important to know the smallest possible dimension k of a potential image space for any given and δ Note that, for any d < d, each, δ-jl distribution D on R k d induces an, δ-jl distribution D on R k d in a natural way: the matrices of D are obtained from D by deleting the last d d columns of a random matrix, together with the induced probability distribution This construction is a JL distribution since R d can be naturally embedded into R d by extending a vector in R d by d d zeros Hence, if there exists an, δ-jl distribution on R k d, then there is an, δ-jl distribution on R k d for all d d Similarly, if an, δ-jl distribution does not exist on R k d, then, for any k < k, there cannot be an, δ-jl distribution on R k d In particular, since R k can be naturally embedded into R k by extending a vector in R k by k k zeros, if an, δ-jl distribution existed for R k d, it could be extended to an, δ-jl distribution for R k d

3 Optimal Bounds for Johnson-Lindenstrauss Transformations No JLD Exists JLD Exists c log δ 4 log δ [ + o] Figure : For fixed and δ, there exists a JL distribution for k 4 log/δ [ + o] For k < c log/δ, for some absolute constant c > 0, there is no JL distribution In this paper, we close this gap in the limit For any and δ, we define k 0, δ = min{k : there exists an, δ-jl distribution on R k d for every d } By our definition, k 0 = k 0, δ is independent of d, and, by Theorem, we have that k 0 c log/δ for some absolute constant c > 0 Frankl and Maehara 988 show that c 9 Achlioptas 003 further improves this bound by providing a JL distribution with resulting in the following upper bound: k > log/δ 3, 3 k 0 log/δ 3 = 4 log/δ [ + o], 3 where o approaches zero as both and δ approach zero A lower bound on k 0 was not given until 003 when Alon 003 proved that / k 0 c log/δ log/ for some absolute constant c > 0 Improving Alon s work, Jayram and Woodruff 03 and Kane et al 0 showed, through different methods, that, for some absolute constant c > 0, there is no, δ-jl distribution for k c log δ Hence, there is a lower bound of the form k 0 c log δ This situation is summarized in Figure The goal of the current paper is to close the gap between the upper and lower bounds in the limit In particular, we prove an optimal lower bound that asymptotically matches the known upper bound when and δ approach 0, see Theorem This means that 4 log/δ is an asymptotic threshold for k 0 where a phase change phenomenon occurs The main result of this paper is captured in the following theorem: Theorem For and δ sufficiently small, k 0 4 log/δ More precisely, lim,δ 0 k 0, δ 4 log/δ = 3

4 Burr, Gao, and Knoll The rest of the paper is organized as follows: To prove Theorem, we follow the approach of Kane et al 0 To make their constant c explicit, however, we must use a more careful argument In Section, we provide explicit conditions under which we prove the main result, Theorem We delay the proofs of these conditions until Section 3 in order to make the main result more accessible, since only the statements of these results are needed and not their more technical proofs In Section 3, we prove probabilistic bounds on s = x + +x k where x = x,, x k,, x d is a random variable uniformly distributed on S d These bounds explicitly determine the hidden constants in the bounds on Prob [s < s 0 ] and Prob [s > s 0 + ] from Kane et al, 0, Theorem 0 These bounds can be viewed as explicit bounds for concentration theorems for laws of large numbers from probability theory In the Appendix, we provide an alternate proof of a result in Kane et al 0, showing that if dω d is the surface area for the d -dimensional sphere S d, then for any k d, dω d = fsds dω k dω d k, where s [0, ] and fs = s k s d k Asymptotic Threshold Bound In this section, we prove the asymptotic threshold bound for JL transformations In particular, we provide specific conditions that result in the asymptotic threshold bound of 4 log/δ In Section 3, we prove that these conditions hold, but the details of these proofs are more technical and are unnecessary to understand the main results of this paper The Uniform Distribution on S d There is a unique probability distribution, called the uniform distribution, on the d - dimensional sphere S d that is invariant under the orthonormal group We express the uniform distribution on S d in terms of the surface area differential form dω d, which means that, for any measurable subset V S d, the d -dimensional surface area of V is equal to the integral with respect to dω d, ie, Vol d V = V dω d Thus, the uniform distribution on S d corresponds to the measure dω d /Vol d S d As we are interested in reducing a d-dimensional vector to a k-dimensional vector for k < d, we derive a relationship between the uniform distribution on S d and the uniform distributions on S k and S d k Following the approach of Kane et al 0, for k < d, we define an injective map Ψ : S d [0, ] S k S d k as follows: For any x = x, x,, x d t S d, we define s in Ψx = s, u, v as s = x + + x k In the case where 0 < s <, we define u = x,, x k t / s and v = x k+,, x d t / s In this paper, we suppress the pullback maps on equalities for differential forms since there is a unique almost bijective map under consideration in each case We leave the details to the interested reader 4

5 Optimal Bounds for Johnson-Lindenstrauss Transformations When s = 0, ie, x = = x k = 0, we define u =, 0,, 0 t or any point in S k and v = x k+,, x d t Similarly, for s =, we define u = x,, x k t and v =, 0,, 0 t or any point in S d k It is straight-forward to check that Ψ is injective In addition, the complement of the image of Ψ is a subset of {0, } S k S d k, which has, in turn, d -dimensional surface area equal to 0 Therefore, when convenient, we assume that s 0, For s [0, ], we define fs = s k d k s In the Appendix, we provide alternative proof to the computation in Kane et al 0 which shows that, via the map Ψ, dω d = fsds dω k dω d k Equivalently, in term of probability distributions, dω d dω k dω d k Vol d S d = Bfsds Vol k S k Vol d k S d k, where B = Vol k S k Vol d k S d k Γ d Vol k S k = Γ k d k Γ is an appropriate scaling constant, depending on the gamma function, for more details, see the Appendix Moreover, in this situation, we observe that Bfs is a probability distribution on [0, ] This implies that the uniform distribution on S d is a direct product of the distributions on the factors In other words, a uniformly distributed random variable X d on S d can be decomposed into three random variables ΨX d = S, X k, X d k with the following properties: i S is a random variable on [0, ] with density function Bfs, ii X k and X d k are uniformly distributed on S k and S d k, and iii The random variables S, X k, and X d k are independent The independence of these three random variables is a key property in our proof as it allows us to study the three spaces separately Upper Bound: Explicit JL Distribution We recall that Achlioptas 003 proved that k 0, δ log/δ 3 = 4 log/δ [ + o] 3 In this section, we give an alternate proof of this result using the approach and bounds from this paper We recall the following construction by Dasgupta and Gupta 003: A distribution D on k d matrices is formed by picking a d d orthonormal matrix V = v,, v d t uniformly 5

6 Burr, Gao, and Knoll at random with respect to the Haar measure on orthonormal matrices and then letting A = s0 v,, v k t, where s 0 = k/d The following proposition shows that k 0, δ 4 log/δ [ + o], which, in turn, implies that the limit appearing in Theorem if it exists is at most : Proposition 3 Let 0 <, δ < and s 0 = k/d Suppose that there is some constant C so that max{prob x S d [s < s 0 ], Prob x S d [s > s 0 + ]} Ce k 4 3, where s = x + +x k is defined as in Ψx = s, u, v Then, there exists an o function, which approaches zero as both and δ approach zero so that if k > 4 log δ [ + o], then the distribution on k d random matrices defined as above is an, δ-jl distribution, that is, for any w S d, Prob A D [ Aw < ] δ Proof Let V be the random orthogonal matrix as defined above, and let x = x,, x d t = V w Then Aw = s 0 x,, x k t, and Aw = s 0 x + + x k Since V is orthonormal and w =, we have that x =, and, hence, x S d We observe that since V is a random orthogonal matrix, for fixed w S d, x = V w is a random variable, uniformly distributed on S d Hence, [ Prob A D Aw > ] [ ] k = Prob x S d x i >, where x S d means that x is a random variable uniformly distributed on S d Let s = k i= x i Then, s [0, ] and the probability above becomes Prob x S d [s < s 0 ] + Prob x S d [s > s 0 + ] Ce k 4 3, 3 by assumption We observe that when [ ] k > 4 log + δ 3 + logc log /3 + 4 log = 4 log [ + o], δ δ the right-hand-side of Inequality 3 is less than δ In this case, the o term needed in the theorem statement appears in Inequality 4 Therefore, when k > 4 log δ [ + o], the distribution D is an, δ-jl distribution s 0 δ i= 4 We observe that this result also follows from work in Dasgupta and Gupta 003 In particular, Dasgupta and Gupta, 003, Lemma can be used to derive bounds similar to the assumed bounds in this statement 6

7 Optimal Bounds for Johnson-Lindenstrauss Transformations 3 Lower Bound for Arbitrary Distributions In this section, we prove an optimal lower bound on the limit in Theorem that matches the upper bound from the previous section The proof of this lower bound is the main challenge in this paper We begin with the following key lemma: Lemma 4 Let x = x,, x d t be a random variable, uniformly distributed on S d, Ψx = s, u, v, and s 0 = k/d Suppose that for a fixed > 0, min{prob [s > s 0 + ], Prob [s < s 0 ]} L, where s is a random variable with probability distribution Bfs on [0, ] For any function cu, v > 0 depending only on u S k and v S d k ie, independent of s, we have Prob x S d [ sc > ] L Proof By the equality of differential forms in Equation, dω k dω d k Prob [ sc > ] = Bfsds sc > Vol k S k Vol d k S d k dω k dω d k = Vol k S k Vol d k S d k S k S d k Bfsds sc > Our goal is to find a lower bound on the integral sc > Bfsds Due to the independence of u, v, and s, cu, v is a fixed positive constant within this integral We observe that sc > consists of two intervals, s < /c and s > + /c, and we consider two cases depending on the value of c We begin by recalling that Prob [s > s 0 + ] = Bfsds and Prob [s < s 0 ] = Bfsds s>s 0 + s<s 0 If c s 0, then + /c + /s 0, and, hence Bfsds Bfsds Bfsds L sc > s>+/c s>+/s 0 On the other hand, if c < s 0, then /s 0 < /c, then Bfsds Bfsds Bfsds L sc > s< /c s< /s 0 Therefore, the integral sc > Bfsds is bounded from below by L, and Prob [ sc > ] S k S d k L dω k dω d k Vol k S k Vol d k S d k = L 7

8 Burr, Gao, and Knoll We now show that when k η log/δ with η < 4, and and δ are sufficiently small, there does not exist an, δ-jl distribution on R k d This fact, combined with the results in Section, shows that the limit appearing in Theorem exists and equals In order to show this, we consider the following related problem: By definition, for a probability distribution D on R k d to be an, δ-jl distribution, the following inequality must hold for every w S d : Prob A D [ Aw > ] < δ Hence, [ Prob A D, w S d Aw > ] < δ, 5 where w S d is a random variable distributed uniformly on S d Following the approach of Kane et al 0, our goal is to prove that, for every A R k d, [ Prob w S d Aw > ] > δ 6 When Inequality 6 holds for all A, then Inequality 5 cannot hold for any distribution D on R k d Therefore, an, δ-jl distribution does not exist We make this precise in the following theorem: Theorem 5 Suppose that η < 4 and let k, δ = η log δ Let s0 = k/d, and suppose that, for every, δ, and s 0 sufficiently small to make s 0 sufficiently small, d must be sufficiently large, min{prob [s > s 0 + ], Prob [s < s 0 ]} Cδ η 4 γ, where C > 0 is an absolute constant, and γ approaches as, δ, and s 0 approach 0 Then, by decreasing, δ, and s 0 as needed, for every matrix A R k,δ d, [ Prob w S d Aw > ] > δ Proof We assume that A has rank k = k, δ since, if not, we may reduce k and decrease η correspondingly to the rank of A Let A = UΣV t be the singular value decomposition of A where U is a k k orthonormal matrix, V = v,, v d is a d d orthonormal matrix, and Σ is a k d diagonal matrix with λ i > 0 its entry at Σ i,i for i k Let x = x,, x d t = V t w Since V is orthonormal, we have x S d We observe that since w is a uniformly distributed random variable on S d, V t w is also a uniformly distributed random variable on S d Therefore, since U is orthonormal, we have Aw = UΣx = Σx = k λ i x i We recall the definition of the function Ψx = s, u, v where s = x + + x k Moreover, we restrict our attention to the case where s 0, since the complement has zero measure Let k c = λ i x i /s = Σ k k u, i= 8 i=

9 Optimal Bounds for Johnson-Lindenstrauss Transformations where Σ k k denotes the k k principal submatrix of Σ, then [ Prob w S d Aw > ] = Prob x S d [ sc > ] Due to the independence of u, v, and s, it follows that c depends only on u Therefore, by Lemma 4, it follows that [ Prob w S d Aw > ] Cδ η 4 γ It follows that for, δ, and s 0 sufficiently small, Cδ η 4 γ > δ Therefore, by selecting, δ, and s 0 sufficiently small, it follows from Theorem 5 that there is no, δ-jl distribution when k < η log δ for η < 4 Therefore, k0, δ η log δ We collect the results of Proposition 3 and Theorem 5 in the following corollary: Corollary 6 Assume the hypotheses on Prob [s > s 0 + ] and Prob [s < s 0 ] from Proposition 3 and Theorem 5 hold a There exists an o function that approaches 0 as and δ approach zero such that if k > 4 log δ [ + o], then there exists a JL distribution b If k, δ = η log δ, then, by decreasing and δ, and increasing d, there is no, δ-jl distribution for any k k, δ This proves the main result in the paper In the following section, we provide the more technical results that verify the assumptions in Proposition 3 and Theorem 5 3 Explicit Concentration Bounds Throughout this section, we assume that s is a random variable with probability distribution Bfs We define s 0 = k d, and we further assume that 0, δ /, k 4 and s 0 < 04 We derive lower and upper bounds for the following probabilities: Prob [s > s 0 + ] and Prob [s < s 0 ], using the probability density Bfs for s and fs = s k / s d k / These probabilities appear in Kane et al, 0, Theorem 0 without the explicit constants that are derived in this paper These bounds are instances of explicit concentration theorems or explicit laws of large numbers from probability theory Our goal is to formulate these bounds as precisely as possible so that the lower and upper bounds are asymptotically the same when and δ approach 0 3 Bounds for B We recall that Γ/ = π, Γ =, and Γ + z = zγz Hence Γ d d d d d =! if d is even, and Γ = 3 9 π if d is odd

10 Burr, Gao, and Knoll In this section, we derive lower and upper bounds for B, see Equation 38, by using the following form of Stirling s approximation of n! due to Robbins 955: πn n+/ e n e n+ < Γn + = n! < πn n+/ e n e n Since we are interested in the asymptotic behavior, we focus on the case where d is even This choice does not affect the asymptotic results of our paper, but the calculations are more straight-forward in this case We leave the details for the case where d is odd to the interested reader Lemma 7 Suppose k and d are both even Then we have the following inequality: e π d d / e k k / B d k / d k π Proof Using the bound on n! from Robbins 955, we obtain C 0 where d d / k k / d k d k / d d / k k / d k d k / B C d d / k k / d k d k /, C 0 = π e e C = π e e 6d + e 6k e 6d k 6d e 6k + e 6d k + e π, and e π Corollary 8 With s 0 = k/d, we have e k Bs0 fs 0 9e k π π Proof By evaluating f at s 0 and replacing B by its lower bound found in Lemma 7, we obtain the lower bound Bs 0 fs 0 e k d k d kd d kd k e π d k dk dd k k π Similarly, by using the upper bound in Lemma 7, we obtain the upper bound Bs 0 fs 0 e π where the last inequality follows from d k d k d k d k k d k d k x d d k since s 0 < 04, and x x d d k 9e π k, 3 for x 3 Throughout this section, it is possible to derive tighter bounds on constants in the following inequalities, but the ones appearing here are sufficient for our proofs We leave the details of the tighter bounds to the interested reader 0

11 Optimal Bounds for Johnson-Lindenstrauss Transformations 3 Bounds on Prob [s > s 0 + ] We begin by mentioning the following inequalities which are used in our arguments below: log + x x x for 0 < x <, 7 log x x x for 0 < x < 068, 8 log + x x for x >, and 9 log + x x x + x3 3 for x > 0 These bounds can be verified by employing basic calculus techniques eg, derivatives and Taylor expansions as well as sufficiently accurate approximations Using these inequalities, we derive the following bounds: Lemma 9 Moreover, when k < η log δ, Prob [s > s 0 + ] e 4 π e 4 k+ +s 0 s 0 Prob [s > s 0 + ] e 4π δ η 4 γ, where γ = + η log/δ / + s 0 s 0 Additionally, γ approaches as, δ, and s 0 approach 0 Proof Note that Prob [s > s 0 + ] = B s>s 0 + fsds = Bs 0 s 0 fs 0 + xdx, via the substitution s = s 0 + x Let gs = s k/ s d k/, then fs 0 +x can be expressed in terms of gs, namely, fs 0 + x = gs 0 + x s 0 + x s 0 + x To find a lower bound on Prob [s > s 0 + ], we compute a bound on gs 0 + x from below Taking the logarithm of gs 0 + x, we find log gs 0 + x = log gs 0 + d s 0 log + x + s 0 log s 0 x 3 s 0 Restricting x to the interval 0 x <, it then follows that 0 < s 0 s 0 x < 068 from the assumption that s 0 < 04 We now bound the second term in Equation 3 using Inequalities 7 and 8, as follows: s 0 log + x+ s 0 log s 0 s0 + s 0 x x 4 s 0 s 0

12 Burr, Gao, and Knoll Hence, by substituting Inequality 4 into Equation 3 and exponentiating, we obtain the following lower bound for gs 0 + x: gs 0 + x gs 0 e k 4 x +s 0 s 0 5 We also observe that since s 0 < 04 and 0 x <, the denominator of Equation is bounded from below as follows: s 0 + x s 0 + x s 0 s 0 6 Therefore, by substituting Inequalities 5 and 6 into Equation when 0 x <, we have gs 0 k fs 0 + x s 0 s 0 e 4 x +s 0 s 0 = fs 0e k 4 x +s 0 s 0 7 Since s 0 < 04, we observe that s 0 = d k k > 5 Since we assumed that < and k 4 + > 4, it follows that + k / < and so + k / < < s 0 Therefore, we further restrict x to the interval, + k / and observe that Inequality 7 applies in this range Therefore, since f is a positive function, Bs 0 s 0 +k / fs 0 + xdx Bs 0 fs 0 + xdx Bs 0fs 0 +k / e k 4 x +s 0 s 0 dx 8 Replacing Bs 0 fs 0 with its lower bound given in Corollary 8 and observing that the integrand is decreasing over an interval of width k /, Inequality 8 is bounded from below by +k / Bs 0fs 0 e k 4 x +s 0 s 0 dx e 4 π e 4 k+ +s 0 s 0, which completes the first inequality The second inequality follows by replacing k by the given upper bound and simplifying Lemma 0 Prob [s > s 0 + ] 7e π e k 4 3 Proof To derive an upper bound, we start with the expression for Prob [s > + s 0 ] from Equation We first find an upper bound on fs 0 + x Then, we bound fs 0 + x using the inequality x e x for all x as follows: fs 0 +x = fs 0 +x k Moreover, since s 0 < 04, s d k 0 x s 0 fs 0 +x k e s d k 0 x s 0, 9 s 0 d k = k d k > k s 0 d k 0

13 Optimal Bounds for Johnson-Lindenstrauss Transformations By applying Inequality 0 to Inequality 9, we derive the upper bound fs 0 + x k e k x Therefore, by extending the region of integration in Inequality, we find the following upper bound on the probability: Bs 0 s 0 fs 0 + xdx Bs 0 fs 0 + x k e k x dx By integrating by parts, we observe that for any l and m with l m, + x l e mx dx m + l e m + + x l e mx dx Applying Inequality k times to the integral in Inequality and bounding the resulting geometric series from above gives + x k e k x dx e k k k + i i=0 + k + e k k By applying Inequality 0 to + k = e k log+, we obtain the upper bound + k + e k k + k e 4 3 k Since k 4, it follows that k k k 4, and, hence, that k k Therefore, we can further simplify our bound to + k e 4 3 k + e k k By combining Inequalities and 3, we find an upper bound on the probability as follows: s 0 + Bs 0 fs 0 + xdx Bs 0 fs 0 e k 4 3 k Applying the upper bound on Bs 0 fs 0 from Corollary 8 and the assumption that completes the inequality 33 Bounds on Prob [s < s 0 ] We begin this section by including the following two additional inequalities on log x: log x x x x3 for 0 < x < 085, 4 log x x x for 0 x < 5 These bounds can be justified using a similar approach as for Inequalities 7-0 Using these inequalities, we derive the following bounds: 3

14 Burr, Gao, and Knoll Lemma Prob [s < s 0 ] e π e k+ + 3 k+k 4 s /6 3 0 Moreover, when k < η log δ, Prob [s < s 0 ] e π δ η 4 γ, where γ = + s 0 η log δ + /3 + 3 η log δ 3 Additionally, γ approaches as, δ, and s 0 approach 0 Proof The proof of this lemma is very similar to the proof of Lemma 9, so we focus on the new details The probability can be rewritten, using the substitution s = s 0 x, as Prob [s < s 0 ] = B s<s 0 Using gs as in Lemma 9, it follows that fsds = Bs 0 fs 0 xdx 6 fs 0 x = gs 0 x s 0 x s 0 x 7 and log gs 0 x = log gs 0 + d s 0 log x + s 0 log + s 0 x 8 s 0 Since 0 < x, it follows that 0 < s 0 s 0 x < 068 from the assumption that s 0 < 04 Therefore, we can bound the second term in Equation 8 using Inequalities 7 and 4, as follows: s 0 log x + s 0 log + s 0 x s 0 s 0 s 0 x s 0 x 3 9 Substituting Inequality 9 into Expression 8 and exponentiating, we get Since s 0 < 04, it follows that gs 0 x gs 0 e k 4 s 0 s 0 x s 0 +x 3 30 <, and, hence, that x + s 0 s 0 x < Therefore, the denominator in Equation 7 can be bounded from above by s 0 x s 0 x = s 0 s 0 x + s 0x s 0 s 0 3 s 0 4

15 Optimal Bounds for Johnson-Lindenstrauss Transformations Therefore, by substituting Inequalities 30 and 3 into Expression 7, when < x <, we have fs 0 x fs 0 e k x +x 4 s Since < and k 4+ > 4, it follows that +k / < Therefore, we restrict x to the interval, + k /, and observe that Inequality 3 applies in this range Therefore, Bs 0 +k / fs 0 xdx Bs 0 fs 0 e k x +x 4 s 0 dx 3 33 Replacing Bs 0 fs 0 with its lower bound given in Corollary 8 and observing that the integrand is decreasing on an interval of width k /, Inequality 33 is bounded from below by +k / Bs 0 fs 0 e k x +x 4 s 3 0 dx e π e k+ + 3 k+k 4 s /6 3 0 which completes the first inequality The second inequality follows by replacing k by the given upper bound and simplifying, Lemma Prob [s < s 0 ] 8 e / e k 4 8 e / k e 4 3 π π Proof The proof of this lemma is very similar to the proof of Lemma 0, so we focus on the new details To prove an upper bound, we start with the bound on Prob [s < s 0 ] from Equation 6 We first observe that fs 0 x = fs 0 x k + s d k 0 x 34 s 0 We now bound the logarithm of the second and third factors in Equation 34 using Inequalities 9 and 5 as follows: log Since x k s 0 s 0 = k d k k + s 0 s 0 x d k k x x + d k s0 x s 0 and < x <, Inequality 35 further simplifies to x x + d k k d k x x Hence, for x <, we have k 4 x 3 k x 4 35 fs 0 x fs 0 e 3/ e k 4 x fs 0 e 3/ e k 4 x 36 5

16 Burr, Gao, and Knoll Substituting Inequality 36 into the integral of Equation 6, we find Bs 0 fs 0 xdx Bs 0 fs 0 e 3/ e k 4 x dx Bs 0 fs 0 e 3/ Since k, Inequality 37 can be further simplified to Bs 0 fs 0 4e3/ k e k 4 e k 4 x dx Bs 0 fs 0 4e3/ k e k 4 37 Applying the upper bound on Bs 0 fs 0 from Corollary 8 completes the first inequality of the proof The final inequality follows from the fact that k k 3 This completes the proof all of the conditions in Section, and, therefore, completes the proof of our main theorem, Theorem Acknowledgments This work was supported by a grant from the Simons Foundation #8399 to Michael Burr and National Science Foundation Grants CCF-5793, CCF-40763, DMS-40306, and DMS Appendix A Uniform Distributions on Unit Spheres in High Dimensions In this section, we study uniform distributions and surface areas or hypervolumes on high-dimensional spheres More precisely, for any d, let S d denote the unit sphere of dimension d and dω d denote the surface area measure for S d We show that, for any k d, dω d = fsds dω k dω d k, where s [0, ] and fs = s k s d k This is a more precise version of a result in Kane et al 0, replacing an unspecified constant by / We include the proof of this result here in order to make this result more accessible to a larger population of researchers This formula is of independent interest since it shows that the uniform distribution on S d is a product of uniform distributions on S k and S d k with a distribution on [0, ] Theorem 3 Under the almost bijective map Ψ : S d [0, ] S k S d k, we have equality of the surface area differential forms on S d, S k, and S d k, ie, where fs = s k / s d k / measures, dω d dω d = fsds dω k dω d k, Equivalently, in terms of probability distribution dω k dω d k Vol d S d = Bfsds Vol k S k Vol d k S d k 6

17 Optimal Bounds for Johnson-Lindenstrauss Transformations Hence, the uniform distribution on S d can be identified with the product distribution on [0, ] S k S d k where the distribution of s on [0, ] has density function Bfs and B = Vol k S k Vol d k S d k Vol k S k = Γ k Γ d 38 d k Γ This theorem is based on the following lemma, which is well-known to experts, but is included here for completeness Lemma 4 Let x = x,, x d with x d > 0 be a point on the upper hemisphere of S d Then the surface area measure of the unit sphere S d at x is dω d = x d dx dx d Before we begin the proof, we recall the approach for S in 3-dimensional space We consider the upper hemisphere of S as the graph of a function over D, where D d denotes d - dimensional disk, namely D d = { ˆx R d : d i= ˆx i We then integrate over the disk D to calculate the surface area of S In particular, the integrand is the limit of the ratios of the area of a square in D to the area of the corresponding parallelogram above the square in the tangent space of S as the square shrinks a point In the case of the sphere, the parallelogram s area is calculated using the cross product, but we must replace the use of the cross product in higher dimensions Proof In d-dimensional space, we consider the upper hemisphere of S d as the graph of a function over the d -dimensional disk D d We construct a pair of d - dimensional parallelepipeds as follows: P d is in the tangent space of D d and Q d is in the tangent space of S d Then, we take the limit of their d -dimensional volumes as P d approaches a point Due to complications in taking the d -dimensional volume in d-dimensional space, we extend both P d and Q d to associated, full-dimensional parallelepipeds Let ˆx,, ˆx d D d and define φ : D d R 0 as d φˆx,, ˆx d = ˆx i We observe that the graph of this function is the upper hemisphere of S d We now extend this map to D d R as Φ d : D d R R d defined by ˆx,, ˆx d, ˆx d + ˆx d ˆx, + ˆx d ˆx,, + ˆx d ˆx d, + ˆx d φˆx,, ˆx d We observe that Φ d D d {0} maps the disk D d {0} surjectively onto the graph of φ, ie, the upper hemisphere of S d, see Figure i= } 7

18 Burr, Gao, and Knoll Φ d π Figure : Φ d maps the disk D d {0} surjectively onto the upper hemisphere of the d - dimensional unit sphere S d We observe that Φ d = π is the projection map onto the first d coordinates We recall that, at ˆx,, ˆx d D d, the tangent space is R d and we define the parallelepiped P d in the tangent space by the vectors ˆx i e i of length ˆx i in the direction of the i th standard basis vector e i of R d Similarly, the tangent space at ˆx,, ˆx d, 0 D d R is R d, and we define the parallelepiped P d in this tangent space by the vectors ˆx i e i for i d and he d for the final direction Then, as the vector he d is perpendicular to the tangent vectors of the disk D d, the d-dimensional volume of P d can be computed in terms of the d -dimensional volume of P d and the height h, ie, Vol d P d = hvol d P d Next, we let Q d and Q d be the images of P d and P d under the Jacobian of Φ d, ie, Jac Φ d, respectively Note that the Jacobian of Φ d, when restricted to D d {0}, is ˆx I Jac Φ d D d {0} = ˆx d ˆx ˆx φˆx,,ˆx d d φˆx,,ˆx d φˆx,, ˆx d Since the Jacobian acts on tangent vectors, Q d is defined by the vectors ˆx i ˆx i f i φˆx,, ˆx d f d, for i d, where the f i is the i th standard basis vector of R d Moreover, Q d is defined by these vectors as well as the image of he d, ie, h ˆx,, ˆx d, φˆx,, ˆx d 8

19 Optimal Bounds for Johnson-Lindenstrauss Transformations ˆx We observe that the vectors ˆx i f i i φˆx,,ˆx d f d for i d are tangent vectors to the sphere S d, and that h ˆx,, ˆx d, φˆx,, ˆx d is the outward pointing surface normal with length h Since the tangent vectors are perpendicular to the outward pointing normal, the volumes of Q d and Q d have a similar relationship as the volumes of P d and P d, ie, Vol d Q d = hvol d Q d Therefore, the ratio between the d-dimensional volumes of Q d and P d is the same as the ratio of the d -dimensional volumes of Q d and P d Since Q d is the image of P d under the linear map Jac Φ dd d {0}, it follows that Vol d Q d = det Jac Φ d ˆx,,ˆx d,0 Vol d P d Therefore, the ratio of the volumes of Q d and P d is det Jac Φ d ˆx,,ˆx d,0 It is straight-forward to compute the determinant of JacΦ d at the point ˆx,, ˆx d, 0 via a few row reductions to eliminate the first d entries in the last row and turn the matrix into an upper triangular matrix whose lower right corner is φˆx,,ˆx d Hence, the determinant of JacΦ d is φˆx,,ˆx d, which is the desired scaling factor Therefore, φˆx,,ˆx d is the local factor in the stretching of the surface area in the map from D d to S d We recall that the coordinates x,, x d are the coordinates on the upper hemisphere and ˆx,, ˆx d are the coordinates on D d Since, under the map Φ d D d {0}, x i = ˆx i for i d, it follows that dˆx dˆx d = dx dx d and φˆx,, ˆx d = x d From here, the result follows directly Proof of Theorem 3 Let w,, w d, x,, x k, and y,, y d k be coordinates of points on the d -dimensional unit sphere, the k -dimensional unit sphere, and the d k -dimensional unit sphere, respectively Let ŵ,, ŵ d, ˆx,, ˆx k, and ŷ,, ŷ d k be the coordinates of points on the disks D d, D k and D d k, respectively Let ϕ : [0, ] D k D d k D d be defined by s ˆx,, ˆx k ŷ,, ŷ d k sˆx,, sˆx k, k s ˆx i, sŷ,, sŷ d k We observe that ϕ maps the disks D k and D d k onto the half of the disk D d whose k th coordinate is nonnegative As the measure of the image is the measure of the preimage scaled by the determinant of the Jacobian of ϕ, the surface area measure of the disk D d is dŵ dŵ d = det Jacϕ ds dˆx dˆx k dŷ dŷ d k 39 i= 9

20 Burr, Gao, and Knoll The Jacobian of ϕ for s 0, is s Jac ϕ = ˆx s ˆx k s k i= ˆx i s ŷ s ŷ d k s 0 sˆx k i= ˆx i s 0 0 sˆx k k i= ˆx i s s Eliminating all but the k th entry of the first column by adding multiples of the other columns to the first column, we obtain det Jacϕ = ˆx k s k s d k Substituting this value into Expression 39, we have the surface area measure of D d in terms of the disks D k and D d k That is, dŵ dŵ d = ˆx k s k / s d k / ds dˆx dˆx k dŷ dŷ d k 40 We observe that the coordinates of the disk D t correspond to the first t entries of coordinates of the unit sphere S t Therefore, we may extend ϕ to the map Ψ, as defined above, where Ψ = Φ d ϕ id s Φ k Φ d k Employing the results of Lemma 4 in various dimensions, we rewrite the surface measure of a unit sphere in terms of the surface measure of the corresponding disks: dˆx dˆx k = dx dx k = x k dω k dŷ dŷ d k = dy dy d k = y d k dω d k dŵ dŵ d = dw dw d = w d dω d By applying the Ψ, we can substitute these three equalities into Equation 40 to obtain dω d = w d dw dw d = y d k w d s k / s d k / ds dω k dω d k = sk / s d k / ds dω k dω d k = fsds dω k dω d k, where the third equality follows from the fact that w d = sy d k by the map Ψ Since the cases where s = 0 or s = have measure 0, the result follows 0

21 Optimal Bounds for Johnson-Lindenstrauss Transformations References Dimitris Achlioptas Database-friendly random projections: Johnson-Lindenstrauss with binary coins Journal of Computer and System Sciences, 664:67 687, 003 Nir Ailon and Bernard Chazelle Approximate nearest neighbors and the fast Johnson- Lindenstrauss transform In Proceedings of the 38th ACM Symposium on Theory of Computing STOC, pages , 006 Noga Alon Problems and results in extremal combinatorics Discrete Mathematics, 73: 3 53, 003 Rosa I Arriaga and Santosh Vempala An algorithmic theory of learning: Robust concepts and random projection Machine Learning, 63:6 8, 006 Ella Bingham and Heikki Mannila Random projection in dimensionality reduction: applications to image and text data In Knowledge Discovery and Data Mining, pages 45 50, 00 Emmanuel J Candès The restricted isometry property and its implications for compressed sensing Comptes Rendus Mathématique Académie des Sciences Paris, 346:589 59, 008 Kenneth L Clarkson and David P Woodruff Low rank approximation and regression in input sparsity time In Proceedings of the 45th ACM Symposium on Theory of Computing STOC, pages 8 90, 03 Graham Cormode and Piotr Indyk Stable distributions in streaming computation In Garofalakis M, Gehrke J, and Rastoi R, editors, Data Stream Management Springer, Berlin, Heidelberg, 06 Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós A sparse Johnson-Lindenstrauss transform In Proceedings of the 4nd ACM Symposium on Theory of Computing STOC, pages , 00 Sanjoy Dasgupta and Anupam Gupta An elementary proof of a theorem of Johnson and Lindenstrauss Random Structure and Algorithms, :60 65, 003 Peter Frankl and Hiroshi Maehara The Johnson-Lindenstrauss lemma and the sphericity of some graphs Journal of Combinatorial Theory Series B, 443:355 36, 988 Piotr Indyk and Rajeev Motwani Approximate nearest neighbors: Towards removing the curse of dimensionality In Proceedings of the 30th ACM Symposium on Theory of Computing STOC, pages , 998 T S Jayram and David P Woodruff Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with subconstant error ACM Transactions on Algorithms, 96, 03 William B Johnson and Joram Lindenstrauss Extensions of Lipschitz mappings into a hilbert space Contemporary Mathematics, 6:89 06, 984

22 Burr, Gao, and Knoll Daniel M Kane and Jelani Nelson Sparser Johnson-Lindenstrauss transforms Journal of the ACM, 6, 04 Daniel M Kane, Raghu Meka, and Jelani Nelson Almost optimal explicit Johnson- Lindenstrauss families In Proceedings of the 5th International Workshop on Randomization and Computation RANDOM, pages , 0 Jessica Lin and Dimitrios Gunopulos Dimensionality reduction by random projection and latent semantic indexing In Proceedings of the Text Mining Workshop, at the 3rd SIAM International Conference on Data Mining, May 003 Jiri Matousek On variants of the Johnson-Lindenstrauss lemma Random Structure and Algorithms, 33:4 56, 008 Nam H Nguyen, Thong T Do, and Trac D Tran A fast and efficient algorithm for lowrank approximation of a matrix In Proceedings of the 4st ACM Symposium on Theory of Computing STOC, 009 Herbert Robbins A remark on Stirling formula American Mathematical Monthly, 6:6 9, 955 Shashanka Ubaru, Arya Mazumdar, and Yousef Saad Low rank approximation and decomposition of large matrices using error correcting codes IEEE Transactions on Information Theory, 639: , 07 Kilian Q Weinberger, Anirban Dasgupta, John Langford, Alexander J Smola, and Josh Attenberg Feature hashing for large scale multitask learning In Proceedings of the 6th Annual International Conference on Machine Learning ICML, volume 3-0, 009

16 Embeddings of the Euclidean metric

16 Embeddings of the Euclidean metric 16 Embeddings of the Euclidean metric In today s lecture, we will consider how well we can embed n points in the Euclidean metric (l 2 ) into other l p metrics. More formally, we ask the following question.

More information

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Nir Ailon May 22, 2007 This writeup includes very basic background material for the talk on the Fast Johnson Lindenstrauss Transform

More information

Sparser Johnson-Lindenstrauss Transforms

Sparser Johnson-Lindenstrauss Transforms Sparser Johnson-Lindenstrauss Transforms Jelani Nelson Princeton February 16, 212 joint work with Daniel Kane (Stanford) Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression)

More information

Sparse Johnson-Lindenstrauss Transforms

Sparse Johnson-Lindenstrauss Transforms Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 211 joint work with Daniel Kane (Harvard) Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean

More information

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very

More information

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Lecturer: Roman Vershynin Scribe: Matthew Herman 1 Introduction Consider the set X = {n points in R N } where

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 17 10/28 Scribe: Morris Yau 1 Overview In the last lecture we defined subspace embeddings a subspace embedding is a linear transformation

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical

More information

1 Dimension Reduction in Euclidean Space

1 Dimension Reduction in Euclidean Space CSIS0351/8601: Randomized Algorithms Lecture 6: Johnson-Lindenstrauss Lemma: Dimension Reduction Lecturer: Hubert Chan Date: 10 Oct 011 hese lecture notes are supplementary materials for the lectures.

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

Fast Dimension Reduction

Fast Dimension Reduction Fast Dimension Reduction Nir Ailon 1 Edo Liberty 2 1 Google Research 2 Yale University Introduction Lemma (Johnson, Lindenstrauss (1984)) A random projection Ψ preserves all ( n 2) distances up to distortion

More information

Fast Random Projections

Fast Random Projections Fast Random Projections Edo Liberty 1 September 18, 2007 1 Yale University, New Haven CT, supported by AFOSR and NGA (www.edoliberty.com) Advised by Steven Zucker. About This talk will survey a few random

More information

Very Sparse Random Projections

Very Sparse Random Projections Very Sparse Random Projections Ping Li, Trevor Hastie and Kenneth Church [KDD 06] Presented by: Aditya Menon UCSD March 4, 2009 Presented by: Aditya Menon (UCSD) Very Sparse Random Projections March 4,

More information

The Johnson-Lindenstrauss Lemma

The Johnson-Lindenstrauss Lemma The Johnson-Lindenstrauss Lemma Kevin (Min Seong) Park MAT477 Introduction The Johnson-Lindenstrauss Lemma was first introduced in the paper Extensions of Lipschitz mappings into a Hilbert Space by William

More information

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

More information

Dense Fast Random Projections and Lean Walsh Transforms

Dense Fast Random Projections and Lean Walsh Transforms Dense Fast Random Projections and Lean Walsh Transforms Edo Liberty, Nir Ailon, and Amit Singer Abstract. Random projection methods give distributions over k d matrices such that if a matrix Ψ (chosen

More information

Randomized Numerical Linear Algebra: Review and Progresses

Randomized Numerical Linear Algebra: Review and Progresses ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November

More information

Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices

Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices Dimensionality reduction: Johnson-Lindenstrauss lemma for structured random matrices Jan Vybíral Austrian Academy of Sciences RICAM, Linz, Austria January 2011 MPI Leipzig, Germany joint work with Aicke

More information

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Ata Kabán School of Computer Science The University of Birmingham Birmingham B15 2TT, UK http://www.cs.bham.ac.uk/

More information

JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION

JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION JOHNSON-LINDENSTRAUSS TRANSFORMATION AND RANDOM PROJECTION LONG CHEN ABSTRACT. We give a brief survey of Johnson-Lindenstrauss lemma. CONTENTS 1. Introduction 1 2. JL Transform 4 2.1. An Elementary Proof

More information

Faster Johnson-Lindenstrauss style reductions

Faster Johnson-Lindenstrauss style reductions Faster Johnson-Lindenstrauss style reductions Aditya Menon August 23, 2007 Outline 1 Introduction Dimensionality reduction The Johnson-Lindenstrauss Lemma Speeding up computation 2 The Fast Johnson-Lindenstrauss

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra (Part 2) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching

More information

Random Feature Maps for Dot Product Kernels Supplementary Material

Random Feature Maps for Dot Product Kernels Supplementary Material Random Feature Maps for Dot Product Kernels Supplementary Material Purushottam Kar and Harish Karnick Indian Institute of Technology Kanpur, INDIA {purushot,hk}@cse.iitk.ac.in Abstract This document contains

More information

Fast Dimension Reduction

Fast Dimension Reduction Fast Dimension Reduction MMDS 2008 Nir Ailon Google Research NY Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes (with Edo Liberty) The Fast Johnson Lindenstrauss Transform (with Bernard

More information

Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels

Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels Devdatt Dubhashi Department of Computer Science and Engineering, Chalmers University, dubhashi@chalmers.se Functional

More information

Optimality of the Johnson-Lindenstrauss Lemma

Optimality of the Johnson-Lindenstrauss Lemma Optimality of the Johnson-Lindenstrauss Lemma Kasper Green Larsen Jelani Nelson September 7, 2016 Abstract For any integers d, n 2 and 1/(min{n, d}) 0.4999 < ε < 1, we show the existence of a set of n

More information

Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390

Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390 Fast Random Projections using Lean Walsh Transforms Yale University Technical report #1390 Edo Liberty Nir Ailon Amit Singer Abstract We present a k d random projection matrix that is applicable to vectors

More information

Using the Johnson-Lindenstrauss lemma in linear and integer programming

Using the Johnson-Lindenstrauss lemma in linear and integer programming Using the Johnson-Lindenstrauss lemma in linear and integer programming Vu Khac Ky 1, Pierre-Louis Poirion, Leo Liberti LIX, École Polytechnique, F-91128 Palaiseau, France Email:{vu,poirion,liberti}@lix.polytechnique.fr

More information

The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction

The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction Kasper Green Larsen Jelani Nelson Abstract For any n > 1 and 0 < ε < 1/, we show the existence of an n O(1) -point subset

More information

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016 Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

18.409: Topics in TCS: Embeddings of Finite Metric Spaces. Lecture 6

18.409: Topics in TCS: Embeddings of Finite Metric Spaces. Lecture 6 Massachusetts Institute of Technology Lecturer: Michel X. Goemans 8.409: Topics in TCS: Embeddings of Finite Metric Spaces September 7, 006 Scribe: Benjamin Rossman Lecture 6 Today we loo at dimension

More information

Dimensionality Reduction Notes 3

Dimensionality Reduction Notes 3 Dimensionality Reduction Notes 3 Jelani Nelson minilek@seas.harvard.edu August 13, 2015 1 Gordon s theorem Let T be a finite subset of some normed vector space with norm X. We say that a sequence T 0 T

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Lecture 9: Low Rank Approximation

Lecture 9: Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Stability and Robustness of Weak Orthogonal Matching Pursuits

Stability and Robustness of Weak Orthogonal Matching Pursuits Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery

More information

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression Hanxiang Peng and Fei Tan Indiana University Purdue University Indianapolis Department of Mathematical

More information

Lecture 18 Nov 3rd, 2015

Lecture 18 Nov 3rd, 2015 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 18 Nov 3rd, 2015 Scribe: Jefferson Lee 1 Overview Low-rank approximation, Compression Sensing 2 Last Time We looked at three different

More information

A Simple Proof of the Restricted Isometry Property for Random Matrices

A Simple Proof of the Restricted Isometry Property for Random Matrices DOI 10.1007/s00365-007-9003-x A Simple Proof of the Restricted Isometry Property for Random Matrices Richard Baraniuk Mark Davenport Ronald DeVore Michael Wakin Received: 17 May 006 / Revised: 18 January

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Accelerated Dense Random Projections

Accelerated Dense Random Projections 1 Advisor: Steven Zucker 1 Yale University, Department of Computer Science. Dimensionality reduction (1 ε) xi x j 2 Ψ(xi ) Ψ(x j ) 2 (1 + ε) xi x j 2 ( n 2) distances are ε preserved Target dimension k

More information

Constructive bounds for a Ramsey-type problem

Constructive bounds for a Ramsey-type problem Constructive bounds for a Ramsey-type problem Noga Alon Michael Krivelevich Abstract For every fixed integers r, s satisfying r < s there exists some ɛ = ɛ(r, s > 0 for which we construct explicitly an

More information

MANY scientific computations, signal processing, data analysis and machine learning applications lead to large dimensional

MANY scientific computations, signal processing, data analysis and machine learning applications lead to large dimensional Low rank approximation and decomposition of large matrices using error correcting codes Shashanka Ubaru, Arya Mazumdar, and Yousef Saad 1 arxiv:1512.09156v3 [cs.it] 15 Jun 2017 Abstract Low rank approximation

More information

Lecture 3 Sept. 4, 2014

Lecture 3 Sept. 4, 2014 CS 395T: Sublinear Algorithms Fall 2014 Prof. Eric Price Lecture 3 Sept. 4, 2014 Scribe: Zhao Song In today s lecture, we will discuss the following problems: 1. Distinct elements 2. Turnstile model 3.

More information

Compression on the digital unit sphere

Compression on the digital unit sphere 16th Conference on Applied Mathematics, Univ. of Central Oklahoma, Electronic Journal of Differential Equations, Conf. 07, 001, pp. 1 4. ISSN: 107-6691. URL: http://ejde.math.swt.edu or http://ejde.math.unt.edu

More information

Nearest Neighbor Preserving Embeddings

Nearest Neighbor Preserving Embeddings Nearest Neighbor Preserving Embeddings Piotr Indyk MIT Assaf Naor Microsoft Research Abstract In this paper we introduce the notion of nearest neighbor preserving embeddings. These are randomized embeddings

More information

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti Mobile Robotics 1 A Compact Course on Linear Algebra Giorgio Grisetti SA-1 Vectors Arrays of numbers They represent a point in a n dimensional space 2 Vectors: Scalar Product Scalar-Vector Product Changes

More information

Lecture 18: March 15

Lecture 18: March 15 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 18: March 15 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

Linear Algebra Primer

Linear Algebra Primer Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................

More information

B 1 = {B(x, r) x = (x 1, x 2 ) H, 0 < r < x 2 }. (a) Show that B = B 1 B 2 is a basis for a topology on X.

B 1 = {B(x, r) x = (x 1, x 2 ) H, 0 < r < x 2 }. (a) Show that B = B 1 B 2 is a basis for a topology on X. Math 6342/7350: Topology and Geometry Sample Preliminary Exam Questions 1. For each of the following topological spaces X i, determine whether X i and X i X i are homeomorphic. (a) X 1 = [0, 1] (b) X 2

More information

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5 CS 229r: Algorithms for Big Data Fall 215 Prof. Jelani Nelson Lecture 19 Nov 5 Scribe: Abdul Wasay 1 Overview In the last lecture, we started discussing the problem of compressed sensing where we are given

More information

4.2. ORTHOGONALITY 161

4.2. ORTHOGONALITY 161 4.2. ORTHOGONALITY 161 Definition 4.2.9 An affine space (E, E ) is a Euclidean affine space iff its underlying vector space E is a Euclidean vector space. Given any two points a, b E, we define the distance

More information

Sharp Generalization Error Bounds for Randomly-projected Classifiers

Sharp Generalization Error Bounds for Randomly-projected Classifiers Sharp Generalization Error Bounds for Randomly-projected Classifiers R.J. Durrant and A. Kabán School of Computer Science The University of Birmingham Birmingham B15 2TT, UK http://www.cs.bham.ac.uk/ axk

More information

QUALITATIVE CONTROLLABILITY AND UNCONTROLLABILITY BY A SINGLE ENTRY

QUALITATIVE CONTROLLABILITY AND UNCONTROLLABILITY BY A SINGLE ENTRY QUALITATIVE CONTROLLABILITY AND UNCONTROLLABILITY BY A SINGLE ENTRY D.D. Olesky 1 Department of Computer Science University of Victoria Victoria, B.C. V8W 3P6 Michael Tsatsomeros Department of Mathematics

More information

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty Abstract The Fast Johnson-Lindenstrauss Transform (FJLT) was recently discovered by Ailon and Chazelle as a novel

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares MATH 167: APPLIED LINEAR ALGEBRA Least-Squares October 30, 2014 Least Squares We do a series of experiments, collecting data. We wish to see patterns!! We expect the output b to be a linear function of

More information

1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications 1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

More information

Notes taken by Costis Georgiou revised by Hamed Hatami

Notes taken by Costis Georgiou revised by Hamed Hatami CSC414 - Metric Embeddings Lecture 6: Reductions that preserve volumes and distance to affine spaces & Lower bound techniques for distortion when embedding into l Notes taken by Costis Georgiou revised

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Math 240 Calculus III

Math 240 Calculus III The Calculus III Summer 2015, Session II Wednesday, July 8, 2015 Agenda 1. of the determinant 2. determinants 3. of determinants What is the determinant? Yesterday: Ax = b has a unique solution when A

More information

25.2 Last Time: Matrix Multiplication in Streaming Model

25.2 Last Time: Matrix Multiplication in Streaming Model EE 381V: Large Scale Learning Fall 01 Lecture 5 April 18 Lecturer: Caramanis & Sanghavi Scribe: Kai-Yang Chiang 5.1 Review of Streaming Model Streaming model is a new model for presenting massive data.

More information

arxiv: v1 [cs.ds] 30 May 2010

arxiv: v1 [cs.ds] 30 May 2010 ALMOST OPTIMAL UNRESTRICTED FAST JOHNSON-LINDENSTRAUSS TRANSFORM arxiv:005.553v [cs.ds] 30 May 200 NIR AILON AND EDO LIBERTY Abstract. The problems of random projections and sparse reconstruction have

More information

(x, y) = d(x, y) = x y.

(x, y) = d(x, y) = x y. 1 Euclidean geometry 1.1 Euclidean space Our story begins with a geometry which will be familiar to all readers, namely the geometry of Euclidean space. In this first chapter we study the Euclidean distance

More information

Optimality of the Johnson-Lindenstrauss lemma

Optimality of the Johnson-Lindenstrauss lemma 58th Annual IEEE Symposium on Foundations of Computer Science Optimality of the Johnson-Lindenstrauss lemma Kasper Green Larsen Computer Science Department Aarhus University Aarhus, Denmark larsen@cs.au.dk

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

On Riesz-Fischer sequences and lower frame bounds

On Riesz-Fischer sequences and lower frame bounds On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

(1) for all (2) for all and all

(1) for all (2) for all and all 8. Linear mappings and matrices A mapping f from IR n to IR m is called linear if it fulfills the following two properties: (1) for all (2) for all and all Mappings of this sort appear frequently in the

More information

18.06 Professor Johnson Quiz 1 October 3, 2007

18.06 Professor Johnson Quiz 1 October 3, 2007 18.6 Professor Johnson Quiz 1 October 3, 7 SOLUTIONS 1 3 pts.) A given circuit network directed graph) which has an m n incidence matrix A rows = edges, columns = nodes) and a conductance matrix C [diagonal

More information

225A DIFFERENTIAL TOPOLOGY FINAL

225A DIFFERENTIAL TOPOLOGY FINAL 225A DIFFERENTIAL TOPOLOGY FINAL KIM, SUNGJIN Problem 1. From hitney s Embedding Theorem, we can assume that N is an embedded submanifold of R K for some K > 0. Then it is possible to define distance function.

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

Upper triangular matrices and Billiard Arrays

Upper triangular matrices and Billiard Arrays Linear Algebra and its Applications 493 (2016) 508 536 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Upper triangular matrices and Billiard Arrays

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms 南京大学 尹一通 Martingales Definition: A sequence of random variables X 0, X 1,... is a martingale if for all i > 0, E[X i X 0,...,X i1 ] = X i1 x 0, x 1,...,x i1, E[X i X 0 = x 0, X 1

More information

Algebraic Methods in Combinatorics

Algebraic Methods in Combinatorics Algebraic Methods in Combinatorics Po-Shen Loh 27 June 2008 1 Warm-up 1. (A result of Bourbaki on finite geometries, from Răzvan) Let X be a finite set, and let F be a family of distinct proper subsets

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

LINEAR ALGEBRA SUMMARY SHEET.

LINEAR ALGEBRA SUMMARY SHEET. LINEAR ALGEBRA SUMMARY SHEET RADON ROSBOROUGH https://intuitiveexplanationscom/linear-algebra-summary-sheet/ This document is a concise collection of many of the important theorems of linear algebra, organized

More information

Packing triangles in regular tournaments

Packing triangles in regular tournaments Packing triangles in regular tournaments Raphael Yuster Abstract We prove that a regular tournament with n vertices has more than n2 11.5 (1 o(1)) pairwise arc-disjoint directed triangles. On the other

More information

Approximating sparse binary matrices in the cut-norm

Approximating sparse binary matrices in the cut-norm Approximating sparse binary matrices in the cut-norm Noga Alon Abstract The cut-norm A C of a real matrix A = (a ij ) i R,j S is the maximum, over all I R, J S of the quantity i I,j J a ij. We show that

More information

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Introduction to Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Vectors Arrays of numbers Vectors represent a point in a n dimensional space

More information

Department of Mathematics, University of California, Berkeley. GRADUATE PRELIMINARY EXAMINATION, Part A Fall Semester 2016

Department of Mathematics, University of California, Berkeley. GRADUATE PRELIMINARY EXAMINATION, Part A Fall Semester 2016 Department of Mathematics, University of California, Berkeley YOUR 1 OR 2 DIGIT EXAM NUMBER GRADUATE PRELIMINARY EXAMINATION, Part A Fall Semester 2016 1. Please write your 1- or 2-digit exam number on

More information

A BRIEF INTRODUCTION TO HILBERT SPACE FRAME THEORY AND ITS APPLICATIONS AMS SHORT COURSE: JOINT MATHEMATICS MEETINGS SAN ANTONIO, 2015 PETER G. CASAZZA Abstract. This is a short introduction to Hilbert

More information

Math 200 University of Connecticut

Math 200 University of Connecticut RELATIVISTIC ADDITION AND REAL ADDITION KEITH CONRAD Math 200 University of Connecticut Date: Aug. 31, 2005. RELATIVISTIC ADDITION AND REAL ADDITION 1 1. Introduction For three particles P, Q, R travelling

More information

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Suppose again we have n sample points x,..., x n R p. The data-point x i R p can be thought of as the i-th row X i of an n p-dimensional

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)

More information

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS 1. Cardinal number of a set The cardinal number (or simply cardinal) of a set is a generalization of the concept of the number of elements

More information

Dimensionality Reduced l 0 -Sparse Subspace Clustering

Dimensionality Reduced l 0 -Sparse Subspace Clustering Yingzhen Yang Independent Researcher Abstract Subspace clustering partitions the data that lie on a union of subspaces. l 0 -Sparse Subspace Clustering (l 0 -SSC), which belongs to the subspace clustering

More information

MATH 22A: LINEAR ALGEBRA Chapter 4

MATH 22A: LINEAR ALGEBRA Chapter 4 MATH 22A: LINEAR ALGEBRA Chapter 4 Jesús De Loera, UC Davis November 30, 2012 Orthogonality and Least Squares Approximation QUESTION: Suppose Ax = b has no solution!! Then what to do? Can we find an Approximate

More information