arxiv: v1 [math.pr] 30 Mar 2015

Size: px
Start display at page:

Download "arxiv: v1 [math.pr] 30 Mar 2015"

Transcription

1 DICIONARY LEARNING WIH FEW SAMPLES AND MARIX CONCENRAION KYLE LUH arxiv: v1 [math.pr] 30 Mar 2015 Department of Mathematics, Yae University VAN VU Department of Mathematics, Yae University Abstract. Let A be an n n matrix, X be an n p matrix and Y = AX. A chaenging and important probem in data anaysis, motivated by dictionary earning and other practica probems, is to recover both A and X, given Y. Under norma circumstances, it is cear that this probem is underdetermined. However, in the case when X is sparse and random, Spieman, Wang and Wright showed that one can recover both A and X efficienty from Y with high probabiity, given that p (the number of sampes) is sufficienty arge. heir method works for p Cn 2 og 2 n and they conjectured that p Cn og n suffices. he bound n og n is sharp for an obvious information theoretica reason. In this paper, we show that p Cn og 4 n suffices, matching the conjectura bound up to a poyogarithmic factor. he core of our proof is a theorem concerning 1 concentration of random matrices, which is of independent interest. Our proof of the concentration resut is based on two ideas. he first is an economica way to appy the union bound. he second is a refined version of Bernstein s concentration inequaity for the sum of independent variabes. Both have nothing to do with random matrices and are appicabe in genera settings. E-mai addresses: kye.uh@yae.edu, van.vu@yae.edu. Key words and phrases. Dictionary earning, matrix concentration. 1

2 2 DICIONARY LEARNING WIH FEW SAMPLES 1. Introduction Let A be an n n invertibe matrix and X be an n p matrix; set Y := AX. he aim of this paper is to study the foowing recovery probem: Given Y, reconstruct A and X. It is cear that in the equation (1.1) Y = AX, we have n 2 + np unknowns (the entries of A and X), and ony np equations (given by the entries of Y ). hus, the probem is underdetermined and one cannot hope for a unique soution. However, in practice, X is frequenty a sparse matrix. If X is sparse, the number of unknowns decreases dramaticay, as the majority of entries of X are zero. he name of the game here is to find the minimum vaue of p, the number of observations, which guarantees a unique recovery (e.g. [2] and [6]). One rea-ife appication that motivates the studies of this probem is dictionary earning. he matrix A can be seen as a hidden dictionary, with its coumns being the words. X is a sparse sampe matrix. his means that in the coumns of Y we observe inear combinations of a few coumns of A. From these observations, we woud ike to recover the dictionary. An archetypa exampe is facia recognition [18] [10]. A database of observed faces is used to generate the dictionary and once the dictionary is found, the probem of storing and transmitting facia images can be done very efficienty, as a one needs is to store and transmit few coefficients. In fact, such dictionary-earning techniques can be utiized to recognize faces that are partiay occuded or corrupted with noise [17]. For more discussion and rea-ife exampes, we refer to [9], [12] and the references therein. Another practica situation in which the recovery probem appears essentia is bind source separation and we refer the reader to [20] for more detais. here have been many approaches to efficient recovery beginning with the work of [12]. Let us mention, among others, onine dictionary earning by [11], SIV [7], the reative Newton method for source separation by [19], the Method of Optima Directions by [4], K-SVD in [1], and scaabe variants in [11]. Whie various different approaches have been considered, there have not been many rigorous resuts concerning performance. he first such resut has been obtained by Spieman, Wang and Wright [15] concerning recovery with random sampes; in other words, X is a random sparse matrix. Before stating their resut, we need to discuss the meaning of unique and the random mode. First, notice that if Y = AX, then Y = (AV )(V 1 X) for any diagona matrix V with non-zero diagona entries. Furthermore, one can freey permute the coumns of A and the rows of X accordingy whie keeping Y the same. In the rest of the paper, unique recovery wi be understood moduo these two operations. o mode X, one considers random Bernoui-subgaussian matrices, defined as foows: X is a matrix of size n p with iid entries x ij, where (1.2) x ij := χ ij ξ ij, where χ ij are iid indicator random variabes with P(χ ij ) = θ and ξ ij are iid random variabes with mean 0, variance bounded by 1, E ξ [1/10, 1], and P( ξ t) 2 exp( t 2 /2). his mode incudes many important distributions such as the standard Gaussians and Rademachers. he 1/10 is introduced for convenience of anaysis and not critica to the argument. Spieman et. a. proved

3 DICIONARY LEARNING WIH FEW SAMPLES 3 heorem 1.1. here are constants C > 0, C > 0 such that the foowing hods. Let A be an invertibe n n matrix and X a sparse random n p matrix with 2/n θ C / n and ξ ij having a symmetric distribution. hen for p Cn 2 og 2 n, one can efficienty find a soution with probabiity 1 o(1). Here and ater, efficient means poynomia time. he agorithm designed for this purpose is caed ER-SpUD, whose main subroutine is 1 optimization. We are going to present and discuss this agorithm in Section 4. In the dictionary earning probem, p is the number of measurements, and it is important to optimize its vaue. From beow, it is easy to see that we must have p cn og n for some constant c > 0. Indeed, if θ = 2/n (or c /n for any constant c ) and p < cn og n for a sufficienty sma constant c, then the coupon coector argument shows that with probabiity 1 o(1), X has an a-zero row. In this case, changing the corresponding coumn of A wi not effect Y, and an unique recovery is hopeess. Spieman et. a. conjecture Conjecture 1.2. here are constants C > 0, α > 0 such that the foowing hods. Let A be an invertibe n n matrix and X a sparse random n p matrix with 2/n θ α/ n. hen for p Cn og n, one can efficienty find a soution with probabiity 1 o(1). As a matter of fact, they beieve that ER-SpUD shoud perform we as ong as p Cn og n, for some arge constant C. hey aso proved that if one does not cared about the running time of the agorithm, then p Cn og n suffices. he anaysis in [15] bois down to the concentration probem. For a vector v R n, et µ v := E X v 1. Let c be a sma positive constant (c =.1 suffices) and et Bad(v) be the event that X v 1 µ v cµ v. We want to have (1.3) P( v R nbad(v)) = o(1). In other words, with high probabiity, X v 1 does not deviate significanty from its mean, simutaneousy for a v R n. One needs to find the smaest vaue of p which guarantees (1.3). Notice that X v is the sum of p iid random variabes X i v where X i are the rows of X. hus, intuitivey the arger p is, the more X v concentrates. From beow, we observe that (1.3) fais if p n 1, since in this case for any matrix X one can find a v such that X v = 0 and µ v 1 (we can take v arbitrariy ong). Spieman, Wang, and Wright [15] showed that p Cn 2 og 2 n suffices. We wi prove heorem 1.3. For any constant c > 0 there is a constant C > 0 such that (1.3) hods for any p Cn og 4 n. Beyond the current appication, heorem 1.3 may be of independent interest for severa reasons. Whie concentration inequaities for random matrices are abundant, most of them concern the spectra or 2 norm. We have not seen one which addresses the 1 norm as in this theorem. As sparsity pays crucia roe in data anaysis, techniques invoving 1 norm (such as 1 optimization) become more and more important. Furthermore, in the proof we introduce two genera ideas, which seem to be appicabe in many settings. he first is an economica way to appy the union bound and the second is a refined version of Bernstein s concentration inequaity for sums of independent variabes. Using heorem 1.3, we are abe to give an improved anaysis of ER-SpUD, which yieds heorem 1.4. here are constants C > 0, C > 0 such that the foowing hods. Let A be an invertibe n n matrix and X a sparse random n p matrix with 2/n θ C / n. hen for p Cn og 4 n, one can efficienty find a soution with probabiity 1 o(1). Our p is within a og 3 n factor from the bound in Conjecture 1.2. Furthermore, we can drop the assumption that ξ ij are symmetric from heorem 1.1.

4 4 DICIONARY LEARNING WIH FEW SAMPLES Next, we wi be abe to refine heorem 1.3 in two ways. First, combining the proof of heorem 1.4 with a resut from random matrix theory, we obtain the foowing more genera resut, which handes the case when A is rectanguar heorem 1.5. here are constants C, α > 0 such that the foowing hods. Let n > m and A be an n m matrix of rank m and and X a sparse random m p matrix with 2/n θ α/ n. hen for p Cn og 4 n, one can efficienty find a soution with probabiity 1 o(1) Second, in the sparest case θ := Θ(1/n), we deveop a new agorithm that obtains the optima bound p = Cn og n, proving Conjecture 1.2 in this regime. heorem 1.6. For any c > 0 there is a constant C > 0 such that the foowing hods. Let A be an invertibe n n matrix and X a sparse random n p matrix with θ = c/n. hen for p Cn og n, one can efficienty find a soution with probabiity 1 o(1) Finay, et us mention the issue of theoretica recovery, regardess the running time. Without the compexity issue, Spieman et. a. showed that p Cn og n suffices, given that the random variabe ξ ij in the definition of X has a symmetric distribution. We coud strengthen this theorem by removing this assumption. heorem 1.7. here are constants C > 0, C > 0 such that the foowing hods. Let A be an invertibe n n matrix and X a sparse random n p matrix with 2/n θ C / n. hen for p Cn og n, one can find a soution with probabiity 1 o(1). he rest of the paper is organized as foows. In Section 2, we present the main ideas behind the proof of heorem 1.3. he detais foows next in Section 3. Section 4 contains the accompanying agorithms and an improved anaysis of ER-SpUD, foowing [15]. Section 5 addresses a generaization to rectanguar dictionaries. Section 6 introduces a new agorithm that achieves the optima bound in the sparse regime. In Section 7, we prove heorem 1.7. We concude with Section 8, in which we present some numerica experiments of the various agorithms. Acknowedgement. We woud ike to thank D. Spieman for bringing the probem to our attention. 2. he main ideas and emmas 2.1. he standard ɛ-net argument. Let us reca our task. For a vector v R n, et µ v := E X v 1. Let c be a sma positive constant (c =.1 suffices) and et Bad(v) be the event that X v 1 µ v cµ v. We want to show that if p is sufficienty arge, then (2.1) P( v R nbad(v)) = o(1). For the sake of presentation, et us assume that the random variabes ξ ij are Rademacher (taking vaues ±1 with probabiity 1/2); the entries x ij of X have the form x ij = χ ij ξ ij, where χ ij are iid indicator variabes with mean θ. We start by a quick proof of the bound p Cn 2 og 2 n obtained in [15]. Notice that the union in (1.3) contains infinitey many terms. he standard way to hande this is to use an ɛ-net argument. Definition 2.1. A set N R n is an ɛ-net of a set D R n in q norm, for some 0 < q, if for any x D there is y N so that x y q ɛ. he unit sphere in q norm consists of vectors v where v q = 1. B denotes the unit sphere in 1 norm. Considering the vectors in B is sufficient to prove the resut. It is easy to show that for any v B µ min := p θ/n µ v pθ := µ max,

5 DICIONARY LEARNING WIH FEW SAMPLES 5 where the ower bounds attend at v = 1 n1 (1 is the a one vector) and the upper bound at v = (1, 0,..., 0). Let N 0 be the set of a vectors in B whose coordinates are integer mutipes of n 3. Any vector in B woud be of distance at most n 2 in 1 norm from some vector in N 0 (thus N 0 is an n 2 -net of B). A short consideration shows that if u, v B are within n 2 of each other, then hus, to prove (1.3), it suffices to show that µ v µ u = o(µ min ). (2.2) P( v N0 Bad(v)) = o(1). In order to bound P( v N0 Bad(v)), et us first bound P(Bad(v)) for any B. Notice that X v 1 = p X i v, where X i are the coumns of X. he random variabes X i v are iid, and one is poised to appy another standard too, Bernstein s inequaity for the sum of independent random variabes. Lemma 2.2. Let Z 1,..., Z n be independent random variabes such that Z i τ with probabiity 1. Let S := n i=1 Z i. hen for any > 0 max{p(s ES ), P(S ES )} exp( 2(VarS + τ) ) exp( min{ 2 4VarS, 4τ }). In our case Z i = X i v = n i=1 X ijv j. As x ij = χ ij ξ ij 1 with probabiity 1 (we assume that ξ ij are Rademacher) Z i i=1 2 n v j = v 1 = 1 with probabiity 1. his means we can set τ = 1. Furthermore Var i=1 p Z i = pvarz i pe X i v 2 = p i=1 n θvj 2 pθ Finay, one can set = cµ min = cp θ/n. Lemma 2.2 impies that since θ/n 1/n as θ 1/n. Using the union bound P(Bad(v) 2 exp( min{ c2 p 2 θ/n 4pθ (2.3) P( v N0 Bad(v) we obtain j=1 n v j = pθ. j=1, cp θ/n }) = 2 exp( c2 p 4 4n ) v N 0 P(Bad(v)) P( v N0 Bad(v)) N 0 2 exp( c2 p 4n ). It is easy to check that N 0 = exp(ω(n og n)). So, in order to make the RHS o(1), we need p Cn 2 og n for a sufficienty arge constant C. For the case when ξ ij are not Bernoui (but sti

6 6 DICIONARY LEARNING WIH FEW SAMPLES subgaussian) the cacuation in [15] requires an extra ogarithm term, which resuts in the bound p Cn 2 og 2 n New ingredients. Our first idea is to find a more efficient variant of the union bound P( v N0 Bad(v)) v N 0 P(Bad(v)). Motivated by the incusion-excusion formua we try to capture some gain when P(Bad(u) Bad(v)) is arge for many pairs u, v. We observe that if we can group the eements v of the net into custers so that within each custer, the events Bad(v) (seen as subsets of the underying probabiity space) are cose to each other. Assume, for a moment, that one can spit the net N 0 into m disjoint custers C i, 1 i m, so that if u and v beong to the same custer P(Bad(u)\Bad(v)) p 1, where p 1 is much smaer than p 0, then P( v Ci Bad(v)) P(Bad(v [i] )) + C i p 1, where v [i] is a representative point in C i. Summing over i, one obtains (2.4) P( v N0 Bad(v)) m P( v Ci Bad(v)) i=1 m P(Bad(v [i] )) + N 0 p 1 mp 0 + N 0 p 1. We gain significanty if p 1 is much smaer than p 0 and m is much smaer than N 0. Next, viewing the set of representatives v [i] as a new net N 1, we can iterate the argument, obtaining the foowing emma. Lemma 2.3. Let P be a probabiity space. Let N = N 0 be a finite set, where to each eement v N 0 we associate a set Bad 0 (v) P. Assume that we can construct a sequence of sets N L, N L 1,..., N 0, and for each u N, 1 L an event Bad (u) such that the foowing hods. For each v N 1, there is u N such that P(Bad 1 (v)\bad (u)) p and for each u N L, P(Bad L (u)) p 0. hen (2.5) P( v N0 Bad 0 (v)) N L p 0 + i=1 L N 1 p. he construction of N are of critica importance, and we are going to construct them using the distance, rather than the obvious choice of 1. his is the key point of our method. he next main technica ingredient is a more efficient way of using Bernstein s inequaity, Lemma 2.2. Reca the bound 2 (2.6) P( S ES ) 2 exp( 2(VarS + τ) ) 2 exp( min{ 2 4VarS, 4τ }). he first term 2 4VarS on the right most formua is usuay optima. However, we need to improve the second term. he idea is to repace τ with a smaer quantity τ such that the probabiity that Z i τ is cose to 1. Let us iustrate this idea with the upper tai. Set µ := ES, we consider Write P(S µ + ). =1 Z i := Z i J i + Z i I i

7 DICIONARY LEARNING WIH FEW SAMPLES 7 where J i is the indicator of the event Z i τ and I i = 1 I i. hus S := i Z i I i + i Z i J i = Q + S(1). Let µ j be the expectation of S(j). hen P(S µ + ) P(Q µ 1 + /2) + P(S(1) µ 2 + /2). We can use Lemma 2.2 to bound P(Q µ 1 + /2), which provides a bound better than (2.6) as now τ < τ. On the other hand, if the probabiity that Z i τ is sma, then we can bound P(S(1) µ 2 + /2) in a different way, expoiting the fact that there wi be very few non-zero summands in S(1). We can (and have to) further refine this idea by considering a sequence of τ, breaking S into the sum of Q and S(k), 1 k M, for a propery chosen M. his wi be our eading idea to bound the difference probabiity p in the next section. On the abstract eve, our method bears a simiarity to the chaining argument from the theory of Banach spaces. We are going to discuss this point in Section Proof of heorem 1.3 For the sake of presentation, we assume that x ij = χ ij ξ ij where χ ij are iid Bernoui random variabes with mean θ and ξ ij are iid Rademachers random variabes. In fact, p Cn og 3 n is sufficient for the Rademacher case. he proof can be easiy modified for ξ ij being genera subgaussian at the cost of a og n factor in the bound for p (See Section 3.6). We reca the notation µ min = p θ/n, µ max = pθ; µ v := E X v 1. B is the set of a vectors of unit 1 norm. We set p = Cn og 3 n, for a sufficienty arge constant C. Let := c 0µ min og n for a sma constant c 0 > 0 and K := 6µmax α-nets in norm. Lemma 3.1. For any 1 α 2/n, B admits an α-net in norm of size at most exp(2α 1 og n). Proof. Let N be the coection of a vectors v B, whose coordinates are integer mutipes of α. Obviousy, N is an α-net of B in norm. Furthermore, any v N satisfies v 1 1, so it has at most k := α 1 non-zero coordinates. If a coordinate is non-zero, it can take at most 2α k vaues. herefore, As α 2/n, the RHS is at most N k i=0 ( ) n (3k) k. i ( ) n n (3k) k n( en k k 3k)k = n(2en) k exp(2α 1 og n). he key here is that we consider an α-net in norm, rather than in 1 norm, which appears to be a natura choice.

8 8 DICIONARY LEARNING WIH FEW SAMPLES 3.2. Buiding a nested sequence. Reca that N 0 is the set of vectors v in B whose coordinates are integer mutipes of n 3. We have (3.1) N 0 (2n 3 + 1) n exp(4n og n). Consider the sequence α 0 = 2/n; α = 2α 1 for = 1,..., L, where L og 2 n is the first index such that α L > 1/2. Let N be an α -net of B in the norm. By Lemma 3.1, we can choose N such that (3.2) N exp(2α 1 og n). We now buid a nested sequence N L N L 1 N 1 N 0 as foows. Assume that N 1 has been buit. Use the points in N as centers to construct a Voronoi partition of the points of N 1 with respect to the norm (ties are broken arbitrariy). For each point u N, et C u be the subset of N 1 corresponds to u. By definition, u v α for any v C u, Partition the interva [µ min, µ max ] = [p θ/n, pθ] into K intervas I 1,..., I K of equa engths. We partition C u further into K subsets C u,j, 1 j K, where v C u,j if E Xv 1 I j. By this construction, if v, w beong to the same C u,j, then by the definition of K, we have the key reations (3.3) v w 2α and E Xv 1 E Xw 1 pθ/k /6. From each set C u,j choose an arbitrary eement v. hus, each u N gives rise to a set R u of K eements (R stands for representative). Define It is cear that N N 1 and N := u N R u. (3.4) N K N K exp(2α 1 og n) Bounding the differences. Consider the construction of N, 1 L, from Section 3.1. Let v N. hus, v C u,j for some u N and 1 j K. Consider another point w N u,j. Our main task is to show Lemma 3.2. For a pairs v, w as above (3.5) ρ(v, w) := P( X v 1 X w 1 ) exp( 5α 1 og n). he rest of this section is devoted to the proof of this emma. By (3.3), we have (3.6) v w 2α and E X v 1 E X w 1 pθ/k /6. Define Z I = X i v X i w, where X i is the ith row of X ; we have X v 1 X w 1 = p ( X i v X i w ) = i=1 Set S := p i=1 Z i; by symmetry, it suffices to bound p Z i. i=1 P(Z Z p ) := P(S ). Notice that by the triange inequaity Z i = X i v X i w X i (v w). herefore,

9 DICIONARY LEARNING WIH FEW SAMPLES 9 VarZ i EZi 2 E X i (v w) 2 = θ Reca that v, w 1 and v w α. herefore n (v j w j ) 2. j=1 his impies n n (v j w j ) 2 α v j + w j = 2α. j=1 j=1 (3.7) VarZ i EZ 2 i 2α θ. We denote by I i,k the event that τ k < Z i τ k 1 for k = 1,..., M and J i the event that Z i τ M, for a sequence τ k, k = 0,..., M, where τ 0 = 2; τ i = 2 i τ 0 and M is the first index so that (3.8) min{ τ 2 M 8α θ, τ M 4α } 8 og n. Note that if α 1 32 og 1 n then such an index M 1 exists. We wi proceed with this assumption and cover the remaining cases at the end of the proof. Apparenty, Z i k Z i I i,k + Z i J i. Set S(k) = p i=1 Z ii i,k for k = 1,..., M and Q = p i=1 Z ij i. We have i=1 P(S ) P(Q /2) + M P(S(k) k=1 2M ). o bound P(Q /2), we notice that (see (3.11)) the choice of τ M guarantees that P(J i ) 1 2n 8 for a i = 1,..., p. As Z i 2 with probabiity 1, it foows that and so EZ i J i EZ i 4n 8 EQ ES 4pn 8 = o(n 6 ), as p = o(n 2 ). On the other hand, by (3.6), 5(ES + n 6 ). hus P(Q /2) P(Q EQ + /4). By definition, Q is sum of p iid random variabes, each is bounded by τ M in absoute vaue with probabiity 1. Furthermore, by (3.7) By Lemma 2.2, we have VarQ = pvarz 1 J 1 pez 2 1 2α θp. (3.9) P(Q EQ + /4) 2(exp( min{ (/4)2 8α θp, /4 }) = 2 exp( min{ 4τ M 128α θp, 16τ M }). Now we bound P(S(k) 2M ), for k = 1,..., M. Reca that S(k) := p i=1 Z ii i,k is a sum of iid non-negative random variabes, each is either 0 or in (τ k and τ k 1 ]. hus, if S(k) /2M there

10 10 DICIONARY LEARNING WIH FEW SAMPLES must be at east p k := /2M τ k 1 indices i such that Z i > τ k. Let ρ k be the probabiity that Z 1 > τ k. hen by the union bound and the fact that p = o(n 2 ), (3.10) P(S(k) ( ) p 2M ) ρ p k p k (ep ρ k ) p k ( n2 k p k 2 ρ k) p k. o compete the anaysis, we need to estimate ρ k. By definition ρ k := P( X 1 v X 1 w > τ k ) P( X 1 (v w) τ k ). he random variabe Z 1 := X 1 (v w) = n j=1 ξ j(v j w j ) has mean 0. Furthermore, by (3.7), Var Z 1 Z 2 1 2α θ. Finay, each term ξ j (v j w j ) is at most α in absoute vaue. hus Lemma 2.2 impies (3.11) ρ k P( Z 1 τ k ) 2(exp( min{ τ 2 k 8α θ, τ k 4α }). his and (3.10) yied (3.12) P(S(k) ( 2M ) 2 exp( min{ τ k 2 8α θ, τ ) k } + 2 og n p k ). 4α By (3.8), so By definition p k = /2M τ k 1 and min{ τ 2 k 8α θ, τ k 4α } 8 og n, ( min{ τ k 2 8α θ, τ ) k } + 2 og n p k 1 4α 2 min{ τ k 2 8α θ p k, τ k p k }. 4α = /4M τ k, as τ k 1 = 2τ k. herefore, τ 2 k 1 2 8α θ p k = τ k 64Mα θ By (3.9) and (3.12), we concude that τ k 1 p k = 2 4α 32Mα. (3.13) P(S ) 2 exp( min{ 128α θp, }) + 16τ M 2 M τ k 2 exp( min{ 64Mα θ, k=1 32Mα }). A routine verification (see Section 3.5) shows that once p Cn og 3 n for a sufficient arge constant C, then the RHS in (3.13) is at most exp( 5α 1 og n), competing the proof for the case α 1 32 og 1 n. o compete the proof, we now treat the remaining case when α 1 32 og 1 n.. In this case, we do not need to spit Z i. Reca S = Z Z p whre Z i 2 with probabiity 1, ES /6 and VarS 2pθα. By Lemma 2.2, we have P(S ) P(S ES + /2) exp( min{ 2 8pθα, 8 }).

11 DICIONARY LEARNING WIH FEW SAMPLES 11 By the anaysis of (3.13), we aready know that 2 8pθα α 1 32 og 1 n 5α 1 og n. On the other hand, as 8 = c 0p θ/n 8 og n = c 0C 8 given that c 0 C is sufficienty arge. his competes the proof. θn og 2 n 5α 1 og n, 3.4. Proof of the Concentration emma. For v N, 0 L, et Bad (v) be the event that Xv 1 µ v 2(L + 1 ). For = 0, 2(L + 1 = 2(L + 1) 2c 0(og 2 n+1)µ min og n 4c 0 µ min. hus, P( v N0 X v 1 µ v 4c 0 µ min ) P( v N0 Bad 0 (v)). Assume that there is a number p 0 such that P(Bad 0 (v)) p 0 for a v N 0. Assume furthermore that for any 1 L, there is a number p such that for v N and w N 1 where v is the representative of the set C (u,k) that contains w (see the construction in Section 3.2). hen by Lemma 2.3 P(Bad (w)\bad 1 (v)) p. P( v N0 ) N L p 0 + L N 1 p. o find p, notice that if Bad 1 (w) hods and Bad (v) does not, then X w 1 µ w 2(L + 2 ) and X v 1 µ v 2(L + 1 ). By (3.3), µ v µ w. It thus foows that X w 1 X v 1. By the main emma of Section 3.3, we know that the probabiity of this event is at most p := exp( 5α 1 og n), for a. Reca from Section 3.2 that =1 we have N K exp(2α 1 og n) = K exp(4α 2 og n), L N 1 p =1 Since K = O(n 1/2 ) and α 1 L =1 exp( 4α 1 og n og n, the RHS is at most og n) K exp(4α 1 og n). L =1 exp(.5α 1 og n) = o(1). o concude, notice that by Lemma 2.2, we can set p 0 := 2 exp( min{ 2 8pθ, 8 }). exp( 2α 1 L og n) exp(4 og n) since α L 1/2, we have As N L as ong as min{ 2 8pθ, 8 } 5 og n. his constant C. his impies that p 0 N 0 = o(1), condition hods if p Cn og3 n for a sufficienty arge and we are done by (2.2). P( v N0 { X v µ v 4c 0 µ min ) = o(1),

12 12 DICIONARY LEARNING WIH FEW SAMPLES 3.5. he magnitude of p. We present the routine verification concerning the exponents in (3.13). his is the ony pace where the magnitude of p matters. Reca that = c 0µ min og n = c 0p θ/n og n p = Cn og 3 n (since for the sake of exposition we are ony considering the Rademacher case). We have 2 128α θp = c2 0 p2 θ/n 128θp og 2 n α 1 = c2 0 p n α 1 provided that c 2 0 C 4.1. By the definition of M in (3.8), we have = c 2 0Cα 1 og n 4.1α 1 og n, and his impies that It foows that 32 og n min{ τ 2 M 8α θ, τ M 4α } 8 og n. τ M max{16 α θ og n, 128α og n}. min{ 16τ M 256 α θ og n, α og n }. By the definition of p and 256 α θ og n = c 0 p 256 α n og 3 n = α 1 c 0 C og n nα 4.1α 1 og n, since c 0 C 4.1 and nα nα 0 n 2 n > 1. Furthermore, α og n = c 0 Cn og 3 n θ/n α og n Finay, we bound the exponent. By definition τ τ k 64Mα θ 8 α θ og n = α 1 c 0 C 64 og nα θ 8 concuding the proof. = ω(α 1 og n). Next, we bound the exponent 32Mα. As M og n, we have c 0Cn og 2 n θ/n = α 1 c 0 C 32Mα 32α og n θn og n 4.1α 1 32 og n, provided that c 0 C/32 4.1, since θn 1. τ k k 2 64Mα θ 8α θ 8 og n and M og n thus nα og 3/2 n = ω(α 1 og n), 3.6. Extension from Rademacher to genera sub-gaussian variabes. We introduce the truncation operator τ : R n p R n p as { Mij M ( τ [M]) ij = ij τ 0 ese Let τ = C og n and et X = τ [X]. For C sufficienty arge, the probabiity that X = X is 1 o(1). his aows us to work with random matrix whose entries are bounded by τ (instead of 1 as in the Rademacher case). he same proof wi go through if we increase p by C 1 τ, for a sufficienty arge constant C 1. his means p = O(n og 3.5 n) suffices. We round 3.5 up to 4 for cosmetic reasons.

13 DICIONARY LEARNING WIH FEW SAMPLES Concuding remarks. here is a connection between the method of our proof and Fernique s chaining argument [5] (see [16] for a survey). he goa of the chaining method is to bound the supermum sup t B X t where B is a domain in a metrics space and X t is a Gaussian process. In this case, the bad event Bad(v) can roughy be defined as X v M v, for some candidate vaue M v. One then considers a chain of sets in order to bound P( v B Bad(v)). his, in spirit, is simiar to the purpose of Lemma 2.3. After this, the arguments become different in a aspects. First, in our setting, the bad event Bad(v) can have any nature. Next, in the chaining argument, the sets N j are defined using the metrics of B, whie in our case, it is crucia to use a different metrics. We construct N j using the norm, rather than the natura 1 norm used to define the domain B. Finay, in the chaining case it is easy to bound P(Bad(u)\Bad(v)), using the fact that P( X u X v t) 2 exp( t2 ), dist(u,v) 2 which is the basic property of a Gaussian process. In our case, bounding P(Bad(u)\Bad(v)) is an essentia step (Lemma 3.2), which requires the deveopment of the refined Bernstein s inequaity. 4. he agorithm and concentration of random matrices As the agorithm and anaysis are discussed extensivey in [15], we wi be brief and the readers can consut [15] for more detais. [15] introduces the dictionary earning agorithm ER-SpUD. he key insight in the design of ER-SpUD is that the rows of X are ikey to be the sparsest vectors in the row space of Y. (his observation aso appeared [20] and [11].) [15] proposed to find these vectors by considering the foowing optimization probems. minimize w Y 1 subject to r w = 1 where r is a row of two coumns of Y. Using 1 optimization for finding sparse vectors is a natura idea, and the authors of [15] pointed out that such an approach was aready proposed in [13] and [8]. he difference is the new constraint r w = 1. (Earier works used different constraints.) By a change of variabes z = A w, b = A 1 r, we can consider the equivaent probem (4.1) minimize z X 1 subject to b z = 1. he agorithm presented in [15] is outined beow (for those famiiar with [15], note that we are presenting the two-coumn version of ER-SpUD): Agorithm 1 ER-SpUD 1: Randomy pair the coumns of Y into p/2 groups g j = {Y e j1, Y e j2 } 2: For j = 1,..., p/2 Let r j = Y e j1 + Y e j2, where g j = {Y e j1, Y e j2 } Sove min w w Y 1 subject to (Yr j ) w = 1, and set s j = w Y. 3: Use Greedy agorithm to reconstruct X and A. Agorithm 2 Greedy 1: Require: S = {s 1,..., s } R p 2: For i = 1... n REPEA arg min s S s 0, breaking ties arbitrariy x i = s S = S\{s } UNIL rank([x 1,..., x i ]) = i 3: Set X = [x 1,..., x i ], and A = YY (XY ) 1

14 14 DICIONARY LEARNING WIH FEW SAMPLES A key technica step in anayzing ER-SpUD is the foowing emma, which asserts that if p is sufficienty arge, then with high probabiity X v 1 is cose to its mean, simutaneousy for a unit vectors v R n. Lemma 4.1. For every constant 1 δ > 0 there is a constant C 0 > 0 such that the foowing hods. If θ 1 n and p C 0n 2 og 2 n, then with probabiity 1 o(1), for a v R n (4.2) X v 1 E X v 1 δe X v 1. his emma appears impicity in [15]. Dan Spieman pointed out to us that this woud impy the critica [15, Lemma 17]. he bound p Cn 2 og 2 n is of importance in the proof of this emma. Our heorem 1.3, which pushes p to Cn og 4 n, is an improved version of Lemma 4.1. With heorem 1.3 in hand, et us now sketch the proof of heorem 1.4, foowing the anaysis in [15]. Notice that if the soution of the 1 optimization probem, z, is 1-sparse, then the agorithm wi recover a row of X. he proof of the theorem reies on showing that z, is supported on the non-zero indices of b and that with high-probabiity, z is in fact 1-sparse. he first goa aows us to focus our attention on a submatrix of X which wi be convenient for technica reasons. o address this first issue, we prove the foowing. Lemma 4.2. Suppose that X satisfies the Bernoui-Subgaussian mode. here exists a numerica constant C > 0 such that if θn 2 and p > Cn og 4 n then the random matrix X has the foowing property with probabiity at east 1 o(1). (P1) For every b satisfying b 0 1/8θ, any soution z to the optimization probem 4.1 has supp(z ) supp(b). Sketch of the Proof of Lemma 4.2. We et J be the indices of the s non-zero entries of b. Let S be the indices of the nonzero coumns in X J, and et z 0 = P J z (the restriction to those coordinates indexed by J). Define z 1 = z z 0. We demonstrate that z 0 has at east as ow an objective as z so z 1 must be zero. One can show using the triange inequaity that z X 1 z 0 X 1 2 z 1 X S 1 + z 1 X 1. hus, if z 1 X 1 2 z 1 XS 1 > 0, then z 0 has a ower objective vaue. We need this inequaity to hod for a z with high probabiity. Notice that E[ z X 1 2 z X S 1 ] = (p 2 S )E z X 1 It is easy to show that S < p/4 with high probabiity so (p 2 S ) > 0 with high probabiity. herefore, if we can show that z X 1 2 z X S 1 is concentrated near its positive expectation we are done. We see that it suffices to show the resut for the worst case S = p/4. Now we make critica use of heorem 1.3, which asserts that with high probabiity, and so z X E z X 1 = 5p 8 E z X 1. z X S E z X S 1 = p 8 E z X 1. z X 1 2 z X S 1 p 2 E z X 1 > 0. Having proved Lemma 4.2, the rest of the proof is reativey simpe and foows [15] exacty. he success of the agorithm now depends on the existence of a sufficient gap between the argest and

15 DICIONARY LEARNING WIH FEW SAMPLES 15 second argest entry in b. he intuition is that if X preserved the 1 norm exacty, i.e. z X 1 = c z 1, then the minimization procedure wi output the vector z of smaest 1 norm such that b z = 1, which is just e j /b j, where j is the index of the eement of b with the argest magnitude. However, X ony preserves the 1 norm in an approximate sense. Yet, the agorithm wi sti extract a coumn of X if there is a significant gap between the argest eement of b and the second argest. 5. Rectanguar dictionaries and heorem 1.5 We now present a generaization of ER-SpUD, which enabes us to dea with rectanguar dictionary. Consider a fu rank matrix A of size n > m, such that n > m, and the equation AX = Y. o dea with this setting, we first augment A to be a square, n n, invertibe matrix. Of course, the issue is that one does not know A, and aso need to figure out how the augmentation changes the product Y. We can sove this issue using a random augmentation. For instance, we can use n (n m) gaussian matrix B to augment A to a square matrix A (the entries in B are iid standard gaussian). It is trivia that the augmented matrix has fu rank with probabiity 1, since the probabiity that a gaussian vector beongs to any fixed hyperpane is zero. We can aso augment X from an m p matrix to a n p matrix, X by an (n m) p random matrix Z with entries iid to those of X. his augmentation process yieds a matrix equation Y = A X where Y = Y+E where E = BZ (Figure 1). In practice, we can first generate B, Z, then compute E := BZ and construct Y := Y + E. Next then appy the ER-SpUD agorithm to the equation Y = A X to recover A and X with high probabiity. From these two matrices, we can then deduce A and X. Using a gaussian (or any continuous) augmentation is convenient, as the resuting matrix is obviousy fu rank. However, it is, in some way, a cheat. Apparenty, a gaussian number does not have any finite representation, thus it takes forever to read the input, et aone process it. A common practice is to truncate (as a matter of fact, the computer ony generates a finite approximation of the gaussian numbers anyway), and hope that the truncation is fine for our purpose. But then we face a non-trivia theoretica question to anayze this approximation. How many decima paces are enough? Even if we can prove a guarantee here, using it in practice woud require computing with a matrix with many ong entries, which significanty increases the running time. We can avoid this probem by using random matrices with discrete distributions, such as ±1. he technica issue now is to prove the fu rank property. his is a highy non-trivia probem,but uckiy was taken care of in the foowing resut of Bourgain, Vu, and Wood [3]. heorem 5.1. For every ɛ > 0 there exists δ > 0 such that the foowing hods. Let N f,n be an n by n compex matrix in which f rows contain fixed, non-random entries and where the other rows contain entries that are independent discrete random variabes. If the fixed rows have co-rank k and if for every random entry α, we have max x P(α = x) 1 ɛ, then for a sufficienty arge n P(N f,n has co-rank > k) (1 δ) n f. Letting, k = 0 and f = m, the resut shows that if we augment A by n (m n) random Bernoui matrix, this new matrix, A, wi be nonsinguar with high probabiity, given that n m = ω(1). We summarize our reasoning in the foowing agorithm.

16 16 DICIONARY LEARNING WIH FEW SAMPLES A X A Ã X X = AX + Ã X Figure 1. Rectanguar A with n > m Agorithm 3 Rectanguar Agorithm 1: Generate a (n m) p matrix Z with iid random variabes that agree with the mode for X. 2: Generate a n (n m) matrix B with iid entries (either Gaussian or Rademacher). 3: Run ER-SpUD on Y = Y + BZ 4: Remove the rows of A and the coumns of X from the output of ER-SpUD. 6. Optima bound for very sparse random matrices In this section, we discuss heorem 1.6. We present a simpe agorithm (see beow) and use this agorithm to prove heorem 1.6, obtaining the optima bound p = Cn og n. Agorithm 4 Very-sparse Agorithm 1: Partition the coumns of Y into a minimum number of groups G i whose members are mutipes of each other. 2: Choose representatives of those G i with more than two members to be the coumns of A up to scaing. Proof of heorem 1.6. Since A is nonsinguar, any two coumns of Y that are mutipes of each other must be inear combinations of the same coumns of A. For a group G i to have more than two members woud require that there be more than two coumns in X with their non-zero entries in the same rows. Definition 6.1. We say that a set of coumns are aigned if they each have more than one nonzero entry and their non-zero entries occur in the same positions. Lemma 6.2. he probabiity that X has more than two aigned coumns is o(1). hus, the agorithm is ikey to yied ony coumns of A. We now need to show that a the coumns of A wi be outputted with high probabiity. Definition 6.3. We say the coumn a of A is k-represented if some group G i consists of mutipes of a and G i = k. In particuar, if no mutipe of the jth coumn, a j, shows up in the coumns of Y then a j is 0-represented. A coumn is we represented if it is k-represented for k > 2. Notice that the agorithm wi output a mutipe of every coumn that is we represented. he foowing emma finishes the proof of heorem 1.6. Lemma 6.4. he probabiity that every coumn a i is we represented is 1 o(1). 1

17 DICIONARY LEARNING WIH FEW SAMPLES Proofs of Sparse Agorithm. Proof of Lemma 6.2. Given the choice of θ, we know that the number of nonzero entries in any coumn of X wi converge to the Poisson distribution. We ignore the o(1/n) error terms from this approximation in ater cacuations to aeviate cutter. o cacuate the probabiity, we condition on the number of nonzero entries, and then we bound the probabiity that three specific coumns have the required property, and finay we use the union bound. his yieds an upper bound of ( n 3 ) Proof of Lemma 6.4. By the union bound, k 2 e 3c (k!) 3 1 ( n k) 2 = o(1) P( i such that a i is not we represented) np(a 1 is not we represented) Partitioning into disjoint events yieds P(a 1 is not we represented) = 2 P(a 1 is j-represented) Notice that a mutipe of a 1, say a a 1, appears as a coumn of Y if and ony if a e 1 = (a, 0, 0,..., 0), with a 0, is X j, the jth coumn of X, for some j. Now, using the Poisson approximation we can bound each term in the summand. For exampe, for the probabiity of being 0-represented, we can divide into the case that X i does not have exacty one non-zero eement and the case that X i has exacty one non-zero term but not in the first row. We use C to indicate an absoute constant which may change with each appearance. ( P(a 1 is 0-represented) (1 ce c ) + e c n 1 ) p C exp( Cp/n) n Simiary, ( ) ( ce c P(a 1 is 1-represented) n (1 ce c ) + e c n 1 ) p 1 C exp( Cp/n) n n and P(a 1 is 2-represented) ( n 2 ) ( ce c n j=0 ) 2 ( (1 ce c ) + e c n 1 ) p 2 C exp( Cp/n) n hus, P(a 1 is not we represented) C exp(og n Cp/n) = o(1) for p = C n og n for a arge enough C. 7. Proof of heorem Lemmas Independent of Symmetry. We first state the necessary emmas from [15] whose proofs do not use the symmetry of the random variabes. Lemma 7.1. If rank(x) = n, A is nonsinguar, and Y can be decomposed into Y = A X, then the row spaces of X, X, and Y are the same. he genera idea is to show that the sparsest vectors in the row-span of Y are the rows of X. Since a of the rows of X ie in the row-span of Y, intuitivey, they can be sparse ony when they are mutipes of the rows of X. Naivey, this is because rows of X are ikey to have neary disjoint supports. hus, any inear combination of them wi probaby increase the number of nonzero entries.

18 18 DICIONARY LEARNING WIH FEW SAMPLES Lemma 7.2. Let Ω be an n p Bernoui(θ) matrix with 1/n < θ < 1/4. For each set S [n], et S [p] be the indices of the coumns of Ω that have at east one non-zero entry in some row indexed by S. (a) For every set S of size 2, (b) For every set S of size σ with 3 σ 1/θ, (c) For every set S of size σ with 1/θ σ, ( P( S (4/3)θp) exp θp ) 108 P( S (3σ/8)θp) exp P( S (1 1/e)p/2) exp 7.2. Generaized Lemmas. We wi use a resut of [14]. ( σθp ) 64 ( ) (1 1/e)p 8 Lemma 7.3. Let ξ 1,..., ξ n be independent centered random variabes with variances at east 1 and fourth moments bounded by B. hen there exists ν (0, 1) depending ony on B, such that for every coefficient vector a = (a 1,..., a n ) S n 1 the random sum S = n k=1 a kξ k satisfies P( S < 1 2 ) ν Definition 7.4. We ca a vector α R n fuy dense if for a i [n], α i 0. Lemma 7.5. For b > s, et H R s b be a matrix with one nonzero in each coumn. Let R be a s-by-b matrix with independent centered random variabes with variances at east 1 and bounded fourth moments. Define U = H R hen the probabiity that the eft nuspace of U contains a fuy dense vector is at most Proof of Lemma 7.5. Let U = [u 1... u b ] denote the coumns of U and for each j [b], et N j be the eft nuspace of [u 1... u j ]. We show that with high probabiity N b cannot contain a fuy dense vector. his can be done by showing that if N j 1 contains a fuy dense vector then with probabiity 1/2 the dimension of N j is ess than the dimension of N j 1. Formay, consider a fuy dense vector α N j 1. If u j contains ony one nonzero entry, then α u j 0 reducing the dimension of N j. If u j contains more than one non-zero entry, then Lemma 7.3 impies that the probabiity, over the choice of entries of R j, that α u j = 0 is ess than 1/2. Note that the dimension cannot decrease more than s times. For N b to contain a fuy dense vector, there must be at east b s coumns for which the dimension of the nuspace does nto decrease. Let F [b] have size b s. he probabiity that for every j F, N j 1 contains a fuy dense vector and that the dimension of N j equas the dimension of N j 1 is at most 2 b+s 1. By the union bound, the probabiity that N b contains a fuy dense vector is at most ) ) s 2 b+s 2 b+s og(e2 b/s) ( b b s 2 b+s ( eb s he proofs of the foowing emmas are identica to those in [15] except that they now use our more genera Lemma 7.5 aong with the emmas in the previous section. Lemma 7.6. For t > 200s, et Ω {0, 1} s t be any binary matrix with at east one nonzero in each coumn. Let R R s t be a random matrix whose entries are iid random variabes, with P(R ij = 0) = 0, and et U = Ω R. hen, the probabiity that there exists a fuy-dense vector α for which α U 0 t/5 is at most 2 t/25.

19 DICIONARY LEARNING WIH FEW SAMPLES 19 Lemma 7.7. If X = Ω R foows the Bernoui-Subgaussian mode with P(R ij = 0) = 0, 1/n < θ < 1/C and p > Cn og n, then the probabiity that there is a vector α with support of size arger than 1 for which α X 0 (11/9)θp is at most exp( cθp), and C, c are numerica constants Proof of heorem 1.7. Say Y can be decomposed as A X. From Lemma 7.7, we know that with probabiity at most exp( cθp), any inear combination of two or more rows of X has at east (11/9)θp nonzeros. By a simpe Chernoff bound, the probabiity that any row of X has more than (10/9)θp nonzero entries is bounded by n exp( θp/243). hus, the rows of X are ikey the sparsest in row(x). On the previous event of probabiity at east 1 exp( cθp), X does not have any eft nu vectors with more than one nonzero entry. herefore, if the rows of X are nonzero, X wi have no nonzero vectors in its eft nuspace. he probabiity that a of the rows of X are nonzero is at east 1 n(1 θ) p 1 n exp( cp). From this, by Lemma 7.1, we get row(x) = row(y ) = row(x ). Hence, we can concude that every row in X is a scaar mutipe of a row of X. 8. Numerica Simuations We demonstrate that the efficiency of the ER-SpUD agorithm is not improved with arger p vaues beyond the threshod conjectured. In Figure 2, we have chosen A to be an n n matrix of independent N(0, 1) random variabes. he n p matrix X has k randomy chosen non-zero entries which are Rademacher. he graph on the eft of Figure 2 is generated with p = 5n og n and the one on the right with p = 5n 2 og 2 n. For both graphs, n varies from 10 to 60 and k from 1 to 10. Accuracy is measured in terms of reative error: re(a, A) = min Π,Λ A ΛΠ A F / A F he average reative error over ten trias is reported. sparsity: k ER SpUD Sma Sampe sparsity: k ER SpUD Large Sampe Re. Error dictionary size: n dictionary size: n 0 Figure 2. Mean reative errors of ER-SpUD with p = 5n og n versus p = 5n 2 og 2 n We then ran our Agorithm 6 in a sparse regime to compare its performance with that of ER- SpUD (see Figure 3. A was as before, but since our agorithm reies on the appearance of 1-sparse coumns in X, we cannot fix sparsity as in our first experiments. Rather, we vary the Bernoui parameter θ from 0.02 to 0.18, and the χ ij are Rademacher. One can see the expected phase transition at which point the matrix X is no onger sparse enough for our agorithm. In the regime

20 20 DICIONARY LEARNING WIH FEW SAMPLES for which the agorithm was designed, the reative error of our output is on the same order as that of ER-SpUD. Furthermore, our agorithm runs much quicker and has no troube with inputs of size up to n = 500. (he numerica experiments were competed on a Macbook Pro.) Finay, we compare the outcome of our optima p vaue with that of a much arger sampe size (p = O(n 2 og 2 n)). We et n range from 10 to 200 and θ from 0.01 to Figure 4 shows that the efficacy of the agorithm is not much improved despite the dramatic increase in p. he threshod for faiure is identica Sparse Agorithm ER SpUD Re. Error sparsity: theta sparsity: theta dictionary size: n dictionary size: n 0 Figure 3. Mean reative errors with varying sparsity θ. Here, p = 5n og n. Sparse Ag Sma Sampe Sparse Ag Large Sampe Re. Error sparsity: theta sparsity: theta dictionary size: n dictionary size: n 0 Figure 4. Mean reative errors of Agorithm 6 with p = 5n og n versus p = 5n 2 og 2 n

21 DICIONARY LEARNING WIH FEW SAMPLES 21 References [1] Micha Aharon, Michae Ead, and Afred Bruckstein. he k-svd: An agorithm for designing overcompete dictionaries for sparse representation. Signa Processing, IEEE ransactions on, 54(11): , [2] Micha Aharon, Michae Ead, and Afred M Bruckstein. On the uniqueness of overcompete dictionaries, and a practica way to retrieve them. Linear agebra and its appications, 416(1):48 67, [3] Jean Bourgain, Van H Vu, and Phiip Matchett Wood. On the singuarity probabiity of discrete random matrices. Journa of Functiona Anaysis, 258(2): , [4] Kjersti Engan, Sven Oe Aase, and J Hakon Husoy. Method of optima directions for frame design. In Acoustics, Speech, and Signa Processing, Proceedings., 1999 IEEE Internationa Conference on, voume 5, pages IEEE, [5] X Fernique. Reguarite de processus gaussien. In Invent Math., pages [6] Pando Georgiev, Fabian heis, and Andrzej Cichocki. Bind source separation and sparse component anaysis of overcompete mixtures. In Acoustics, Speech, and Signa Processing, Proceedings.(ICASSP 04). IEEE Internationa Conference on, voume 5, pages V 493. IEEE, [7] Lee-Ad Gottieb and yer Neyon. Matrix sparsification and the sparse nu space probem. In Approximation, Randomization, and Combinatoria Optimization. Agorithms and echniques, pages Springer, [8] Forent Jaiet, Rémi Gribonva, Mark D Pumbey, and Hadi Zayyani. An 1 criterion for dictionary earning by subspace identification. In Acoustics Speech and Signa Processing (ICASSP), 2010 IEEE Internationa Conference on, pages IEEE, [9] Kenneth Kreutz-Degado, Joseph F Murray, Bhaskar D Rao, Kjersti Engan, e-won Lee, and errence J Sejnowski. Dictionary earning agorithms for sparse representation. Neura computation, 15(2): , [10] Liangyue Li, Sheng Li, and Yun Fu. Discriminative dictionary earning with ow-rank reguarization for face recognition. In Automatic Face and Gesture Recognition (FG), th IEEE Internationa Conference and Workshops on, pages 1 6. IEEE, [11] Juien Maira, Francis Bach, Jean Ponce, and Guiermo Sapiro. Onine dictionary earning for sparse coding. In Proceedings of the 26th Annua Internationa Conference on Machine Learning, pages ACM, [12] Bruno A Oshausen et a. Emergence of simpe-ce receptive fied properties by earning a sparse code for natura images. Nature, 381(6583): , [13] Mark D Pumbey. Dictionary earning for 1-exact sparse coding. In Independent Component Anaysis and Signa Separation, pages Springer, [14] Mark Rudeson and Roman Vershynin. he ittewood offord probem and invertibiity of random matrices. Advances in Mathematics, 218(2): , [15] Danie A Spieman, Huan Wang, and John Wright. Exact recovery of sparsey-used dictionaries. In Proceedings of the wenty-hird internationa joint conference on Artificia Inteigence, pages AAAI Press, [16] Miche aagrand. Majorizing measures: the generic chaining. he Annas of Probabiity, pages , [17] John Wright, Aen Y Yang, Arvind Ganesh, Shankar S Sastry, and Yi Ma. Robust face recognition via sparse representation. Pattern Anaysis and Machine Inteigence, IEEE ransactions on, 31(2): , [18] Qiang Zhang and Baoxin Li. Discriminative k-svd for dictionary earning in face recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages IEEE, [19] Michae Zibuevsky. Bind source separation with reative newton method. In Proc. ICA, voume 2003, pages , [20] Michae Zibuevsky and Barak A Pearmutter. Bind source separation by sparse decomposition. In AeroSense 2000, pages Internationa Society for Optics and Photonics, 2000.

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

arxiv: v1 [cs.lg] 31 Oct 2017

arxiv: v1 [cs.lg] 31 Oct 2017 ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT

More information

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ).

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ). Bourgain s Theorem Computationa and Metric Geometry Instructor: Yury Makarychev 1 Notation Given a metric space (X, d) and S X, the distance from x X to S equas d(x, S) = inf d(x, s). s S The distance

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea-Time Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process

More information

THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES

THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES by Michae Neumann Department of Mathematics, University of Connecticut, Storrs, CT 06269 3009 and Ronad J. Stern Department of Mathematics, Concordia

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

Haar Decomposition and Reconstruction Algorithms

Haar Decomposition and Reconstruction Algorithms Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea- Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process Management,

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

DICTIONARY IDENTIFIABILITY FROM FEW TRAINING SAMPLES

DICTIONARY IDENTIFIABILITY FROM FEW TRAINING SAMPLES DICTIONARY IDENTIFIABIITY FROM FEW TRAINING SAMPES Rémi Gribonva Projet METISS Centre de Recherche INRIA Rennes - Bretagne Atantique IRISA, Campus de Beauieu F-35042 Rennes Cedex, France E-mai: firstname.astname@irisa.fr

More information

An explicit Jordan Decomposition of Companion matrices

An explicit Jordan Decomposition of Companion matrices An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

arxiv: v1 [math.co] 17 Dec 2018

arxiv: v1 [math.co] 17 Dec 2018 On the Extrema Maximum Agreement Subtree Probem arxiv:1812.06951v1 [math.o] 17 Dec 2018 Aexey Markin Department of omputer Science, Iowa State University, USA amarkin@iastate.edu Abstract Given two phyogenetic

More information

General Certificate of Education Advanced Level Examination June 2010

General Certificate of Education Advanced Level Examination June 2010 Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/Q10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of using one or two eyes on the perception

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

Lower Bounds for the Relative Greedy Algorithm for Approximating Steiner Trees

Lower Bounds for the Relative Greedy Algorithm for Approximating Steiner Trees This paper appeared in: Networks 47:2 (2006), -5 Lower Bounds for the Reative Greed Agorithm for Approimating Steiner Trees Stefan Hougard Stefan Kirchner Humbodt-Universität zu Berin Institut für Informatik

More information

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm 1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete

More information

#A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG

#A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG #A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG Guixin Deng Schoo of Mathematica Sciences, Guangxi Teachers Education University, Nanning, P.R.China dengguixin@ive.com Pingzhi Yuan

More information

BASIC NOTIONS AND RESULTS IN TOPOLOGY. 1. Metric spaces. Sets with finite diameter are called bounded sets. For x X and r > 0 the set

BASIC NOTIONS AND RESULTS IN TOPOLOGY. 1. Metric spaces. Sets with finite diameter are called bounded sets. For x X and r > 0 the set BASIC NOTIONS AND RESULTS IN TOPOLOGY 1. Metric spaces A metric on a set X is a map d : X X R + with the properties: d(x, y) 0 and d(x, y) = 0 x = y, d(x, y) = d(y, x), d(x, y) d(x, z) + d(z, y), for a

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

Reichenbachian Common Cause Systems

Reichenbachian Common Cause Systems Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems Convergence Property of the Iri-Imai Agorithm for Some Smooth Convex Programming Probems S. Zhang Communicated by Z.Q. Luo Assistant Professor, Department of Econometrics, University of Groningen, Groningen,

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Homework 5 Solutions

Homework 5 Solutions Stat 310B/Math 230B Theory of Probabiity Homework 5 Soutions Andrea Montanari Due on 2/19/2014 Exercise [5.3.20] 1. We caim that n 2 [ E[h F n ] = 2 n i=1 A i,n h(u)du ] I Ai,n (t). (1) Indeed, integrabiity

More information

17 Lecture 17: Recombination and Dark Matter Production

17 Lecture 17: Recombination and Dark Matter Production PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu

More information

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation Appied Mathematics and Stochastic Anaysis Voume 007, Artice ID 74191, 8 pages doi:10.1155/007/74191 Research Artice On the Lower Bound for the Number of Rea Roots of a Random Agebraic Equation Takashi

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SIAM J. NUMER. ANAL. Vo. 0, No. 0, pp. 000 000 c 200X Society for Industria and Appied Mathematics VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW

More information

Smoothers for ecient multigrid methods in IGA

Smoothers for ecient multigrid methods in IGA Smoothers for ecient mutigrid methods in IGA Cemens Hofreither, Stefan Takacs, Water Zuehner DD23, Juy 2015 supported by The work was funded by the Austrian Science Fund (FWF): NFN S117 (rst and third

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

Stat 155 Game theory, Yuval Peres Fall Lectures 4,5,6

Stat 155 Game theory, Yuval Peres Fall Lectures 4,5,6 Stat 155 Game theory, Yuva Peres Fa 2004 Lectures 4,5,6 In the ast ecture, we defined N and P positions for a combinatoria game. We wi now show more formay that each starting position in a combinatoria

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA ON THE SYMMETRY OF THE POWER INE CHANNE T.C. Banwe, S. Gai {bct, sgai}@research.tecordia.com Tecordia Technoogies, Inc., 445 South Street, Morristown, NJ 07960, USA Abstract The indoor power ine network

More information

More Scattering: the Partial Wave Expansion

More Scattering: the Partial Wave Expansion More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction

More information

Theory and implementation behind: Universal surface creation - smallest unitcell

Theory and implementation behind: Universal surface creation - smallest unitcell Teory and impementation beind: Universa surface creation - smaest unitce Bjare Brin Buus, Jaob Howat & Tomas Bigaard September 15, 218 1 Construction of surface sabs Te aim for tis part of te project is

More information

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case Technica Report PC-04-00 Anaysis of Emerson s Mutipe Mode Interpoation Estimation Agorithms: The MIMO Case João P. Hespanha Dae E. Seborg University of Caifornia, Santa Barbara February 0, 004 Anaysis

More information

Formulas for Angular-Momentum Barrier Factors Version II

Formulas for Angular-Momentum Barrier Factors Version II BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A

More information

Section 6: Magnetostatics

Section 6: Magnetostatics agnetic fieds in matter Section 6: agnetostatics In the previous sections we assumed that the current density J is a known function of coordinates. In the presence of matter this is not aways true. The

More information

How many random edges make a dense hypergraph non-2-colorable?

How many random edges make a dense hypergraph non-2-colorable? How many random edges make a dense hypergraph non--coorabe? Benny Sudakov Jan Vondrák Abstract We study a mode of random uniform hypergraphs, where a random instance is obtained by adding random edges

More information

Minimizing Total Weighted Completion Time on Uniform Machines with Unbounded Batch

Minimizing Total Weighted Completion Time on Uniform Machines with Unbounded Batch The Eighth Internationa Symposium on Operations Research and Its Appications (ISORA 09) Zhangiaie, China, September 20 22, 2009 Copyright 2009 ORSC & APORC, pp. 402 408 Minimizing Tota Weighted Competion

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

MONOCHROMATIC LOOSE PATHS IN MULTICOLORED k-uniform CLIQUES

MONOCHROMATIC LOOSE PATHS IN MULTICOLORED k-uniform CLIQUES MONOCHROMATIC LOOSE PATHS IN MULTICOLORED k-uniform CLIQUES ANDRZEJ DUDEK AND ANDRZEJ RUCIŃSKI Abstract. For positive integers k and, a k-uniform hypergraph is caed a oose path of ength, and denoted by

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

Generalized Bell polynomials and the combinatorics of Poisson central moments

Generalized Bell polynomials and the combinatorics of Poisson central moments Generaized Be poynomias and the combinatorics of Poisson centra moments Nicoas Privaut Division of Mathematica Sciences Schoo of Physica and Mathematica Sciences Nanyang Technoogica University SPMS-MAS-05-43,

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations Optimaity of Inference in Hierarchica Coding for Distributed Object-Based Representations Simon Brodeur, Jean Rouat NECOTIS, Département génie éectrique et génie informatique, Université de Sherbrooke,

More information

New Efficiency Results for Makespan Cost Sharing

New Efficiency Results for Makespan Cost Sharing New Efficiency Resuts for Makespan Cost Sharing Yvonne Beischwitz a, Forian Schoppmann a, a University of Paderborn, Department of Computer Science Fürstenaee, 3302 Paderborn, Germany Abstract In the context

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework Limits on Support Recovery with Probabiistic Modes: An Information-Theoretic Framewor Jonathan Scarett and Voan Cevher arxiv:5.744v3 cs.it 3 Aug 6 Abstract The support recovery probem consists of determining

More information

Target Location Estimation in Wireless Sensor Networks Using Binary Data

Target Location Estimation in Wireless Sensor Networks Using Binary Data Target Location stimation in Wireess Sensor Networks Using Binary Data Ruixin Niu and Pramod K. Varshney Department of ectrica ngineering and Computer Science Link Ha Syracuse University Syracuse, NY 344

More information

u(x) s.t. px w x 0 Denote the solution to this problem by ˆx(p, x). In order to obtain ˆx we may simply solve the standard problem max x 0

u(x) s.t. px w x 0 Denote the solution to this problem by ˆx(p, x). In order to obtain ˆx we may simply solve the standard problem max x 0 Bocconi University PhD in Economics - Microeconomics I Prof M Messner Probem Set 4 - Soution Probem : If an individua has an endowment instead of a monetary income his weath depends on price eves In particuar,

More information

Honors Thesis Bounded Query Functions With Limited Output Bits II

Honors Thesis Bounded Query Functions With Limited Output Bits II Honors Thesis Bounded Query Functions With Limited Output Bits II Daibor Zeený University of Maryand, Batimore County May 29, 2007 Abstract We sove some open questions in the area of bounded query function

More information

General Certificate of Education Advanced Level Examination June 2010

General Certificate of Education Advanced Level Examination June 2010 Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/P10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of temperature on the rate of photosynthesis

More information

8 Digifl'.11 Cth:uits and devices

8 Digifl'.11 Cth:uits and devices 8 Digif'. Cth:uits and devices 8. Introduction In anaog eectronics, votage is a continuous variabe. This is usefu because most physica quantities we encounter are continuous: sound eves, ight intensity,

More information

Distributed average consensus: Beyond the realm of linearity

Distributed average consensus: Beyond the realm of linearity Distributed average consensus: Beyond the ream of inearity Usman A. Khan, Soummya Kar, and José M. F. Moura Department of Eectrica and Computer Engineering Carnegie Meon University 5 Forbes Ave, Pittsburgh,

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

CONSISTENT LABELING OF ROTATING MAPS

CONSISTENT LABELING OF ROTATING MAPS CONSISTENT LABELING OF ROTATING MAPS Andreas Gemsa, Martin Nöenburg, Ignaz Rutter Abstract. Dynamic maps that aow continuous map rotations, for exampe, on mobie devices, encounter new geometric abeing

More information

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c) A Simpe Efficient Agorithm of 3-D Singe-Source Locaization with Uniform Cross Array Bing Xue a * Guangyou Fang b Yicai Ji c Key Laboratory of Eectromagnetic Radiation Sensing Technoogy, Institute of Eectronics,

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

FOURIER SERIES ON ANY INTERVAL

FOURIER SERIES ON ANY INTERVAL FOURIER SERIES ON ANY INTERVAL Overview We have spent considerabe time earning how to compute Fourier series for functions that have a period of 2p on the interva (-p,p). We have aso seen how Fourier series

More information

Investigation on spectrum of the adjacency matrix and Laplacian matrix of graph G l

Investigation on spectrum of the adjacency matrix and Laplacian matrix of graph G l Investigation on spectrum of the adjacency matrix and Lapacian matrix of graph G SHUHUA YIN Computer Science and Information Technoogy Coege Zhejiang Wani University Ningbo 3500 PEOPLE S REPUBLIC OF CHINA

More information

Absolute Value Preconditioning for Symmetric Indefinite Linear Systems

Absolute Value Preconditioning for Symmetric Indefinite Linear Systems MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.mer.com Absoute Vaue Preconditioning for Symmetric Indefinite Linear Systems Vecharynski, E.; Knyazev, A.V. TR2013-016 March 2013 Abstract We introduce

More information

On colorings of the Boolean lattice avoiding a rainbow copy of a poset arxiv: v1 [math.co] 21 Dec 2018

On colorings of the Boolean lattice avoiding a rainbow copy of a poset arxiv: v1 [math.co] 21 Dec 2018 On coorings of the Booean attice avoiding a rainbow copy of a poset arxiv:1812.09058v1 [math.co] 21 Dec 2018 Baázs Patkós Afréd Rényi Institute of Mathematics, Hungarian Academy of Scinces H-1053, Budapest,

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5].

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5]. PRODUCTS OF NEARLY HOLOMORPHIC EIGENFORMS JEFFREY BEYERL, KEVIN JAMES, CATHERINE TRENTACOSTE, AND HUI XUE Abstract. We prove that the product of two neary hoomorphic Hece eigenforms is again a Hece eigenform

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased PHYS 110B - HW #1 Fa 2005, Soutions by David Pace Equations referenced as Eq. # are from Griffiths Probem statements are paraphrased [1.] Probem 6.8 from Griffiths A ong cyinder has radius R and a magnetization

More information

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s

More information

Reliability: Theory & Applications No.3, September 2006

Reliability: Theory & Applications No.3, September 2006 REDUNDANCY AND RENEWAL OF SERVERS IN OPENED QUEUING NETWORKS G. Sh. Tsitsiashvii M.A. Osipova Vadivosto, Russia 1 An opened queuing networ with a redundancy and a renewa of servers is considered. To cacuate

More information