Exact tensor completion with sum-of-squares

Size: px
Start display at page:

Download "Exact tensor completion with sum-of-squares"

Transcription

1 Proceedings of Machine Learning Research vol 65:1 54, th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David Steurer Cornell University and Institute for Advanced Study, Princeton Editors: Satyen Kale, Ohan Shair and Kaalika Chaudhuri Abstract We obtain the first polynoial-tie algorith for exact tensor copletion that iproves over the bound iplied by reduction to atrix copletion. The algorith recovers an unknown 3-tensor with r incoherent, orthogonal coponents in n fro r Õn 1.5 ) randoly observed entries of the tensor. This bound iproves over the previous best one of r Õn 2 ) by reduction to exact atrix copletion. Our bound also atches the best known results for the easier proble of approxiate tensor copletion Barak & Moitra, 2015). Our algorith and analysis extends seinal results for exact atrix copletion Candes & Recht, 2009) to the tensor setting via the su-of-squares ethod. The ain technical challenge is to show that a sall nuber of randoly chosen onoials are enough to construct a degree-3 polynoial with precisely planted orthogonal global optia over the sphere and that this fact can be certified within the su-of-squares proof syste. Keywords: tensor copletion, su-of-squares ethod, seidefinite prograing, exact recovery, atrix polynoials, atrix nor bounds 1. Introduction A basic task in achine learning and signal processing is to infer issing data fro a sall nuber of observations about the data. An iportant exaple is atrix copletiton which asks to recover an unknown low-rank atrix fro a sall nuber of observed entries. This proble has any interesting applications one of the proinent original otivations was the Netflix Prize that sought iproved algoriths for predicting user ratings for ovies fro a sall nuber of user-provided ratings. After an extensive research effort Candès and Recht, 2009; Candès and Tao, 2010; Keshavan et al., 2009; Srebro and Shraiban, 2005), efficient algoriths with alost optial, provable recovery guarantees have been obtained: In order to efficiently recover an unknown incoherent n-by-n atrix of rank r it is enough to observe r Õn) rando entries of the atrix Gross, 2011; Recht, 2011). One of the reaining challenges is to obtain algorith for the ore general and uch less understood tensor copletion proble where the observations do not just consist of pairwise correlations but also higher-order ones. Algoriths and analyses for atrix and tensor copletion coe in three flavors: 1. algoriths analyzed by statistical learning tools like Radeacher coplexity Srebro and Shraiban, 2005; Barak and Moitra, 2016). 2. iterative algoriths like alternating iniization Jain et al., 2013; Hardt, 2014; Hardt and Wootters, 2014) A. Potechin & D. Steurer.

2 Potechin Steurer 3. algoriths analyzed by constructing dual certificates for convex prograing relaxations Candès and Recht, 2009; Gross, 2011; Recht, 2011). While each of these flavors have different benefits, typically only algoriths of the third flavor achieve exact recovery. The only exceptions to this rule we are aware of are a recent fast algorith for atrix copletion Jain and Netrapalli, 2015) and a recent analysis Ge et al., 2016) showing that the coonly used non-convex objective function for positive seidefinite atrix copletion has no spurious local inia and thus stochastic gradient descent and other popular optiization progras can solve positive seidefinite atrix copletion with arbitrary initialization.) For all other algoriths, the analysis exhibits a trade-off between reconstruction error and the required nuber of observations even when there is no noise in the input).1 In this work, we obtain the first algorith for exact tensor copletion that iproves over the bounds iplied by reduction to exact atrix copletion. The algorith recovers an unknown 3-tensor with r incoherent, orthogonal coponents in n fro r Õn 1.5 ) randoly observed entries of the tensor. The previous best bound for exact recovery is r Õn 2 ), which is iplied by reduction to exact atrix copletion. The reduction views 3-tensor on n as an n-by-n 2 atrix. We can recover rank-r atrices of this shape fro r Õn 2 ) saples, which is best possible.) Our bound also atches the best known results for the easier proble of approxiate tensor copletion Jain and Oh, 2014; Bhojanapalli and Sanghavi, 2015; Barak and Moitra, 2016) the results of the last work also applies to a wider range of tensors and does not require orthogonality). A proble siilar to atrix and tensor copletion is atrix and tensor sensing. The goal is to recover an unknown low rank atrix or tensor fro a sall nuber of linear easureents. An interesting phenoenon is that for carefully designed easureents which actually happen to be rank 1) it is possible to efficiently recover a 3-tensor of rank r with just Or 2 n) easureents Forbes and Shpilka, 2012), which is better than the best bounds for tensor copletion when r n 0.5. We conjecture that for tensor copletion fro rando entries the bound we obtain is up to logarithic factors best possible aong polynoial-tie algoriths. Su-of-squares ethod. Our algorith is based on su-of-squares Shor, 1987; Parrilo, 2000; Lasserre, 2000/01), a very general and powerful eta-algorith studied extensively in any scientific counities see for exaple the survey Barak and Steurer, 2014)). In theoretical coputer science, the ain research focus has been on the capabilities of su-of-squares for approxiation probles Barak et al., 2012), especially in the context of Khot s Unique Gaes Conjecture Khot, 2002). More recently, su-of-squares eerged as a general approach to inference probles that arise in achine learning and have defied other algorithic techniques. This approach has lead to iproved algoriths for tensor decoposition Barak et al., 2015; Ge and Ma, 2015; Hopkins et al., 2016; Ma et al., 2016), dictionary learning Barak et al., 2015; Hazan and Ma, 2016), tensor principal coponent analysis Hopkins et al., 2015; Raghavendra et al., 2016; Bhattiprolu et al., 2016), planted sparse vectors Barak et al., 2014; Hopkins et al., 2016). An exciting direction is also to understand liitations of su-of-squares for inference probles on concrete input distributions Ma and Wigderson, 2015; Hopkins et al., 2015; Barak et al., 2016). 1. We reark that this trade-off is a property of the analysis and not necessarily the algorith. For exaple, soe algoriths of the first flavor are based on the sae convex prograing relaxations as exact recovery algoriths. Also for iterative algorith, the trade-off between reconstruction error and nuber of saple coes fro the requireent of the analysis that each iteration uses fresh saples. For these iterative algoriths, the nuber of saples depends only logarithically on the desired accuracy, which eans that these analyses iply exact recovery if the bit coplexity of the entries is sall. 2

3 Exact tensor copletion with su-of-squares An appealing feature of the su-of-squares ethod is that its capabilities and liitations can be understood through the lens of a siple but surprisingly powerful and intuitive restricted proof syste called su-of-squares or Positivstellensatz syste Grigoriev and Vorobjov, 2001; Grigoriev, 2001a,b). A conceptual contribution of this work is to show that seinal results for inference proble like copressed sensing and atrix copletion have natural interpretations as identifiability proofs in this syste. Furtherore, we show that this interpretation is helpful in order to analyze ore challenging inference probles like tensor copletion. A proising future direction is to find ore exaples of inference probles where this lens on inference algoriths and identifiability proofs yields stronger provable guarantees. A technical contribution of our work is that we develop techniques in order to show that suof-squares achieves exact recovery. Most previous works only showed that su-of-squares gives approxiate solutions, which in soe cases can be turned to exact solutions by invoking algoriths with local convergence guarantees Ge and Ma, 2015; Barak et al., 2014) or solving successive su-of-squares relaxations Ma et al., 2016) Results We say that a vector v n is -incoherent with respect to the coordinate basis e 1,..., e n if for every index i [n], e i, v 2 n v ) We say that a 3-tensor X n n n is orthogonal of rank r if there are orthogonal vectors {u i } i [r] n, {v i } i [r] n, {w i } i [r] n such that X = r u i v i w i. We say that such a 3-tensor X is -incoherent if all of the vectors u i, v i, w i are -incoherent. Theore 1 ain) There exists a polynoial-tie algorith that given at least r O1) Õn) 1.5 rando entries of an unknown orthogonal -incoherent 3-tensor X n n n of rank r, outputs all entries of X with probability at least 1 n ω1). We note that the analysis also shows that the algorith is robust to inverse polynoial aount of noise in the input resulting in inverse polynoial aount of error in the output). We reark that the running tie of the algorith depends polynoially on the bit coplexity on X. 2. Techniques Let {u i } i [r], {v i } i [r], {w i } i [r] be three orthonoral sets in n. Consider a 3-tensor X n n n of the for X = r λ i u i v i w i with λ 1,..., λ n 0. Let Ω [n] 3 be a subset of the entries of X. Our goal is to efficiently reconstruct the unknown tensor X fro its restriction X Ω to the entries in Ω. Ignoring coputational efficiency, we first ask if this task is inforation-theoretically possible. More concretely, for a given set of observations X Ω, how can we rule out that there exists another rank-r orthogonal 3-tensor X X that would give rise to the sae observations X Ω = X Ω?2 2. We ephasize that we ask here about the uniqueness of X for a fixed set of entries Ω. This questions differs fro asking about the uniqueness for a rando set of entries, which could be answered by suitably counting the nuber of low-rank 3-tensors. 3

4 Potechin Steurer A priori it is not clear how an answer to this inforation-theoretic question could be related to the goal of obtaining an efficient algorith. However, it turns out that the su-of-squares fraework allows us to systeatically translate a uniqueness proof to an algorith that efficiently finds the solution. In addition, this solution also coes with a short certificate for uniqueness.3) Uniqueness proof. Let Ω [n] 3 be a set of entries and let X = r λ i u i v i w i be a 3-tensor with λ 1,..., λ r 0. It turns out that the following two conditions are enough to iply that X Ω uniquely deterines X: The first condition is that the vectors {u i v i w i ) Ω } are linearly independent. The second condition is that exists a 3-linear for T on n with the following properties: 1. in the onoial basis T is supported on Ω so that Tx, y, z) = i, j,k) Ω T ijk x i y j x k, 2. evaluated over unit vectors, the 3-for T is exactly axiized at the points u i, v i, w i ) so that Tu 1, v 1, w 1 ) = = Tu r, v r, w r ) = 1 and Tx, y, z) < 1 for all unit vectors x, y, z) {u i, v i, w i ) i [r]}. We show that the two deterinistic conditions above are satisfied with high probability if the vectors {u i }, {v i }, {w i } are incoherent and Ω is a rando set of entries of size at least r Õn 1.5 ). Let us sketch the proof that such a 3-linear for T indeed iplies uniqueness. Concretely, we clai that if we let X be a 3-tensor of the for r λ i u i v i w i for λ 1,..., λ r 0 and unit vectors {u i }, {v i }, {w i } with X Ω = X Ω that iniizes r λ i then X = X ust hold. We identify T with an eleent of n n n the coefficient tensor of T in the onoial basis). Let X be as before. We are to show that X = X. On the one hand, using that Tx, y, z) 1 for all unit vectors x, y, z, r r T, X = λ i Tu i, v i, w i ) λ i. At the sae tie, using that T is supported on Ω and the fact that X Ω = X Ω, T, X = T, X = r λ i Tu i, v i, w i ) = r λ i. Since X iniizes r λ i, equality has to hold in the previous inequality. It follows that every point u i, v i, w i ) is equal to one of the points u j, v j, w j ), because T is uniquely axiized at the points {u i, v i, w i ) i [r]}. Since we assued that {u i v i w i ) Ω } is linearly independent, we can conclude that X = X. When we show that such a 3-linear for T exists, we will actually show soething stronger, naely that the second property is not only true but also has a short certificate in for of a degree-4 su-of-squares proof, which we describe next. This certificate also enables us to efficiently recover the issing tensor entries. 3. This certificate is closely related to certificates in the for of dual solutions for convex prograing relaxations that are used in the copressed sensing and atrix copletion literature. 4

5 Exact tensor copletion with su-of-squares Uniqueness proof in the su-of-squares syste. A degree-4 sos certificate for the second property of T is an n + n 2 )-by-n + n 2 ) positive-seidefinite atrix M acting as a linear operator on n n n )) that represents the polynoial x 2 + y 2 z 2 2Tx, y, z), i.e., x, y z), Mx, y z) = x 2 + y 2 z 2 2Tx, y, z). 2.1) Furtherore, we require that the kernel of M is precisely the span of the vectors {u i, v i w i ) i [r]}. Let s see that this atrix M certifies that T has the property that over unit vectors it is exactly axiized at the desired points u i, v i, w i ). Let u, v, w be unit vectors such that u, v, w) is not a ultiple of one of the vectors u i, v i, w i ). Then by orthogonality, both u, v w) and u, v w) have non-zero projection on the orthogonal copleent of the kernel of M. Therefore, the bounds 0 < u, v w), Mu, v w) = 2 2pu, v, w) and 0 < u, v w), M u, v w) = 2 + 2pu, v, w) together give the desired conclusion that Tu, v, w) < 1. Reconstruction algorith based on the su-of-squares syste. The existence of a positive seidefinite atrix M as above not only eans that reconstruction of X fro X Ω is possible inforation-theoretically but also efficiently. The su-of-squares algorith allows us to efficiently search over low-degree oents of objects called pseudo-distributions that generalize probability distributions over real vector spaces. Every pseudo-distribution defines pseudo-expectation values f for all low-degree polynoial functions f x, y, z), which behave in any ways like expectation values under an actual probability distribution. In order to reconstruct X fro the observations X Ω, we use the su-of-squares algorith to efficiently find a pseudo-distribution that satisfies4 x, y,z) x 2 + y 2 z ) ) x y z = X Ω 2.3) x, y,z) Note that the distribution over the vectors u i, v i, w i ) with probabilities λ i satisfies the above conditions. Our previous discussion about uniqueness shows that the existence of a positive seidefinite atrix M as above iplies no other distribution satisfies the above conditions. It turns out that the atrix M iplies that this uniqueness holds even aong pseudo-distributions in the sense that any pseudodistribution that satisfies Eqs. 2.2) and 2.3) ust satisfy x, y,z) x y z = X, which eans that the reconstruction is successful.5 When do such uniqueness certificates exist? The above discussion shows that in order to achieve reconstruction it is enough to show that uniqueness certificates of the for above exist. We show that these certificates exists with high probability if we choose Ω to be a large enough rando subset of entries under suitable assuptions on X). Our existence proof is based on a randoized procedure to construct such a certificate heavily inspired by siilar constructions for atrix copletion Gross, 2011; Recht, 2011). We note that this construction uses the unknown tensor X and is therefore not constructive in the context of the recovery proble.) 4. The viewpoint in ters of pseudo-distributions is useful to see how the previous uniqueness proof relates to the algorith. We can also describe the solutions to the constraints Eqs. 2.2) and 2.3) in ters of linearly constrained positive seidefinite atrices. See alternative description of Algorith 2 5. The atrix M can also be viewed as a solution to the dual of the convex optiization proble of finding a pseudo-distribution that satisfies conditions Eqs. 2.2) and 2.3). Ω 5

6 Potechin Steurer Before describing the construction, we ake the requireents on the 3-linear for T ore concrete. We identify T with the linear operator fro n n to n such that Tx, y, z) = x, Ty z). Furtherore, let T a be linear operators on n such that Tx, y, z) = n a=1 x a y, T a z. Then, the following conditions on T iply the existence of a uniqueness certificate M which also eans that recover succeeds), 1. every unknown entry i, j, k) Ω satisfies e i, Te j e k ) = 0, 2. every index i [r] satisfies u i = Tv i w i ), 3. the atrix n a=1 T a T a r v i w i )v i w i ) has spectral nor at ost We note that the uniqueness certificates for atrix copletion Gross, 2011; Recht, 2011) have siilar requireents. The key difference is that we need to control the spectral nor of an operator that depends quadratically on the constructed object T as opposed to a linear dependence in the atrix copletion case). Cobined with the fact that the construction of T is iterative about log n steps), the spectral nor bound unfortunately requires significant technical work. In particular, we cannot apply general atrix concentration inequalities and instead apply the trace oent ethod. See Section 5.) We also note that the fact that the above requireents allow us to construct the certifcate M is not iediate and requires soe new ideas about atrix representations of polynoials, which ight be useful elsewhere. See Appendix A.) Finally, we note that the transforation applied to T in order to obtain the atrix for the third condition above appears in any works about 3-tensors Hopkins et al., 2015; Barak and Moitra, 2016) with the earliest appearance in a work on refutation algoriths for rando 3-SAT instances see Feige and Ofek, 2007)). The iterative construction of the linear operator T exactly follows the recipe fro atrix copletion Gross, 2011; Recht, 2011). Let R Ω be the projection operator into the linear space of operators T that satify the first requireent. Let P T be the affine) projection operator into the affine linear space of operators T that satisfy the second reqireent. We start with T 0) = X. At this point we satisfy the second condition. Also the atrix in the third condition is 0.) In order to enforce the first condition we apply the operator R Ω. After this projection, the second condition is ost likely no longer satisfied. To enforce the second condition, we apply the affine linear operator P T and obtain T 1) = P T R Ω X). The idea is to iterate this construction and show that after a logarithic nuber of iterations both the first and second condition are satisfied up to an inverse polynoially sall error which we can correct in a direct way). The ain challenge is to show that the iterates obtained in this way satisfy the desired spectral nor bound. We note that for technical reasons the construction uses fresh randoness Ω for each iteration like in the atrix copletion case Recht, 2011; Gross, 2011). Since the nuber of iterations is logarithic, the total nuber of required observations reains the sae up to a logarithic factor.) 3. Preliinaries Unless explicitly stated otherwise, O )-notation hides absolute ultiplicative constants. Concretely, every occurrence of Ox) is a placeholder for soe function f x) that satisfies x. f x) C x for soe absolute constant C > 0. Siilarly, Ωx) is a placeholder for a function gx) that satisfies x. gx) x /C for soe absolute constant C > 0. 6

7 Exact tensor copletion with su-of-squares Our algorith is based on a generalization of probability distributions over n. To define this generalization the following notation for the foral expectation of a function f on n with respect to a finitely-supported function : n, f = x) f x). x support) A degree-d pseudo-distribution over n is a finitely-supported function : n such that 1 = 1 and f 2 0 for every polynoial f of degree at ost d/2. A key algorithic property of pseudo-distributions is that their low-degree oents have an efficient separation oracle. Concretely, the set of degree-d oents 1, x) d such that is a degree-d pseudo-distributions over n has an n Od) -tie separation oracle. Therefore, standard convex optiization ethods allow us to efficiently optiize linear functions over low-degree oents of pseudo-distributions even subject to additional convex constraints that have efficient separation oracles) up to arbitrary nuerical accuracy. 4. Tensor copletion algorith In this section, we show that the following algorith for tensor copletion succeeds in recovering the unknown tensor fro partial observations assuing the existence of a particular linear operator T. We will state conditions on the unknown tensor that iply that such a linear operator exists with high probability if the observed entries are chosen at rando. We use essentially the sae convex relaxation as in Barak and Moitra, 2016) but our analysis differs significantly. Algorith 2 Exact tensor copletion based on degree-4 su-of-squares) Input: locations Ω [n] 3 and partial observations X Ω of an unknown 3-tensor X n n n. Operation: Find a degree-4 pseudo-distribution on n n n such that the third oent atches the observations x, y,z) x y z Ω = X Ω so as to iniize x, y,z) x 2 + y 2 z 2. Output the 3-tensor x, y,z) x y z n n n. Alternative description: Output a iniu trace, positive seidefinite atrix Y acting on n n n ) with blocks Y 1,1, Y 1,2 and Y 2,2 such that Y 1,2 ) Ω = X Ω atches the observations, and Y 2,2 satisfies the additional syetry constraints that each entry e j e k, Y 2,2 e j e k ) only depends on the index sets { j, j }, {k, k }. Let {u i }, {v i }, {w i } be three orthonoral sets in n, each of cardinality r. We reason about the recovery guarantees of the algorith in ters of the following notion of certifcate. Definition 3 We say that a linear operator T fro n n to n is a degree-4 certificate for Ω and orthonoral sets {u i }, {v i }, {w i } n if the following conditions are satisfies 1. the vectors {u i v j w k ) Ω i, j, k) S} are linearly independent, where S [n] 3 is the set of triples with at least two identical indices fro [r], 7

8 Potechin Steurer 2. every entry a, b, c) Ω satisfies e a, Te b e c ) = 0, 3. If we view T as a 3-tensor in n ) 3 whose a, b, c) entry is e a, Te b e c ), every index i [r] satisfies u i v i Id)T = w i, u i Id w i )T = v i, and Id v i w i )T = u i. 4. the following atrix has spectral nor at ost 0.01, n r T a T a v i w i )v i w i ), a=1 where {T a } are atrices such that x, Ty x) = n a=1 x a y, T a z. In Section 4.4, we prove that existence of such certifcates iplies that the above algorith successfully recovers the unknown tensor, as foralized by the following theore. Theore 4 Let X n n n be any 3-tensor of the for r λ i u i v i w i for λ 1,..., λ r +. Let Ω [n] 3 be a subset of indices. Suppose there exists degree-4 certificate in the sense of Definition 3. Then, given the observations X Ω the above algorith recovers the unknown tensor X exactly. In Section 4.5, we show that degree-4 certificates are likely to exist when Ω is a rando set of appropriate size. Theore 5 Let {u i }, {v i }, {w i } be three orthonoral sets of -incoherent vectors in n, each of cardinality r. Let Ω [n] 3 be a rando set of tensor entries of cardinality = r n 1.5 log n) C for an absolute constant C 1. Then, with probability 1 n ω1), there exists a linear operator T that satisfies the requireents of Definition 3. Taken together the two theores above iply our ain result Theore Sipler proofs via higher-degree su-of-squares Unfortunately the proof of Theore 5 requires extreely technical spectral nor bounds for rando atrices. It turns out that less technical nor bounds suffice if we use degree 6 su-of-squares relaxations. For this ore powerful algorith, weaker certificates are enough to ensure exact recovery and the proof that these weaker certificates exist with high probability is considerably easier than the proof that degree-4 certificates exist with high probability. In the following we describe this weaker notion of certificates and state their properties. In the subsequent sections we prove properties of these certificates are enough to iply our ain result Theore 1. Algorith 6 Exact tensor copletion based on higher-degree su-of-squares) Input: locations Ω [n] 3 and partial observations X Ω of an unknown 3-tensor X n n n. Operation: Find a degree-6 pseudo-distribution on n n n so as to iniize x, y,z) x 2 + z 2 subject to the following constraints ) x y z x, y,z) Ω = X Ω, 4.1) 8

9 Exact tensor copletion with su-of-squares x, y,z) y 2 1) px, y, z) = 0 for all px, y, z) [x, y, z] ) Output the 3-tensor x, y,z) x y z n n n. Let {u i }, {v i }, {w i } be three orthonoral sets in n, each of cardinality r. We reason about the recovery guarantees of the above algorith in ters of the following notion of certificate. The ain difference to degree-4 certificate Definition 3) is that the spectral nor condition is replaced by a condition in ters of su-of-squares representations. Definition 7 We say that a 3-tensor T n ) 3 is a higher-degree certificate for Ω and orthonoral sets {u i }, {v i }, {w i } n if the following conditions are satisfies 1. the vectors {u i v i w i ) Ω } i [r] are linearly independent, 2. every entry a, b, c) Ω satisfies T, e a e b e c ) = 0, 3. every index i [r] satisfies u i v i Id)T = w i, u i Id w i )T = v i, and Id v i w i )T = u i, 4. the following degree-4 polynoials in [x, y, z] are su of squares x 2 + y 2 z 2 1/ε T, x y z, 4.3) y 2 + x 2 z 2 1/ε T, x y z, 4.4) z 2 + x 2 y 2 1/ε T, x y z. 4.5) where T = T r u i v i w i and ε > 0 is an absolute constant say ε = 10 6 ). In the following sections we prove that higher-degree certificates iply that Algorith 6 successfully recovers the desired tensor and that they exist with high probability for rando Ω of appropriate size Higher-degree certificates iply exact recovery Let {u i }, {v i }, {w i } be orthonoral bases in n. We say that a degree-l pseudo-distribution x, y, z) satisfies the constraint y 2 = 1, denoted = { y 2 = 1}, if x, y,z) px, y, z) 1 y 2 ) = 0 for all polynoials p [x, y, z] l 2 We are to show that a higher-degree certificate in the sense of Definition 7 iplies that Algorith 6 reconstructs the partially observed tensor exactly. A key step of this proof is the following lea about expectation values of higher degree pseudo-distributions. Lea 8 Let T n ) 3 be a higher-degree certificate as in Definition 7 for the set Ω [n] 3 and the vectors {u i } i [r], {v i } i [r], {w i } i [r]. Then, every degree-6 pseudo-distribution x, y, z) with = { y 2 = 1} satisfies x Tx, y, z) 2 + z 2 1 x, y,z) x, y,z) n u i, x 2 + w i, z 2 ) i=r+1 n j [n]\{i} v i, y 2 u j, x 2 + w j, z 2) 4.6)

10 Potechin Steurer To prove this lea it will be useful to introduce the su-of-squares proof syste. Before doing that let us observe that the lea indeed allows us to prove that Algorith 6 works. Theore 9 Higher-degree certificates iply exact recovery) Suppose there exists a higherdegree certificate T in the sense of Definition 7 for the set Ω [n] 3 and the vectors {u i } i [r], {v i } i [r], {w i } i [r]. Then, Algorith 6 recovers the partially observed tensor exactly. In other words, if X = r λ i u i v i w i with λ 1,..., λ r 0 and x, y, z) is a degree- 6 pseudo-distribution with = { y 2 1 = 1} that iniizes x, y,z) 2 x 2 + z 2 ) subject to x, y,z) x y z) Ω = X Ω, then x, y,z) x y z = X. Proof Consider the distribution over vectors x, y, z) such that λ i n u i, v i, λ i n w i ) has probability 1/n. By construction, x, y,z) x y z = X. We have Tx, y, z) = x, y,z) r Tx, y, z) = λ i = x, y,z) By Lea 8 and the optiality of, it follows that 1 x, y,z) 100 n u i, x 2 + w i, z 2 ) + 1 i=r n j [n]\{i} 1 x, y,z) 2 x 2 + z 2 ). v i, y 2 u j, x 2 + w j, z 2) = 0 Since the suands on the left-hand side are squares it follows that each suand has pseudo-expectation 0. It follows that u i, x 2 = v i, y 2 = w i, z 2 = 0 for all i > r and v i, y 2 u j, x 2 = v i, y 2 w j, z 2 = 0 for all i j. By the Cauchy Schwarz inequality for pseudo-expectations, it follows that x, y,z) x y z, u i v j w k = 0 unless i = j = k [r]. Consequently, x, y,z) x y z is a linear cobination of the vectors {u i v i w i i [r]}. Finally, the linear independence of the vectors {u i v i w i ) Ω i [r]} iplies that x y z = X as desired. It reains to prove Lea 8. Here it is convenient to use foral notation for su-of-squares proofs. We will work with polynoials [x, y, z] and the polynoial equation A = { y 2 = 1}. For p [x, y, z], we say that there exists a degree-l SOS proof that A iplies p 0, denoted A l p 0, if there exists a polynoial q [x, y, z] of degree at ost l 2 such that p+q 1 y 2 ) is a su of squares of polynoials. This notion proof allows us to reason about pseudo-distributions. In particular, if A l p 0 then every degree-l pseudo-distribution with = A satisfies p 0. We will change coordinates such that u i = v i = w i = e i is the i-th coordinate vector for every i [n]. Then, the conditions on T in Definition 7 iply that T, x y z) = r x i y i z i + T x, y, z), 4.7) where T is a 3-linear for with the property that T x, x, x) does not contain squares i.e. is ultilinear). Furtherore, the conditions iply the following SOS proofs for T : 1. 4 T x, y, z) ε x + y 2 z 2, 2. 4 T x, y, z) ε y + x 2 z 2, 3. 4 T x, y, z) ε z + x 2 y 2. The following lea gives an upper bound on one of the parts in Eq. 4.7). 10

11 Exact tensor copletion with su-of-squares Lea 10 For A = { y 2 = 1}, the following inequality has a degree-6 su-of-squares proof, A 6 r x i y i z i 1 2 x z n x 2 i + z2 i ) i=r ij yi 2 x 2 j + z2 j + y2 j x 2 + z 2 ) ). 4.8) Proof We bound the left-hand side in the lea as follows, r r A 6 x i y i z i 1 2 x2 i y2 i z2 i ) 4.9) 1 2 x x 2 i i>r n yi 2 z2 i. 4.10) We can further bound i y 2 i z2 i A 6 as follows, n yi 2 z2 i = n y 2 i n z 2 i yi 2 z 2 j ij 4.11) = n z 2 i yi 2 z 2 j. ij 4.12) We can prove a different bound on i y 2 i z2 i as follows, A 6 n yi 2 z2 i 1 2 z z n yi 4 z2 i 4.13) n yi 4 z ) = 1 2 z n 2 y 2 i n yi 2 z yi 2 yj 2 z ) ij = z yi 2 yj 2 z ) ij By cobining these three inequalities, we obtain the inequality A 6 r x i y i z i 1 2 x z x 2 i 1 2 yi 2 z 2 j 1 4 yi 2 yj 2 z 2. i>r ij ij By syetry between z and x, the sae inequality holds with x and z exchanged. Cobining these syetric inequalities, we obtain the desired inequality A 6 r x i y i z i 1 2 x z x 2 i + z2 i ) i>r 11

12 Potechin Steurer 1 4 ij yi 2 x 2 j + z2 j ) 1 8 yi 2 yj 2 x 2 + z 2 ). 4.17) ij It reains to bound the second part in Eq. 4.7), which the following lea achieves. Lea 11 A 6 T x, y, z) 3ε 2 i ji yi 2 x 2 j + z 2 j y2 j x 2 + z 2 ) ) Proof It is enough to show the following inequality for all i [n], A 6 yi 2 T x, y, z) ε ji 3 2 y2 i x2 j + z2 j ) y2 i y2 j x 2 + z 2 ) By syetry it suffices to consider the case i = 1. Let x = x x 1 e 1, y = y y 1 e 1, and z = z z 1 e 1. We observe that A 4 T x, y, z) = T x 1 e 1 + x, y 1 e 1 + y, z 1 e 1 + z ) = T x 1 e 1, y, z ) + T x, y 1 e 1, z ) + T x, y, z 1 e 1 ) + T x, y, z ) We now apply the following inequalities 1. A 4 T x 1 e 1, y, z ) ε 2 x 2 1 y 2 + z 2 ε 2 j1yj 2 x 2 + z 2 j ) 2. A 4 T x, y 1 e 1, z ) ε 2 x 2 y1 2 + z 2 ε 2 j1x 2 j + z2 j ) 3. A 4 T x, y, z 1 e 1 ) ε 2 z 2 1 y 2 + x 2 ε 2 j1yj 2 z 2 + x 2 j ) 4. A 4 T x, y, z ) ε 2 x 2 y 2 + z 2 ε 2 j1x 2 j + z2 j ) ) We can now prove Lea 8. Proof [Proof of Lea 8] Taken together, Leas 10 and 11 iply A 6 T, x y z 1 2 x z Oε)) n x 2 i + z2 i ) i=r Oε)) yi 2 x 2 j + z2 j + y2 j x 2 + z 2 ) ), 4.18) where the absolute constant hidden by O ) notation is at ost 10. Therefore for ε < 1/100, as we assued in Definition 7, we get a SOS proof of the inequality, A 6 T, x y z 1 2 x z ij n x 2 i + z2 i ) i=r+1 12

13 Exact tensor copletion with su-of-squares 1 16 yi 2 x 2 j + z2 j + y2 j x 2 + z 2 ) ), 4.19) ij This SOS proof iplies that that every degree-6 pseudo-distribution x, y, z) with = A satisfies the desired inequality, 1 T, x y z x, y,z) x, y,z) 2 x z n x 2 i + z2 i ) i=r+1 yi 2 x 2 j + z2 j + y2 j x 2 + z 2 ) ), 4.20) ij 4.3. Constructing the certificate T In this section we give a procedure for constructing the certificate T. This construction is directly inspired by the construction of the dual certificate in Gross, 2011; Recht, 2011) soeties called quantu golfing). We will then prove that T satisfies all of the conditions for a higher-degree certificate of Ω. In Section 4.5 we will show that T also satisfies the conditions for a degree-4 certificate for Ω. Let {u i }, {v i }, {w i } n be three orthonoral bases, with all vectors -incoherent. Let X = r u i v i w i. Let Ω [n] 3 chosen at rando such that each eleent is included independently with probability /n 1.5 so that Ω is tightly concentrated around ). Let P be the projector on the span of the vectors u i v j w k such that an index in [r] appears at least twice in i, j, k) i.e., at least one of the conditions i = j [r], i = k [r], j = k [r] is satisfied). Let R Ω be the linear operator on n n n that sets all entries outside of Ω to 0 so that R Ω [T]) Ω = R Ω [T]) and is scaled such that Ω R Ω = Id. Let R Ω be Id R Ω. Our goal is to construct T n n n such that P[T] = X, T) Ω = T, and the spectral nor condition in Definition 3 is satisfied. The idea for constructing T is to start with T = X. Then, ove to closest point T that satisfies R Ω [T ] = T Then, ove to closest point T that satisfies P[T ] = X and repeat. To ipleent this strategy, we define k 1 T k) = 1) j R Ωj+1 P R Ωj ) P R Ω1 )[X], 4.21) j=0 where Ω 1,..., Ω k are iid saples fro the sae distribution as Ω. By induction, we can show the following lea about linear constraints that the constructed stensors T k) satisfy. Lea 12 For every k 1, the tensor T k) satisfies T) Ω = T and P[T k) ] + 1) k PP R Ωk ) P R Ω1 )[X] = X. Here, PP R Ωk ) P R Ω1 )[X] is an error ter that decreases geoetrically. In the paraeter regie of Theore 5, the nor of this ter is n ω1) for soe k = log n) O1). 13

14 Potechin Steurer The following lea shows that it is possible to correct such sall errors. This lea also iplies that the linear independence condition in Definition 3 is satisfied with high probability. Therefore, we can ignore this condition in the following.) Lea 13 Suppose rn log n) O1). Then, with probability 1 n ω1) over the choice of Ω, the following holds: For every E n n n with P[E] = E, there exists Y with Y) Ω = Y such that P[Y] = E and Y F O1) E F. Proof Let S [n] 3 be such that P is the projector to the vectors u i v j w k with i, j, k) S. By construction of P we have S 3rn. In order to show the conclusion of the lea it is enough to show that the vectors u i v j w k ) Ω with i, j, k) S are well-conditioned in the sense that the ratio of the largest and sallest singular value is O1). This fact follows fro standard atrix concentration inequalities. See Lea 15. The ain technical challenge is to show that the construction satisfies the condition that the following degree 4 polynoials are sus of squares where T = T X). x 2 + y 2 z 2 1/ε T, x y z, 4.22) y 2 + x 2 z 2 1/ε T, x y z, 4.23) z 2 + x 2 y 2 1/ε T, x y z. 4.24) We show how to prove the first stateent, the other stateents can be proved with syetrical arguents. To prove the first stateent, we decopose T into pieces of the for R Ωl P) R Ω1 P) R Ω0 X), P R Ωl P) R Ω1 P) R Ω0 X) where P is a part of P), or E. For each piece A, we prove a nor bound a A a A T a B. Since a A a A T a represents the sae polynoial as A A, this proves that B y 2 z 2 y z) A Ay z) is a degree 4 su of squares. Now note that y z) A Ay z) Bx Ay z) By z) A x + B x 2 is also a su of squares. Cobining these equations and scaling we have that x 2 + y 2 z 2 2 x Ay z) is a degree 4 su of B squares. Thus, it is sufficient to prove nor bounds on a A a A T a. We have an appropriate bound in the case when A = E because E has very sall Frobenius nor. For the cases when A = R Ωl P) R Ω1 P) R Ω0 X) or A = P R Ωl P) R Ω1 P) R Ω0 X), we use the following theore Theore 14 Let A = R Ωl P) R Ω1 P) R Ω0 X) or P R Ωl P) R Ω1 P) R Ω0 X) where P is a part of P. There is an absolute constant C such that for any α > 1 and β > 0, A a A T a > α l+1) < n β a as long as > Cα β 3 2 rn 1.5 logn) and > Cα β 2 rn logn). Proof This theore follows directly fro cobining Proposition 29, Theore 33, Theore 42, and Theore

15 Exact tensor copletion with su-of-squares Final correction of error ters In this section, we prove a spectral nor bound that allows us to correct error ters that are left at the end of the construction. The proof uses the by now standard Matrix Bernstein concentration inequality. Siilar proofs appear in the atrix copletion literature Gross, 2011; Recht, 2011). Let {u i }, {v i }, {w i } n be three orthonoral bases, with all vectors -incoherent. Let Ω [n] 3 be entires sapled uniforly at rando with replaceent. This sapling odel is different fro what is used in the rest of the proof. However, it is well known that the odels are equivalent in ters of the final recovery proble.) Lea 15 Let S [n] 3. Suppose = S log n) C for an absolute constant C 1. Then with probability 1 n ω1) over the choice of Ω, the vectors u i v j w k ) Ω for i, j, k) S are well-conditioned in the sense that the ratio between the largest and sallest singular value is at ost 1.1. Proof For s = i, j, k) S, let y s = u i v j w k. Let Ω = {ω 1,..., ω }, where ω 1,..., ω [n] 3 are sapled uniforly at rando with replaceent. Let A be the S-by-S Gra atrix of the vectors y s ) Ω. Then, A is the su of identically distributed rank-1 atrices A i, A = A i with A i ) s,s = y s ) ωi y s ) ωi. Each A i has expectation A i = n 1.5 Id and spectral nor at ost S /n 1.5. Standard atrix concentration inequalities Tropp, 2012) show that O S 2 log n) is enough to ensure that the su is spectral close to its expectation /n 1.5 ) Id in the sense that 0.99A /n 1.5 ) Id 1.1A Degree-4 certificates iply exact recovery In this section we prove Theore 4. We need the following technical lea, which we prove in Appendix A. Lea 16 Let R be self-adjoint linear operator R on n n. Suppose v j w k ), Rv i w i ) = 0 for all indices i, j, k [r] such that i { j, k}. Then, there exists a self-adjoint linear operator R on n n such that R v i w i ) = 0 for all i [r], the spectral nor of R satisfies R 10 R, and R represents the sae polynoial in [y, z], y z), R y z) = y z), Ry z). We can now prove that certificates in the sense of Definition 3 iply that our algorith successfully recoves the unknown tensor. Proof [Proof of Theore 4] Let T be a certificate in the sense of Definition 3. Our goal is to construct a positive seidefinite atrix M on n n n that represents the following polynoial x, y z), Mx, y z) = x 2 + y 2 z 2 2 x, Ty z). 15

16 Potechin Steurer Let T a be atrices such that x, Tx y) = a x a T a y, z). Since x 2 + n a=1 T a y, z) 2 2 x, Ty z) = x Ty z) is a su of squares of polynoials, it will be enough to find a positive seidefinite atrix that represents the polynoial y 2 z 2 n a=1 T a y, z) 2. This step is a polynoial version of the Schur copleent condition for positive seidefiniteness.) Let R be the following linear operator R = n T a T a a=1 r v i w i )v i w i ), Lea 17 R satisfies the requireent of Lea 16. Proof Consider v j w k ), Rv j w j ). Since v j is repeated, the value of this expression will be the sae if we replace R by an R 2 which represents the sae polynoial. Thus, we can replace R by R 2 = n a=1 T a T a r v i w i )v i w i ) = T T r v i w i )v i w i ) We now observe that v j w k ), R 2 v j w j ) = v j w k ), T u j ) v j w j ) = 0. By a syetrical proof, v j w k ), Rv k w k ) = 0 as well. By Lea Lea 16, there exists a self-adjoint linear operator R that represents the sae polynoial as R, has spectral nor R 10 R 0.1, and sends all vectors v i w i to 0. Since R sends all vectors v i w i to 0 and R 0.1, the following atrix R = r v i w i )v i w i ) + R has r eigenvalues of value 1 corresponding to the space spanned by v i w i ) and all other eigenvalues are at ost 0.1 because the non-zero eigenvalues of R have eigenvectors orthogonal to all v i w i ). At the sae tie, R represents the following polynoial, y z), R y z) = n T a y, z) 2. Let P be a positive seidefinite atrix that represents the polynoial x 2 + n a=1 T a y, z) 2 2 x, Ty z) such a atrix exists because the polynoial is a su of squares). We choose M as follows ) ) Id T 0 0 M = T) T) + T 0 Id R Since R Id, this atrix is positive seidefinite. Also, M represents x 2 + y 2 z 2 2 x, Ty z). Since u i = Tv i w i ) for all i [r] and the kernel of Id R only contains span of v i w i, the kernel of M is exactly the span of the vectors u i, v i w i ). Next, we show that the above atrix M iplies that Algorith 2 recovers the unknown tensor X. Recall that the algorith on input X Ω finds a pseudo-distribution x, y, z) so as to iniize x 2 + y 2 z 2 such that x y z) Ω = X Ω. Since everything is scale invariant, we ay assue that X = r λ i u i v i w i for λ 1,..., λ r 0 and i λ i = 1. Then, a valid pseudodistribution would be the probability distribution over u 1, v 1, w 1 ),..., u r, v r, w r ) with probabilities a=1 16

17 Exact tensor copletion with su-of-squares λ 1,..., λ r. Let be the pseudo-distribution coputed by the algorith. By optiality of, we know that the objective value satisfies x 2 + y 2 z 2 i λ u i 2 + v i 2 w i 2 = 2. Then, if we let Y = x, y z)x, y z), 0 M, Y = x, y,z) x 2 + y 2 z 2 2 x, Ty z) 2 2 x, y,z) x, Ty z) = 2 2 i λ u i, Tv i w i ) = 0 The first step uses that M and Y are psd. The second step uses that M represents the polynoial x 2 + y 2 z 2 2 x, Ty z). The third step uses that iniizes the objective function. The fourth step uses that the entries of T are 0 outside of Ω and that atches the observations x y z) Ω = X Ω. The last step uses that u i = Tv i w i ) for all i [r]. We conclude that M, Y = 0, which eans that the range of Y is contained in the kernel of M. Therefore, Y = r i, j=1 γ i, j u i, v i w i )u j, v j w j ) for scalars {γ i, j }. We clai that the ultipliers ust satisfy γ i,i = λ i and γ i, j = 0 for all i j [r]. Indeed since atches the observations in Ω, 0 = n λ i γ i, j δ ij ) u i v j w j ) Ω. i, j=1 Since the vectors u i v j w j ) Ω are linearly independent, we conclude that γ i, j = λ i δ ij as desired. This linear independence was one of the requireents of the certificate in Definition 3.) 4.5. Degree-4 certificates exist with high probability In this section we show that our certificate T in fact satisfies the conditions for a degree-4 certificate, proving Theore 5. We use the sae construction as in Section 4.3. The ain, reaining technical challenge for Theore 5 is to show that the construction satisfies the spectral nor condition of Definition 3. This spectral nor bound follows fro the following theore which we give a proof sketch for in Appendix B. Theore 18 Let A = R Ωl P) R Ω1 P) R Ω0 X) or P R Ωl P) R Ω1 P) R Ω0 X) and let B = R Ωl P) R Ω1 P) R Ω0 X) or P R Ωl P) R Ω1 P) R Ω0 X). There is an absolute constant C such that for any α > 1 and β > 0, A a B T a > α l+l +2) < n β a as long as > Cα β 3 2 rn 1.5 logn) and > Cα β 2 rn logn). Reark 19 If it were true in general that a A a B T a a A a A T a a B a B T a then it would be sufficient to use Theore 14 and we would not need to prove Theore 18. Unfortunately, this is not true in general. 17

18 Potechin Steurer That said, it ay be possible to show that even if we do not know directly that a A a B T a is sall, since a A a A T a and a B a B T a are both sall there ust be soe alternative atrix representation of a A a B T a which has sall nor, and this is sufficient. We leave it as an open proble whether this can be done. We have now all ingredients to prove Theore 5. Proof [Proof of Theore 5] Let k = log n) C for soe absolute constant C 1. Let E = 1) k PP R Ωk ) P R Ω1 )[X]. By Lea 13 there exists Y with Y) Ω = Y and P[Y] = E such that Y F O1) E. We let T = T k) + Y. This tensor satisfies the desired linear constraints T) Ω = T and P[T] = X. Since E has the for of the atrices in Theore 18, the bound in Theore 18 iplies E F 2 k n 10 n C+10. Here, we use that the nor in the conclusion of Theore 18 is within a factor of n 10 of the Frobenius nor.) We are to prove that the following atrix has spectral nor bounded by 0.01, n T) a T) T a a=1 n X a Xa T. We expand the su according to the definition of T l) in Eq. 4.21). Then, ost ters that appear in the expansion have the for as in Theore 18. Since those ters decrease geoetrically, we can bound their contribution by with probability 1 n ω1). The ters that involve the error correction Y is saller than because Y has polynoially sall nor Y F n C+10. The only reaining ters are cross ters between X and a tensor of the for as in Theore 18. We can bound the total contribution of these ters also bounded by at ost using Theore 84. a=1 5. Matrix nor bound techniques In this section, we describe the techniques that we will use to prove probabilistic nor bounds on atrices of the for Y = a R Ω A) a R Ω A) T a. We will prove these nor bounds using the trace oent ethod, which obtains probabilistic bounds on the nor of a atrix Y fro bounds on the expected value of tryy T ) q ) for sufficiently large q. This will require analyzing tryy T ) q ), which will take the for of a su of products, where the ters in the product are either entries of A or ters of the for R Ω a, b, c) where R Ω a, b, c) = n3 1 if a, b, c) Ω and 1 otherwise. To analyze tryy T ) q ), we will group products together which have the sae expected behavior on the R Ω a, b, c) ters, foring saller sus of products. For each of these sus, we can then use the sae bound on the expected behavior of the R Ω a, b, c) ters for each product in the su. This allows us to ove this bound outside of the su, leaving us with a su of products of entries of A. We will then bound the value of these sus by carefully choosing the order in which we su over the indices. In the reainder of this section and in the next two sections, we allow for our tensors to have asyetric diensions. We account for this with the following definitions. Definition 20 We define n 1 to the diension of the u vectors, n 2 to be the diension of the v vectors, and n 3 to be the diension of the w vectors. We define n ax = ax {n 1, n 2, n 3 } 18

19 Exact tensor copletion with su-of-squares 5.1. The trace oent ethod We use the trace oent ethod through the following proposition and corollary. Proposition 21 For any rando atrix Y, for any integer q 1 and any ε > 0, Pr Y > 2q E tryy T ) q ) ε < ε Proof By Markov s inequality, for all integers q 1 and all ε > 0 Pr tryy T ) q ) > E tryy T ) q ) ε < ε The result now follows fro the observation that if Y > 2q E[trYY T ) q )] ε then tryy T ) q ) > E[trYY T ) q )] ε. Corollary 22 For a given p 1, r 0, n > 0, and B > 0, for a rando atrixy, if E tr YY T ) q q p B) 2q n r for all integers q > 1 then for all β > 0, [ ) p ] r + β) Pr Y > Be p ln n + 1 < n β 2p Proof We take ε = n β and we choose q to iniize 2q q p B) 2q n r ε = Bq p n r+β 2q. Setting the derivative of this expression to 0 we obtain that p q r+β ln n)bq p n r+β 2q 2 2q q ust be an integer, so we instead take q = r+β 2p Bq p n r+β 2q B ) p r + β 2p ln n + 1 n p ln n Applying Proposition 21 with q, we obtain that Pr Y > 2q E tryy T ) q ) ε Pr = 0, so we want q = r+β 2p ln n. With this q, we have that = Be p r + β) ln n + 1 2p ) p [ ) p ] r + β) Y > Be p ln n + 1 < n β 2p ln n. However, 5.2. Partitioning by intersection pattern As discussed at the beginning of the section, E tryy T ) q ) will be a su of products, where part of these products will be of the for 2q R Ω a i, b i, c i ). Here, q ay or ay not be equal to q, in fact we will often have q = 2q because each Y will contribute two ters of the for R Ω a, b, c) to the product. To handle this part of the product, we partition the ters of our su based on the intersection pattern of which triples a i, b i, c i ) are equal to each other. Fixing an intersection pattern deterines the expected value of 2q R Ω a i, b i, c i ). 19

20 Potechin Steurer Definition 23 We define an intersection pattern to be a set of equalities and inequalities satisfying the following conditions 1. All of the equalities and inequalities are of the for a i1, b i1, c i1 ) = a i2, b i2, c i2 ) or a i1, b i1, c i1 ) a i2, b i2, c i2 ), respectively. 2. For every i 1, i 2, either a i1, b i1, c i1 ) = a i2, b i2, c i2 ) is in the intersection pattern or a i1, b i1, c i1 ) a i2, b i2, c i2 ) is in the intersection pattern 3. All of the equalities and inequalities are consistent with each other, i.e. there exist values of a 1, b 1, c 1 ),, a 2q, b 2q, c 2q ) satisfying all of the equalities and inequalities in the intersection pattern. Proposition 24 For a given a, b, c), 1. E R Ω a, b, c) = 0 2. For all k > 1, E [ R Ω a, b, c) k ] n 1 n 2 n 3 k 1 Corollary 25 For a given intersection pattern, if there is any triple a, b, c) which appears exactly once, E [ 2q R Ω a i, b i, c i ) ] = 0. Otherwise, letting z be the nuber of distinct triples, E [ 2q R Ω a i, b i, c i ) ] n 1 n 2 n 3 2q z Proof for a given intersection pattern, let a i1, b i1, c i1 ),, a iz, b iz, c iz ) be the distinct triples and let c j be the nuber of ties the triple a ij, b ij, c ij ) appears. We have that 2q E R Ω a i, b i, c i ) = z E R Ω a ij, b ij, c ij ) c j j=1 If c j = 1 for any j then this expression is 0. Otherwise, z E R Ω a ij, b ij, c ij ) c j j=1 z j=1 n1 n 2 n 3 ) cj 1 n1 n 2 n ) z 3 j=1 c j = ) z n1 n 2 n ) 2q z 3 = 5.3. Bounding sus of products of tensor entries In this subsection, we describe how to bound the su of products of tensor entries we obtain for a given intersection pattern after oving our bound on the expected value of the R Ω a, b, c) ters outside the su. We represent such a product with a hypergraph as follows. Definition 26 Given a set of distinct indices and a set of tensor entries on those indices, let H be the hypergraph with one vertex for each distinct index and one hyperedge for each tensor entry, where the hyperedge consists of all indices contained in the tensor entry. If the tenor entry appears to the pth power, we take this hyperedge with ultiplicity p. 20

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Physics 215 Winter The Density Matrix

Physics 215 Winter The Density Matrix Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it

More information

A Probabilistic and RIPless Theory of Compressed Sensing

A Probabilistic and RIPless Theory of Compressed Sensing A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA 94305 2 Applied and Coputational Matheatics,

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

THE POLYNOMIAL REPRESENTATION OF THE TYPE A n 1 RATIONAL CHEREDNIK ALGEBRA IN CHARACTERISTIC p n

THE POLYNOMIAL REPRESENTATION OF THE TYPE A n 1 RATIONAL CHEREDNIK ALGEBRA IN CHARACTERISTIC p n THE POLYNOMIAL REPRESENTATION OF THE TYPE A n RATIONAL CHEREDNIK ALGEBRA IN CHARACTERISTIC p n SHEELA DEVADAS AND YI SUN Abstract. We study the polynoial representation of the rational Cherednik algebra

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010 A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING By Eanuel J Candès Yaniv Plan Technical Report No 200-0 Noveber 200 Departent of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

Lecture 21 Nov 18, 2015

Lecture 21 Nov 18, 2015 CS 388R: Randoized Algoriths Fall 05 Prof. Eric Price Lecture Nov 8, 05 Scribe: Chad Voegele, Arun Sai Overview In the last class, we defined the ters cut sparsifier and spectral sparsifier and introduced

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Estimating Entropy and Entropy Norm on Data Streams

Estimating Entropy and Entropy Norm on Data Streams Estiating Entropy and Entropy Nor on Data Streas Ait Chakrabarti 1, Khanh Do Ba 1, and S. Muthukrishnan 2 1 Departent of Coputer Science, Dartouth College, Hanover, NH 03755, USA 2 Departent of Coputer

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Reed-Muller codes for random erasures and errors

Reed-Muller codes for random erasures and errors Reed-Muller codes for rando erasures and errors Eanuel Abbe Air Shpilka Avi Wigderson Abstract This paper studies the paraeters for which Reed-Muller (RM) codes over GF (2) can correct rando erasures and

More information

ADVANCES ON THE BESSIS- MOUSSA-VILLANI TRACE CONJECTURE

ADVANCES ON THE BESSIS- MOUSSA-VILLANI TRACE CONJECTURE ADVANCES ON THE BESSIS- MOUSSA-VILLANI TRACE CONJECTURE CHRISTOPHER J. HILLAR Abstract. A long-standing conjecture asserts that the polynoial p(t = Tr(A + tb ] has nonnegative coefficients whenever is

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

The Hilbert Schmidt version of the commutator theorem for zero trace matrices

The Hilbert Schmidt version of the commutator theorem for zero trace matrices The Hilbert Schidt version of the coutator theore for zero trace atrices Oer Angel Gideon Schechtan March 205 Abstract Let A be a coplex atrix with zero trace. Then there are atrices B and C such that

More information

M ath. Res. Lett. 15 (2008), no. 2, c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS. Van H. Vu. 1.

M ath. Res. Lett. 15 (2008), no. 2, c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS. Van H. Vu. 1. M ath. Res. Lett. 15 (2008), no. 2, 375 388 c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS Van H. Vu Abstract. Let F q be a finite field of order q and P be a polynoial in F q[x

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

G G G G G. Spec k G. G Spec k G G. G G m G. G Spec k. Spec k

G G G G G. Spec k G. G Spec k G G. G G m G. G Spec k. Spec k 12 VICTORIA HOSKINS 3. Algebraic group actions and quotients In this section we consider group actions on algebraic varieties and also describe what type of quotients we would like to have for such group

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions Journal of Matheatical Research with Applications Jul., 207, Vol. 37, No. 4, pp. 496 504 DOI:0.3770/j.issn:2095-265.207.04.0 Http://jre.dlut.edu.cn An l Regularized Method for Nuerical Differentiation

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Tail estimates for norms of sums of log-concave random vectors

Tail estimates for norms of sums of log-concave random vectors Tail estiates for nors of sus of log-concave rando vectors Rados law Adaczak Rafa l Lata la Alexander E. Litvak Alain Pajor Nicole Toczak-Jaegerann Abstract We establish new tail estiates for order statistics

More information

Asynchronous Gossip Algorithms for Stochastic Optimization

Asynchronous Gossip Algorithms for Stochastic Optimization Asynchronous Gossip Algoriths for Stochastic Optiization S. Sundhar Ra ECE Dept. University of Illinois Urbana, IL 680 ssrini@illinois.edu A. Nedić IESE Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

a a a a a a a m a b a b

a a a a a a a m a b a b Algebra / Trig Final Exa Study Guide (Fall Seester) Moncada/Dunphy Inforation About the Final Exa The final exa is cuulative, covering Appendix A (A.1-A.5) and Chapter 1. All probles will be ultiple choice

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

arxiv: v1 [math.na] 10 Oct 2016

arxiv: v1 [math.na] 10 Oct 2016 GREEDY GAUSS-NEWTON ALGORITHM FOR FINDING SPARSE SOLUTIONS TO NONLINEAR UNDERDETERMINED SYSTEMS OF EQUATIONS MÅRTEN GULLIKSSON AND ANNA OLEYNIK arxiv:6.395v [ath.na] Oct 26 Abstract. We consider the proble

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization Robust Principal Coponent Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optiization John Wright, Arvind Ganesh, Shankar Rao, and Yi Ma Departent of Electrical Engineering University

More information

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3 A. Find all ordered pairs a, b) of positive integers for which a + b = 3 08. Answer. The six ordered pairs are 009, 08), 08, 009), 009 337, 674) = 35043, 674), 009 346, 673) = 3584, 673), 674, 009 337)

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

A new type of lower bound for the largest eigenvalue of a symmetric matrix

A new type of lower bound for the largest eigenvalue of a symmetric matrix Linear Algebra and its Applications 47 7 9 9 www.elsevier.co/locate/laa A new type of lower bound for the largest eigenvalue of a syetric atrix Piet Van Mieghe Delft University of Technology, P.O. Box

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product.

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product. Coding Theory Massoud Malek Reed-Muller Codes An iportant class of linear block codes rich in algebraic and geoetric structure is the class of Reed-Muller codes, which includes the Extended Haing code.

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison yströ Method vs : A Theoretical and Epirical Coparison Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou Machine Learning Lab, GE Global Research, San Raon, CA 94583 Michigan State University,

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

time time δ jobs jobs

time time δ jobs jobs Approxiating Total Flow Tie on Parallel Machines Stefano Leonardi Danny Raz y Abstract We consider the proble of optiizing the total ow tie of a strea of jobs that are released over tie in a ultiprocessor

More information

A Bernstein-Markov Theorem for Normed Spaces

A Bernstein-Markov Theorem for Normed Spaces A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization

Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization Structured signal recovery fro quadratic easureents: Breaking saple coplexity barriers via nonconvex optiization Mahdi Soltanolkotabi Ming Hsieh Departent of Electrical Engineering University of Southern

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40 On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering

More information