BINARY STABLE EMBEDDING VIA PAIRED COMPARISONS. Andrew K. Massimino and Mark A. Davenport

Size: px
Start display at page:

Download "BINARY STABLE EMBEDDING VIA PAIRED COMPARISONS. Andrew K. Massimino and Mark A. Davenport"

Transcription

1 06 IEEE Statistical Signal Processing Workshop SSP BINAY STABLE EMBEDDING VIA PAIED COMPAISONS Andrew K. Massimino and Mark A. Davenport Georgia Institute of Technology, School of Electrical and Computer Engineering ABSTACT Suppose that we wish to estimate a vector x from a set of binary paired comparisons of the form x is closer to p than to q for various choices of vectors p and q. The problem of estimating x from this type of observation arises in a variety of contexts, including nonmetric multidimensional scaling, unfolding, and ranking problems, often because it provides a powerful and flexible model of preference. The main contribution of this paper is to show that under a randomized model for p and q, a suitable number of binary paired comparisons yield a stable embedding of the space of target vectors. Index Terms paired comparisons, binary stable embeddings, recommendation systems, -bit compressive sensing. INTODUCTION The central problem we consider in this paper is the estimation of a vector x n where, rather than directly observing x, we assume that we are restricted to m observations of the form x is closer to p i than q i, where p i, q i n correspond to points whose locations are known. Each of these comparisons essentially divides n in half and tells us which side of a hyperplane the point x lies. Observations of this form arise in a variety of contexts, but a particularly important class of applications involve recommendation systems, targeted advertisement, and psychological studies where x represents an ideal point that models a particular user s preferences and the p i and q i represent items that the user compares []. Items which are close to x are those most preferred by the user. Paired comparisons arise naturally in this context since precise numerical scores quantifying a user s preference are generally much more difficult to assign than comparative judgements [, 3]. Moreover, data consisting of paired comparisons is often generated implicitly in contexts where the user has the option to act on two or more alternatives; for instance they may choose to watch a particular movie, or click a particular advertisement, out of those displayed to them [4]. In such contexts, the true distances in the ideal point model s preference space are generally inaccessible in any direct way, but it is nevertheless still possible to obtain a good estimate of a user s ideal point. This work was supported by grants NL N C00, AFOS FA , NSF CCF-35066, CCF , and CMMI The fundamental question which interests us in this paper is how many comparisons suffice in order to guarantee that the number of differing paired comparisons generated by x and y is roughly proportional to the Euclidean distance between x and y. This allows us to understand how well x can potentially be estimated from highly quantized information and gives an idea of the stability in the presence of labeling errors. We consider the case where we are given an existing embedding of the items as in a mature recommender system and focus on the on-line problem of locating a single new user from their feedback consisting of binary data generated from paired comparisons. The item embedding could be generated using a variety of methods, such as multidimensional scaling applied to a set of item features, or even using the results of previous paired comparisons via an approach like that in [5]. We wish to understand how many comparisons are then required to accurately localize a user s ideal point. Any precise answer to this question would depend on the underlying geometry of the item embedding since some sets of hyperplanes will yield better tesselations of the preference space than others. Thus, to gain some intuition on this problem without reference to the geometry of a particular embedding, we will instead consider a probabilistic model where the items are generated at random from a particular distribution. It is important to note that the ideal point model, while similar, is distinct from the low-rank factor or attribute model used in the matrix completion approaches which have recently gained much attention as applied to recommendation systems, e.g., [6, 7]. Although both models suppose user choices are guided by a number of attributes, the ideal point model leads to preferences that are non-monotonic functions of those attributes. There is also empirical evidence that the ideal point model captures user behavior more accurately than factorization based approaches do [8, 9]. There is a large body of work that studies the problem of learning to rank items from various sources of data, including paired comparisons of the sort we consider in this paper. See, for example, [0,, ] and references therein. We first note that in most work on rankings, the central focus is on learning a correct rank-ordered list for a particular user, without providing any guarantees on recovering a correct parameterization for the user s preferences as we do here. Perhaps most /6/$ IEEE

2 06 IEEE Statistical Signal Processing Workshop SSP closely related to our work is that of [0, ], which examines the problem of learning a rank ordering using the same ideal point model considered in this paper. The message in this work is broadly consistent with ours, in that the number of comparisons required should scale with the dimension of the preference space not the total number of items. Also closely related is the work in [3, 4, 5] which consider paired comparisons and more general ordinal measurements in the similar but as discussed above, subtly different context of low-rank factorizations. Finally, while seemingly unrelated, we note that our work builds on the growing body of literature on -bit compressive sensing. In particular, our results are largely inspired by those in [6, 7], and borrow techniques from [8] in the proofs of some of our main results.. THE ANDOM OBSEVATION MODEL For the moment we will consider the noise-free setting where a user always prefers the item closest to the user s ideal point x. In this case we can represent the observed comparisons mathematically by letting A i x denote the i th observation, which consists of comparisons between p i and q i, and setting x q i x p i A i x :=sign = { + if x is closer to p i if x is closer to q i. We will also use Ax =[A x,, A m x] T to denote the vector of all observations resulting from m comparisons. Note that if we set a i =p i q i and τ i = p i q i, then we can re-write our observation model as A i x =sign a T i x τ i =sign a T i x τ i.. This is reminiscent of the one-bit compressive sensing setting with dithers [6, 7] with the important differences that: i we have not yet made any kind of sparsity or other structural assumption on x and, ii the dithers τ i, at least in this formulation, are dependent on the a i, which results in difficulty applying existing results from this theory to this setting. However, many of the techniques from this literature will nevertheless be helpful in analyzing this problem. To see this, we consider a randomized observation model where the pairs p i, q i are chosen independently with i.i.d. entries drawn according to a normal distribution, i.e., p i, q i N0,σ I. In this case, we have that the entries of our sensing vectors are i.i.d. with a ij N0, σ. Moreover, if we define b i = p i + q i, then we also have that b i N0, σ I, and we can write τ i = at i b i. Note that while τ i is clearly dependent on a i, we do have that a i and b i are independent. To further simplify things, let s re-normalize by dividing by a i, i.e., setting ã i = a i / a i and τ i = τ i / a i and A i x =sign ã T i x τ i. It is easy to see that ã i is distributed uniformly on the sphere S n = {a n : a =}. Note also that we can write τ i = ãt i b i. Since a i and b i are independent, ã i and b i are also independent. Moreover, for any unit-vector ã i,if b i N0, σ I then ã T i b i N0, σ. Thus, we must have τ i N0,σ /, independent of ã i, which is the key insight that enables the analysis below. 3. MAIN ESULT Under the model described above, we would like to understand how much the sign patterns of two different signals can differ from their Euclidean distance. Our main result is that given enough comparisons there is an approximate embedding of the preference space into {, } m via our measurement model. We assume any signal of interest has norm at most, i.e., we say signals x, y B n, the n-ball of radius. Asan example application, x might be considered the true user s ideal point and y some other possible estimate. Theorem 3. states that if x and y are sufficiently nearby, the respective sign-measurement patterns Ax and Ay closely match. We denote by d H the Hamming distance, i.e., d H counts the number of {, +} sign measurement errors; d H Ax, Ay := m i= A ix A i y. 3. Note that the following result applies for all x and y, and follows from Lemma 3., which in contrast applies only for fixed x and y. Theorem 3.. Let η>0 and ζ>0. Suppose m total measurements of the form. may be taken, i.e., each a i is drawn uniformly on the unit sphere and τ m is a vector with entries τ i N0,σ / independent of the a i, where σ =/n. If m ζ n log 3 ζ +log. η Then there are constants c,c,c 3, such that with probability at least η, for all points x, y B n, c x y c ζ + c + n + ζ c 3 + π +. Proof. By Lemma 3., for any fixed pair w, z B n we have bounds on the Hamming distance with probability at least exp ζ m, for all u B δ w and v B δ z. Since the radius ball can be covered with a set U of radius δ balls with U 3/δ n, by a union bound we have with

3 06 IEEE Statistical Signal Processing Workshop SSP probability at least 3/δ n exp ζ m, for all w, z U, for all u B δ w and v B δ z, c w z c δ ζ d HAu, Av c 3 w z + δ Since x y δ w z x y +δ, c x y c δ c δ n ζ + c 3δ + δ n π + ζ. Letting δ = ζ/ this becomes c x y ζ c + c + n + ζ c 3 + Lower bounding the probability by η, 3/δ n exp ζ m η. earranging, we have the desired result. π +. n π + ζ. We comment that Theorem 3. concerns a particular choice of the variance parameter σ. A natural question is what would happen with a different choice of σ. In fact, this assumption is critical intuitively, if σ is too small, then nearly all the hyperplanes induced by the comparisons will pass very close to the origin, so that accurate estimation of even x becomes impossible. On the other hand, if σ is too large, then an increasing number of these hyperplanes will not even intersect the ball of radius in which x is presumed to lie, thus yielding no new information. Lemma 3.. Let w, z B n be fixed. Let m measurements of the form. be taken in which each a i is drawn uniformly on the unit sphere and let τ m be a vector with entries τ i N0,σ / independent of the a i.fix ζ>0, δ>0, and define B δ w :={u B n : u w δ}. Then there are constants c,c, and c 3 such that with probability at least exp ζ m, for all u B δ w and v B δ z, c w z c δ ζ d HAu, Av c 3 w z where d H is the Hamming distance δ n π + ζ, Proof. Fix δ>0and let u B δ w, v B δ z. ecall that the Hamming distance d H is a sum of independent identically distributed Bernoulli random variables and we may bound it using Hoeffding s inequality. Since our probabilistic upper and lower bounds must hold for all u, v as described above, we introduce quantities L 0 and L which represent two extreme cases of the Bernoulli variables: L 0 := L := Then we have sup u B δ w,v B δ z inf u B δ w,v B δ z m m A i u A i v i= A i u A i v. i= L d H Au, Av L 0 Denote P 0 = E L 0 and P = E L, i.e., P 0 = P{ u B δ w, v B δ z :A i u =A i v} P = P{ u B δ w, v B δ z :A i u A i v}. We give a lower bound for P 0 in Lemma 3.3 and for P in Lemma 3.4. Now by Hoeffding s inequality, P{L 0 > P 0 +ζ} exp mζ P{L <P ζ} exp mζ. Hence, with probability at least exp mζ, P ζ d H Au,Av P 0 +ζ. Since P w z 4δ 6, πe and recalling σ = /n, P 0 Γ n δ σπ Γ n+ w z + σ π π δ n w z + π n π w z = + δ n π π, the lemma is satisfied with appropriate c,c,c Probability estimates Lemma 3.3. Let w, z B n be distinct. Fix δ>0. Denote by P 0 the probability that all points u and v that are within δ of w and z respectively do not differ in the random measurement denoted by A i i.e., the two δ-balls are separated by hyperplane i. The direction and threshold of hyperplane i are denoted by a and τ respectively. That is, P 0 = P{ u B δ w, v B δ z :A i u =A i v}.

4 06 IEEE Statistical Signal Processing Workshop SSP Then P 0 σπ Γ n δ Γ n+ w z + σ π. Proof. This result will require the following integral; a T w z νda = Γ n w z, S π n Γ n+ where ν is the uniform probability measure on the unit sphere. We need a lower bound on P 0 which is equivalent to an upper bound on P 0 = P{A i u A i v for some u B δ w, v B δ z}. Assume a T w > a T z. Then this probability is just But P{a T v <τ<a T u for some u B δ w,v B δ z} = P{ min v B δ z at v <τ< max u B δ w at u} min v B δ z at v a T z δ, so we have max u B δ w at u a T w + δ, P 0 P{a T z δ<τ<a T w + δ}. = a T Φ w + δ a T S n σ/ z δ σ/ νda S σ a T w z +δ νda π n σ a T w z νda+ δ π S σ π n Γ n δ σπ Γ n+ w z + σ π. Lemma 3.4 Probability of separation with a single random measurement. Let w, z B n be distinct. Fix δ>0. Denote by P the probability that all points u and v that are within δ of w and z respectively differ by a random measurement denoted by A i i.e., the two δ-balls are separated by hyperplane i. The direction and threshold of hyperplane i are denoted by a and τ respectively. That is, P = P{ u B δ w, v B δ z :A i u A i v}. Set ɛ 0 w z. Then, P ɛ 0 4δ 6 πe. Proof. Assume = w z, otherwise swap w and z in the argument that follows. Denote Δ=w zand ɛ = Δ. If B δ w B δ z, then the chance that a hyperplane with orientation a and offset τ splits the two balls is zero. Hence, we may consider a restriction to the portion of the sphere where a T w z δ. Further, since the distribution of τ is symmetric, we can restrict to C α = {a : a T w z αɛ} for some α δ and double the integral. Hence C α is a hyper-spherical cap of height αɛ. P = P{a T z + δ τ a T w δ a T w + δ τ a T z δ} a T Φ w δ a T σ/ z + δ σ/ ωda C α To obtain a lower bound, we consider the area of an arbitrary subset C α C α and multiply this by the minimum value of the integrand over that set. Let W = {a : a T w ξ w δ} and Z = {a : a T z ξ z δ}. Note that for any a Z c, a T z + δ ξ and for any a W c, a T w δ ξ w + δ ξ + δ by this assumption. W c and Z c are each the complement of a spherical cap and are of equal area. The set we are considering is C α = C α W c Z c. One can show, by properties of the normal distribution, a T min a C α Φ w δ a T z + δ σ σ a T w z δ σ φ σ ξ. C α has maximal area when W = Z, i.e. when z = ɛ 0 w. However, this will provide a lower bound for arbitrary w and z. Setting α and ξ will effect a trade-off between the area of the spherical section C α and the Gaussian density. We set ξ = ξ /, and α = α /. ecall by assumption σ = /n. Then the lower bound above becomes α ɛ 0 n δ n φξ = α ɛ 0 δ φξ = α ɛ 0 δ exp ξ /. π By integrating, it can be shown that the normalized area of C α is bounded by νc α 4 3 ξ α exp ξ /. Combining the two previous formulae, we have P α ɛ 0 δ ξ α 4 6π exp ξ. We obtain the stated result by setting ξ =/ and α = /.

5 06 IEEE Statistical Signal Processing Workshop SSP 4. EFEENCES [] C. Coombs, Psychological scaling without a unit of measurement, Psych. ev., vol. 57, no. 3, pp , 950. [] G. Miller, The magical number seven, plus or minus two: some limits on our capacity for processing information., Psychological review, vol. 63, no., pp. 8, 956. [3] H. David, The method of paired comparisons, London, UK: Charles Griffin & Company Limited, 963. [4] F. adlinski and T. Joachims, Active exploration for learning rankings from clickthrough data, in Proc. ACM SIGKDD Int. Conf. on Knowledge, Discovery, and Data Mining KDD, San Jose, CA, Aug [5] S. Agarwal, J. Wills, L. Cayton, G. Lanckriet, D. Kriegman, and S. Belongie, Generalized non-metric multidimensional scaling, in Proc. Int. Conf. Art. Intell. Stat. AISTATS, San Juan, Puerto ico, Mar [4] D. Park, J. Neeman, J. Zhang, S. Sanghavi, and I. Dhillon, Preference completion: Large-scale collaborative ranking form pairwise comparisons, in Proc. Int. Conf. Machine Learning ICML, Lille, France, July 05. [5] S. Oh, K. Thekumparampil, and J. Xu, Collaboratively learning preferences from ordinal data, in Proc. Adv. in Neural Processing Systems NIPS, Montréal, Québec, Dec. 04. [6] K. Knudson,. Saab, and. Ward, One-bit compressive sensing with norm estimation, IEEE Trans. Inform. Theory, vol. 6, no. 5, pp , 06. [7]. Baraniuk, S. Foucart, D. Needell, Y. Plan, and M. Wootters, Exponential decay of reconstruction error from binary measurements of sparse signals, arxiv: , 04. [8] L. Jacques, J. Laska, P. Boufounos, and. Baraniuk, obust -bit compressive sensing via binary stable embeddings of sparse vectors, IEEE Trans. Inform. Theory, vol. 59, no. 4, pp. 08 0, 03. [6] J. ennie and N. Srebro, Fast maximum margin matrix factorization for collaborative prediction, in Proc. Int. Conf. Machine Learning ICML, Bonn, Germany, Aug [7] E. Candès and B. echt, Exact matrix completion via convex optimization, Found. Comput. Math., vol. 9, no. 6, pp , 009. [8] B. Dubois, Ideal point versus attribute models of brand preference: a comparison of predictive validity, Adv. Consumer esearch, vol., no., pp , 975. [9] A. Maydeu-Olivares and U. Böckenholt, Modeling preference data, The SAGE handbook of quantitative methods in psychology, pp. 64 8, 009. [0] K. Jamieson and. Nowak, Active ranking using pairwise comparisons, in Proc. Adv. in Neural Processing Systems NIPS, Granada, Spain, Dec. 0. [] K. Jamieson and. Nowak, Low-dimensional embedding using adaptively selected ordinal data, in Proc. Allerton Conf. Communication, Control, and Computing, Monticello, IL, 0. [] F. Wauthier, M. Jordan, and N. Jojic, Efficient ranking from pairwise comparisons, in Proc. Int. Conf. Machine Learning ICML, Atlanta, GA, June 03. [3] Y. Lu and S. Negahban, Individualized rank aggregration using nuclear norm regularization, arxiv: , 04.

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v1 [cs.it] 21 Feb 2013 q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto

More information

Exponential decay of reconstruction error from binary measurements of sparse signals

Exponential decay of reconstruction error from binary measurements of sparse signals Exponential decay of reconstruction error from binary measurements of sparse signals Deanna Needell Joint work with R. Baraniuk, S. Foucart, Y. Plan, and M. Wootters Outline Introduction Mathematical Formulation

More information

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very

More information

Binary matrix completion

Binary matrix completion Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion

More information

Random hyperplane tessellations and dimension reduction

Random hyperplane tessellations and dimension reduction Random hyperplane tessellations and dimension reduction Roman Vershynin University of Michigan, Department of Mathematics Phenomena in high dimensions in geometric analysis, random matrices and computational

More information

Sparse Recovery from Inaccurate Saturated Measurements

Sparse Recovery from Inaccurate Saturated Measurements Sparse Recovery from Inaccurate Saturated Measurements Simon Foucart and Jiangyuan Li Texas A&M University Abstract This article studies a variation of the standard compressive sensing problem, in which

More information

Sharp Generalization Error Bounds for Randomly-projected Classifiers

Sharp Generalization Error Bounds for Randomly-projected Classifiers Sharp Generalization Error Bounds for Randomly-projected Classifiers R.J. Durrant and A. Kabán School of Computer Science The University of Birmingham Birmingham B15 2TT, UK http://www.cs.bham.ac.uk/ axk

More information

Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel

Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel Pritam Mukherjee Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College Park, MD 074 pritamm@umd.edu

More information

Performance Regions in Compressed Sensing from Noisy Measurements

Performance Regions in Compressed Sensing from Noisy Measurements Performance egions in Compressed Sensing from Noisy Measurements Junan Zhu and Dror Baron Department of Electrical and Computer Engineering North Carolina State University; aleigh, NC 27695, USA Email:

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

Adaptive one-bit matrix completion

Adaptive one-bit matrix completion Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines

More information

Sparse Recovery with Pre-Gaussian Random Matrices

Sparse Recovery with Pre-Gaussian Random Matrices Sparse Recovery with Pre-Gaussian Random Matrices Simon Foucart Laboratoire Jacques-Louis Lions Université Pierre et Marie Curie Paris, 75013, France Ming-Jun Lai Department of Mathematics University of

More information

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization Tim Roughgarden & Gregory Valiant April 18, 2018 1 The Context and Intuition behind Regularization Given a dataset, and some class of models

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

The Decision List Machine

The Decision List Machine The Decision List Machine Marina Sokolova SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 sokolova@site.uottawa.ca Nathalie Japkowicz SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 nat@site.uottawa.ca

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

References for online kernel methods

References for online kernel methods References for online kernel methods W. Liu, J. Principe, S. Haykin Kernel Adaptive Filtering: A Comprehensive Introduction. Wiley, 2010. W. Liu, P. Pokharel, J. Principe. The kernel least mean square

More information

Content-based Recommendation

Content-based Recommendation Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

FORMULATION OF THE LEARNING PROBLEM

FORMULATION OF THE LEARNING PROBLEM FORMULTION OF THE LERNING PROBLEM MIM RGINSKY Now that we have seen an informal statement of the learning problem, as well as acquired some technical tools in the form of concentration inequalities, we

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Learning to Learn and Collaborative Filtering

Learning to Learn and Collaborative Filtering Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

Sketching for Large-Scale Learning of Mixture Models

Sketching for Large-Scale Learning of Mixture Models Sketching for Large-Scale Learning of Mixture Models Nicolas Keriven Université Rennes 1, Inria Rennes Bretagne-atlantique Adv. Rémi Gribonval Outline Introduction Practical Approach Results Theoretical

More information

Information-Theoretic Limits of Matrix Completion

Information-Theoretic Limits of Matrix Completion Information-Theoretic Limits of Matrix Completion Erwin Riegler, David Stotz, and Helmut Bölcskei Dept. IT & EE, ETH Zurich, Switzerland Email: {eriegler, dstotz, boelcskei}@nari.ee.ethz.ch Abstract We

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

arxiv: v1 [math.na] 26 Nov 2009

arxiv: v1 [math.na] 26 Nov 2009 Non-convexly constrained linear inverse problems arxiv:0911.5098v1 [math.na] 26 Nov 2009 Thomas Blumensath Applied Mathematics, School of Mathematics, University of Southampton, University Road, Southampton,

More information

Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms

Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms Tom Bylander Division of Computer Science The University of Texas at San Antonio San Antonio, Texas 7849 bylander@cs.utsa.edu April

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Joshua R. Wang November 1, 2016 1 Model and Results Continuing from last week, we again examine provable algorithms for

More information

Variable-Rate Universal Slepian-Wolf Coding with Feedback

Variable-Rate Universal Slepian-Wolf Coding with Feedback Variable-Rate Universal Slepian-Wolf Coding with Feedback Shriram Sarvotham, Dror Baron, and Richard G. Baraniuk Dept. of Electrical and Computer Engineering Rice University, Houston, TX 77005 Abstract

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

The Pros and Cons of Compressive Sensing

The Pros and Cons of Compressive Sensing The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal

More information

IEOR 265 Lecture 3 Sparse Linear Regression

IEOR 265 Lecture 3 Sparse Linear Regression IOR 65 Lecture 3 Sparse Linear Regression 1 M Bound Recall from last lecture that the reason we are interested in complexity measures of sets is because of the following result, which is known as the M

More information

Sampling and high-dimensional convex geometry

Sampling and high-dimensional convex geometry Sampling and high-dimensional convex geometry Roman Vershynin SampTA 2013 Bremen, Germany, June 2013 Geometry of sampling problems Signals live in high dimensions; sampling is often random. Geometry in

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

Convergence Rates of Kernel Quadrature Rules

Convergence Rates of Kernel Quadrature Rules Convergence Rates of Kernel Quadrature Rules Francis Bach INRIA - Ecole Normale Supérieure, Paris, France ÉCOLE NORMALE SUPÉRIEURE NIPS workshop on probabilistic integration - Dec. 2015 Outline Introduction

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

On the Generalization Ability of Online Strongly Convex Programming Algorithms

On the Generalization Ability of Online Strongly Convex Programming Algorithms On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract

More information

Sparse and Low Rank Recovery via Null Space Properties

Sparse and Low Rank Recovery via Null Space Properties Sparse and Low Rank Recovery via Null Space Properties Holger Rauhut Lehrstuhl C für Mathematik (Analysis), RWTH Aachen Convexity, probability and discrete structures, a geometric viewpoint Marne-la-Vallée,

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

ONE-BIT COMPRESSED SENSING WITH PARTIAL GAUSSIAN CIRCULANT MATRICES

ONE-BIT COMPRESSED SENSING WITH PARTIAL GAUSSIAN CIRCULANT MATRICES ONE-BIT COMPRESSED SENSING WITH PARTIAL GAUSSIAN CIRCULANT MATRICES SJOERD DIRKSEN, HANS CHRISTIAN JUNG, AND HOLGER RAUHUT Abstract. In this paper we consider memoryless one-bit compressed sensing with

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxations

Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxations Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxations Dian Mo Department of Electrical and Computer Engineering University of Massachusetts Amherst, MA 3 mo@umass.edu Marco F.

More information

Generalization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh

Generalization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh Generalization Bounds in Machine Learning Presented by: Afshin Rostamizadeh Outline Introduction to generalization bounds. Examples: VC-bounds Covering Number bounds Rademacher bounds Stability bounds

More information

Sparse Legendre expansions via l 1 minimization

Sparse Legendre expansions via l 1 minimization Sparse Legendre expansions via l 1 minimization Rachel Ward, Courant Institute, NYU Joint work with Holger Rauhut, Hausdorff Center for Mathematics, Bonn, Germany. June 8, 2010 Outline Sparse recovery

More information

Support Vector Regression with Automatic Accuracy Control B. Scholkopf y, P. Bartlett, A. Smola y,r.williamson FEIT/RSISE, Australian National University, Canberra, Australia y GMD FIRST, Rudower Chaussee

More information

Sample Complexity of Learning Mahalanobis Distance Metrics. Nakul Verma Janelia, HHMI

Sample Complexity of Learning Mahalanobis Distance Metrics. Nakul Verma Janelia, HHMI Sample Complexity of Learning Mahalanobis Distance Metrics Nakul Verma Janelia, HHMI feature 2 Mahalanobis Metric Learning Comparing observations in feature space: x 1 [sq. Euclidean dist] x 2 (all features

More information

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1 Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1 Simon Foucart Department of Mathematics Vanderbilt University Nashville, TN 3740 Ming-Jun Lai Department of Mathematics

More information

Lattices and Lattice Codes

Lattices and Lattice Codes Lattices and Lattice Codes Trivandrum School on Communication, Coding & Networking January 27 30, 2017 Lakshmi Prasad Natarajan Dept. of Electrical Engineering Indian Institute of Technology Hyderabad

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Learning Bound for Parameter Transfer Learning

Learning Bound for Parameter Transfer Learning Learning Bound for Parameter Transfer Learning Wataru Kumagai Faculty of Engineering Kanagawa University kumagai@kanagawa-u.ac.jp Abstract We consider a transfer-learning problem by using the parameter

More information

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS Andrew S. Lan 1, Christoph Studer 2, and Richard G. Baraniuk 1 1 Rice University; e-mail: {sl29, richb}@rice.edu 2 Cornell University; e-mail:

More information

Improved Group Testing Rates with Constant Column Weight Designs

Improved Group Testing Rates with Constant Column Weight Designs Improved Group esting Rates with Constant Column Weight Designs Matthew Aldridge Department of Mathematical Sciences University of Bath Bath, BA2 7AY, UK Email: maldridge@bathacuk Oliver Johnson School

More information

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016 Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

A Large Deviation Bound for the Area Under the ROC Curve

A Large Deviation Bound for the Area Under the ROC Curve A Large Deviation Bound for the Area Under the ROC Curve Shivani Agarwal, Thore Graepel, Ralf Herbrich and Dan Roth Dept. of Computer Science University of Illinois Urbana, IL 680, USA {sagarwal,danr}@cs.uiuc.edu

More information

NetBox: A Probabilistic Method for Analyzing Market Basket Data

NetBox: A Probabilistic Method for Analyzing Market Basket Data NetBox: A Probabilistic Method for Analyzing Market Basket Data José Miguel Hernández-Lobato joint work with Zoubin Gharhamani Department of Engineering, Cambridge University October 22, 2012 J. M. Hernández-Lobato

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Convex Phase Retrieval without Lifting via PhaseMax

Convex Phase Retrieval without Lifting via PhaseMax Convex Phase etrieval without Lifting via PhaseMax Tom Goldstein * 1 Christoph Studer * 2 Abstract Semidefinite relaxation methods transform a variety of non-convex optimization problems into convex problems,

More information

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection

Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Improved Bounds on the Dot Product under Random Projection and Random Sign Projection Ata Kabán School of Computer Science The University of Birmingham Birmingham B15 2TT, UK http://www.cs.bham.ac.uk/

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

A Generalized Restricted Isometry Property

A Generalized Restricted Isometry Property 1 A Generalized Restricted Isometry Property Jarvis Haupt and Robert Nowak Department of Electrical and Computer Engineering, University of Wisconsin Madison University of Wisconsin Technical Report ECE-07-1

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

GREEDY SIGNAL RECOVERY REVIEW

GREEDY SIGNAL RECOVERY REVIEW GREEDY SIGNAL RECOVERY REVIEW DEANNA NEEDELL, JOEL A. TROPP, ROMAN VERSHYNIN Abstract. The two major approaches to sparse recovery are L 1-minimization and greedy methods. Recently, Needell and Vershynin

More information

Rank Determination for Low-Rank Data Completion

Rank Determination for Low-Rank Data Completion Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,

More information

On the Observability of Linear Systems from Random, Compressive Measurements

On the Observability of Linear Systems from Random, Compressive Measurements On the Observability of Linear Systems from Random, Compressive Measurements Michael B Wakin, Borhan M Sanandaji, and Tyrone L Vincent Abstract Recovering or estimating the initial state of a highdimensional

More information

Sample width for multi-category classifiers

Sample width for multi-category classifiers R u t c o r Research R e p o r t Sample width for multi-category classifiers Martin Anthony a Joel Ratsaby b RRR 29-2012, November 2012 RUTCOR Rutgers Center for Operations Research Rutgers University

More information

LOW ALGEBRAIC DIMENSION MATRIX COMPLETION

LOW ALGEBRAIC DIMENSION MATRIX COMPLETION LOW ALGEBRAIC DIMENSION MATRIX COMPLETION Daniel Pimentel-Alarcón Department of Computer Science Georgia State University Atlanta, GA, USA Gregory Ongie and Laura Balzano Department of Electrical Engineering

More information

Breaking the Limits of Subspace Inference

Breaking the Limits of Subspace Inference Breaking the Limits of Subspace Inference Claudia R. Solís-Lemus, Daniel L. Pimentel-Alarcón Emory University, Georgia State University Abstract Inferring low-dimensional subspaces that describe high-dimensional,

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall

More information

Succinct Data Structures for Approximating Convex Functions with Applications

Succinct Data Structures for Approximating Convex Functions with Applications Succinct Data Structures for Approximating Convex Functions with Applications Prosenjit Bose, 1 Luc Devroye and Pat Morin 1 1 School of Computer Science, Carleton University, Ottawa, Canada, K1S 5B6, {jit,morin}@cs.carleton.ca

More information

1 Differential Privacy and Statistical Query Learning

1 Differential Privacy and Statistical Query Learning 10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 5: December 07, 015 1 Differential Privacy and Statistical Query Learning 1.1 Differential Privacy Suppose

More information

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Constructing Explicit RIP Matrices and the Square-Root Bottleneck Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry

More information

Learning Kernels -Tutorial Part III: Theoretical Guarantees.

Learning Kernels -Tutorial Part III: Theoretical Guarantees. Learning Kernels -Tutorial Part III: Theoretical Guarantees. Corinna Cortes Google Research corinna@google.com Mehryar Mohri Courant Institute & Google Research mohri@cims.nyu.edu Afshin Rostami UC Berkeley

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information