OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS. Philippe Rigollet (joint with Quentin Berthet)
|
|
- Avis Hensley
- 5 years ago
- Views:
Transcription
1 OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS Philippe Rigollet (joint with Quentin Berthet)
2 High dimensional data Cloud of point in R p
3 High dimensional data Cloud of point in R p
4 High dimensional data Cloud of n points in R p
5 Principal component Principal component = direction of largest variance
6 Principal component analysis (PCA) Tool for dimension reduction Spectrum of covariance matrix Main tool for exploratory data analysis. We study only the first principal component This talk: high-dimensional p n, finite sample framework.
7 Testing for sphericity under rank-one alternative H 0 : =I p H 1 : =I p + vv > v 2 =1 Isotropic Principal component
8 The model Observations: i.i.d. X 1,...,X n N p (0, ) Estimator: empirical covariance matrix nx ˆ = 1 n i=1 X i X > i If n p it is a consistent estimator. If n ' cp it is inconsistent (Nadler, Paul, Onatski,...) eigenvectors are orthogonal
9 Empirical spectrum under the null H 0 : =I p Marcenko-Pastur distribution Spectrum of ˆ
10 Empirical spectrum under the alternative H 1 : =I p + vv > v 2 =1 The BBP (Baik, Ben Arous, Péché) transition p n! >0 apple p > p Indistinguishable from the null detection possible if > r p n very strong signal!
11 Testing for sparse principal component H 0 : =I p H 1 : =I p + vv >, v 2 =1, v 0 apple k Isotropic Sparse principal direction
12 Testing for sparse principal component H 0 : =I p H 1 : =I p + vv >, v 2 =1, v 0 apple k minimum detection level? Goal: find a statistic ' : S + p 7! R such that P H0 ('(ˆ ) < 0 ) 1 small under H 0 P H1 ('(ˆ ) > 1 ) 1 large under H
13 P H0 ('(ˆ ) < 0 ) 1 small under P H1 ('(ˆ ) > 1 ) 1 large under H 0 H apple apple 1 Take the test: (ˆ ) = 1{'(ˆ ) > }. It satisfies: P H0 ( = 1) _ max v 2 =1 v 0 applek P H1 ( = 0) apple
14 k-sparse eigenvalue: Sparse eigenvalue '(ˆ ) = k max(ˆ ) = max x 2 =1 x 0 apple k x > ˆ x = max S =k max (ˆ S ) Note that: k max(i p ) = 1 and k max(i p + vv > )=1+ Smaller fluctuations than the largest eigenvalue max(ˆ )
15 Upper bounds w.p. 1 Under the null hypothesis: k max(ˆ ) apple 1+8 Under the alternative hypothesis: k max(ˆ ) 1+ 2(1 + ) Can detect as soon as C r k log(9ep/k) + log(1/ ) 0 < 1 r k log(p/k) n n r log(1/ ) n, which yields =: 1 =: 0
16 Minimax lower bound Fix > 0 (small). Then there exists a constant C > 0 such that if < := r k log (C p/k 2 + 1) n ^ 1 p 2 Then inf n P n 0 ( = 1) _ max v 2 =1 v 0 applek o Pv n ( = 0) 1 2 See also Arias-Castro, Bubeck and Lugosi (12)
17 Computational issues k To compute, need to compute eigenvalues max(ˆ ) p k Can be used to find cliques in graphs: NP-complete pb. Need an approximation...
18 Semidefinite relaxation 101 k SDP max(a) k = max. subject to Tr( x AZ > A x ) Tr( x Z > x ) =1 Z x 0 apple k Z = xx > rank(z) =1 Z 0 Cauchy-Schwarz xx > 1 apple k Semidefinite program program (SDP) introduced by d Aspremont, El Gahoui, Jordan and Lanckriet (2004). Testing procedure: 1{SDP k (ˆ ) > } Defined even if solution of SDP has rank > 1
19 Performance of SDP For the alternative: relaxation of k max(ˆ ) so SDP k (ˆ ) k max(ˆ ) For the null: use dual (Bach et al. 2010) SDP k (A) = min U2S + p { max (A + U)+k U 1 } For any U 2 S + p this gives an upper bound on SDP k (ˆ ) Enough to look only at minimum dual perturbation MDP k (ˆ ) = min z 0 n o max(st z (ˆ )) + kz
20 Upper bounds w.p. 1 Under the null hypothesis: DP k (ˆ ) apple Under the alternative hypothesis: Can detect as soon as 0 < 1 r k2 log(ep/ ) DP k (ˆ ) 1+ 2(1 + ) n, which yields DP 2 SDP, MDP =: 0 r log(1/ ) n =: 1 C r k2 log(p/k) n
21 1.08 SDP k MDP k λ max ( ) 1.06 H1/H Ratio of 5% quantile under H 1 over 95% quantile under H 0, versus signal strength. When this ratio is larger than one, both type I and type II errors are below 5%. θ A. d Aspremont Soutenance HDR, ENS Cachan, Nov /33
22 Summary No detection detection with k max detection with DP k r k p n log k r k 2 n log p k Can we tighten the gap?
23 Numerical evidence Fix type I error at 1%, plot type II error of MDPk p={50, 100, 200, 500}, k= p Q 0.5 P f k p n log k minimax optimal scaling e k 2 n log p k proved scaling
24 Random graphs A random (Erdos-Renyi) graph on N vertices is obtained by drawing edges at random with probability 1/2 N=50 largest clique is of size 2 log N =7.8 asymp. almost surely
25 Hidden clique We can hide a clique (here of size 10) in this graph Choose points arbitrarily and draw a clique
26 Hidden clique embed in the original random graph
27 Hidden clique Question: is there a hidden clique in this graph?
28 Hidden clique problem It is believed that it is hard to find/test the presence of a clique in a random graph (Alon, Arora, Feige, Hazan, Krauthgamer,... Cryptosystems are based on this fact!) Conjecture: It is hard to find cliques of size between 2 log N and p Alon, Krivelevich, Sudakov 98 N Feige and Krauthgamer 00 Dekel et al. 10 Feige and Ron 10 Ames and Vavasis 11 Canonical example of average case complexity
29 Hidden clique problem It seems related to our problem but not trivially (the randomness structure is very fragile) Note that all our results extend to sub-gaussian r.v. Theorem. If we could prove that there exists such that under the null hypothesis it holds SDP k (ˆ ) apple 1+C r k log(ep/ ) for some 2 (1, 2), then it can be used to test the presence of a clique of size polylog(n)n 1 4 n C>0
30 Remarks Unlike usual hardness results, this one is for one (actually two) method only (not for all methods). In progress: we can remove this limitation using bicliques (need to carefully deal with independence)
31 Conclusion Optimal rates for sparse detection Computationally efficient methods with suboptimal rate First(?) link between sparse detection and average case complexity Opens the door to new statistical lower bounds: complexity theoretic lower bounds Evidence that heuristics cannot be optimal
Planted Cliques, Iterative Thresholding and Message Passing Algorithms
Planted Cliques, Iterative Thresholding and Message Passing Algorithms Yash Deshpande and Andrea Montanari Stanford University November 5, 2013 Deshpande, Montanari Planted Cliques November 5, 2013 1 /
More informationInformation-theoretically Optimal Sparse PCA
Information-theoretically Optimal Sparse PCA Yash Deshpande Department of Electrical Engineering Stanford, CA. Andrea Montanari Departments of Electrical Engineering and Statistics Stanford, CA. Abstract
More informationReconstruction in the Generalized Stochastic Block Model
Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR
More informationNon white sample covariance matrices.
Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions
More informationOptimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices
Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Alex Wein MIT Joint work with: Amelia Perry (MIT), Afonso Bandeira (Courant NYU), Ankur Moitra (MIT) 1 / 22 Random
More informationRandom Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications
Random Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications J.P Bouchaud with: M. Potters, G. Biroli, L. Laloux, M. A. Miceli http://www.cfm.fr Portfolio theory: Basics Portfolio
More informationApplications of random matrix theory to principal component analysis(pca)
Applications of random matrix theory to principal component analysis(pca) Jun Yin IAS, UW-Madison IAS, April-2014 Joint work with A. Knowles and H. T Yau. 1 Basic picture: Let H be a Wigner (symmetric)
More informationOn the distinguishability of random quantum states
1 1 Department of Computer Science University of Bristol Bristol, UK quant-ph/0607011 Distinguishing quantum states Distinguishing quantum states This talk Question Consider a known ensemble E of n quantum
More informationStatistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16
Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent
More informationOPTIMALITY AND SUB-OPTIMALITY OF PCA I: SPIKED RANDOM MATRIX MODELS
Submitted to the Annals of Statistics OPTIMALITY AND SUB-OPTIMALITY OF PCA I: SPIKED RANDOM MATRIX MODELS By Amelia Perry and Alexander S. Wein and Afonso S. Bandeira and Ankur Moitra Massachusetts Institute
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationAssessing the dependence of high-dimensional time series via sample autocovariances and correlations
Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,
More informationapproximation algorithms I
SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationStatistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
: The High-Dimensional Setting Yudong Chen YUDONGCHEN@EECSBERELEYEDU Department of EECS, University of California, Berkeley, Berkeley, CA 94704, USA Jiaming Xu JXU18@ILLINOISEDU Department of ECE, University
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationComposite Loss Functions and Multivariate Regression; Sparse PCA
Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.
More informationRelaxations and Randomized Methods for Nonconvex QCQPs
Relaxations and Randomized Methods for Nonconvex QCQPs Alexandre d Aspremont, Stephen Boyd EE392o, Stanford University Autumn, 2003 Introduction While some special classes of nonconvex problems can be
More informationConvex Optimization M2
Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationTighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time
Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time Zhaoran Wang Huanran Lu Han Liu Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08540 {zhaoran,huanranl,hanliu}@princeton.edu
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationExtreme eigenvalues of Erdős-Rényi random graphs
Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random
More informationPCA with random noise. Van Ha Vu. Department of Mathematics Yale University
PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures
More informationShort Course Robust Optimization and Machine Learning. Lecture 4: Optimization in Unsupervised Learning
Short Course Robust Optimization and Machine Machine Lecture 4: Optimization in Unsupervised Laurent El Ghaoui EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 s
More informationCS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning
CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning Tim Roughgarden April 25, 2010 1 Learning Mixtures of Gaussians This lecture continues
More informationarxiv: v5 [math.na] 16 Nov 2017
RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem
More informationStatistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices
Journal of Machine Learning Research 17 (2016) 1-57 Submitted 8/14; Revised 5/15; Published 4/16 Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationOn the principal components of sample covariance matrices
On the principal components of sample covariance matrices Alex Bloemendal Antti Knowles Horng-Tzer Yau Jun Yin February 4, 205 We introduce a class of M M sample covariance matrices Q which subsumes and
More informationPainlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE
Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Craig A. Tracy UC Davis RHPIA 2005 SISSA, Trieste 1 Figure 1: Paul Painlevé,
More information(k, q)-trace norm for sparse matrix factorization
(k, q)-trace norm for sparse matrix factorization Emile Richard Department of Electrical Engineering Stanford University emileric@stanford.edu Guillaume Obozinski Imagine Ecole des Ponts ParisTech guillaume.obozinski@imagine.enpc.fr
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationInformation-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization Jess Banks Cristopher Moore Roman Vershynin Nicolas Verzelen Jiaming Xu Abstract We study the problem
More informationCommunity detection in stochastic block models via spectral methods
Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),
More informationNorms of Random Matrices & Low-Rank via Sampling
CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 4-10/05/2009 Norms of Random Matrices & Low-Rank via Sampling Lecturer: Michael Mahoney Scribes: Jacob Bien and Noah Youngs *Unedited Notes
More informationPrincipal Component Analysis
Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationInformation Recovery from Pairwise Measurements
Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large
More informationLarge sample covariance matrices and the T 2 statistic
Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and
More informationAverage-case hardness of RIP certification
Average-case hardness of RIP certification Tengyao Wang Centre for Mathematical Sciences Cambridge, CB3 0WB, United Kingdom t.wang@statslab.cam.ac.uk Quentin Berthet Centre for Mathematical Sciences Cambridge,
More informationRandom Matrices and Multivariate Statistical Analysis
Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationEIGENVECTOR OVERLAPS (AN RMT TALK) J.-Ph. Bouchaud (CFM/Imperial/ENS) (Joint work with Romain Allez, Joel Bun & Marc Potters)
EIGENVECTOR OVERLAPS (AN RMT TALK) J.-Ph. Bouchaud (CFM/Imperial/ENS) (Joint work with Romain Allez, Joel Bun & Marc Potters) Randomly Perturbed Matrices Questions in this talk: How similar are the eigenvectors
More informationEXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2
EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationRandom Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg
Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles
More informationCanonical SDP Relaxation for CSPs
Lecture 14 Canonical SDP Relaxation for CSPs 14.1 Recalling the canonical LP relaxation Last time, we talked about the canonical LP relaxation for a CSP. A CSP(Γ) is comprised of Γ, a collection of predicates
More informationNumerical Solutions to the General Marcenko Pastur Equation
Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some
More informationStatistical and Computational Phase Transitions in Planted Models
Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in
More informationSum-of-Squares Method, Tensor Decomposition, Dictionary Learning
Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationQuadratic reformulation techniques for 0-1 quadratic programs
OSE SEMINAR 2014 Quadratic reformulation techniques for 0-1 quadratic programs Ray Pörn CENTER OF EXCELLENCE IN OPTIMIZATION AND SYSTEMS ENGINEERING ÅBO AKADEMI UNIVERSITY ÅBO NOVEMBER 14th 2014 2 Structure
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationPreface to the Second Edition...vii Preface to the First Edition... ix
Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................
More informationEE 227A: Convex Optimization and Applications October 14, 2008
EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider
More informationSparse and Low Rank Recovery via Null Space Properties
Sparse and Low Rank Recovery via Null Space Properties Holger Rauhut Lehrstuhl C für Mathematik (Analysis), RWTH Aachen Convexity, probability and discrete structures, a geometric viewpoint Marne-la-Vallée,
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationPolyhedral Approaches to Online Bipartite Matching
Polyhedral Approaches to Online Bipartite Matching Alejandro Toriello joint with Alfredo Torrico, Shabbir Ahmed Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Industrial
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationMaximum variance formulation
12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal
More information1 Tridiagonal matrices
Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate
More informationLecture 8: Statistical vs Computational Complexity Tradeoffs via SoS
CS369H: Hierarchies of Integer Programming Relaxations Spring 2016-2017 Lecture 8: Statistical vs Computational Complexity Tradeoffs via SoS Pravesh Kothari Scribes: Carrie and Vaggos Overview We show
More informationDeflation Methods for Sparse PCA
Deflation Methods for Sparse PCA Lester Mackey Computer Science Division University of California, Berkeley Berkeley, CA 94703 Abstract In analogy to the PCA setting, the sparse PCA problem is often solved
More informationA reverse Sidorenko inequality Independent sets, colorings, and graph homomorphisms
A reverse Sidorenko inequality Independent sets, colorings, and graph homomorphisms Yufei Zhao (MIT) Joint work with Ashwin Sah (MIT) Mehtaab Sawhney (MIT) David Stoner (Harvard) Question Fix d. Which
More informationApproximate Kernel PCA with Random Features
Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming Alexandre d Aspremont alexandre.daspremont@m4x.org Department of Electrical Engineering and Computer Science Laurent El Ghaoui elghaoui@eecs.berkeley.edu
More informationPrincipal Component Analysis
Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.
More informationComputational Lower Bounds for Community Detection on Random Graphs
Computational Lower Bounds for Community Detection on Random Graphs Bruce Hajek, Yihong Wu, Jiaming Xu Department of Electrical and Computer Engineering Coordinated Science Laboratory University of Illinois
More informationSDP Relaxation of Quadratic Optimization with Few Homogeneous Quadratic Constraints
SDP Relaxation of Quadratic Optimization with Few Homogeneous Quadratic Constraints Paul Tseng Mathematics, University of Washington Seattle LIDS, MIT March 22, 2006 (Joint work with Z.-Q. Luo, N. D. Sidiropoulos,
More informationarxiv: v3 [math-ph] 21 Jun 2012
LOCAL MARCHKO-PASTUR LAW AT TH HARD DG OF SAMPL COVARIAC MATRICS CLAUDIO CACCIAPUOTI, AA MALTSV, AD BJAMI SCHLI arxiv:206.730v3 [math-ph] 2 Jun 202 Abstract. Let X be a matrix whose entries are i.i.d.
More informationCS264: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique
CS64: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique Tim Roughgarden February 7 and 9, 017 1 Preamble Lectures #6 8 studied deterministic conditions
More informationComputational functional genomics
Computational functional genomics (Spring 2005: Lecture 8) David K. Gifford (Adapted from a lecture by Tommi S. Jaakkola) MIT CSAIL Basic clustering methods hierarchical k means mixture models Multi variate
More information16.1 L.P. Duality Applied to the Minimax Theorem
CS787: Advanced Algorithms Scribe: David Malec and Xiaoyong Chai Lecturer: Shuchi Chawla Topic: Minimax Theorem and Semi-Definite Programming Date: October 22 2007 In this lecture, we first conclude our
More informationSTAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment
STAT 5 Assignment Name Spring Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Examine the matrix properties
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction
More informationDecentralized Control of Stochastic Systems
Decentralized Control of Stochastic Systems Sanjay Lall Stanford University CDC-ECC Workshop, December 11, 2005 2 S. Lall, Stanford 2005.12.11.02 Decentralized Control G 1 G 2 G 3 G 4 G 5 y 1 u 1 y 2 u
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017
U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017 Scribed by Luowen Qian Lecture 8 In which we use spectral techniques to find certificates of unsatisfiability
More informationRobust Statistics, Revisited
Robust Statistics, Revisited Ankur Moitra (MIT) joint work with Ilias Diakonikolas, Jerry Li, Gautam Kamath, Daniel Kane and Alistair Stewart CLASSIC PARAMETER ESTIMATION Given samples from an unknown
More informationGaussian random variables inr n
Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b
More informationDescriptive Statistics
Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Descriptive statistics Techniques to visualize
More informationHigh-dimensional statistics: Some progress and challenges ahead
High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh
More informationSparse principal component analysis via random projections
Sparse principal component analysis via random projections Milana Gataric, Tengyao Wang and Richard J. Samworth Statistical Laboratory, University of Cambridge {m.gataric,t.wang,r.samworth}@statslab.cam.ac.uk
More informationSlide a window along the input arc sequence S. Least-squares estimate. σ 2. σ Estimate 1. Statistically test the difference between θ 1 and θ 2
Corner Detection 2D Image Features Corners are important two dimensional features. Two dimensional image features are interesting local structures. They include junctions of dierent types Slide 3 They
More informationConvex Optimization in Classification Problems
New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley elghaoui@eecs.berkeley.edu 1
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationSample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson
Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationSparse Principal Component Analysis Formulations And Algorithms
Sparse Principal Component Analysis Formulations And Algorithms SLIDE 1 Outline 1 Background What Is Principal Component Analysis (PCA)? What Is Sparse Principal Component Analysis (spca)? 2 The Sparse
More informationInsights into Large Complex Systems via Random Matrix Theory
Insights into Large Complex Systems via Random Matrix Theory Lu Wei SEAS, Harvard 06/23/2016 @ UTC Institute for Advanced Systems Engineering University of Connecticut Lu Wei Random Matrix Theory and Large
More informationRobust Sparse Principal Component Regression under the High Dimensional Elliptical Model
Robust Sparse Principal Component Regression under the High Dimensional Elliptical Model Fang Han Department of Biostatistics Johns Hopkins University Baltimore, MD 21210 fhan@jhsph.edu Han Liu Department
More informationSpectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity
Spectral Graph Theory and its Applications Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Adjacency matrix and Laplacian Intuition, spectral graph drawing
More information1 Principal Components Analysis
Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for
More informationApplied Linear Algebra in Geoscience Using MATLAB
Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in
More information