Sum-of-Squares and Spectral Algorithms
|
|
- Patricia French
- 5 years ago
- Views:
Transcription
1 Sum-of-Squares and Spectral Algorithms Tselil Schramm June 23, 2017 Workshop on STOC 2017
2 Spectral algorithms as a tool for analyzing SoS. SoS Semidefinite Programs Spectral Algorithms
3 SoS suggests a new family of spectral algorithms! SoS Semidefinite Programs Spectral Algorithms Average-Case & Structured Instances
4 Average Case SoS/Spectral Algorithms Tensor Decomposition/Dictionary Learning [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] Planted Sparse Vector [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Tensor Principal Components Analysis [Hopkins-Shi-Steurer 15,Bhattiprolu-Guruswami-Lee 16, Raghavendra-Rao-S 17]
5 Average Case SoS/Spectral Algorithms Tensor Decomposition/Dictionary Learning [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] Planted Sparse Vector [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Tensor Principal Components Analysis [Hopkins-Shi-Steurer 15,Bhattiprolu-Guruswami-Lee 16, Raghavendra-Rao-S 17]
6 Tensor Principle Components Analysis (TPCA) T = n n n Want ``max tensor singular value/vector : σ = max x S n 1 T, x 3 and x = argmax x S n 1 T, x 3 NP-hard in worst case.
7 This notation Definition Kronecker/tensor product: m k e.g. tensor power of x: A = n B = l x = 1 n 1 mk A B = A 1,1 A 2,1 A n,1 A 1,2 A 1,m A n,m nl x 3 = n 3 x i x j x k
8 Tensor Principle Components Analysis (TPCA) T = n n n Want ``max tensor singular value/vector : NP-hard in worst case. σ = max x S n 1 T, x 3 and x = argmax x S n 1 T, x 3 T x x x
9 Spiked tensor model for TPCA [Montanari-Richard 14] planted T = T = λ v v signal v + noise G n n n entries N(0,1) random Search: find v in planted case noise Distinguishing: planted or random case? Refutation: certify upper bound on max x T, x 3 in random case T = G entries N(0,1)
10 The Plan Refutation: certify upper bound on max x T, x 3 in random case random noise 1. SoS suggests a family of spectral algorithms 2. Naïve spectral algorithm 3. Improving with SoS spectral algorithms T = G entries N(0,1) Search: find v in planted case 4. Use SoS analysis to get fast algorithms T = λ v v signal v + planted noise G entries N(0,1)
11 Degree- SoS deg p D Solve for E: p x R Linearity: n variables E a p x + b q(x) = a E p x + b E q x Fixed Scalars: E 1 = 1 Non-negative squares: E q x 2 0 deg q D 2 + problem-specific constraints, e.g. E x 2 = 1
12 SoS suggests spectral algorithms If we want to bound f x associate some matrix with f and then Rearrange entries along monomial symmetries Apply degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Use problem-specific constraints (e.g. x i 2 = 1)
13 SoS captures spectral algorithms Theorem E f x λ max (f) Definition λ max f = argmin λ max F F symmetric f x = F, x 2d symmetric matrix representation of f(x) f(x) = F, x 2d e.g.
14 SoS captures spectral algorithms Theorem Proof E f x λ max (f) E f(x) = E F, x 2d symmetric matrix representation of f(x) λ = λ max F 0 λ Id F 0 = σ i u i u i E E 0 λ Id F, x 2d = σ i u i, x d 2 sum of degree-d squares if 2d D
15 SoS captures spectral algorithms Theorem E f x λ max (f) Proof E f(x) = E F, x 2d sum of degree-d squares if 2d D E F, x 2d + E λ Id F, x 2d By linearity = E λ Id, x 2d = λ E x 2d squares on diagonal
16 What kind of spectral algorithms? Choose best matrix representation F by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Problem-specific constraints (e.g. x i 2 = 1)
17 What kind of spectral algorithms? Choose best matrix representation F by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Problem-specific constraints (e.g. x i 2 = 1)
18 SoS suggests several spectral algorithms matrix representation of f(x) λ E x 2d choice of may affect!
19 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). unit vector f x = E g N(0,Id) x, g 4 = 3 N(0,1)
20 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). unit vector f x = E g N(0,Id) x, g 4 N(0,1) = E g 4, x 4 i = j = k = l i, j, k, l two distinct pairs any index with odd multiplicity
21 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji E g g g g = eigenvalue is n 3 n ii jj ij ji
22 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji ii +(x i 2 x j 2 x i 2 x j 2 ) ij n ii jj ij ji jj ij -1 +1
23 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji
24 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji
25 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji
26 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji = 3 Id eigenvalues are 3!
27 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) f x = E g 4, x 4 = 3 Id, x 4 = 3
28 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),
29 Tensor Norm Refutation random case, noise only Claim max x S n G, x 3 O n with high probability over G Simple spectral algorithm can only certify O n. Proof: G, x 3 σ max (F G ) G n 2 Gordon s Theorem σ max (G) n n Representations all the same because G is symmetric with iid entries
30 SoS Cauchy-Schwarz Claim if Proof: degree D square
31 Cauchy-Schwarz for Tensor PCA Refutation Theorem unit vector E G, x 3 n 3/4 n noise N(0,1) G i i n Proof: n G, x 3 Cauchy-Schwarz
32 Cauchy-Schwarz for Tensor PCA Refutation Proof: Theorem unit vector E G, x 3 n 3/4 noise N(0,1) SoS analysis spectral algorithm for refutation! G, x 3 Cauchy-Schwarz eigenvalues are n 3/2
33 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),
34 Better Approx (in more time) Theorem unit vector E G, x 3 n 3/4 noise N(0,1) Theorem unit vector noise N(0,1) (in time n D ) But actually, max x S n G, x 3 O n. Information-theoretically, can certify O n in time 2 n (epsilon net).
35 Better Approx (in more time) Theorem unit vector E G, x 3 n 3/4 noise N(0,1) Theorem unit vector noise N(0,1) (in time n D ) ε D = n ε 1 2 n Tensor PCA Refutation SoS and spectral algorithms log D log n 0 1 approximation factor SoS lower bounds poly(n) n 1 4 [Hopkins-Kothari- Potechin-Raghavendra- S-Steurer 17]
36 Jensen s Inequality For d a power of 2, D d deg(f) Proof by induction on d E f x d 2 E f x 2d apply inductive hypothesis
37 Jensen s Inequality For d a power of 2, D d deg(f) We can take advantage of increased symmetry in higher-degree polynomials (more matrix representations)
38 Better Approximation Theorem unit vector noise N(0,1) (in time n D ) Proof: Jensen s inequality for d some power of 2
39 Better Approximation Theorem unit vector noise N(0,1) (in time n D ) Proof: Jensen s inequality for d some power of 2
40 Symmetrize to improve eigenvalue ordered multiset of 2d variables Taking the average of row S and π(s) fixes the polynomial S π(s) entries degree-2d polynomials in G ijk N(0,1) n 2d x i = i S j π(s) x j n 2d
41 Symmetrizing to improve eigenvalue Taking the average of row S and π(s) fixes the polynomial S π(s) entries degree-2d polynomials in G ijk N(0,1) n 2d (2) M abcd,ijkl 1 4! ( G i G i ) abij ( G i G i ) cdkl +( G i G i ) acij ( G i G i ) bdkl + n 2d
42 Heuristic spectral norm calculation Spectral norm? I π I ( G i G i ) d = n 3/2 d avg. entry each entry is average of ~d! i.i.d. uniform randomly signed variables magnitude m m d!
43 Improving Tensor PCA noise parameter Theorem unit vector noise N(0,1) Jensen s inequality for d some power of 2 (if D 4d) Average over symmetries of x 2d to reduce matrix representation eigenvalues = E x 4d, M (d) 1/2d M d 1/2d w.h.p.
44 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),
45 Other SoS (via Spectral) Algorithms Tensor Decomposition: symmetry, Cauchy-Schwarz, constant-d Jensen s Dictionary Learning: symmetry + tensor decomposition Planted Sparse Vector: symmetry [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion: symmetry, Cauchy-Schwarz [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs: symmetry, Cauchy-Schwarz, Jensen s, (x i 2 = 1) constraints [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Polynomial Maximization over S n : symmetry, Cauchy-Schwarz, Jensen s, worst case [Bhattiprolu-Ghosh-Guruswami-E.Lee-Tulsiani 16]
46 Fast spectral algorithms from SoS Analyses [Hopkins-S-Shi-Steurer 16]
47 SoS Gives Spectral Search Algorithm T = G + λ v 3 T i T i n 2 T i n n i n T i = G i + λv i vv n 2 v i 2 = 1 = G λ 2 v 2 i vv vv i G i + cross-terms + 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2
48 Running in. T = G + λ v 3 n T i T i n 2 T i i n n sum of n matrices of size n 2 n 2 n 2 time = n 5 + n 4 log n build matrix compute top eigenvalue practical spectral algorithm? Theorem Can compress to get an O n 3 -time algorithm.
49 Theorem Compressing the matrix There is an O n 3 -time algorithm. T i T i = G i G i + crossterms + λ 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2 How to reduce dimension but preserve signal-to-noise ratio? A, B are n n matrices A i,i B Partial Trace: Tr par A B = Tr A B Tr par λ 2 vv vv = λ 2 v 2 vv = λ 2 vv signal-to-noise Tr par G i G i = Tr G i G i eigs n 3/2 ratio preserved! ±n 1/2 eigenvalues ±n 1/2
50 Theorem Compressing the matrix There is an O n 3 -time algorithm. T i T i = G i G i + crossterms + λ 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2 How to reduce dimension but preserve signal-to-noise ratio? A, B are n n matrices A i,i B Partial Trace: Tr par A B = Tr A B Tr par T i T i = Tr T i T i runtime? computing all Tr(T i ) : n 2 time each of the n 2 entries is sum of n numbers: n 3 time linear in input! computing top eigenvector/eigenvalue of n n matrix: n 2 log n time
51 Fast Spectral Algorithms via SoS Secret Sauce: apply partial trace to SoS matrix (in a way that enables fast power iteration) Tensor PCA [Hopkins-S-Shi-Steurer 16] Tensor decomposition Planted Sparse Vector [Hopkins-S-Shi-Steurer 16, S-Steurer 17] [Hopkins-S-Shi-Steurer 16] Tensor Completion [Montanari-Sun 17]
52 SoS perspective gives new spectral algorithms Spectral techniques let us analyze SoS Worst-case problems? Spectral Algorithms Sum-of-Squares Algorithms Average-Case & Structured Instances
approximation algorithms I
SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions
More informationSum-of-Squares Method, Tensor Decomposition, Dictionary Learning
Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees
More informationApproximation & Complexity
Summer school on semidefinite optimization Approximation & Complexity David Steurer Cornell University Part 1 September 6, 2012 Overview Part 1 Unique Games Conjecture & Basic SDP Part 2 SDP Hierarchies:
More informationLower bounds on the size of semidefinite relaxations. David Steurer Cornell
Lower bounds on the size of semidefinite relaxations David Steurer Cornell James R. Lee Washington Prasad Raghavendra Berkeley Institute for Advanced Study, November 2015 overview of results unconditional
More informationUnique Games Conjecture & Polynomial Optimization. David Steurer
Unique Games Conjecture & Polynomial Optimization David Steurer Newton Institute, Cambridge, July 2013 overview Proof Complexity Approximability Polynomial Optimization Quantum Information computational
More informationA NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE
A NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE Boaz Barak Sam Hopkins Jon Kelner Pravesh Kothari Ankur Moitra Aaron Potechin on the market! PLANTED CLIQUE ("DISTINGUISHING") Input: graph
More informationFunctional Analysis Review
Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationQuantum Computing Lecture 2. Review of Linear Algebra
Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces
More informationBasic Calculus Review
Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V
More informationLinear Algebra Massoud Malek
CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationsublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)
sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank
More informationLearning with Singular Vectors
Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:
More informationLecture 7: Positive Semidefinite Matrices
Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.
More informationSymmetric Matrices and Eigendecomposition
Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions
More informationSemidefinite Programming Basics and Applications
Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent
More informationLinear Algebra: Characteristic Value Problem
Linear Algebra: Characteristic Value Problem . The Characteristic Value Problem Let < be the set of real numbers and { be the set of complex numbers. Given an n n real matrix A; does there exist a number
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationElementary linear algebra
Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The
More informationarxiv: v2 [cs.ds] 3 Nov 2016
Strongly Refuting Random CSPs Below the Spectral Threshold Prasad Raghavendra Satish Rao Tselil Schramm arxiv:1605.00058v2 [cs.ds] 3 Nov 2016 Abstract Random constraint satisfaction problems CSPs) are
More informationLecture 7 Spectral methods
CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional
More informationMatrix Vector Products
We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and
More informationLecture 8: Statistical vs Computational Complexity Tradeoffs via SoS
CS369H: Hierarchies of Integer Programming Relaxations Spring 2016-2017 Lecture 8: Statistical vs Computational Complexity Tradeoffs via SoS Pravesh Kothari Scribes: Carrie and Vaggos Overview We show
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More information6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC
6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 2003 2003.09.02.10 6. The Positivstellensatz Basic semialgebraic sets Semialgebraic sets Tarski-Seidenberg and quantifier elimination Feasibility
More informationSection 3.9. Matrix Norm
3.9. Matrix Norm 1 Section 3.9. Matrix Norm Note. We define several matrix norms, some similar to vector norms and some reflecting how multiplication by a matrix affects the norm of a vector. We use matrix
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More informationMP 472 Quantum Information and Computation
MP 472 Quantum Information and Computation http://www.thphys.may.ie/staff/jvala/mp472.htm Outline Open quantum systems The density operator ensemble of quantum states general properties the reduced density
More informationLow Rank Matrix Completion Formulation and Algorithm
1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9
More informationSTATISTICAL INFERENCE AND THE SUM OF SQUARES METHOD
STATISTICAL INFERENCE AND THE SUM OF SQUARES METHOD A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor
More informationquantum information and convex optimization Aram Harrow (MIT)
quantum information and convex optimization Aram Harrow (MIT) Simons Institute 25.9.2014 outline 1. quantum information, entanglement, tensors 2. optimization problems from quantum mechanics 3. SDP approaches
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationMatrix Rank Minimization with Applications
Matrix Rank Minimization with Applications Maryam Fazel Haitham Hindi Stephen Boyd Information Systems Lab Electrical Engineering Department Stanford University 8/2001 ACC 01 Outline Rank Minimization
More informationj=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).
Math 344 Lecture #19 3.5 Normed Linear Spaces Definition 3.5.1. A seminorm on a vector space V over F is a map : V R that for all x, y V and for all α F satisfies (i) x 0 (positivity), (ii) αx = α x (scale
More informationNoisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get
Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto
More informationarxiv: v5 [math.na] 16 Nov 2017
RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem
More informationExercises * on Principal Component Analysis
Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................
More informationSVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)
Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method
CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that
More informationINNER PRODUCT SPACE. Definition 1
INNER PRODUCT SPACE Definition 1 Suppose u, v and w are all vectors in vector space V and c is any scalar. An inner product space on the vectors space V is a function that associates with each pair of
More informationLinear Algebra. Session 12
Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationBare minimum on matrix algebra. Psychology 588: Covariance structure and factor models
Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1
More informationComputational Methods. Eigenvalues and Singular Values
Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations
More informationPractice Algebra Qualifying Exam Solutions
Practice Algebra Qualifying Exam Solutions 1. Let A be an n n matrix with complex coefficients. Define tr A to be the sum of the diagonal elements. Show that tr A is invariant under conjugation, i.e.,
More informationProblem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit.
CSE 151: Introduction to Machine Learning Winter 2017 Problem Set 1 Instructor: Kamalika Chaudhuri Due on: Jan 28 Instructions This is a 40 point homework Homeworks will graded based on content and clarity
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationRounding Sum-of-Squares Relaxations
Rounding Sum-of-Squares Relaxations Boaz Barak Jonathan Kelner David Steurer June 10, 2014 Abstract We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares
More informationLecture 8 : Eigenvalues and Eigenvectors
CPS290: Algorithmic Foundations of Data Science February 24, 2017 Lecture 8 : Eigenvalues and Eigenvectors Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Hermitian Matrices It is simpler to begin with
More informationLMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009
LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset
More informationMATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.
as Basics Vectors MATRIX ALGEBRA An array of n real numbers x, x,, x n is called a vector and it is written x = x x n or x = x,, x n R n prime operation=transposing a column to a row Basic vector operations
More informationKernel Methods. Machine Learning A W VO
Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance
More information. a m1 a mn. a 1 a 2 a = a n
Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by
More informationSignal Recovery from Permuted Observations
EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,
More informationIterative Methods. Splitting Methods
Iterative Methods Splitting Methods 1 Direct Methods Solving Ax = b using direct methods. Gaussian elimination (using LU decomposition) Variants of LU, including Crout and Doolittle Other decomposition
More informationChapter 7: Symmetric Matrices and Quadratic Forms
Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved
More informationBackground Mathematics (2/2) 1. David Barber
Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and
More informationCompound matrices and some classical inequalities
Compound matrices and some classical inequalities Tin-Yau Tam Mathematics & Statistics Auburn University Dec. 3, 04 We discuss some elegant proofs of several classical inequalities of matrices by using
More information1 Quantum states and von Neumann entropy
Lecture 9: Quantum entropy maximization CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: February 15, 2016 1 Quantum states and von Neumann entropy Recall that S sym n n
More informationA fast randomized algorithm for approximating an SVD of a matrix
A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,
More informationSpectral Theorem for Self-adjoint Linear Operators
Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;
More informationGraph Partitioning Algorithms and Laplacian Eigenvalues
Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear
More informationMath Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.
Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems
More informationThe following definition is fundamental.
1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic
More informationSupremum of simple stochastic processes
Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there
More informationlinear programming and approximate constraint satisfaction
linear programming and approximate constraint satisfaction Siu On Chan MSR New England James R. Lee University of Washington Prasad Raghavendra UC Berkeley David Steurer Cornell results MAIN THEOREM: Any
More informationMath 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm
Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm References: Trefethen & Bau textbook Eigenvalue problem: given a matrix A, find
More informationMathematics Department Stanford University Math 61CM/DM Inner products
Mathematics Department Stanford University Math 61CM/DM Inner products Recall the definition of an inner product space; see Appendix A.8 of the textbook. Definition 1 An inner product space V is a vector
More informationTight Size-Degree Lower Bounds for Sums-of-Squares Proofs
Tight Size-Degree Lower Bounds for Sums-of-Squares Proofs Massimo Lauria KTH Royal Institute of Technology (Stockholm) 1 st Computational Complexity Conference, 015 Portland! Joint work with Jakob Nordström
More informationECE 598: Representation Learning: Algorithms and Models Fall 2017
ECE 598: Representation Learning: Algorithms and Models Fall 2017 Lecture 1: Tensor Methods in Machine Learning Lecturer: Pramod Viswanathan Scribe: Bharath V Raghavan, Oct 3, 2017 11 Introduction Tensors
More informationTHE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R
THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is a dynamical system defined by a method of iterated differences. In this paper, we
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationProvable Alternating Minimization Methods for Non-convex Optimization
Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish
More informationwhat can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley
what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley Collaborators Joint work with Samy Bengio, Moritz Hardt, Michael Jordan, Jason Lee, Max Simchowitz,
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize
More informationTHE EIGENVALUE PROBLEM
THE EIGENVALUE PROBLEM Let A be an n n square matrix. If there is a number λ and a column vector v 0 for which Av = λv then we say λ is an eigenvalue of A and v is an associated eigenvector. Note that
More informationLimitations in Approximating RIP
Alok Puranik Mentor: Adrian Vladu Fifth Annual PRIMES Conference, 2015 Outline 1 Background The Problem Motivation Construction Certification 2 Planted model Planting eigenvalues Analysis Distinguishing
More informationLecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora
princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the
More informationApplied Linear Algebra
Applied Linear Algebra Peter J. Olver School of Mathematics University of Minnesota Minneapolis, MN 55455 olver@math.umn.edu http://www.math.umn.edu/ olver Chehrzad Shakiban Department of Mathematics University
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationRobustness Meets Algorithms
Robustness Meets Algorithms Ankur Moitra (MIT) ICML 2017 Tutorial, August 6 th CLASSIC PARAMETER ESTIMATION Given samples from an unknown distribution in some class e.g. a 1-D Gaussian can we accurately
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationMath Matrix Algebra
Math 44 - Matrix Algebra Review notes - (Alberto Bressan, Spring 7) sec: Orthogonal diagonalization of symmetric matrices When we seek to diagonalize a general n n matrix A, two difficulties may arise:
More informationExercise Sheet 1.
Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?
More informationMath 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.
Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More informationL26: Advanced dimensionality reduction
L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern
More informationThe study of linear algebra involves several types of mathematical objects:
Chapter 2 Linear Algebra Linear algebra is a branch of mathematics that is widely used throughout science and engineering. However, because linear algebra is a form of continuous rather than discrete mathematics,
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More information