Sum-of-Squares and Spectral Algorithms

Size: px
Start display at page:

Download "Sum-of-Squares and Spectral Algorithms"

Transcription

1 Sum-of-Squares and Spectral Algorithms Tselil Schramm June 23, 2017 Workshop on STOC 2017

2 Spectral algorithms as a tool for analyzing SoS. SoS Semidefinite Programs Spectral Algorithms

3 SoS suggests a new family of spectral algorithms! SoS Semidefinite Programs Spectral Algorithms Average-Case & Structured Instances

4 Average Case SoS/Spectral Algorithms Tensor Decomposition/Dictionary Learning [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] Planted Sparse Vector [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Tensor Principal Components Analysis [Hopkins-Shi-Steurer 15,Bhattiprolu-Guruswami-Lee 16, Raghavendra-Rao-S 17]

5 Average Case SoS/Spectral Algorithms Tensor Decomposition/Dictionary Learning [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] Planted Sparse Vector [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Tensor Principal Components Analysis [Hopkins-Shi-Steurer 15,Bhattiprolu-Guruswami-Lee 16, Raghavendra-Rao-S 17]

6 Tensor Principle Components Analysis (TPCA) T = n n n Want ``max tensor singular value/vector : σ = max x S n 1 T, x 3 and x = argmax x S n 1 T, x 3 NP-hard in worst case.

7 This notation Definition Kronecker/tensor product: m k e.g. tensor power of x: A = n B = l x = 1 n 1 mk A B = A 1,1 A 2,1 A n,1 A 1,2 A 1,m A n,m nl x 3 = n 3 x i x j x k

8 Tensor Principle Components Analysis (TPCA) T = n n n Want ``max tensor singular value/vector : NP-hard in worst case. σ = max x S n 1 T, x 3 and x = argmax x S n 1 T, x 3 T x x x

9 Spiked tensor model for TPCA [Montanari-Richard 14] planted T = T = λ v v signal v + noise G n n n entries N(0,1) random Search: find v in planted case noise Distinguishing: planted or random case? Refutation: certify upper bound on max x T, x 3 in random case T = G entries N(0,1)

10 The Plan Refutation: certify upper bound on max x T, x 3 in random case random noise 1. SoS suggests a family of spectral algorithms 2. Naïve spectral algorithm 3. Improving with SoS spectral algorithms T = G entries N(0,1) Search: find v in planted case 4. Use SoS analysis to get fast algorithms T = λ v v signal v + planted noise G entries N(0,1)

11 Degree- SoS deg p D Solve for E: p x R Linearity: n variables E a p x + b q(x) = a E p x + b E q x Fixed Scalars: E 1 = 1 Non-negative squares: E q x 2 0 deg q D 2 + problem-specific constraints, e.g. E x 2 = 1

12 SoS suggests spectral algorithms If we want to bound f x associate some matrix with f and then Rearrange entries along monomial symmetries Apply degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Use problem-specific constraints (e.g. x i 2 = 1)

13 SoS captures spectral algorithms Theorem E f x λ max (f) Definition λ max f = argmin λ max F F symmetric f x = F, x 2d symmetric matrix representation of f(x) f(x) = F, x 2d e.g.

14 SoS captures spectral algorithms Theorem Proof E f x λ max (f) E f(x) = E F, x 2d symmetric matrix representation of f(x) λ = λ max F 0 λ Id F 0 = σ i u i u i E E 0 λ Id F, x 2d = σ i u i, x d 2 sum of degree-d squares if 2d D

15 SoS captures spectral algorithms Theorem E f x λ max (f) Proof E f(x) = E F, x 2d sum of degree-d squares if 2d D E F, x 2d + E λ Id F, x 2d By linearity = E λ Id, x 2d = λ E x 2d squares on diagonal

16 What kind of spectral algorithms? Choose best matrix representation F by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Problem-specific constraints (e.g. x i 2 = 1)

17 What kind of spectral algorithms? Choose best matrix representation F by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares), Problem-specific constraints (e.g. x i 2 = 1)

18 SoS suggests several spectral algorithms matrix representation of f(x) λ E x 2d choice of may affect!

19 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). unit vector f x = E g N(0,Id) x, g 4 = 3 N(0,1)

20 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). unit vector f x = E g N(0,Id) x, g 4 N(0,1) = E g 4, x 4 i = j = k = l i, j, k, l two distinct pairs any index with odd multiplicity

21 SoS suggests several spectral algorithms Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji E g g g g = eigenvalue is n 3 n ii jj ij ji

22 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji ii +(x i 2 x j 2 x i 2 x j 2 ) ij n ii jj ij ji jj ij -1 +1

23 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji

24 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji

25 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji

26 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) n ii jj ij ji n ii jj ij ji = 3 Id eigenvalues are 3!

27 Rearranging entries along symmetries of Claim There exist f(x) with representations F 1, F 2 such that f x = F 1, x d = F 2, x d but λ F 1 λ(f 2 ). f x = E g N(0,Id) x, g 4 = E g 4, x 4 unit vector N(0,1) f x = E g 4, x 4 = 3 Id, x 4 = 3

28 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),

29 Tensor Norm Refutation random case, noise only Claim max x S n G, x 3 O n with high probability over G Simple spectral algorithm can only certify O n. Proof: G, x 3 σ max (F G ) G n 2 Gordon s Theorem σ max (G) n n Representations all the same because G is symmetric with iid entries

30 SoS Cauchy-Schwarz Claim if Proof: degree D square

31 Cauchy-Schwarz for Tensor PCA Refutation Theorem unit vector E G, x 3 n 3/4 n noise N(0,1) G i i n Proof: n G, x 3 Cauchy-Schwarz

32 Cauchy-Schwarz for Tensor PCA Refutation Proof: Theorem unit vector E G, x 3 n 3/4 noise N(0,1) SoS analysis spectral algorithm for refutation! G, x 3 Cauchy-Schwarz eigenvalues are n 3/2

33 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),

34 Better Approx (in more time) Theorem unit vector E G, x 3 n 3/4 noise N(0,1) Theorem unit vector noise N(0,1) (in time n D ) But actually, max x S n G, x 3 O n. Information-theoretically, can certify O n in time 2 n (epsilon net).

35 Better Approx (in more time) Theorem unit vector E G, x 3 n 3/4 noise N(0,1) Theorem unit vector noise N(0,1) (in time n D ) ε D = n ε 1 2 n Tensor PCA Refutation SoS and spectral algorithms log D log n 0 1 approximation factor SoS lower bounds poly(n) n 1 4 [Hopkins-Kothari- Potechin-Raghavendra- S-Steurer 17]

36 Jensen s Inequality For d a power of 2, D d deg(f) Proof by induction on d E f x d 2 E f x 2d apply inductive hypothesis

37 Jensen s Inequality For d a power of 2, D d deg(f) We can take advantage of increased symmetry in higher-degree polynomials (more matrix representations)

38 Better Approximation Theorem unit vector noise N(0,1) (in time n D ) Proof: Jensen s inequality for d some power of 2

39 Better Approximation Theorem unit vector noise N(0,1) (in time n D ) Proof: Jensen s inequality for d some power of 2

40 Symmetrize to improve eigenvalue ordered multiset of 2d variables Taking the average of row S and π(s) fixes the polynomial S π(s) entries degree-2d polynomials in G ijk N(0,1) n 2d x i = i S j π(s) x j n 2d

41 Symmetrizing to improve eigenvalue Taking the average of row S and π(s) fixes the polynomial S π(s) entries degree-2d polynomials in G ijk N(0,1) n 2d (2) M abcd,ijkl 1 4! ( G i G i ) abij ( G i G i ) cdkl +( G i G i ) acij ( G i G i ) bdkl + n 2d

42 Heuristic spectral norm calculation Spectral norm? I π I ( G i G i ) d = n 3/2 d avg. entry each entry is average of ~d! i.i.d. uniform randomly signed variables magnitude m m d!

43 Improving Tensor PCA noise parameter Theorem unit vector noise N(0,1) Jensen s inequality for d some power of 2 (if D 4d) Average over symmetries of x 2d to reduce matrix representation eigenvalues = E x 4d, M (d) 1/2d M d 1/2d w.h.p.

44 What kind of spectral algorithms? Choose best matrix representation by: Rearranging entries along symmetries of x d Applying degree-d SoS polynomial inequalities Cauchy-Schwarz, Jensen s Inequality (for squares),

45 Other SoS (via Spectral) Algorithms Tensor Decomposition: symmetry, Cauchy-Schwarz, constant-d Jensen s Dictionary Learning: symmetry + tensor decomposition Planted Sparse Vector: symmetry [Barak-Kelner-Steurer 14, Ge-Ma 15, Ma-Shi-Steurer 16] [Barak-Brandão-Harrow-Kelner-Steurer-Zhou 12, Barak-Kelner-Steurer 14] Tensor Completion: symmetry, Cauchy-Schwarz [Barak-Moitra 16, Potechin-Steurer 17] Refuting Random CSPs: symmetry, Cauchy-Schwarz, Jensen s, (x i 2 = 1) constraints [Allen-O Donnell-Witmer 15, Raghavendra-Rao-S 17] Polynomial Maximization over S n : symmetry, Cauchy-Schwarz, Jensen s, worst case [Bhattiprolu-Ghosh-Guruswami-E.Lee-Tulsiani 16]

46 Fast spectral algorithms from SoS Analyses [Hopkins-S-Shi-Steurer 16]

47 SoS Gives Spectral Search Algorithm T = G + λ v 3 T i T i n 2 T i n n i n T i = G i + λv i vv n 2 v i 2 = 1 = G λ 2 v 2 i vv vv i G i + cross-terms + 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2

48 Running in. T = G + λ v 3 n T i T i n 2 T i i n n sum of n matrices of size n 2 n 2 n 2 time = n 5 + n 4 log n build matrix compute top eigenvalue practical spectral algorithm? Theorem Can compress to get an O n 3 -time algorithm.

49 Theorem Compressing the matrix There is an O n 3 -time algorithm. T i T i = G i G i + crossterms + λ 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2 How to reduce dimension but preserve signal-to-noise ratio? A, B are n n matrices A i,i B Partial Trace: Tr par A B = Tr A B Tr par λ 2 vv vv = λ 2 v 2 vv = λ 2 vv signal-to-noise Tr par G i G i = Tr G i G i eigs n 3/2 ratio preserved! ±n 1/2 eigenvalues ±n 1/2

50 Theorem Compressing the matrix There is an O n 3 -time algorithm. T i T i = G i G i + crossterms + λ 2 vv vv eigenvalue n 3/2 eigenvalue = λ 2 How to reduce dimension but preserve signal-to-noise ratio? A, B are n n matrices A i,i B Partial Trace: Tr par A B = Tr A B Tr par T i T i = Tr T i T i runtime? computing all Tr(T i ) : n 2 time each of the n 2 entries is sum of n numbers: n 3 time linear in input! computing top eigenvector/eigenvalue of n n matrix: n 2 log n time

51 Fast Spectral Algorithms via SoS Secret Sauce: apply partial trace to SoS matrix (in a way that enables fast power iteration) Tensor PCA [Hopkins-S-Shi-Steurer 16] Tensor decomposition Planted Sparse Vector [Hopkins-S-Shi-Steurer 16, S-Steurer 17] [Hopkins-S-Shi-Steurer 16] Tensor Completion [Montanari-Sun 17]

52 SoS perspective gives new spectral algorithms Spectral techniques let us analyze SoS Worst-case problems? Spectral Algorithms Sum-of-Squares Algorithms Average-Case & Structured Instances

approximation algorithms I

approximation algorithms I SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions

More information

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees

More information

Approximation & Complexity

Approximation & Complexity Summer school on semidefinite optimization Approximation & Complexity David Steurer Cornell University Part 1 September 6, 2012 Overview Part 1 Unique Games Conjecture & Basic SDP Part 2 SDP Hierarchies:

More information

Lower bounds on the size of semidefinite relaxations. David Steurer Cornell

Lower bounds on the size of semidefinite relaxations. David Steurer Cornell Lower bounds on the size of semidefinite relaxations David Steurer Cornell James R. Lee Washington Prasad Raghavendra Berkeley Institute for Advanced Study, November 2015 overview of results unconditional

More information

Unique Games Conjecture & Polynomial Optimization. David Steurer

Unique Games Conjecture & Polynomial Optimization. David Steurer Unique Games Conjecture & Polynomial Optimization David Steurer Newton Institute, Cambridge, July 2013 overview Proof Complexity Approximability Polynomial Optimization Quantum Information computational

More information

A NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE

A NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE A NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE Boaz Barak Sam Hopkins Jon Kelner Pravesh Kothari Ankur Moitra Aaron Potechin on the market! PLANTED CLIQUE ("DISTINGUISHING") Input: graph

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Quantum Computing Lecture 2. Review of Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information

Symmetric Matrices and Eigendecomposition

Symmetric Matrices and Eigendecomposition Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions

More information

Semidefinite Programming Basics and Applications

Semidefinite Programming Basics and Applications Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent

More information

Linear Algebra: Characteristic Value Problem

Linear Algebra: Characteristic Value Problem Linear Algebra: Characteristic Value Problem . The Characteristic Value Problem Let < be the set of real numbers and { be the set of complex numbers. Given an n n real matrix A; does there exist a number

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

arxiv: v2 [cs.ds] 3 Nov 2016

arxiv: v2 [cs.ds] 3 Nov 2016 Strongly Refuting Random CSPs Below the Spectral Threshold Prasad Raghavendra Satish Rao Tselil Schramm arxiv:1605.00058v2 [cs.ds] 3 Nov 2016 Abstract Random constraint satisfaction problems CSPs) are

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

Matrix Vector Products

Matrix Vector Products We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and

More information

Lecture 8: Statistical vs Computational Complexity Tradeoffs via SoS

Lecture 8: Statistical vs Computational Complexity Tradeoffs via SoS CS369H: Hierarchies of Integer Programming Relaxations Spring 2016-2017 Lecture 8: Statistical vs Computational Complexity Tradeoffs via SoS Pravesh Kothari Scribes: Carrie and Vaggos Overview We show

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 2003 2003.09.02.10 6. The Positivstellensatz Basic semialgebraic sets Semialgebraic sets Tarski-Seidenberg and quantifier elimination Feasibility

More information

Section 3.9. Matrix Norm

Section 3.9. Matrix Norm 3.9. Matrix Norm 1 Section 3.9. Matrix Norm Note. We define several matrix norms, some similar to vector norms and some reflecting how multiplication by a matrix affects the norm of a vector. We use matrix

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

MP 472 Quantum Information and Computation

MP 472 Quantum Information and Computation MP 472 Quantum Information and Computation http://www.thphys.may.ie/staff/jvala/mp472.htm Outline Open quantum systems The density operator ensemble of quantum states general properties the reduced density

More information

Low Rank Matrix Completion Formulation and Algorithm

Low Rank Matrix Completion Formulation and Algorithm 1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9

More information

STATISTICAL INFERENCE AND THE SUM OF SQUARES METHOD

STATISTICAL INFERENCE AND THE SUM OF SQUARES METHOD STATISTICAL INFERENCE AND THE SUM OF SQUARES METHOD A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor

More information

quantum information and convex optimization Aram Harrow (MIT)

quantum information and convex optimization Aram Harrow (MIT) quantum information and convex optimization Aram Harrow (MIT) Simons Institute 25.9.2014 outline 1. quantum information, entanglement, tensors 2. optimization problems from quantum mechanics 3. SDP approaches

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Matrix Rank Minimization with Applications

Matrix Rank Minimization with Applications Matrix Rank Minimization with Applications Maryam Fazel Haitham Hindi Stephen Boyd Information Systems Lab Electrical Engineering Department Stanford University 8/2001 ACC 01 Outline Rank Minimization

More information

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A). Math 344 Lecture #19 3.5 Normed Linear Spaces Definition 3.5.1. A seminorm on a vector space V over F is a map : V R that for all x, y V and for all α F satisfies (i) x 0 (positivity), (ii) αx = α x (scale

More information

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

INNER PRODUCT SPACE. Definition 1

INNER PRODUCT SPACE. Definition 1 INNER PRODUCT SPACE Definition 1 Suppose u, v and w are all vectors in vector space V and c is any scalar. An inner product space on the vectors space V is a function that associates with each pair of

More information

Linear Algebra. Session 12

Linear Algebra. Session 12 Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

Computational Methods. Eigenvalues and Singular Values

Computational Methods. Eigenvalues and Singular Values Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations

More information

Practice Algebra Qualifying Exam Solutions

Practice Algebra Qualifying Exam Solutions Practice Algebra Qualifying Exam Solutions 1. Let A be an n n matrix with complex coefficients. Define tr A to be the sum of the diagonal elements. Show that tr A is invariant under conjugation, i.e.,

More information

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit.

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit. CSE 151: Introduction to Machine Learning Winter 2017 Problem Set 1 Instructor: Kamalika Chaudhuri Due on: Jan 28 Instructions This is a 40 point homework Homeworks will graded based on content and clarity

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Rounding Sum-of-Squares Relaxations

Rounding Sum-of-Squares Relaxations Rounding Sum-of-Squares Relaxations Boaz Barak Jonathan Kelner David Steurer June 10, 2014 Abstract We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares

More information

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 8 : Eigenvalues and Eigenvectors CPS290: Algorithmic Foundations of Data Science February 24, 2017 Lecture 8 : Eigenvalues and Eigenvectors Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Hermitian Matrices It is simpler to begin with

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset

More information

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y. as Basics Vectors MATRIX ALGEBRA An array of n real numbers x, x,, x n is called a vector and it is written x = x x n or x = x,, x n R n prime operation=transposing a column to a row Basic vector operations

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

. a m1 a mn. a 1 a 2 a = a n

. a m1 a mn. a 1 a 2 a = a n Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

Iterative Methods. Splitting Methods

Iterative Methods. Splitting Methods Iterative Methods Splitting Methods 1 Direct Methods Solving Ax = b using direct methods. Gaussian elimination (using LU decomposition) Variants of LU, including Crout and Doolittle Other decomposition

More information

Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved

More information

Background Mathematics (2/2) 1. David Barber

Background Mathematics (2/2) 1. David Barber Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and

More information

Compound matrices and some classical inequalities

Compound matrices and some classical inequalities Compound matrices and some classical inequalities Tin-Yau Tam Mathematics & Statistics Auburn University Dec. 3, 04 We discuss some elegant proofs of several classical inequalities of matrices by using

More information

1 Quantum states and von Neumann entropy

1 Quantum states and von Neumann entropy Lecture 9: Quantum entropy maximization CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: February 15, 2016 1 Quantum states and von Neumann entropy Recall that S sym n n

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

Spectral Theorem for Self-adjoint Linear Operators

Spectral Theorem for Self-adjoint Linear Operators Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;

More information

Graph Partitioning Algorithms and Laplacian Eigenvalues

Graph Partitioning Algorithms and Laplacian Eigenvalues Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear

More information

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012. Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

linear programming and approximate constraint satisfaction

linear programming and approximate constraint satisfaction linear programming and approximate constraint satisfaction Siu On Chan MSR New England James R. Lee University of Washington Prasad Raghavendra UC Berkeley David Steurer Cornell results MAIN THEOREM: Any

More information

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm References: Trefethen & Bau textbook Eigenvalue problem: given a matrix A, find

More information

Mathematics Department Stanford University Math 61CM/DM Inner products

Mathematics Department Stanford University Math 61CM/DM Inner products Mathematics Department Stanford University Math 61CM/DM Inner products Recall the definition of an inner product space; see Appendix A.8 of the textbook. Definition 1 An inner product space V is a vector

More information

Tight Size-Degree Lower Bounds for Sums-of-Squares Proofs

Tight Size-Degree Lower Bounds for Sums-of-Squares Proofs Tight Size-Degree Lower Bounds for Sums-of-Squares Proofs Massimo Lauria KTH Royal Institute of Technology (Stockholm) 1 st Computational Complexity Conference, 015 Portland! Joint work with Jakob Nordström

More information

ECE 598: Representation Learning: Algorithms and Models Fall 2017

ECE 598: Representation Learning: Algorithms and Models Fall 2017 ECE 598: Representation Learning: Algorithms and Models Fall 2017 Lecture 1: Tensor Methods in Machine Learning Lecturer: Pramod Viswanathan Scribe: Bharath V Raghavan, Oct 3, 2017 11 Introduction Tensors

More information

THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R

THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is a dynamical system defined by a method of iterated differences. In this paper, we

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley

what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley what can deep learning learn from linear regression? Benjamin Recht University of California, Berkeley Collaborators Joint work with Samy Bengio, Moritz Hardt, Michael Jordan, Jason Lee, Max Simchowitz,

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize

More information

THE EIGENVALUE PROBLEM

THE EIGENVALUE PROBLEM THE EIGENVALUE PROBLEM Let A be an n n square matrix. If there is a number λ and a column vector v 0 for which Av = λv then we say λ is an eigenvalue of A and v is an associated eigenvector. Note that

More information

Limitations in Approximating RIP

Limitations in Approximating RIP Alok Puranik Mentor: Adrian Vladu Fifth Annual PRIMES Conference, 2015 Outline 1 Background The Problem Motivation Construction Certification 2 Planted model Planting eigenvalues Analysis Distinguishing

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Applied Linear Algebra

Applied Linear Algebra Applied Linear Algebra Peter J. Olver School of Mathematics University of Minnesota Minneapolis, MN 55455 olver@math.umn.edu http://www.math.umn.edu/ olver Chehrzad Shakiban Department of Mathematics University

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Robustness Meets Algorithms

Robustness Meets Algorithms Robustness Meets Algorithms Ankur Moitra (MIT) ICML 2017 Tutorial, August 6 th CLASSIC PARAMETER ESTIMATION Given samples from an unknown distribution in some class e.g. a 1-D Gaussian can we accurately

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

Math Matrix Algebra

Math Matrix Algebra Math 44 - Matrix Algebra Review notes - (Alberto Bressan, Spring 7) sec: Orthogonal diagonalization of symmetric matrices When we seek to diagonalize a general n n matrix A, two difficulties may arise:

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

The study of linear algebra involves several types of mathematical objects:

The study of linear algebra involves several types of mathematical objects: Chapter 2 Linear Algebra Linear algebra is a branch of mathematics that is widely used throughout science and engineering. However, because linear algebra is a form of continuous rather than discrete mathematics,

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information