U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Similar documents
U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

APPENDIX A Some Linear Algebra

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 6 Luca Trevisan September 12, 2017

6.854J / J Advanced Algorithms Fall 2008

Computing Correlated Equilibria in Multi-Player Games

COS 521: Advanced Algorithms Game Theory and Linear Programming

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Lecture 10 Support Vector Machines II

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 2/21/2008. Notes for Lecture 8

Problem Set 9 Solutions

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Lecture 12: Discrete Laplacian

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Which Separator? Spring 1

Lecture 10 Support Vector Machines. Oct

1 Matrix representations of canonical matrices

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Some modelling aspects for the Matlab implementation of MMA

MMA and GCMMA two methods for nonlinear optimization

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Lecture 6/7 (February 10/12, 2014) DIRAC EQUATION. The non-relativistic Schrödinger equation was obtained by noting that the Hamiltonian 2

Linear Approximation with Regularization and Moving Least Squares

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Lecture Notes on Linear Regression

Lecture 3. Ax x i a i. i i

Lecture 3: Dual problems and Kernels

Maximal Margin Classifier

Feature Selection: Part 1

Kernel Methods and SVMs Extension

Assortment Optimization under MNL

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

MATH Homework #2

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Randomness and Computation

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Quantum Mechanics I - Session 4

Singular Value Decomposition: Theory and Applications

Math 217 Fall 2013 Homework 2 Solutions

NP-Completeness : Proofs

Lecture 10: May 6, 2013

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Stanford University Graph Partitioning and Expanders Handout 3 Luca Trevisan May 8, 2013

A 2D Bounded Linear Program (H,c) 2D Linear Programming

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Lagrange Multipliers Kernel Trick

Eigenvalues of Random Graphs

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Lecture 21: Numerical methods for pricing American type derivatives

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

10-701/ Machine Learning, Fall 2005 Homework 3

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Section 8.3 Polar Form of Complex Numbers

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Math1110 (Spring 2009) Prelim 3 - Solutions

Lecture 11. minimize. c j x j. j=1. 1 x j 0 j. +, b R m + and c R n +

Inexact Newton Methods for Inverse Eigenvalue Problems

Spectral Graph Theory and its Applications September 16, Lecture 5

Complex Numbers Alpha, Round 1 Test #123

Errors for Linear Systems

Non-negative Matrices and Distributed Control

Generalized Linear Methods

Calculation of time complexity (3%)

Dynamic Systems on Graphs

More metrics on cartesian products

Support Vector Machines CS434

The Second Anti-Mathima on Game Theory

Statistical Mechanics and Combinatorics : Lecture III

2.3 Nilpotent endomorphisms

1 Convex Optimization

Finding Dense Subgraphs in G(n, 1/2)

Feb 14: Spatial analysis of data fields

Lecture 17. Solving LPs/SDPs using Multiplicative Weights Multiplicative Weights

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

CS286r Assign One. Answer Key

e - c o m p a n i o n

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

Low correlation tensor decomposition via entropy maximization

Randić Energy and Randić Estrada Index of a Graph

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Affine transformations and convexity

The Geometry of Logit and Probit

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

A property of the elementary symmetric functions

Assortment Optimization under the Paired Combinatorial Logit Model

Google PageRank with Stochastic Matrix

Edge Isoperimetric Inequalities

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Mixed-integer vertex covers on bipartite graphs

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P A = (P P + P )A = P (I P T (P P ))A = P (A P T (P P )A) Hence if we let E = P T (P P A), We have that

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Transcription:

U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that a matrx M R n n s postve semdefnte (abbrevated PSD and wrtten M 0) f t s symmetrc and all ts egenvalues are non-negatve. We wll use wthout proof the followng facts from lnear algebra:. If M R n n s a symmetrc matrx, then all the egenvalues of M are real, and, f we call λ λ λ n the egenvalues of M wth repetton, we have M = λ v () (v () ) T where the v () are orthonormal egenvectors of the λ.. The smallest egenvalue of M has the characterzaton y T My λ = mn y 0 y and the optmzaton problem n the rght-hand sde s solvable up to arbtrarly good accuracy From part () above we have that M s PSD f and only f for every vector y we have y T My 0. We wll also use the followng alternatve characterzaton of PSD matrces Lemma A matrx M R n n s PSD f and only f there s a collecton of vectors x (),..., x (n) such that, for every,, we have M = x (), x (). Proof: Suppose that M and x (),..., x (n) are such that M = x (), x () for all and. Then M s PSD because for every vector y we have

y T My = y y M = y y x (), x () = y x () 0 Conversely, f M s PSD and we wrte t as M = k λ v (k) (v (k) ) T we have M = k λ k v (k) v (k) and we see that we can defne n vectors x (),, x (n) by settng and we do have the property that x k := λ k v (k) M = x (), x () Wth these characterzatons n mnd, we defne a semdefnte program as an optmzaton program n whch we have n real varables X, wth, n, and we want to maxmze, or mnmze, a lnear functon of the varables such that lnear constrants over the varables are satsfed (so far ths s the same as a lnear program) and subect to the addtonal constrant that the matrx X s PSD. Thus, a typcal semdefnte program (SDP) looks lke max C X A () X b. X 0 A (m) X b m where the matrces C, A (),..., A (m) and the scalars b,..., b m are gven, and the entres of X are the varables that we are optmzng over. If A and B are two matrces such that A 0 and B 0, and f a 0 s a scalar, then t s easy to see that a A 0 and A + B 0, by usng the characterzaton that M 0 ff y T My 0 for every y. Ths means that the set of PSD matrces s a convex subset of R n n, and that the above optmzaton problem s a convex problem.

Usng the ellpsod algorthm, one can solve n polynomal tme (up to arbtrarly good accuracy) any optmzaton problem n whch one wants to optmze a lnear functon over a convex feasble regon, provded that one has a separaton oracle for the feasble regon, that s, an algorthm that, gven a pont, checks whether t s feasble and, f not, constructs an nequalty that s satsfed by all feasble pont but not satsfed by the gven pont. In order to construct a separaton oracle for a SDP, t s enough to solve the followng problem: gven a matrx M, decde f t s PSD or not and, f not, construct an nequalty that s satsfed by the entres of all PSD matrces but that s not satsfed by M. In order to do so, recall that the smallest egenvalue of M s mn y y T My y and that the above mnmzaton problem s solvable n polynomal tme (up to arbtrarly good accuracy). If the above optmzaton problem has a non-negatve optmum, then M s PSD. If t s a negatve optmum y, then the matrx s not PSD, and the nequalty X y y 0 s satsfed for all PSD matrces X but fals for X := M. Thus we have a separaton oracle and we can solve SDPs n polynomal tme up to arbtrarly good accuracy. In lght of our characterzaton of PSD matrces, SDPs have the followng equvalent formulaton: max C x (), x () A () x(), x () b. A (m) x(), x () b m where our varables are vectors x (),, x (n). SDP Relaxaton of Max Cut and Random Hyperplane Roundng The Max Cut problem n a gven graph G = (V, E) has the followng equvalent characterzaton, as a quadratc optmzaton problem over real varables x,..., x n, where V = {,..., n}: 3

max () E x = 4 (x x ) V Any quadratc optmzaton problem has a natural relaxaton to an SDP, n whch we relax real varables to take vector values and we change multplcaton to nner product: max () E x = 4 x x V Solvng the above SDP, whch s doable n polynomal tme up to arbtrarly good accuracy, gves us a unt vector x for each vertex. A smple way to convert ths collecton to a cut (S, V S) s to take a random hyperplane through the orgn, and then defne S to be the set of vertces such that x s above the hyperplane. Equvalently, we pck a random vector g accordng to a rotaton-nvarant dstrbuton, for example a Gaussan dstrbuton, and let S be the set of vertces such that g, x 0. Let (, ) be an edge: One sees that f θ s the angle between x and x, then P[(, ) s cut ] = θ and the contrbuton of (, ) to the cost functon s 4 x x = x, x = cos θ some calculus shows that for every 0 we have ( θ >.878 ) cos θ and so E[ number of edges cut by (S, V S)].878 () E =.878 SDP MaxCut(G).878 MaxCut(G) 4 x x so we have a polynomal tme approxmaton algorthm wth worst-case approxmaton guarantee.878. Next tme, we wll see how the SDP relaxaton behaves on random graphs, but frst let us how t behaves on a large class of graphs. 4

3 Max Cut n Bounded-Degree Trangle-Free Graphs Theorem If G = (V, E) s a trangle-free graph n whch every vertex has degree at most d, then ( ) MaxCut(G) + Ω d E Proof: Consder the followng feasble soluton for the SDP: we assocate to each node an n-dmensonal vector x () such that x () =, x () = / deg() s (, ) E, and = 0 otherwse. We mmedately see that x () = for every and so the soluton s feasble. x () Let us transform ths SDP soluton nto a cut S, V S) usng a random hyperplane. We see that, for every edge (, ) we have x (), x () = d() d( d The probablty that (, ) s cut by (S, V S) s arccos d and arccos d = arcsn + d so that the expected number of cut edges s at least + Ω ( d ) ( ) + Ω d E. 5