Lecture 7 Solvng LPs/SDPs usng Multplcatve Weghts In the last lecture we saw the Multplcatve Weghts (MW) algorthm and how t could be used to effectvely solve the experts problem n whch we have many experts and wsh to make predctons that are approxmately as good as the predctons made by the best expert. In ths lecture we wll see how to apply the MW algorthm to effcently approxmate the optmal soluton to LPs and SDPs. 7. Multplcatve Weghts Recall the followng result from Lecture 6 about the Hedge algorthm: heorem 7.. Suppose the cost vectors are m (t) [, ] N. hen for any ɛ, and for any, the Hedge algorthm guarantees that for all [m], p (t) m (t) m (t) + ɛ + ln N ɛ So the total cost pad by the algorthm s no more than an addtve factor of ɛ + ln N ɛ worse than the cost ncurred by any ndvdual component of the cost vector. heorem 7. mples a smlar result for the average cost ncurred per round. (One can get a smlar result for the MW algorthm, where nstead of the update rule w (t) the rule w (t) w (t) ( ɛm (t) ).) w (t) exp( ɛm (t) ), we used Corollary 7.2. Suppose the cost vectors are m (t) [ ρ, ρ] N. hen for any ɛ, and for 2 any 4 ln N ρ 2, the Hedge algorthm guarantees than for all [m] ɛ 2 p (t) m (t) m (t) + ɛ * Lecturer: Anupam Gupta. Scrbe: m Wlson.
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 2 Note: We dd not cover ths n lecture, but one can show that f the cost vectors are n [0, ρ], then usng the MW algorthm, the settng 4 ln N ɛ ρ suffces to get the same guarantee of 2 Lemma 7.3. Suppose the cost vectors are m (t) [0, ρ] N. hen for any ɛ 2, and for any 4 ln N ɛ ρ, the MW algorthm guarantees than for all [m] 2 p (t) m (t) m (t) + ɛ A proof of ths can be found n the Arora, Hazan, and Kale survey [AHK05]. 7.2 Solvng LPs wth Multplcatve Weghts We wll use the MW algorthm to help solve LPs wth m constrants of the form mn c x s.t. Ax b x 0 Supposng that we know c x = OP (by bnary search), we wll am to fnd an ɛ-approxmate soluton x such that c x = OP A x b ɛ x 0 or output nfeasble f no soluton exsts. he runtme for ths wll be O s the wdth of the LP whch wll be defned shortly. 7.2. Smplfyng the Constrants ( ρ 2 log m ɛ 2 ) where ρ Instead of searchng for solutons x R n, we wll package together the easy constrants nto the smple convex regon K = {x R n x 0, c x = OP} Now we wsh to solve Ax b such that x K. Note that ths s partcularly easy to solve f Ax b s only one constrant,.e., we are tryng to determne whether x K such that α x β for some α R n, β R. For example, f c 0 and OP max α c we can set x = OP e whch wll satsfy our constrants; else we could output Infeasble. c For general c we are essentally reduced to solvng an LP over two constrants, whch whle not as trval as ths, s stll smple. We wll henceforth assume we have an oracle that gven α R n, β R, and K R n ether returns x R n such that α x β, or correctly asserts that there s no such x. β
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 3 7.2.2 Usng Multplcatve Weghts We wll use ths oracle that allows us to satsfy one constrant (αx β) for k K, along wth the MW algorthm to get an algorthm satsfy all of the constrants Ax b for x K. Each of the constrants a topx b wll be vewed as an expert for a total of m experts. Each round we wll produce a vector p (t) that wll gve us a convex combnaton of the constrants as follows p (t) A x p (t) b }{{}}{{} α (t) β (t) Usng our oracle, we can determne whether α (t) x β (t) has some soluton x (t) K, or f no such soluton exsts. Clearly f no soluton exsts, then Ax b s nfeasble over K, so our LP s nfeasble. (It s easy to see the contrapostve: f there were a soluton to Ax b, x K, then ths vector x would also satsfy α (t) x β (t) ; here we use the fact that p (t) 0.) Moreover, the vector p (t) serves as proof of ths nfeasblty. Otherwse, we wll set our cost vector so that m (t) = a x (t) b, update our weghts and proceed wth the next round. If we have not determned the LP to be nfeasble after rounds we wll termnate and return the soluton x = Why do we set our cost vectors ths way? It almost seems lke we should ncur no cost when a x (t) b 0 (.e., when we satsfy ths constrant), whereas we are ncurrng a hgher cost the more we satsfy t. Well, the dea s whenever a (t) x b s postve, we have oversatsfed the constrant. Gvng a postve cost to ths constrant causes us to reduce the weght of ths constrant n ths next round. hs works analogously to the experts problem where an expert who s wrong (has hgh cost) s gven less credence (less weght) n future rounds. Smlarly, for any constrant n whch a (t) x b s negatve, we have faled the constrant. Gvng a negatve cost to ths constrant causes us to ncrease the weght of ths constrant n the next round. Intally we set all of our weghts equal to express our gnorance; all constrants are equally hard. Whenever we update our weghts we reduce the weghts of constrants we oversatsfed so we ll cover them less n future rounds. We ncrease the weghts of constrants we ddn t satsfy so we ll cover them more n future rounds. Our hope s that over tme ths wll converge to a soluton where we satsfy all constrants to a roughly equal extent. x (t) 7.2.3 Analyzng Multplcatve Weghts Supposng that we do not dscover our LP s nfeasble, how many rounds should we run and how good wll our soluton be? If we defne ρ = max{, max,x K { a x b }}
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 4 to be the maxmum magntude of any cost assgned to a constrant, then we may mmedately apply Corollary 7.2 to fnd that after 4 ln n ρ 2 rounds, ɛ 2 p (t) m (t) m (t) + ɛ where ɛ 2, m(t) = a x (t) b [ ρ, ρ] n for all [m], and each x () K. Note that we do not actually need to fnd ρ; t suffces to keep track of ρ t = max{, max,t t{ a x (t ) b }}, the maxmum cost seen so far, and run untl 4 ln n ɛ 2 ρ 2. What guarantee do we get? On the left hand sde of ths nequalty we have p (t) m (t) = p (t) (Ax (t) b) = p (t) Ax (t) p (t) b 0 where the fnal nequalty holds due to our oracle s propertes. herefore the left hand sde s at least 0. And on the rght hand sde we have m (t) = = a a x (t) b ( ) x (t) b = a x b Combnng ths wth our nequalty for the rght hand sde we get : a x b + ɛ 0 a x b ɛ herefore we can obtan an ɛ-feasble soluton to Ax b, x K n tme O ( log m ρ 2) tme ɛ 2 where ρ = max{, max,x K { a x b }} s the wdth of the LP. 7.2.4 Example: Mnmum Set Cover Recall the mnmum fractonal set cover problem wth m sets F = {S, S 2,..., S m } and n elements U. he goal s to pck fractons of sets n order to cover each element to an extent of :.e., to solve the followng LP s.t. mn x S S x S S e x S 0 e
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 5 Suppose we know OP = L [, m], so K = { S x S = L, x S 0}. We want to fnd x K such that S e x S for all elements e. Our oracle, gven some p, must try to fnd x K such that p e p e = e S S x S S e e x S p e e S x S p(s) where p(s) s the total weght of elements n S. hs quantty s clearly maxmzed over K by concentratng on a set wth the maxmum weght and settng { L for some S F maxmzng p(s) x S = 0 for all other S Note that the wdth of ths LP s at most max x S L m e S e How does the weght update step work? Intally we set w () for all constrants. Whenever a set s overcovered, we reduce the weght of that set so we don t try as hard to cover t n the next step. Whenever a set s undercovered we ncrease the weght of the set so we try harder to cover t n the next step. Now, after 4L 2 ln n/ɛ 2 steps we wll obtan an ɛ-approxmate soluton x such that x S = L S x S ɛ S e x 0 Note that, n ths case, the constrant matrx s completely nonnegatve, and we can scale up our soluton to get a feasble soluton x = x/( ɛ) so that x S = S x S S e x 0 L ɛ L( + ɛ)
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 6 7.2.5 Comments. he scalng we used for mnmum set cover to obtan a non-optmal, feasble soluton can be appled to any LP where b > ɛ ndeed, we could just multply all the x values by max /(b ɛ). hs s often useful, partcularly when we re gong to round ths LP soluton and ncur further losses, and hence losng ths factor may be nsgnfcant. 2. If the constrant matrx A s all postve the problem s sad to be a coverng problem (we are just nterested n puttng enough weght on x to cover every constrant). If the constrant matrx s all negatve or equvalently, f we have Ax b wth an allpostve matrx A the problem s sad to be a packng problem (we are packng as much weght nto x as possble wthout volatng any constrant). In ether case, we can use a smlar scalng trck to get a non-optmal, feasble soluton. In ths case we can reduce the run-tme further. Assume we have a coverng problem: mn{c x Ax b, x 0}. By scalng, we can transform ths nto a problem of the form mn{c x Ax, x 0} he unform values of b = allows us to set the cost vectors m (t) = a x (t) nstead of m (t) = a x (t) ; ths translaton does not change the algorthm. But the postve cost vectors allow us to use Lemma 7.3 to reduce the runtme from O ( log m ɛ 2 O ( log m ɛ 2 ρ ). ρ 2) to 3. In general, the wdth of our LPs may not turn out to be as nce. For example, n the weghted mnmum set cover problem mn c S x S S s.t. x S S e x S 0 e our optmum, and hence the wdth, can ncrease to as much as m max S c S mn S c S. An approach developed by Garg and Könemann [GK07] can be useful to solve the problems wthout the wdth penalty. 4. he MW algorthm does not need a perfect oracle. Beng able to determne gven α R n and β R f there s no x K wth α topx β, or else returnng an x K such that α x β ɛ s suffcent for our purposes. hs gves us solutons x K such that Ax b (ɛ + ɛ ). 5. here was exactly one pont where we used the fact that our constrants were lnear. hat was concludng that a x (t) b = a x b
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 7 However, we can make a smlar clam for any set of convex constrants as well: f we wanted to fnd x K such that f (x) 0 for [m], wth the f s convex. hen as long as we could solve the oracle and fnd x K wth p(t) f (x) 0 effcently, the rest of the argument would go through. In partcular, n the step where we used lnearty, we could nstead use ( ) f (x (t) ) f x (t) = f ( x). 7.3 Solvng SDPs wth Multplcatve Weghts Suppose we now move to solvng SDPs of the form mn C X s.t. A X b X 0 note that the frst few constrants are lnear constrants. It s only the psd-ness constrant that s non-lnear so we only need to modfy our MW algorthm by absorbng the X 0 constrant nto the oracle. It wll be also convenent to requre the constrant tr(x) = as well: usually we can guess the trace of the soluton X. (If the trace of the soluton we seek s not but R, we can scale the problem by R to get unt trace.) hen the oracle we must mplement s ths: Let K := {X X 0, tr(x) = }. Gven a symmetrc matrx A R n n and β R, does there exst X K such that A X β? (Agan, A, β wll be obtaned n the algorthm by settng A () := p (t) A, and β () := p (t) b.) But we know from Lecture 2 that ths s equvalent to askng whether the maxmum egenvalue of the symmetrc matrx A s at least β. Indeed, f ths s so, and f λ max s the maxmum egenvalue of A wth unt egenvector x, then A (xx ) = tr(a xx ) = tr(axx ) = tr(λ max xx ) = λ max so our oracle should return X = xx, else t should return Infeasble. Moreover, usng the Observaton #4 on the prevous page, t suffces to return x such that x Ax λ max ɛ. How fast ths can be done depends on the partcular structure of the matrx A; n the next secton we see that for the max-cut problem, the matrx A tself s psd, and hence we can fnd such an x relatvely quckly.
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 8 7.3. Example: Max Cut hs part s loosely based on the paper of Klen and Lu [KL96]. Recall the Max Cut SDP we derved n Lecture 2: max 4 L X s.t. (e e ) X = X 0 As usual, we wll thnk of the edge weghts as summng to : ths means that tr(l) = L = j L j =. If we let b = OP and scale X by /n, we are lookng for feasblty of the constrants: n 4b L X n(e e ) X = X 0 Fnally, f we take K = {X X 0, tr(x) = }, the above SDP s equvalent to fndng X K such that n 4b L X n(e e ) X (hs s because tr(x) = means X =. Snce we have the constrants n(e e ) X = nx, ths means X = /n for all.) By the dscussons of the prevous secton, our oracle wll need to check whether there exsts X K such that D (t) X, where D (t) = p (t) 0 n n 4b L + = p (t) n(e e ). And agan, s s equvalent to checkng whether λ max (D (t) ). Implementng the oracle. It s useful to note that D (t) s postve semdefnte: ndeed, t s the sum of the Laplacan (whch s psd), and a bunch of matrces e e (whch are psd). Note: In Homework #6, you wll show that for any psd matrx D, the power method startng wth a random unt vector can fnd x K such that D (xx ) [λ max (D)/( + ɛ), λ max (D)]. he algorthm succeeds wth hgh probablty, and runs n tme O(ɛ m log n) tme, where m s the number of edges n G (and hence the number of non-zeroes n L). So we can run ths algorthm: f t answers wth an x such that D (t) (xx ) s smaller than /(+ɛ), we answer sayng λ max (D (t) ) <. Else we return the vector x: ths has the property that D (t) (xx ) /( + ɛ) ɛ. Now, usng the Observaton #4 on the prevous page, we know ths wll suffce to get a soluton that has an O(ɛ) nfeasblty. Boundng the wdth. he wdth of our algorthm s the maxmum possble magntude of D (t) X for X K,.e., the maxmum possble egenvalue of D (t). Snce D (t) s postve
LECURE 7. SOLVING LPS/SDPS USING MULIPLICAIVE WEIGHS 9 semdefnte all of ts egenvalues are non-negatve. Moreover, tr(l) =, and also tr(e e ) =. So λ max (D (t) ) λ (D (t) ) = tr(d (t) ) ( ) = tr p (t) n n 0 4b L + p (t) n(e e ) = p (t) 0 = n n 4b tr(l) + = n( + /4b). = p (t) n tr(e e ) Fnally, the max-cut values we are nterested n le between /2 (snce the max-cut s at least half the edge-weght) and. So b [/2, ], and the wdth s O(n). Runnng me. Settng the wdth ρ = O(n) gves us a runtme of ( ) n 2 log n O whch we can reduce to ɛ 2 oracle ( ) n log n O ɛ 2 oracle usng Lemma 7.3, snce our cost vectors can be made all nonnegatve. Fnally, pluggng n our oracle gves a fnal runtme of ( mn log 2 ) n O, where m s the number of edges n our graph. Note: We can now scale the average matrx X by n to get a matrx X satsfyng: ɛ 3 4 L X b( ɛ) X ɛ tr( X) = n X 0 he attentve reader wll observe that ths s not as nce as we d lke. We d really want each X [ ɛ, + ɛ] then we could transform ths soluton nto one where X = and 4 L X b( ɛ O() ). What we have only guarantees that X [ ɛ, + nɛ], and so we d need to set ɛ /n for any non-trval guarantees. hs would stll gve us a run-tme of O(ɛ 3 mn 4 poly log n) stll polynomal (and useful to examplfy the technque), but t could be better. One can avod ths loss by defnng K dfferently n fact, n a way that s smlar to Secton 7.2. the detals can be found n [KL96]. One can do even better usng the matrx multplcatve weghts algorthms: see, e.g., [AK07, Ste0].
Bblography [AHK05] Sanjeev Arora, Elad Hazan, and Satyen Kale. he multplcatve weghts update method: a meta algorthm and applcatons. echncal report, Prnceton Unversty, 2005. 7. [AK07] [GK07] [KL96] [Ste0] Sanjeev Arora and Satyen Kale. A combnatoral, prmal-dual approach to semdefnte programs. In SOC, pages 227 236, 2007. 7.3. Naveen Garg and Jochen Könemann. Faster and smpler algorthms for multcommodty flow and other fractonal packng problems. SIAM J. Comput., 37(2):630 652 (electronc), 2007. 3 Phlp Klen and Hsueh-I Lu. Effcent approxmaton algorthms for semdefnte programs arsng from MAX CU and COLORING. In Proceedngs of the wentyeghth Annual ACM Symposum on the heory of Computng (Phladelpha, PA, 996), pages 338 347, New York, 996. ACM. 7.3. Davd Steurer. Fast sdp algorthms for constrant satsfacton problems. In SODA, pages 684 697, 200. 7.3. 0