CS261: Problem Set #3

Similar documents
CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

CS261: A Second Course in Algorithms Lecture #9: Linear Programming Duality (Part 2)

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Problem set 1. (c) Is the Ford-Fulkerson algorithm guaranteed to produce an acyclic maximum flow?

Duality of LPs and Applications

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent

CS261: A Second Course in Algorithms Lecture #8: Linear Programming Duality (Part 1)

CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality

CS173 Lecture B, November 3, 2015

CS264: Beyond Worst-Case Analysis Lecture #18: Smoothed Complexity and Pseudopolynomial-Time Algorithms

6.854 Advanced Algorithms

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Discrete Optimization

Lecture 15 (Oct 6): LP Duality

IE 521 Convex Optimization Homework #1 Solution

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Topics in Theoretical Computer Science April 08, Lecture 8

0.1 Motivating example: weighted majority algorithm

Two Applications of Maximum Flow

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

III. Linear Programming

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Integer Linear Programs

Supplementary lecture notes on linear programming. We will present an algorithm to solve linear programs of the form. maximize.

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 17: Duality and MinMax Theorem Lecturer: Sanjeev Arora

CMPUT 675: Approximation Algorithms Fall 2014

1 Review Session. 1.1 Lecture 2

The Simplex Algorithm

Lectures 6, 7 and part of 8

Online Learning with Experts & Multiplicative Weights Algorithms

1 Primals and Duals: Zero Sum Games

x 1 + x 2 2 x 1 x 2 1 x 2 2 min 3x 1 + 2x 2

Mathematics for Decision Making: An Introduction. Lecture 13

COS 341: Discrete Mathematics

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

PROBLEM SET 3: PROOF TECHNIQUES

The Steiner Network Problem

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018

Multicommodity Flows and Column Generation

1 Seidel s LP algorithm

directed weighted graphs as flow networks the Ford-Fulkerson algorithm termination and running time

HOMEWORK 4: SVMS AND KERNELS

Today: Linear Programming (con t.)

Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502)

Topics in Approximation Algorithms Solution for Homework 3

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Lecture notes on the ellipsoid algorithm

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007

Maximum flow problem CE 377K. February 26, 2015

Randomized Algorithms

Lecture notes for Analysis of Algorithms : Markov decision processes

CS1800: Strong Induction. Professor Kevin Gold

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15

1 The linear algebra of linear programs (March 15 and 22, 2015)

BBM402-Lecture 20: LP Duality

6.046 Recitation 11 Handout

Lecture #21. c T x Ax b. maximize subject to

Critical Reading of Optimization Methods for Logical Inference [1]

The No-Regret Framework for Online Learning

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748

15-780: LinearProgramming

Linear Programming: Simplex

CS264: Beyond Worst-Case Analysis Lecture #14: Smoothed Analysis of Pareto Curves

Proof Techniques (Review of Math 271)

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

Homework 3. Convex Optimization /36-725

Fall 2017 Qualifier Exam: OPTIMIZATION. September 18, 2017

Week 4. (1) 0 f ij u ij.

STEP Support Programme. Hints and Partial Solutions for Assignment 1

CS446: Machine Learning Spring Problem Set 4

Average-case Analysis for Combinatorial Problems,

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

15.083J/6.859J Integer Optimization. Lecture 10: Solving Relaxations

An algorithm to increase the node-connectivity of a digraph by one

4. Duality Duality 4.1 Duality of LPs and the duality theorem. min c T x x R n, c R n. s.t. ai Tx = b i i M a i R n

Hon.Algorithms (CSCI-GA 3520), Professor Yap Fall 2012 MIDTERM EXAM. Oct 11, 2012 SOLUTION

4.6 Linear Programming duality

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

Linear Programming. Linear Programming I. Lecture 1. Linear Programming. Linear Programming

CO 250 Final Exam Guide

Markov Decision Processes

Midterm Review. Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

1 More finite deterministic automata

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016

CMPUT651: Differential Privacy

7.4 DO (uniqueness of minimum-cost spanning tree) Prove: if all edge weights are distinct then the minimum-cost spanning tree is unique.

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

1 Introduction (January 21)

Chapter 1 Linear Programming. Paragraph 5 Duality

Combinatorial Optimization

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

Lecture 9 Tuesday, 4/20/10. Linear Programming

Transcription:

CS261: Problem Set #3 Due by 11:59 PM on Tuesday, February 23, 2016 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Submission instructions: We are using Gradescope for the homework submissions. Go to www.gradescope.com to either login or create a new account. Use the course code 9B3BEM to register for CS261. Only one group member needs to submit the assignment. When submitting, please remember to add all group member names in Gradescope. (3) Please type your solutions if possible and we encourage you to use the LaTeX template provided on the course home page. (4) Write convincingly but not excessively. (5) Some of these problems are difficult, so your group may not solve them all to completion. In this case, you can write up what you ve got ( (3), above): partial proofs, lemmas, high-level ideas, counterexamples, and so on. (6) Except where otherwise noted, you may refer to the course lecture notes only. You can also review any relevant materials from your undergraduate algorithms course. (7) You can discuss the problems verbally at a high level with other groups. And of course, you are encouraged to contact the course staff (via Piazza or office hours) for additional help. (8) If you discuss solution approaches with anyone outside of your group, you must list their names on the front page of your write-up. (9) Refer to the course Web page for the late day policy. Problem 13 This problem fills in some gaps in our proof sketch of strong linear programming duality. (a) For this part, assume the version of Farkas s Lemma stated in Lecture #9, that given A R m n and b R m, exactly one of the following statements holds: (i) there is an x R n such that Ax = b and x 0; (ii) there is a y R m such that y T A 0 and y T b < 0. Deduce from this a second version of Farkas s Lemma, stating that for A and b as above, exactly one of the following statements holds: (iii) there is an x R n such that Ax b; (iv) there is a y R m such that y 0, y T A = 0, and y T b < 0. [Hint: note the similarity between (i) and (iv). Also note that if (iv) has a solution, then it has a solution with y T b = 1. ] (b) Use the second version of Farkas s Lemma to prove the following version of strong LP duality: if the linear programs max c T x Ax b 1

with x unrestricted, and min b T y A T y = c, y 0 are both feasible, then they have equal optimal objective function values. [Hint: weak duality is easy to prove directly. For strong duality, let γ denote the optimal objective function value of the dual linear program. Add the constraint c T x γ to the primal linear program and use Farkas s Lemma to show that the feasible region is non-empty.] Problem 14 Recall the multicommodity flow problem from Exercise 17. Recall the input consists of a directed graph G = (V, E), k commodities or source-sink pairs (s 1, t 1 ),..., (s k, t k ), and a positive capacity u e for each edge. Consider also the multicut problem, where the input is the same as in the multicommodity flow problem, and feasible solutions are subsets F E of edges such that, for every commodity (s i, t i ), there is no s i -t i path in G = (V, E \ F ). (Assume that s i and t i are distinct for each i.) The value of a multicut F is just the total capacity e F u e. (a) Formulate the multicommodity flow problem as a linear program with one decision variable for each path P that travels from a source s i to the corresponding sink t i. Aside from nonnegativity constraints, there should be only be m constraints (one per edge). [Note: this is a different linear programming formulation than the one asked for in Exercise 21.] (b) Take the dual of the linear program in (a). Prove that every optimal 0-1 solution of this dual i.e., among all feasible solutions that assign each decision variable the value 0 or 1, one of minimum objective function value is the characteristic vector of a minimum-value multicut. (c) Show by example that the optimal solution to this dual linear program can have objective function value strictly smaller than that of every 0-1 feasible solution. In light of your example, explain a sense in which there is no max-flow/min-cut theorem for multicommodity flows and multicuts. Problem 15 This problem gives a linear-time (!) randomized algorithm for solving linear programs that have a large number m of constraints and a small number n of decision variables. (The constant in the linear-time guarantee O(m) will depend exponentially on n.) Consider a linear program of the form max c T x Ax b. For simplicity, assume that the linear program is feasible with a bounded feasible region, and let M be large enough that x j < M for every coordinate of every feasible solution. Assume also that the linear program is non-degenerate, in the sense that no feasible point satisfies more than n constraints with equality. For example, in the plane (two decision variables), this just means that there does not exist three different constraints (i.e., halfplanes) whose boundaries meet at a common point. Finally, assume that the linear program has a unique optimal solution. 1 Let C = {1, 2,..., m} denote the set of constraints of the linear program. Let B denote additional constraints asserting that M x j M for every j. The high-level idea of the algorithm is: (i) drop a 1 All of these simplifying assumptions can be removed without affecting the asymptotic running time; we leave the details to the interested reader. 2

random constraint and recursively compute the optimal solution x of the smaller linear program; (ii) if x is feasible for the original linear program, return it; (iii) else, if x violates the constraint a T i x b i, then change this inequality to an equality and recursively solve the resulting linear program. More precisely, consider the following recursive algorithm with two arguments. The first argument C 1 is a subset of inequality constraints that must be satisfied (initially, equal to C). The second argument is a subset C 2 of constraints that must be satisfied with equality (initially, ). The responsibility of a recursive call is to return a point maximizing c T x over all points that satisfy all the constraints of C 1 B (as inequalities) and also those of C 2 (as equations). Input: Linear-Time Linear Programming two disjoint subsets C 1, C 2 C of constraints Base case #1: if C 2 = n, return the unique point that satisfies every constraint of C 2 with equality Base case #2: if C 1 + C 2 = n, return the point that maximizes c T x a T i x b i for every i C 1, a T i x = b i for every i C 2, and the constraints in B Recursive step: choose i C 1 uniformly at random recurse with the sets C 1 \ {i} and C 2 to obtain a point x if a T i x b i then return x else recurse with the sets C 1 \ {i} and C 2 {i}, and return the result (a) Prove that this algorithm terminates with the optimal solution x of the original linear program. [Hint: be sure to explain why, in the else case, it s OK to recurse with the ith constraint set to an equation.] (b) Let T (m, s) denote the expected number of recursive calls made by the algorithm to solve an instance with C 1 = m and C 2 = s (with the number n of variables fixed). Prove that T satisfies the following recurrence: { 1 if s = n or m + s = n T (m, s) = T (m 1, s) + n s m T (m 1, s + 1) otherwise. [Hint: you should use the non-degeneracy assumption in this part.] (c) Prove that T (m, 0) n! m. [Hint: it might be easiest to make the variable substitution δ = n s and proceed by simultaneous induction on m and δ.] (d) Conclude that, for every fixed constant n, the algorithm above can be implemented so that the expected running time is O(m) (where the hidden constant can depend arbitrarily on n). 3

Problem 16 This problem considers a variant of the online decision-making problem. There are n experts, where n is a power of 2. At each time step t = 1, 2,..., T : Combining Expert Advice each expert offers a prediction of the realization of a binary event (e.g., whether a stock will go up or down) a decision-maker picks a probability distribution p t over the possible realizations 0 and 1 of the event the actual realization r t {0, 1} of the event is revealed a 0 or 1 is chosen according to the distribution p t, and a mistake occurs whenever it is different from r t You are promised that there is at least one omniscient expert who makes a correct prediction at every time step. (a) Prove that the minimum worst-case number of mistakes that a deterministic algorithm can make is precisely log 2 n. (b) Prove that the minimum worst-case expected number of mistakes that a randomized algorithm can make is precisely 1 2 log 2 n. Problem 17 In Lecture #11 we saw that the follow-the-leader (FTL) algorithm, and more generally every deterministic algorithm, can have regret that grows linearly with T. This problem outlines a randomized variant of FTL, the follow-the-perturbed-leader (FTPL) algorithm, with worst-case regret comparable to that of the multiplicative weights algorithm. In the description of FTPL, we define each probability distribution p t over actions implicitly through a randomized subroutine. for each action a A do Follow-the-Perturbed-Leader (FTPL) Algorithm independently sample a geometric random variable with parameter η, 2 denoted by X a for each time step t = 1, 2,..., T do choose the action a that maximizes the perturbed cumulative reward X a + t 1 u=1 ru (a) so far For convenience, assume that, at every time step t, there is no pair of actions whose (unperturbed) cumulative rewards-so-far differ by an integer. (a) Prove that, at each time step t = 1, 2,..., T, with probability at least 1 η, the largest perturbed cumulative reward of an action prior to t is more than 1 larger than the second-largest such perturbed reward. [Hint: Sample the X a s gradually by flipping coins only as needed, pausing once the action a with largest perturbed cumulative reward is identified. Resuming, only X a is not yet fully determined. What can you say if the next coin flip comes up tails? ] 2 Equivalently, when repeatedly flipping a coin that comes up heads with probability η, count the number of flips up to and including the first heads. 4

(b) As a thought experiment, consider the (unimplementable) algorithm that, at each time step t, picks the action that maximizes the perturbed cumulative reward X a + t u=1 ru (a) over a A, taking into account the current reward vector. Prove that the regret of this algorithm is at most max a A X a. [Hint: Consider first the special case where X a = 0 for all a. Iteratively transform the action sequence that always selects the best action in hindsight to the sequence chosen by the proposed algorithm. Work backward from time T, showing that the reward only increases with each step of the transformation.] (c) Prove that E[max a A X a ] bη 1 ln n, where n is the number of actions and b > 0 is a constant independent of η and n. [Hint: use the definition of a geometric random variable and remind yourself about the union bound. ] (d) Prove that, for a suitable choice of η, the worst-case expected regret of the FTPL algorithm is at most b T ln n, where b > 0 is a constant independent of n and T. Problem 18 In this problem we ll show that there is no online algorithm for the online bipartite matching problem with competitive ratio better than 1 1 e 63.2%. Consider the following probability distribution over online bipartite matching instances. There are n left-hand side vertices L, which are known up front. Let π be an ordering of L, chosen uniformly at random. The n vertices of the right-hand side R arrive one by one, with the ith vertex of R connected to the last n i + 1 vertices of L (according to the random ordering π). (a) Explain why OP T = n for every such instance. (b) Consider an arbitrary deterministic online algorithm A. Prove that for every i {1, 2,..., n}, the probability (over the choice of π) that A matches the ith vertex of L (according to π) is at most i 1 min n j + 1, 1. j=1 [Hint: for example, in the first iteration, assume that A matches the first vertex of R to the vertex v L. Note that A must make this decision without knowing π. What can you say if v does not happen to be the first vertex of π?] (c) Prove that for every deterministic online algorithm A, the expected (over π) size of the matching produced by A is at most n i 1 min n j + 1, 1, (1) i=1 j=1 and prove that (1) approaches n(1 1 e ) as n. [Hint: for the second part, recall that d j=1 1 j ln d (up to an additive constant less than 1). For what value of i is the inner sum roughly equal to 1?] (d) Extend (c) to randomized online algorithms A, where the expectation is now over both π and the internal coin flips of A. [Hint: use the fact that a randomized online algorithm is a probability distribution over deterministic online algorithms (as flipping all of A s coins in advance yields a deterministic algorithm).] (e) Prove that for every ɛ > 0 and (possibly randomized) online bipartite matching algorithm A, there exists an input such that the expected (over A s coin flips) size of A s output is no more than 1 1 e + ɛ times that of an optimal solution. 5