Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones

Size: px
Start display at page:

Download "Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones"

Transcription

1 Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones Wotao Yin joint: Fei Feng, Robert Hannah, Yanli Liu, Ernest Ryu (UCLA, Math) DIMACS: Distributed Optimization, Information Processing, and Learning August 17 1 / 31

2 Overview conic programming problem (P): minimize c T x subject to Ax = b, x K K is a closed convex cone this talk: a first-order iteration parallel: linear speedup, async still working if problem is unsolvable 2 / 31

3 Approach overview Douglas-Rachfordf 1 fixed point iteration z k+1 = T z k T depends on A, b, c and has nice properties: 1 equivalent to standard ADMM, but the different form is important 3 / 31

4 Approach overview Douglas-Rachfordf 1 fixed point iteration z k+1 = T z k T depends on A, b, c and has nice properties: convergence guarantees and rates 1 equivalent to standard ADMM, but the different form is important 3 / 31

5 Approach overview Douglas-Rachfordf 1 fixed point iteration z k+1 = T z k T depends on A, b, c and has nice properties: convergence guarantees and rates coordinate friendly: break z into m blocks, cost(t i) 1 cost(t ) m 1 equivalent to standard ADMM, but the different form is important 3 / 31

6 Approach overview Douglas-Rachfordf 1 fixed point iteration z k+1 = T z k T depends on A, b, c and has nice properties: convergence guarantees and rates coordinate friendly: break z into m blocks, cost(t i) 1 cost(t ) m divergent nicely: (P) has no primal-dual sol pair z k z k+1 z k tells a whole lot 1 equivalent to standard ADMM, but the different form is important 3 / 31

7 Douglas-Rachford splitting (Lions-Mercier 79) proximal mapping of a closed function h prox γh (x) = arg min{h(z) + 1 z 2γ x 2 } z 4 / 31

8 Douglas-Rachford splitting (Lions-Mercier 79) proximal mapping of a closed function h prox γh (x) = arg min{h(z) + 1 z 2γ x 2 } z Douglas-Rachford Splitting (DRS) method solves minimize f(x) + g(x) by iterating z k+1 = T z k 4 / 31

9 Douglas-Rachford splitting (Lions-Mercier 79) proximal mapping of a closed function h prox γh (x) = arg min{h(z) + 1 z 2γ x 2 } z Douglas-Rachford Splitting (DRS) method solves minimize f(x) + g(x) by iterating z k+1 = T z k defined as: x k+ 1 2 = prox γg (z k ) x k+1 = prox γf (2z k x k+ 1 2 ) z k+1 = z k + (x k+1 x k+ 1 2 ) 4 / 31

10 Apply DRS to conic programming minimize c T x subject to Ax = b, x K minimize ( c T x + δ A =b (x) ) + δ K(x) }{{}}{{} f(x) g(x) cone K is nonempty closed convex 5 / 31

11 Apply DRS to conic programming minimize c T x subject to Ax = b, x K minimize ( c T x + δ A =b (x) ) + δ K(x) }{{}}{{} f(x) g(x) cone K is nonempty closed convex each iteration: project onto K, then project onto A = b 5 / 31

12 Apply DRS to conic programming minimize c T x subject to Ax = b, x K minimize ( c T x + δ A =b (x) ) + δ K(x) }{{}}{{} f(x) g(x) cone K is nonempty closed convex each iteration: project onto K, then project onto A = b per-iteration cost: O(n 2 ) if x R n (by pre-factorizing AA T ) 5 / 31

13 Apply DRS to conic programming minimize c T x subject to Ax = b, x K minimize ( c T x + δ A =b (x) ) + δ K(x) }{{}}{{} f(x) g(x) cone K is nonempty closed convex each iteration: project onto K, then project onto A = b per-iteration cost: O(n 2 ) if x R n (by pre-factorizing AA T ) prior work: ADMM for SDP (Wen-Goldfarb-Y. 09) 5 / 31

14 Other choices of splitting linearized ADMM and primal-dual splitting: avoid inverting full A variations of Frank-Wolfe: avoid expensive projections to SDP cone subgradient and bundle methods... 6 / 31

15 Coordinate friendly 2 (CF) (Block) coordinate update is fast only if the subproblems are simple definition: T : H H is CF if, for any z and i [m], z + := ( z 1,..., (T z) i,..., z m ) it holds that cost [ {z, M(z)} {z +, M(z + )} ] = O ( 1 cost[z T z]) m where M(z) is some quantity maintained in the memory 2 Peng-Wu-Xu-Yan-Y. AMSA 16 7 / 31

16 Composed operators 9 rules 3 for CF T 1 T 2 cover many examples general principles: T 1 T 2 inherits the (weaker) separability property if T 1 is CF and T 2 to be either cheap, easy-to-maintain, or directly CF, then T 1 T 2 is CF if T 1 is separable or cheap, T 1 T 2 is easier to CF 3 Peng-Wu-Xu-Yan-Y. AMSA 16 8 / 31

17 Lists of CF T 1 T 2 many convex image processing models portfolio optimization most sparse optimization problems all LPs, all SOCPs, and SDPs without large cones most ERM problems... 9 / 31

18 Example: DRS for SOCP second-order cone: Q n = {x R n : x 1 (x 2,..., x n) 2} 10 / 31

19 Example: DRS for SOCP second-order cone: DRS operator has the form Q n = {x R n : x 1 (x 2,..., x n) 2} T = linear proj Q n 1 Q np 10 / 31

20 Example: DRS for SOCP second-order cone: DRS operator has the form Q n = {x R n : x 1 (x 2,..., x n) 2} CF is trivial if all cones are small T = linear proj Q n 1 Q np 10 / 31

21 Example: DRS for SOCP second-order cone: DRS operator has the form Q n = {x R n : x 1 (x 2,..., x n) 2} T = linear proj Q n 1 Q np CF is trivial if all cones are small now, consider a big cone; property: proj Q n(x) = (αx 1, βx 2,..., βx n) where α, β depend on x 1 and γ := (x 2,..., x n) 2 10 / 31

22 Example: DRS for SOCP second-order cone: DRS operator has the form Q n = {x R n : x 1 (x 2,..., x n) 2} T = linear proj Q n 1 Q np CF is trivial if all cones are small now, consider a big cone; property: proj Q n(x) = (αx 1, βx 2,..., βx n) where α, β depend on x 1 and γ := (x 2,..., x n) 2 given γ and updating x i, refreshing γ costs O(1) 10 / 31

23 Example: DRS for SOCP second-order cone: DRS operator has the form Q n = {x R n : x 1 (x 2,..., x n) 2} T = linear proj Q n 1 Q np CF is trivial if all cones are small now, consider a big cone; property: proj Q n(x) = (αx 1, βx 2,..., βx n) where α, β depend on x 1 and γ := (x 2,..., x n) 2 given γ and updating x i, refreshing γ costs O(1) by maintaining γ, proj Q n is cheap, and T = linear cheap is CF 10 / 31

24 Fixed-point iterations full update z k+1 = T z k 11 / 31

25 Fixed-point iterations full update z k+1 = T z k (block) coordinate update (CU): choose i k [m], { z k+1 zi k + η((t z k ) i zi k ), if i = i k i = zi k, otherwise. 11 / 31

26 Fixed-point iterations full update z k+1 = T z k (block) coordinate update (CU): choose i k [m], { z k+1 zi k + η((t z k ) i zi k ), if i = i k i = zi k, otherwise. parallel CU: p agents choose I k [m] z k+1 i = { z k i + η((t z k ) i z k i ), z k i, if i I k otherwise. η depends on properties of T, i k, and I k 11 / 31

27 Sync-parallel versus async-parallel Agent 1 idle idle Agent 1 Agent 2 idle Agent 2 Agent 3 idle Agent 3 Synchronous (faster agents must wait) Asynchronous (all agents are non-stop) 12 / 31

28 ARock: async-parallel CU p agents every agent continuously does: pick i k [m], new notation: z k+1 i = { z k i + η((t z k d k ) i z k d k i ), if i = i k z k i, k increases after any agent completes an update z k d k = (z k d k,1 1,..., z k d k,m m ) may be stale allow inconsistent atomic read/write otherwise. 13 / 31

29 Various theories and meanings s: T is contractive in w,, partially/totally async 14 / 31

30 Various theories and meanings s: T is contractive in w,, partially/totally async recent in ML community: async SG and async BCD early works: random i k, bounded delays, Ef has sufficient descent, treat delays as noise, delays independent of i k 14 / 31

31 Various theories and meanings s: T is contractive in w,, partially/totally async recent in ML community: async SG and async BCD early works: random i k, bounded delays, Ef has sufficient descent, treat delays as noise, delays independent of i k state-of-the-art: allow essential cyclic i k, unbounded noise (t 4 or faster decay), Lyapunov analysis, delays as overdue progress, delays can depend on i k 14 / 31

32 Various theories and meanings s: T is contractive in w,, partially/totally async recent in ML community: async SG and async BCD early works: random i k, bounded delays, Ef has sufficient descent, treat delays as noise, delays independent of i k state-of-the-art: allow essential cyclic i k, unbounded noise (t 4 or faster decay), Lyapunov analysis, delays as overdue progress, delays can depend on i k ARock: T is non-expansive in 2 unbounded noise (t 4 or faster decay), Lyapunov analysis, delays as overdue progress, delays independent of i k, provable running time async:sync= 1 : log(p) in a poisson system, prox is async Combettes-Eckstein: async projective splitting, free of parameter 14 / 31

33 Various theories and meanings s: T is contractive in w,, partially/totally async recent in ML community: async SG and async BCD early works: random i k, bounded delays, Ef has sufficient descent, treat delays as noise, delays independent of i k state-of-the-art: allow essential cyclic i k, unbounded noise (t 4 or faster decay), Lyapunov analysis, delays as overdue progress, delays can depend on i k ARock: T is non-expansive in 2 unbounded noise (t 4 or faster decay), Lyapunov analysis, delays as overdue progress, delays independent of i k, provable running time async:sync= 1 : log(p) in a poisson system, prox is async Combettes-Eckstein: async projective splitting, free of parameter in distributed comp, also refer to: random activations, may not delay 14 / 31

34 ARock convergence 4 notation: m = # blocks τ = max async delay uniform random selection (non-uniform is okay) Theorem (known max delay) Assume: T is nonexpansive and has a fixed point, and delays do not depend on 1 i k. Use step size η k [ɛ, ). Then, 2m 1/2 τ+1 xk x FixT almost surely. 4 Peng-Xu-Yan-Y. SISC / 31

35 ARock convergence 4 notation: m = # blocks τ = max async delay uniform random selection (non-uniform is okay) Theorem (known max delay) Assume: T is nonexpansive and has a fixed point, and delays do not depend on 1 i k. Use step size η k [ɛ, ). Then, 2m 1/2 τ+1 xk x FixT almost surely. consequence: no sync at least until using O( m) agents sharp when τ m 4 Peng-Xu-Yan-Y. SISC / 31

36 Optimization and fixed-point examples 16 / 31

37 Applications 17 / 31

38 More complicated applications LP, QP, SOCP, some SDP Image reconstruction minimization Nonnegative matrix factorization Decentralized optimization (no global coordination anymore!) 18 / 31

39 QCQP test: ARock versus SCS 5 5 O Donoghue, Chu, Parikh, Boyd / 31

40 Practice coding: OpenMP, C++11, MPI easier than you think performance: if done correctly, async speed sync speed much faster when systems get bigger and/or unbalanced 20 / 31

41 An ideal solver find a solution if there is one 21 / 31

42 An ideal solver find a solution if there is one when there is no solution, reliably report no solution 21 / 31

43 An ideal solver find a solution if there is one when there is no solution, reliably report no solution provide a certificate 21 / 31

44 An ideal solver find a solution if there is one when there is no solution, reliably report no solution provide a certificate suggest the cheapest fix 21 / 31

45 An ideal solver find a solution if there is one when there is no solution, reliably report no solution provide a certificate suggest the cheapest fix status: achievable for LP, not for SOCPs yet 21 / 31

46 Conic programming p = min c T x K is a closed convex cone subject to Ax }{{ = } b, x K x L 22 / 31

47 Conic programming p = min c T x K is a closed convex cone subject to Ax }{{ = } b, x K x L every problem falls in exactly one of the seven cases: 1) p finite: 1a) has PD sol pair, 1b) only P sol, 1c) no P sol 22 / 31

48 Conic programming p = min c T x K is a closed convex cone subject to Ax }{{ = } b, x K x L every problem falls in exactly one of the seven cases: 1) p finite: 1a) has PD sol pair, 1b) only P sol, 1c) no P sol 2) p = : 2a) has improving dir, 2b) no improving dir 22 / 31

49 Conic programming p = min c T x K is a closed convex cone subject to Ax }{{ = } b, x K x L every problem falls in exactly one of the seven cases: 1) p finite: 1a) has PD sol pair, 1b) only P sol, 1c) no P sol 2) p = : 2a) has improving dir, 2b) no improving dir 3) p = + : 3a) dist(l, K) > 0 has separating hyperplane 3b) dist(l, K) = 0 no strict separating hyperplane (nearly) pathological cases fail existing solvers 22 / 31

50 Example 1 3-variable problem: minimize x 1 subject to x 2 = 1, 2x 2x 3 x 2 1. since x 2, x 3 0, the problem is equivalent to minimize x 1 subject to x 2 = 1, (x 1, x 2, x 3) rotated second-order cone. 6 p =, by letting x3 and x 1 7 reason: any improving direction u has form (u1, 0, u 3 ), but by the cone constraint 2u 2 u 3 = 0 u 2 1, so u 1 = 0, which implies c T u 1 = 0 (not improving). 23 / 31

51 Example 1 3-variable problem: minimize x 1 subject to x 2 = 1, 2x 2x 3 x 2 1. since x 2, x 3 0, the problem is equivalent to minimize x 1 subject to x 2 = 1, (x 1, x 2, x 3) rotated second-order cone. classification: (2b) feasible unbounded 6 no improving direction 7 6 p =, by letting x3 and x 1 7 reason: any improving direction u has form (u1, 0, u 3 ), but by the cone constraint 2u 2 u 3 = 0 u 2 1, so u 1 = 0, which implies c T u 1 = 0 (not improving). 23 / 31

52 Example 1 3-variable problem: minimize x 1 subject to x 2 = 1, 2x 2x 3 x 2 1. since x 2, x 3 0, the problem is equivalent to minimize x 1 subject to x 2 = 1, (x 1, x 2, x 3) rotated second-order cone. classification: (2b) feasible unbounded 6 no improving direction 7 solver results: SDPT3: Failed, p no reported SeDuMi: Inaccurate/Solved, p = Mosek: Inaccurate/Unbounded, p = 6 p =, by letting x3 and x 1 7 reason: any improving direction u has form (u1, 0, u 3 ), but by the cone constraint 2u 2 u 3 = 0 u 2 1, so u 1 = 0, which implies c T u 1 = 0 (not improving). 23 / 31

53 Example 2 3-variable problem: minimize 0 subject to [ ] [ ] x =, x 3 x x2 2. }{{}}{{} x K x L 8 x L imply x = [1, α, α] T, α R, which always violates the second-order cone constraint. 9 dist(l, K) [1, α, α] [1, α, (α 2 + 1) 1/2 ] 2 0 as α. 24 / 31

54 Example 2 3-variable problem: minimize 0 subject to [ ] [ ] x =, x 3 x x2 2. }{{}}{{} x K x L classification: (3b) infeasible 8, L K = dist(l, K) = 0 9 no strict separating hyperplane 8 x L imply x = [1, α, α] T, α R, which always violates the second-order cone constraint. 9 dist(l, K) [1, α, α] [1, α, (α 2 + 1) 1/2 ] 2 0 as α. 24 / 31

55 Example 2 3-variable problem: minimize 0 subject to [ ] [ ] x =, x 3 x x2 2. }{{}}{{} x K x L classification: (3b) infeasible 8, L K = dist(l, K) = 0 9 no strict separating hyperplane solver results: SDPT3: Infeasible, p = SeDuMi: Solved, p = 0 Mosek: Failed, p not reported 8 x L imply x = [1, α, α] T, α R, which always violates the second-order cone constraint. 9 dist(l, K) [1, α, α] [1, α, (α 2 + 1) 1/2 ] 2 0 as α. 24 / 31

56 Then, what happens to DRS? In 1970s, Paty, Rockafellar assume T is firmly nonexpansive run z k+1 = T (z k ) converges if has PD sol; otherwise, z k In 1979, Bailion-Bruck-Reich nailed z k z k+1 v = Proj ran(i T ) (0) 25 / 31

57 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) 26 / 31

58 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) deciphered Proj ran(i T ) 26 / 31

59 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) deciphered Proj ran(i T ) a workflow running three similar DRS, differ by only constants: 1) DRS 2) feasibility DRS with c = 0 3) boundedness DRS with b = 0 most pathological cases are identified 26 / 31

60 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) deciphered Proj ran(i T ) a workflow running three similar DRS, differ by only constants: 1) DRS 2) feasibility DRS with c = 0 3) boundedness DRS with b = 0 most pathological cases are identified compute an improving direction if one exists 26 / 31

61 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) deciphered Proj ran(i T ) a workflow running three similar DRS, differ by only constants: 1) DRS 2) feasibility DRS with c = 0 3) boundedness DRS with b = 0 most pathological cases are identified compute an improving direction if one exists compute a separating hyperplane if one exists 26 / 31

62 Our analysis results (Liu-Ryu-Y. 17) rate of convergence: z k z k+1 v + ɛ + O( 1 k+1 ) deciphered Proj ran(i T ) a workflow running three similar DRS, differ by only constants: 1) DRS 2) feasibility DRS with c = 0 3) boundedness DRS with b = 0 most pathological cases are identified compute an improving direction if one exists compute a separating hyperplane if one exists for all infeasible problems, minimally change to restore strong feasibility 26 / 31

63 Decision flow 27 / 31

64 Infeasible SDP test set (Liu-Pataki 17) m = 10 m = 20 Clean Messy Clean Messy SeDuMi SDPT Mosek PP 10 +SeDuMi percentage of success detection on clean and messy examples in Liu-Pataki PreProcessing by Permenter-Parilo / 31

65 Identify weakly infeasible SDPs m = 10 m = 20 Clean Messy Clean Messy Proposed (stopping: z 1e ) our percentage is way much better! 29 / 31

66 Identify strongly infeasible SDPs m = 10 m = 20 Clean Messy Clean Messy Proposed (stopping: z 5e4 z 5e ) our percentage is way much better! 30 / 31

67 Thank you! References: UCLA CAM reports 15-37: ARock 16-13: Coordinate friendly, SOCP applications 17-30: Unbounded and realistic-delay async BCD 17-31: DRS for unsolvable conic programs arxiv: : provably async-to-sync speedup 31 / 31

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

A New Use of Douglas-Rachford Splitting and ADMM for Classifying Infeasible, Unbounded, and Pathological Conic Programs

A New Use of Douglas-Rachford Splitting and ADMM for Classifying Infeasible, Unbounded, and Pathological Conic Programs A New Use of Douglas-Rachford Splitting and ADMM for Classifying Infeasible, Unbounded, and Pathological Conic Programs Yanli Liu, Ernest K. Ryu, and Wotao Yin May 23, 2017 Abstract In this paper, we present

More information

A New Use of Douglas-Rachford Splitting and ADMM for Identifying Infeasible, Unbounded, and Pathological Conic Programs

A New Use of Douglas-Rachford Splitting and ADMM for Identifying Infeasible, Unbounded, and Pathological Conic Programs A New Use of Douglas-Rachford Splitting and ADMM for Identifying Infeasible, Unbounded, and Pathological Conic Programs Yanli Liu Ernest K. Ryu Wotao Yin October 14, 2017 Abstract In this paper, we present

More information

Asynchronous Parallel Computing in Signal Processing and Machine Learning

Asynchronous Parallel Computing in Signal Processing and Machine Learning Asynchronous Parallel Computing in Signal Processing and Machine Learning Wotao Yin (UCLA Math) joint with Zhimin Peng (UCLA), Yangyang Xu (IMA), Ming Yan (MSU) Optimization and Parsimonious Modeling IMA,

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates

ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin May 3, 216 Abstract Finding a fixed point to a nonexpansive operator, i.e., x = T

More information

Operator Splitting for Parallel and Distributed Optimization

Operator Splitting for Parallel and Distributed Optimization Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer

More information

Tight Rates and Equivalence Results of Operator Splitting Schemes

Tight Rates and Equivalence Results of Operator Splitting Schemes Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1

More information

Math 273a: Optimization Subgradient Methods

Math 273a: Optimization Subgradient Methods Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates

ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin July 7, 215 The problem of finding a fixed point to a nonexpansive operator is an abstraction

More information

Approximate Farkas Lemmas in Convex Optimization

Approximate Farkas Lemmas in Convex Optimization Approximate Farkas Lemmas in Convex Optimization Imre McMaster University Advanced Optimization Lab AdvOL Graduate Student Seminar October 25, 2004 1 Exact Farkas Lemma Motivation 2 3 Future plans The

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

Coordinate Update Algorithm Short Course The Package TMAC

Coordinate Update Algorithm Short Course The Package TMAC Coordinate Update Algorithm Short Course The Package TMAC Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 16 TMAC: A Toolbox of Async-Parallel, Coordinate, Splitting, and Stochastic Methods C++11 multi-threading

More information

Parallel Coordinate Optimization

Parallel Coordinate Optimization 1 / 38 Parallel Coordinate Optimization Julie Nutini MLRG - Spring Term March 6 th, 2018 2 / 38 Contours of a function F : IR 2 IR. Goal: Find the minimizer of F. Coordinate Descent in 2D Contours of a

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Fast ADMM for Sum of Squares Programs Using Partial Orthogonality

Fast ADMM for Sum of Squares Programs Using Partial Orthogonality Fast ADMM for Sum of Squares Programs Using Partial Orthogonality Antonis Papachristodoulou Department of Engineering Science University of Oxford www.eng.ox.ac.uk/control/sysos antonis@eng.ox.ac.uk with

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

Primal-dual fixed point algorithms for separable minimization problems and their applications in imaging

Primal-dual fixed point algorithms for separable minimization problems and their applications in imaging 1/38 Primal-dual fixed point algorithms for separable minimization problems and their applications in imaging Xiaoqun Zhang Department of Mathematics and Institute of Natural Sciences Shanghai Jiao Tong

More information

Decentralized Consensus Optimization with Asynchrony and Delay

Decentralized Consensus Optimization with Asynchrony and Delay Decentralized Consensus Optimization with Asynchrony and Delay Tianyu Wu, Kun Yuan 2, Qing Ling 3, Wotao Yin, and Ali H. Sayed 2 Department of Mathematics, 2 Department of Electrical Engineering, University

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Douglas-Rachford Splitting for Pathological Convex Optimization

Douglas-Rachford Splitting for Pathological Convex Optimization Douglas-Rachford Splitting for Pathological Convex Optimization Ernest K. Ryu Yanli Liu Wotao Yin January 9, 208 Abstract Despite the vast literature on DRS, there has been very little work analyzing their

More information

Primal-dual algorithms for the sum of two and three functions 1

Primal-dual algorithms for the sum of two and three functions 1 Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Lecture 17: Primal-dual interior-point methods part II

Lecture 17: Primal-dual interior-point methods part II 10-725/36-725: Convex Optimization Spring 2015 Lecture 17: Primal-dual interior-point methods part II Lecturer: Javier Pena Scribes: Pinchao Zhang, Wei Ma Note: LaTeX template courtesy of UC Berkeley EECS

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Lecture 24 November 27

Lecture 24 November 27 EE 381V: Large Scale Optimization Fall 01 Lecture 4 November 7 Lecturer: Caramanis & Sanghavi Scribe: Jahshan Bhatti and Ken Pesyna 4.1 Mirror Descent Earlier, we motivated mirror descent as a way to improve

More information

Nonconvex ADMM: Convergence and Applications

Nonconvex ADMM: Convergence and Applications Nonconvex ADMM: Convergence and Applications Instructor: Wotao Yin (UCLA Math) Based on CAM 15-62 with Yu Wang and Jinshan Zeng Summer 2016 1 / 54 1. Alternating Direction Method of Multipliers (ADMM):

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725 Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like

More information

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS JONATHAN M. BORWEIN AND MATTHEW K. TAM Abstract. We analyse the behaviour of the newly introduced cyclic Douglas Rachford algorithm

More information

Primal-dual coordinate descent

Primal-dual coordinate descent Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Interior-Point and Augmented Lagrangian Algorithms for Optimization and Control

Interior-Point and Augmented Lagrangian Algorithms for Optimization and Control Interior-Point and Augmented Lagrangian Algorithms for Optimization and Control Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Constrained Optimization May 2014 1 / 46 In This

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

EE364b Convex Optimization II May 30 June 2, Final exam

EE364b Convex Optimization II May 30 June 2, Final exam EE364b Convex Optimization II May 30 June 2, 2014 Prof. S. Boyd Final exam By now, you know how it works, so we won t repeat it here. (If not, see the instructions for the EE364a final exam.) Since you

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

Research overview. Seminar September 4, Lehigh University Department of Industrial & Systems Engineering. Research overview.

Research overview. Seminar September 4, Lehigh University Department of Industrial & Systems Engineering. Research overview. Research overview Lehigh University Department of Industrial & Systems Engineering COR@L Seminar September 4, 2008 1 Duality without regularity condition Duality in non-exact arithmetic 2 interior point

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information

Sparsity and Compressed Sensing

Sparsity and Compressed Sensing Sparsity and Compressed Sensing Jalal Fadili Normandie Université-ENSICAEN, GREYC Mathematical coffees 2017 Recap: linear inverse problems Dictionary Sensing Sensing Sensing = y m 1 H m n y y 2 R m H A

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

III. Applications in convex optimization

III. Applications in convex optimization III. Applications in convex optimization nonsymmetric interior-point methods partial separability and decomposition partial separability first order methods interior-point methods Conic linear optimization

More information

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING SIAM J. OPTIM. Vol. 8, No. 1, pp. 646 670 c 018 Society for Industrial and Applied Mathematics HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX

More information

Lecture 5: September 15

Lecture 5: September 15 10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 15 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Di Jin, Mengdi Wang, Bin Deng Note: LaTeX template courtesy of UC Berkeley EECS

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

Facial Reduction and Geometry on Conic Programming

Facial Reduction and Geometry on Conic Programming Facial Reduction and Geometry on Conic Programming Masakazu Muramatsu UEC Bruno F. Lourenço, and Takashi Tsuchiya Seikei University GRIPS Based on the papers of LMT: 1. A structural geometrical analysis

More information

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Robert M. Freund M.I.T. June, 2010 from papers in SIOPT, Mathematics

More information

On positive duality gaps in semidefinite programming

On positive duality gaps in semidefinite programming On positive duality gaps in semidefinite programming Gábor Pataki arxiv:82.796v [math.oc] 3 Dec 28 January, 29 Abstract We present a novel analysis of semidefinite programs (SDPs) with positive duality

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Solving conic optimization problems via self-dual embedding and facial reduction: a unified approach

Solving conic optimization problems via self-dual embedding and facial reduction: a unified approach Solving conic optimization problems via self-dual embedding and facial reduction: a unified approach Frank Permenter Henrik A. Friberg Erling D. Andersen August 18, 216 Abstract We establish connections

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied

More information

Asynchronous Non-Convex Optimization For Separable Problem

Asynchronous Non-Convex Optimization For Separable Problem Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent

More information

Infeasibility detection in the alternating direction method of multipliers for convex optimization

Infeasibility detection in the alternating direction method of multipliers for convex optimization Infeasibility detection in the alternating direction method of multipliers for convex optimization Goran Banjac, Paul Goulart, Bartolomeo Stellato, and Stephen Boyd June 20, 208 Abstract The alternating

More information

Lecture 5: September 12

Lecture 5: September 12 10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 12 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Barun Patra and Tyler Vuong Note: LaTeX template courtesy of UC Berkeley EECS

More information

arxiv: v2 [math.oc] 2 Mar 2017

arxiv: v2 [math.oc] 2 Mar 2017 CYCLIC COORDINATE UPDATE ALGORITHMS FOR FIXED-POINT PROBLEMS: ANALYSIS AND APPLICATIONS YAT TIN CHOW, TIANYU WU, AND WOTAO YIN arxiv:6.0456v [math.oc] Mar 07 Abstract. Many problems reduce to the fixed-point

More information

Convex Optimization Lecture 16

Convex Optimization Lecture 16 Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization Sparse Optimization Lecture: Dual Certificate in l 1 Minimization Instructor: Wotao Yin July 2013 Note scriber: Zheng Sun Those who complete this lecture will know what is a dual certificate for l 1 minimization

More information

arxiv: v4 [math.oc] 29 Jan 2018

arxiv: v4 [math.oc] 29 Jan 2018 Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

Tutorial on Convex Optimization for Engineers Part II

Tutorial on Convex Optimization for Engineers Part II Tutorial on Convex Optimization for Engineers Part II M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de

More information

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 7 February 7th Overview In the previous lectures we saw applications of duality to game theory and later to learning theory. In this lecture

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

ELEMENTARY REFORMULATION AND SUCCINCT CERTIFICATES IN CONIC LINEAR PROGRAMMING. Minghui Liu

ELEMENTARY REFORMULATION AND SUCCINCT CERTIFICATES IN CONIC LINEAR PROGRAMMING. Minghui Liu ELEMENTARY REFORMULATION AND SUCCINCT CERTIFICATES IN CONIC LINEAR PROGRAMMING Minghui Liu A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment

More information

Bias-free Sparse Regression with Guaranteed Consistency

Bias-free Sparse Regression with Guaranteed Consistency Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March

More information