Operator Splitting for Parallel and Distributed Optimization
|
|
- Christine Johnston
- 5 years ago
- Views:
Transcription
1 Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60
2 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer ( BC) splitting in computing: break a problem separate parts solve the separate parts sub-solutions combine the sub-solutions in a controlled fashion 2 / 60
3 Monotone operator-splitting pipeline 1. recognize the simple parts in your problem 2. build an equivalent monotone-inclusion problem: 0 Ax (the simple parts are separately placed in A) 3. apply an operator-splitting scheme: 0 Ax = z = Tz (the simple parts become sub-operators of T) 4. run the iteration z k+1 = z k + λ(tz k z k ), λ (0, 1] (known as the Krasnosel skiĭ Mann (KM) iteration) 3 / 60
4 Example: LASSO (basis pursuit denoising) Tibshirani 96: 2 simple parts: smooth + simple minimize 1 2 Ax b 2 + λ x 1 smooth function: f (x) = 1 Ax b 2 2 simple nonsmooth function: r(x) = λ x 1 equivalent condition: 0 f (x) + r(x) forward-backward splitting algorithm: x k+1 = prox γr (x k γ f (x k )) }{{} Tx k also known as the Iterative Soft-Thresholding Algorithm (ISTA) 4 / 60
5 Example: total variation (TV) deblurring Rudin-Osher-Fatemi 92: minimize u 1 2 Ku b 2 + λ Du 1 subject to 0 u simple parts: smooth + simple linear + simple smooth function: f (u) = 1 Ku b 2 2 linear operator: D simple nonsmooth function: r 1 = λ 1 simple indicator function: r 2 = ι [0,255] 5 / 60
6 (We only show the results, skipping the details) equivalent condition: [ ] [ ] [ ] [ ] r2 D u f 0 u 0 + D r1 w 0 0 w (where w is the auxiliary (dual) variable) forward-backward splitting algorithm under a special metric: u k+1 = proj [0,255] n (u k γd w k γ f (u k )) w k+1 = prox γl (w k + γd(2u k+1 u k )) every step is simple to implement 6 / 60
7 Simple parts 7 / 60
8 Simple parts linear maps (e.g., matrices, finite differences, orthogonal transforms) differentiable functions (non-differentiable) functions that have simple proximal maps sets that are easy to project to Abstraction: we look for monotone maps which have certain simple operators 8 / 60
9 Monotone map A : H H is monotone if Ax Ay, x y 0, x, y H extend to set-valued A : H 2 H : p q, x y 0, x, y H, p Ax, q Ay examples: a positive semi-definite linear map a skew-symmetric linear map: Ax, x = 0, x H f : subdifferential of a proper closed convex function f 9 / 60
10 Steps to build an operator-splitting algorithm Problem Monotone Maps Standard Operators Algorithm 10 / 60
11 Forward operator require: A is monotone and either Lipschitz or cocoercive 1 definition: fwd γa := (I γa) examples: forward Euler for ẏ + g(t, y) = 0: y t+1 = y t h g(t, y t ) = (I h g(t, ))y t gradient descent for min f (x): x k+1 = x k γ f (x k ) = (I γ f )x k equivalent conditions: 0 = Ax x = (I γa)x 1 Ax Ay, x y β Ax Ay 2, x, y H. If A is β-cocoercive, then A is 1/β-Lipschitz. The reverse is generally untrue. 11 / 60
12 Backward operator require: A is monotone definition: J γa := (I + γa) 1 equivalent conditions: 0 Ax x x + γax x = J γa x even if A is set-valued, J γa is single-valued examples: regularized matrix inversion backward Euler proximal map, including projection as a special case 12 / 60
13 Proximal map require: a proper closed convex function f definition: the minimizer must satisfy: prox γf (y) = arg min x f (x) + 1 x y 2 2γ 0 γ f (x ) + (x y) x = (I + γ f ) 1 (y) = J γ f (y) therefore, prox γf J γ f (I + γ f ) 1 13 / 60
14 Reflective backward operator require: A is monotone definition: R γa := 2J γa I reflects x through J γa x by adding J γa x x examples: mirror or reflective projection: refl C = 2proj C I reflective proximal map: for closed proper convex function f R γ f = 2J γ f I = 2prox γf I 14 / 60
15 Operator splitting 15 / 60
16 Steps to build an operator-splitting algorithm Problem Monotone Maps Standard Operators Algorithm 16 / 60
17 The big three two-operator splitting schemes 0 Ax + Bx forward-backward (Mercier 79) for (maximally monotone) + (cocoercive) Douglas-Rachford (Lion-Mercier 79) for (maximally monotone) + (maximally monotone) forward-backward-forward (Tseng 00) for (maximally monotone) + (Lipschitz & monotone) all the schemes are built from forward operators and backward operators 17 / 60
18 Forward-backward splitting require: A maximally monotone, B cocoercive (thus single-valued) forward-backward splitting (FBS) operator (Lion-Mercier 79) T FBS := J γa (I γb) reduces to forward operator if A = 0, and backward operator if B = 0 equivalent conditions: 0 Ax + Bx x = T FBS (x) backward-forward splitting (BFS) operator T BFS := (I γb) J γa, then 0 Ax + Bx z = T BFS (z), x = J γa z 18 / 60
19 Regularization least squares minimize x r(x) + 1 Kx b 2 } 2 {{} f (x) K: linear operator b: input data (observation) r: enforces a structure on x. examples: l 2 2, l 1, sorted l 1, l 2, TV, nuclear norm,... equivalent condition: 0 r(x) + f (x) forward-backward splitting iteration: x k+1 = prox γr (I γ f )x k 19 / 60
20 Constrained minimization C is a convex set. f is a proper close convex function. minimize x f (x) subject to x C equivalent condition: 0 N C (x) + f (x) if f is Lipschitz differentiable, then apply forward-backward splitting x k+1 = proj C (I γ f )x k recovers the projected gradient method 20 / 60
21 Douglas-Rachford splitting require: A, B both monotone Douglas-Rachford splitting (DRS) operator (Lion-Mercier 79) T DRS := 1 2 I R γa R γb note: switching A and B gives a different DRS operator (relaxed) Peaceman-Rachford splitting (PRS) operator, λ (0, 1]: T λ PRS := (1 λ)i + λr γa R γb also, T PRS = T 1 PRS equivalent conditions: 0 Ax + Bx z = T λ PRS(z), x = J γb z 21 / 60
22 Constrained minimization, cont. if f is non-differentiable, then apply Douglas-Rachford splitting (DRS) (where x k = proj C z k ) z k+1 = ( 1 2 I (2prox γf I ) (2proj C I ) ) z k dual approach: introduce x y = 0 and apply ADMM to minimize x,y f (x) + ι C (y) subject to x y = 0. (indicator function ι C (y) = 0, if y C, and otherwise.) equivalence: the ADMM iteration = the DRS iteration 22 / 60
23 Multi-function minimization f 1,..., f m : H (, ] are proper closed convex functions. product-space trick: minimize f 1(x) + + f N (x) x introduce copies x (i) H of x; let x = (x (1),..., x (N) ) H N let C = {x : x (1) = = x (N) } equivalent problem in H N : minimize x ι C (x) + N f i(x (i) ) i=1 then apply a two-operator splitting scheme 23 / 60
24 Operator splitting: Dual application 24 / 60
25 Duality convex (and some nonconvex) optimization problems have two perspectives: the primal problem and the dual problem duality brings us: alternative or relaxed problems, lower bounds certificates for optimality or infeasibility economic interpretations duality + operator splitting: decouples objective functions and constraints components gives rise to parallel and distributed algorithms 25 / 60
26 Lagrange duality original problem: minimize x f (x) subject to Ax = b. relaxations: Lagrangian: L(x; w) := f (x) + w T (Ax b) augmented Lagrangian: L(x; w, γ) := f (x) + w T (Ax b) + γ Ax b 2 2 dual function, which is convex (even if f is not): d(w) = min L(x; w) x dual problem: minimize w d(w) 26 / 60
27 Monotropic program definition: minimize x 1,...,x m f 1(x 1) + + f m(x m) subject to A 1x 1 + A mx m = b. where f i(x i) may include ι Ci (x) for constraint x i C i x 1,..., x m are separable in the objective and coupled in the constraints dual problem has the form minimize w d 1(w) + + d m(w) where d i(w) := min xi f i(x i) + w T (A ix i 1 m b) 27 / 60
28 Examples of monotropic programs linear programs min{f (x) : Ax C} min x,y{f (x) + ι C (y) : Ax y = 0} consensus problem min{f 1(x 1) + + f n(x n) : Ax = 0}, where Ax = 0 x 1 = = x m the structure of A enables distributed computing exchange problem 28 / 60
29 Dual forward-backward splitting original problem: minimize x f 1(x) + f 2(y) subject to A 1x 1 + A 2x 2 = b. require: strongly convex f 1 (thus Lipschitz d 1) FBS iteration: z k+1 = prox γd2 (I γ d 1 )z k express in terms of (augmented) Lagrangian: x k+1 1 = arg min f 1(x 1) + w kt A 1x 1 x 1 x k+1 2 arg min f 2(x 2) + w kt A 2x 2 + γ x 2 2 A1xk A 2x 2 b 2 w k+1 = w γ(b A 1x k+1 1 A 2x k+1 2 ) we have recovered Tseng s Alternating Minimization Algorithm 29 / 60
30 Dual Douglas-Rachford splitting original problem: minimize x no strong-convexity requirement f 1(x) + f 2(y) subject to A 1x 1 + A 2x 2 = b. DRS iteration: z k+1 = ( 1 2 I (2prox γd 2 I )(2prox γd1 I ) ) z k express in terms of augmented Lagrangian: x k+1 1 arg min f 1(x 1) + w kt A 1x 1 + γ 2 A 1x 1 + A 2x2 k b 2 x k+1 x 1 2 arg min x 2 f 2(x 2) + w kt A 2x 2 + γ k+1 A1x1 + A 2x 2 b 2 2 w k+1 = w k γ(b A 1x k+1 1 A 2x k+1 2 ) recover the Alternating Direction Method of Multipliers (ADMM) 30 / 60
31 Operator splitting: Primal-dual application 31 / 60
32 Nonsmooth linear composition problem: minimize nonsmooth linear + nonsmooth + smooth equivalent condition: minimize x decouple r 1 from L: introduce r 1(Lx) + r 2(x) + f (x) 0 (L T r 1 L + r 2 + f )x dual variable y r 1(Lx) Lx r 1 (y) equivalent condition: [ ] [ ] [ ] [ ] 0 L T x r2(x) f (x) L 0 y r1 (y) 0 32 / 60
33 equivalent condition (copied from last slide): [ ] [ ] [ ] [ ] 0 L T x r2(x) f (x) L 0 y r1 (y) 0 }{{}}{{} Az Bz [ ] x primal-dual variable: z =. y apply forward-backward splitting to 0 Az + Bz: z k+1 = J γa F γb z k z k+1 = (I + γa) 1 (I γb)z k { x k+1 + γl T y k+1 + γ r2(x k+1 ) = x k γ f (x k ) solve y k+1 γlx k+1 + γ r 1 (y k+1 ) = y k issue: both x k+1 and y k+1 appear in both equations! 33 / 60
34 solution: introduce the metric [ ] I γl T U = 0 γl I apply forward-backward splitting to 0 U 1 Az + U 1 Bz: z k+1 = J γu 1 A (I γu 1 B)z k z k+1 = (I + γu 1 A) 1 (I γu 1 B)z k solve Uz k+1 + γãz k+1 = Uz k γbz k { x k+1 γl T y k+1 + γl T y k+1 + γ r2(x k+1 ) = x k γl T y k γ f (x k ) y k+1 γl x k+1 γlx k+1 + γ r1 (y k+1 ) = y k γl x k (like Gaussian elimination, y k+1 is cancelled from the first equation) 34 / 60
35 strategy: obtain x k+1 from the first equation; plug in x k+1 as a constant into the second equation and then obtain y k+1 final iteration: x k+1 = prox γr2 (x k γl T y k γ f (x k )) y k+1 = prox γr (y k + γl(2x k+1 x k )) 1 nice properties: apply L and f explicitly solve proximal-point subproblems of r 1 and r 2 convergence follows from standard forward-backward splitting 35 / 60
36 Example: total variation (TV) deblurring Rudin-Osher-Fatemi 92: minimize u 1 2 Ku b 2 + λ Du 1 subject to 0 u simple parts: smooth + simple linear + simple smooth function: f (u) = 1 Ku b 2 2 linear operator: D simple nonsmooth function: r 1 = λ 1 simple indicator function: r 2 = ι [0,255] 36 / 60
37 (We only show the results, skipping the details) equivalent condition: [ ] [ ] [ ] [ ] r2 D u f 0 u 0 + D r1 w 0 0 w (where w is the auxiliary (dual) variable) forward-backward splitting algorithm under a special metric: u k+1 = proj [0,255] n (u k γd w k γ f (u k )) w k+1 = prox γl (w k + γd(2u k+1 u k )) every step is simple to implement 37 / 60
38 Operator splitting: Parallel and distributed applications 38 / 60
39 Huge matrix A background: you wish to distribute a huge matrix A in your problem three schemes to distribute A: A (1). A 1 A N A (M) A 1,1 A 1,N A M,1 A M,N scheme 1 scheme 2 scheme 3 39 / 60
40 choose a scheme based on the structures of functions/operators example: convex and smooth f ; convex r (possibly nonsmooth) minimize x r(x) + f (Ax) choose scheme 1 for separable f, that is, f (y) = m i=1 fi(yi) apply forward-backward splitting x k+1 = prox γr (x k γa T f (Ax k )) ( M = prox γr x k γ A T (i) f i (A (i) x k ) ) i=1 we can distribute/parallelize A T (i) fi(a (i)x k ) 40 / 60
41 choose scheme 2 for separable r prox γr1 (y 1) prox γr (y) =. apply forward-backward splitting: prox γrn (y N ) x k+1 = prox γr (x k γa T f (Ax k )) cache g = f (Ax k ) x k+1 1 prox γr1 (x1 k γa T 1 g). =. prox γrn (xn k γa T N g) broadcast g = f ( ) M j=1 Ajxk+1 j x k+1 N we distribute the computing of A jx k+1 j = A jprox γrj (x k j γa T j g) this is also known as parallel coordinate descent 41 / 60
42 choose scheme 3 for separable f and r (we skip the details) Remarks: parallel computing is enabled by separability structure try not to introduce extra variables for convergence speed parallelism can be further accelerated by asynchronous parallelism 42 / 60
43 assume scheme 1: A = Duality and structure trade A (1). A (M) primal problem: convex nonsmooth r 1,..., r m ; strongly-convex f : m minimize ri(a w i=1 (i)w) + f (w) structure: nonsmooth linear + strongly-convex let f (y) = sup x y, x f (x) denote the convex conjugate of f dual problem: minimize y f ( m i=1 AT (m) yi ) + m i=1 r i (y i) dual structure: nonsmooth + smooth linear 43 / 60
44 Example: support vector machine (SVM) given N sample-label pairs (x i, y i) where x i R d and y i {1, 1} primal problem: dual problem: apply scheme 1 minimize w,b maximize α R N m i=1 ι R + ( yi (x T (i) w b) 1) w 2 quadratic(α; X) + nonsmooth(α) 44 / 60
45 Operator splitting: Decentralized applications 45 / 60
46 Decentralized computing n agents in a connected network G = (V, E) with bi-directional links E each agent i has a private function f i problem: find a consensus solution x to minimize x R p f (x) := n f i(x i) subject to x i = x j, i, j. i=1 challenges: no center, only between-neighbor communication benefits: fault tolerance, no long-dist communication, privacy 46 / 60
47 Decentralized ADMM Decentralized consensus optimization problem: minimize fi(xi) x i,i V i V subject to x i = x j, (i, j) E ADMM reformulation: minimize x i,i V, y ij,(i,j) E subject to i V fi(xi) x i = y ij, x j = y ij, (i, j) E ADMM (Douglas-Rachford splitting) alternates between two steps each agent: update x i while related y ij are fixed each pair of agents (i, j): update y ij and dual var while x i, x j are fixed 47 / 60
48 Primal-dual splitting problem: minimize x i,i V i V ( ri(x i) + f i(x i) ) subject to W x = x. where r i are convex and f i are convex and smooth the mixing matrix W R E E : w ij 0 only if i = j or agents i, j are neighbors symmetric W = W T, doubly stochastic W 1 = 1. thus I W 0 consensus: x i = x j, (i, j) E W x = x where x stacks all x T i equivalent problem: minimize x i,i V r(x) + f (x) = i V ( ri(x i) + f i(x i) ) subject to W x = x. 48 / 60
49 let V T V = 1 2 (I W ) equivalent problem (KKT conditions): [ ] [ ] [ ] r V T x f (x) 0 + V 0 q 0 applying forward-backward splitting with a special metric, skipping details, eliminating q, we obtain x k+1+1/2 = W x k+1 + x k+1/2 1 2 (W + I )xk α [ f (x k+1 ) f (x k ) ] x k+2 = arg min r(x) + 1 2α x xk+1+1/2 2 F this recovers the PG-EXTRA decentralized algorithm 49 / 60
50 Operator-splitting analysis 50 / 60
51 How to analyze splitting algorithms? problem: find x such that 0 (A + B)x and 0 (A + B + C)x iteration: z k+1 = T(z k ) require: fixed point z of T encodes a solution x z k+1 z 2 z k z 2 C z k+1 z k 2 sufficiency: T is α-averaged, α (0, 1) 51 / 60
52 Averaged operator weaker than contractive operators; strong than nonexpansive operators T is α-averaged, α (0, 1), if for any z, ˉz H Tz Tˉz 2 z ˉz 2 1 α α (I T)z (I T)ˉz 2. assume z k+1 = Tz k and ˉz = Tˉz, then consequences: z k+1 z k 0 z k+1 ˉz 2 z k ˉz 2 1 α α zk+1 z k 2, boundedness of {z k }, subsequence z k j z weakly (by demiclossedness and monotonicity) z k z weakly and z = Tz z k+1 z k 2 = o(1/k) (Davis-Y 15) 52 / 60
53 Examples A is monotone J γa := (I + γa) 1 is (1/2)-averaged 2 R γa = 2J γa I is nonexpansive T is nonexpansive T 1 = (1 α)i + αt is α-averaged, α (0, 1) A is β-cocoercive F γa := I γa is (1 γ 2β )-averaged Baillon-Haddad: if f is convex, f is 1 γ -Lipschitz I γ f is (1 )-averaged β 2β 2 also known as firmly nonexpansive 53 / 60
54 Composition properties T 1, T 2 are nonexpansive T 1 T 2 is nonexpansive T 1, T 2 are averaged T 1 T 2 is averaged 54 / 60
55 Consequences assume A, B are monotone B is β-cocoercive, γ (0, 2β) FBS: J γa F γb is averaged PRS: (2J γa I ) (2J γb I ) is nonexpansive DRS: 1 I + 1 (2J 2 2 γa I ) (2J γb I ) is (1/2)-averaged 55 / 60
56 A three-operator splitting scheme require: A, B maximally monotone, C cocoercive Davis and Yin 15: T DYS := I J γb + J γa (2J γb I γc J γb ) (evaluating T 3z will evaluate J γa, J γb, and C only once each) reduces to BFS if A = 0, FBS if B = 0, and to DRS if C = 0 equivalent conditions: 0 Ax + Bx + Cx z = T 3(z), x = J γb z 56 / 60
57 Abstraction: KM (Krasnosel skiĭ Mann) iteration require: T is nonexpansive (1-Lipschitz) choose λ > 0 so that T λ = (1 λ) + λt is an averaged operator KM iteration: z k+1 = T λ z k special cases: J γa, (I γa), T FBS, T BFS, T DRS, TPRS, λ and T DYS (the step-size of any cocoercive map therein must be bounded) convergence: if FixT, then z k z FixT. divergence: if FixT =, then (z k ) k 0 goes unbounded. 57 / 60
58 Open question Find an operator-splitting scheme for 0 (T T m)x, m 4. require: no use of auxiliary variable convergence is guaranteed under monotonic T i s 58 / 60
59 Summary monotone operator splitting is a set of powerful and elegant tools for many problems in signal processing, machine learning, computer vision, etc. they give rise to parallel, distributed, and decentralized algorithms under the hood: fixed-point and nonexpansive-operator theory not covered: the convergence rates of objective error: f k f point error: z k z 2 accelerated rates by averaging and extrapolation 59 / 60
60 Thank you! References: H. Bauschke and P. Combettes. Convex analysis and monotone operator theory in Hilbert spaces. Springer, N. Komodakis and J.-C. Pasquet. Playing with duality: an overview of recent primal-dual approaches for solving large-scale optimization problems. IEEE Signal Processing Magazine, D. Davis and Y, Convergence rate analysis of several splitting schemes, UCLA CAM 14-51, D. Davis and Y, A three-operator splitting method and its acceleration, UCLA CAM 15-13, / 60
Math 273a: Optimization Overview of First-Order Optimization Algorithms
Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization
More informationCoordinate Update Algorithm Short Course Operator Splitting
Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators
More informationTight Rates and Equivalence Results of Operator Splitting Schemes
Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1
More informationARock: an algorithmic framework for asynchronous parallel coordinate updates
ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,
More informationConvergence of Fixed-Point Iterations
Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and
More informationSplitting methods for decomposing separable convex programs
Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques
More informationCoordinate Update Algorithm Short Course Proximal Operators and Algorithms
Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow
More informationSplitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches
Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques
More informationProximal splitting methods on convex problems with a quadratic term: Relax!
Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.
More informationA Primal-dual Three-operator Splitting Scheme
Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm
More informationARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates
ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin May 3, 216 Abstract Finding a fixed point to a nonexpansive operator, i.e., x = T
More informationARock: an Algorithmic Framework for Async-Parallel Coordinate Updates
ARock: an Algorithmic Framework for Async-Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan Wotao Yin July 7, 215 The problem of finding a fixed point to a nonexpansive operator is an abstraction
More informationSparse Optimization Lecture: Dual Methods, Part I
Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration
More informationIterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem
Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences
More informationAsynchronous Parallel Computing in Signal Processing and Machine Learning
Asynchronous Parallel Computing in Signal Processing and Machine Learning Wotao Yin (UCLA Math) joint with Zhimin Peng (UCLA), Yangyang Xu (IMA), Ming Yan (MSU) Optimization and Parsimonious Modeling IMA,
More informationPrimal-dual algorithms for the sum of two and three functions 1
Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms
More informationarxiv: v4 [math.oc] 29 Jan 2018
Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:
More informationAbout Split Proximal Algorithms for the Q-Lasso
Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S
More informationA Tutorial on Primal-Dual Algorithm
A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.
More informationAsynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones
Asynchronous Algorithms for Conic Programs, including Optimal, Infeasible, and Unbounded Ones Wotao Yin joint: Fei Feng, Robert Hannah, Yanli Liu, Ernest Ryu (UCLA, Math) DIMACS: Distributed Optimization,
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More informationA New Primal Dual Algorithm for Minimizing the Sum of Three Functions with a Linear Operator
https://doi.org/10.1007/s10915-018-0680-3 A New Primal Dual Algorithm for Minimizing the Sum of Three Functions with a Linear Operator Ming Yan 1,2 Received: 22 January 2018 / Accepted: 22 February 2018
More informationPrimal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions
Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function
More informationAccelerated primal-dual methods for linearly constrained convex problems
Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize
More informationM. Marques Alves Marina Geremia. November 30, 2017
Iteration complexity of an inexact Douglas-Rachford method and of a Douglas-Rachford-Tseng s F-B four-operator splitting method for solving monotone inclusions M. Marques Alves Marina Geremia November
More informationRecent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables
Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop
More informationCoordinate Update Algorithm Short Course Subgradients and Subgradient Methods
Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationDistributed Consensus Optimization
Distributed Consensus Optimization Ming Yan Michigan State University, CMSE/Mathematics September 14, 2018 Decentralized-1 Backgroundwhy andwe motivation need decentralized optimization? I Decentralized
More informationConvergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization
Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we
More informationMath 273a: Optimization Convex Conjugacy
Math 273a: Optimization Convex Conjugacy Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Convex conjugate (the Legendre transform) Let f be a closed proper
More informationminimize x subject to (x 2)(x 4) u,
Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for
More informationThis can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization
This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationDistributed Convex Optimization
Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens
More informationPrimal-dual coordinate descent
Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x
More informationA General Framework for a Class of Primal-Dual Algorithms for TV Minimization
A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method
More informationMath 273a: Optimization Lagrange Duality
Math 273a: Optimization Lagrange Duality Instructor: Wotao Yin Department of Mathematics, UCLA Winter 2015 online discussions on piazza.com Gradient descent / forward Euler assume function f is proper
More informationAdaptive Primal Dual Optimization for Image Processing and Learning
Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University
More informationAccelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)
Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan
More informationA New Use of Douglas-Rachford Splitting and ADMM for Classifying Infeasible, Unbounded, and Pathological Conic Programs
A New Use of Douglas-Rachford Splitting and ADMM for Classifying Infeasible, Unbounded, and Pathological Conic Programs Yanli Liu, Ernest K. Ryu, and Wotao Yin May 23, 2017 Abstract In this paper, we present
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING
ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely
More informationMath 273a: Optimization Subgradient Methods
Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R
More informationSparse Regularization via Convex Analysis
Sparse Regularization via Convex Analysis Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York, USA 29 / 66 Convex or non-convex: Which
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationOn the order of the operators in the Douglas Rachford algorithm
On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationOn the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting
Mathematical Programming manuscript No. (will be inserted by the editor) On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Daniel O Connor Lieven Vandenberghe
More informationContraction Methods for Convex Optimization and monotone variational inequalities No.12
XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department
More informationOn the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems
On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the
More informationOn the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,
Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,
More informationOptimality, Duality, Complementarity for Constrained Optimization
Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear
More informationADMM for monotone operators: convergence analysis and rates
ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider
More informationFirst-order methods for structured nonsmooth optimization
First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More information9. Dual decomposition and dual algorithms
EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationBeyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory
Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More informationarxiv: v1 [math.oc] 12 Mar 2013
On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February
More informationA generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces
Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4890 4900 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A generalized forward-backward
More informationSelf Equivalence of the Alternating Direction Method of Multipliers
Self Equivalence of the Alternating Direction Method of Multipliers Ming Yan Wotao Yin August 11, 2014 Abstract In this paper, we show interesting self equivalence results for the alternating direction
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationA Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection
EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est
More informationComplexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators
Complexity of the relaxed Peaceman-Rachford splitting method for the sum of two maximal strongly monotone operators Renato D.C. Monteiro, Chee-Khian Sim November 3, 206 Abstract This paper considers the
More informationWHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????
DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0
More informationBias-free Sparse Regression with Guaranteed Consistency
Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March
More informationarxiv: v2 [math.oc] 2 Mar 2017
CYCLIC COORDINATE UPDATE ALGORITHMS FOR FIXED-POINT PROBLEMS: ANALYSIS AND APPLICATIONS YAT TIN CHOW, TIANYU WU, AND WOTAO YIN arxiv:6.0456v [math.oc] Mar 07 Abstract. Many problems reduce to the fixed-point
More informationMonotone Operator Splitting Methods in Signal and Image Recovery
Monotone Operator Splitting Methods in Signal and Image Recovery P.L. Combettes 1, J.-C. Pesquet 2, and N. Pustelnik 3 2 Univ. Pierre et Marie Curie, Paris 6 LJLL CNRS UMR 7598 2 Univ. Paris-Est LIGM CNRS
More informationCoordinate Update Algorithm Short Course The Package TMAC
Coordinate Update Algorithm Short Course The Package TMAC Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 16 TMAC: A Toolbox of Async-Parallel, Coordinate, Splitting, and Stochastic Methods C++11 multi-threading
More informationLecture: Algorithms for Compressed Sensing
1/56 Lecture: Algorithms for Compressed Sensing Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationLOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley
LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange
More informationAsynchronous Non-Convex Optimization For Separable Problem
Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent
More informationI P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION
I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1
More informationAlternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization
Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote
More informationDouglas-Rachford Splitting: Complexity Estimates and Accelerated Variants
53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationDuality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and
More informationDistributed Computation of Quantiles via ADMM
1 Distributed Computation of Quantiles via ADMM Franck Iutzeler Abstract In this paper, we derive distributed synchronous and asynchronous algorithms for computing uantiles of the agents local values.
More informationDoes Alternating Direction Method of Multipliers Converge for Nonconvex Problems?
Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient
More informationOn convergence rate of the Douglas-Rachford operator splitting method
On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford
More informationApproaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping
March 0, 206 3:4 WSPC Proceedings - 9in x 6in secondorderanisotropicdamping206030 page Approaching monotone inclusion problems via second order dynamical systems with linear and anisotropic damping Radu
More informationSecond order forward-backward dynamical systems for monotone inclusion problems
Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from
More informationON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS
ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in
More informationCYCLIC COORDINATE-UPDATE ALGORITHMS FOR FIXED-POINT PROBLEMS: ANALYSIS AND APPLICATIONS
SIAM J. SCI. COMPUT. Vol. 39, No. 4, pp. A80 A300 CYCLIC COORDINATE-UPDATE ALGORITHMS FOR FIXED-POINT PROBLEMS: ANALYSIS AND APPLICATIONS YAT TIN CHOW, TIANYU WU, AND WOTAO YIN Abstract. Many problems
More informationConvex Optimization Notes
Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =
More informationIn collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29
A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014
More informationCoordinate Descent and Ascent Methods
Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:
More informationA GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION
A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient (PDHG) algorithm proposed
More informationALADIN An Algorithm for Distributed Non-Convex Optimization and Control
ALADIN An Algorithm for Distributed Non-Convex Optimization and Control Boris Houska, Yuning Jiang, Janick Frasch, Rien Quirynen, Dimitris Kouzoupis, Moritz Diehl ShanghaiTech University, University of
More informationSignal Processing and Networks Optimization Part VI: Duality
Signal Processing and Networks Optimization Part VI: Duality Pierre Borgnat 1, Jean-Christophe Pesquet 2, Nelly Pustelnik 1 1 ENS Lyon Laboratoire de Physique CNRS UMR 5672 pierre.borgnat@ens-lyon.fr,
More informationVariational Image Restoration
Variational Image Restoration Yuling Jiao yljiaostatistics@znufe.edu.cn School of and Statistics and Mathematics ZNUFE Dec 30, 2014 Outline 1 1 Classical Variational Restoration Models and Algorithms 1.1
More information