Constrained optimization

Size: px
Start display at page:

Download "Constrained optimization"

Transcription

1 Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis Carlos Fernandez-Granda

2 Compressed sensing Convex constrained problems Analyzing optimization-based methods

3 Magnetic resonance imaging 2D DFT (magnitude) 2D DFT (log. of magnitude)

4 Magnetic resonance imaging Data: Samples from spectrum Problem: Sampling is time consuming (annoying, kids move... ) Images are compressible (sparse in wavelet basis) Can we recover compressible signals from less data?

5 2D wavelet transform

6 2D wavelet transform

7 Full DFT matrix

8 Full DFT matrix

9 Regular subsampling

10 Regular subsampling

11 Random subsampling

12 Random subsampling

13 Toy example

14 Regular subsampling

15 Random subsampling

16 Linear inverse problems Linear inverse problem A x = y Linear measurements, A R m n y[i] = A i:, x, 1 i n, Aim: Recover data signal x R m from data y R n We need n m, otherwise the problem is underdetermined If n < m there are infinite solutions x + w where w null (A)

17 Sparse recovery Aim: Recover sparse x from linear measurements When is the problem well posed? A x = y There shouldn t be two sparse vectors x 1 and x 2 such that A x 1 = A x 2

18 Spark The spark of a matrix is the smallest subset of columns that is linearly dependent

19 Spark The spark of a matrix is the smallest subset of columns that is linearly dependent Let y := A x, where A R m n, y R n and x R m is a sparse vector with s nonzero entries The vector x is the only vector with sparsity s consistent with the data, i.e. it is the solution of for any choice of x if and only if min x x 0 subject to A x = y spark (A) > 2s

20 Proof Equivalent statements For any x, x is the only vector with sparsity s consistent with the data

21 Proof Equivalent statements For any x, x is the only vector with sparsity s consistent with the data For any pair of s-sparse vectors x 1 and x 2 A ( x 1 x 2 ) 0

22 Proof Equivalent statements For any x, x is the only vector with sparsity s consistent with the data For any pair of s-sparse vectors x 1 and x 2 A ( x 1 x 2 ) 0 For any pair of subsets of s indices T 1 and T 2 A T1 T 2 α 0 for any α R T 1 T 2

23 Proof Equivalent statements For any x, x is the only vector with sparsity s consistent with the data For any pair of s-sparse vectors x 1 and x 2 A ( x 1 x 2 ) 0 For any pair of subsets of s indices T 1 and T 2 A T1 T 2 α 0 for any α R T 1 T 2 All submatrices with at most 2s columns have no nonzero vectors in their null space

24 Proof Equivalent statements For any x, x is the only vector with sparsity s consistent with the data For any pair of s-sparse vectors x 1 and x 2 A ( x 1 x 2 ) 0 For any pair of subsets of s indices T 1 and T 2 A T1 T 2 α 0 for any α R T 1 T 2 All submatrices with at most 2s columns have no nonzero vectors in their null space All submatrices with at most 2s columns are full rank

25 Restricted-isometry property Robust version of spark If two s-sparse vectors x 1, x 2 are far, then A x 1, A x 2 should be far The linear operator should preserve distances (be an isometry) when restricted to act upon sparse vectors

26 Restricted-isometry property A satisfies the restricted isometry property (RIP) with constant κ s if for any s-sparse vector x (1 κ s ) x 2 A x 2 (1 + κ s ) x 2 If A satisfies the RIP for a sparsity level 2s then for any s-sparse x 1, x 2 y 2 y 1 2

27 Restricted-isometry property A satisfies the restricted isometry property (RIP) with constant κ s if for any s-sparse vector x (1 κ s ) x 2 A x 2 (1 + κ s ) x 2 If A satisfies the RIP for a sparsity level 2s then for any s-sparse x 1, x 2 y 2 y 1 2 = A ( x 1 x 2 ) (1 κ 2s ) x 2 x 1 2

28 Regular subsampling

29 Regular subsampling

30 Correlation with column Correlation

31 Random subsampling

32 Random subsampling

33 Correlation with column Correlation

34 Restricted-isometry property Deterministic matrices tend to not satisfy the RIP It is NP-hard to check if spark or RIP hold Random matrices satisfy RIP with high probability We prove it for Gaussian iid matrices, ideas in proof for random Fourier matrices are similar

35 Restricted-isometry property for Gaussian matrices Let A R m n be a random matrix with iid standard Gaussian entries 1 m A satisfies the RIP for a constant κ s with probability 1 C 2 n m C 1s κ 2 s for two fixed constants C 1, C 2 > 0 ( n ) log s as long as

36 Proof For a fixed support T of size s bounds follow from bounds on singular values of Gaussian matrices

37 Singular values of m s matrix, s = 100 σi m m/s i

38 Singular values of m s matrix, s = 1000 σi m m/s ,000 i

39 Proof For a fixed submatrix the singular values are bounded by m (1 κs ) σ s σ 1 m (1 + κ s ) with probability at least ( ) 12 s ( ) 1 2 exp mκ2 s 32 κ s For any vector x with support T 1 κs x 2 1 m A x κ s x 2

40 Union bound For any events S 1, S 2,..., S n in a probability space P ( i S i ) n P (S i ). i=1

41 Proof Number of different supports of size s

42 Proof Number of different supports of size s ( ) n ( en s s ) s

43 Proof Number of different supports of size s ( ) n ( en s s ) s By the union bound 1 κs x 2 1 m A x κ s x 2 holds for any s-sparse vector x with probability at least ( en ) ( ) s 12 s ( ) 1 2 exp mκ2 s s 32 κ s ( = 1 exp log 2 + s + s log 1 C 2 n as long as ( n s ) + s log m C 1s κ 2 s ( 12 κ s ( n ) log s ) mκ2 s 2 )

44 Sparse recovery via l 1 -norm minimization l 0 - norm" minimization is intractable (As usual) we can minimize l 1 norm instead, estimate x l1 to is the solution min x x 1 subject to A x = y

45 Minimum l 2 -norm solution (regular subsampling) 4 True Signal Minimum L2 Solution

46 Minimum l 1 -norm solution (regular subsampling) 4 True Signal Minimum L1 Solution

47 Minimum l 2 -norm solution (random subsampling) 4 True Signal Minimum L2 Solution

48 Minimum l 1 -norm solution (random subsampling) 4 True Signal Minimum L1 Solution

49 Geometric intuition l 2 norm l 1 norm

50 Sparse recovery via l 1 -norm minimization If the signal is sparse in a transform domain then min x c 1 subject to AW c = y If we want to recover the original c then AW should satisfy the RIP

51 Sparse recovery via l 1 -norm minimization If the signal is sparse in a transform domain then min x c 1 subject to AW c = y If we want to recover the original c then AW should satisfy the RIP However, we might be fine with any c such that A c = x

52 Regular subsampling

53 Minimum l 2 -norm solution (regular subsampling)

54 Minimum l 1 -norm solution (regular subsampling)

55 Random subsampling

56 Minimum l 2 -norm solution (random subsampling)

57 Minimum l 1 -norm solution (random subsampling)

58 Compressed sensing Convex constrained problems Analyzing optimization-based methods

59 Convex sets A convex set S is any set such that for any x, y S and θ (0, 1) θ x + (1 θ) y S The intersection of convex sets is convex

60 Convex vs nonconvex Nonconvex Convex

61 Epigraph epi (f ) f A function is convex if and only if its epigraph is convex

62 Projection onto convex set The projection of any vector x onto a non-empty closed convex set S exists and is unique P S ( x) := arg min y S x y 2

63 Proof Assume there are two distinct projections y 1 y 2 Consider y belongs to S (why?) y := y 1 + y 2 2

64 Proof x y, y 1 y = x y 1 + y 2 2 x y1 = 2, y 1 y 1 + y x y 2, x y 1 x y

65 Proof x y, y 1 y = x y 1 + y 2, y 1 y 1 + y x y1 = + x y 2, x y 1 x y = 1 ( x y x y 2 2) 4 = 0

66 Proof x y, y 1 y = x y 1 + y 2, y 1 y 1 + y x y1 = + x y 2, x y 1 x y = 1 ( x y x y 2 2) 4 By Pythagoras theorem = 0 x y 1 2 2

67 Proof x y, y 1 y = x y 1 + y 2, y 1 y 1 + y x y1 = + x y 2, x y 1 x y = 1 ( x y x y 2 2) 4 By Pythagoras theorem = 0 x y = x y y 1 y 2 2

68 Proof x y, y 1 y = x y 1 + y 2, y 1 y 1 + y x y1 = + x y 2, x y 1 x y = 1 ( x y x y 2 2) 4 By Pythagoras theorem = 0 x y = x y y 1 y 2 2 = x y y 1 y

69 Proof x y, y 1 y = x y 1 + y 2, y 1 y 1 + y x y1 = + x y 2, x y 1 x y = 1 ( x y x y 2 2) 4 By Pythagoras theorem = 0 x y = x y y 1 y 2 2 = x y y 1 y > x y 2 2 2

70 Convex combination Given n vectors x 1, x 2,..., x n R n, x := n θ i x i i=1 is a convex combination of x 1, x 2,..., x n if θ i 0, 1 i n n θ i = 1 i=1

71 Convex hull The convex hull of S is the set of convex combinations of points in S The l 1 -norm ball is the convex hull of the intersection between the l 0 norm" ball and the l -norm ball

72 l 1 -norm ball

73 B l1 C (B l0 B l ) Let x B l1 Set θ i := x[i], θ 0 = 1 n i=1 θ i n i=0 θ i = 1 by construction, θ i 0 and n+1 θ 0 = 1 i=1 θ i = 1 x 1 0 because x B l1

74 B l1 C (B l0 B l ) Let x B l1 Set θ i := x[i], θ 0 = 1 n i=1 θ i n i=0 θ i = 1 by construction, θ i 0 and n+1 θ 0 = 1 i=1 θ i = 1 x 1 0 because x B l1 x B l0 B l because x = n θ i sign ( x[i]) e i + θ 0 0 i=1

75 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1

76 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1 x 1

77 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1 x 1 m θ i y i 1 i=1 by the Triangle inequality

78 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1 x 1 m θ i y i 1 i=1 m θ i y i i=1 by the Triangle inequality y i only has one nonzero entry

79 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1 x 1 m θ i y i 1 i=1 m θ i y i i=1 m i=1 θ i by the Triangle inequality y i only has one nonzero entry

80 C (B l0 B l ) B l1 Let x C (B l0 B l ), then x = m θ i y i i=1 x 1 1 m θ i y i 1 i=1 m θ i y i i=1 m i=1 θ i by the Triangle inequality y i only has one nonzero entry

81 Convex optimization problem f 0, f 1,..., f m, h 1,..., h p : R n R minimize f 0 ( x) subject to f i ( x) 0, 1 i m, h i ( x) = 0, 1 i p,

82 Definitions A feasible vector is a vector that satisfies all the constraints A solution is any vector x such that for all feasible vectors x f 0 ( x) f 0 ( x ) If a solution exists f ( x ) is the optimal value or optimum of the problem

83 Convex optimization problem The optimization problem is convex if f 0 is convex f 1,..., f m are convex h 1,..., h p are affine, i.e. h i ( x) = a i T x + b i for some a i R n and b i R

84 Linear program minimize subject to a T x c T i x d i, 1 i m A x = b

85 l 1 -norm minimization as an LP The optimization problem minimize x 1 subject to A x = b can be recast as the LP minimize subject to m t[i] i=1 t[i] e T i x t[i] e T i x A x = b

86 Proof Solution to l 1 -norm min. problem: x l 1 Solution to linear program: ( x lp, t lp) Set t l 1 [i] := x l 1 [i] ( x l 1, t l ) 1 is feasible for linear program x l 1 = 1 m t l 1 [i] i=1

87 Proof Solution to l 1 -norm min. problem: x l 1 Solution to linear program: ( x lp, t lp) Set t l 1 [i] := x l 1 [i] ( x l 1, t l ) 1 is feasible for linear program x l 1 = 1 m t l 1 [i] i=1 m t lp [i] i=1 by optimality of t lp

88 Proof Solution to l 1 -norm min. problem: x l 1 Solution to linear program: ( x lp, t lp) Set t l 1 [i] := x l 1 [i] ( x l 1, t l ) 1 is feasible for linear program x l 1 m = t l 1 [i] 1 i=1 m t lp [i] i=1 x lp 1 by optimality of t lp

89 Proof Solution to l 1 -norm min. problem: x l 1 Solution to linear program: ( x lp, t lp) Set t l 1 [i] := x l 1 [i] ( x l 1, t l ) 1 is feasible for linear program x l 1 m = t l 1 [i] 1 i=1 m t lp [i] i=1 x lp 1 by optimality of t lp x lp is a solution to the l 1 -norm min. problem

90 Proof Set t l 1 [i] := x l 1 [i] m i=1 t l 1 i = x l 1 1

91 Proof Set t l 1 [i] := x l 1 [i] m i=1 t l 1 i = x l 1 1 x lp 1 by optimality of x l 1

92 Proof Set t l 1 [i] := x l 1 [i] m i=1 t l 1 i = x l 1 1 x lp 1 by optimality of x l 1 m t lp [i] i=1

93 Proof Set t l 1 [i] := x l 1 [i] m i=1 t l 1 i = x l 1 1 x lp 1 by optimality of x l 1 m t lp [i] i=1 ( x l 1, t l ) 1 is a solution to the linear problem

94 Quadratic program For a positive semidefinite matrix Q R n n minimize subject to x T Q x + a T x c T i x d i, 1 i m, A x = b

95 l 1 -norm regularized least squares as a QP The optimization problem can be recast as the QP minimize A x y α x 1 minimize subject to x T A T A x 2 y T x + α t[i] e T i x t[i] e T i x n t[i] i=1

96 Lagrangian The Lagrangian of a canonical optimization problem is m p L ( x, α, ν) := f 0 ( x) + α[i] f i ( x) + ν[j] h j ( x), i=1 j=1 α R m, ν R p are called Lagrange multipliers or dual variables If x is feasible and α[i] 0 for 1 i m L ( x, α, ν) f 0 ( x)

97 Lagrange dual function The Lagrange dual function of the problem is l ( α, ν) := inf x R n f 0 ( x) + m α[i]f i ( x) + i=1 Let p be an optimum of the optimization problem as long as α[i] 0 for 1 i n l ( α, ν) p p ν[j]h j ( x) j=1

98 Dual problem The dual problem of the (primal) optimization problem is maximize l ( α, ν) subject to α[i] 0, 1 i m. The dual problem is always convex, even if the primal isn t!

99 Maximum/supremum of convex functions Pointwise maximum of m convex functions f 1,..., f m is convex f max (x) := max 1 i m f i (x) Pointwise supremum of a family of convex functions indexed by a set I is convex f sup (x) := sup f i (x) i I

100 Proof For any 0 θ 1 and any x, y R, f sup (θ x + (1 θ) y) = sup f i (θ x + (1 θ) y) i I

101 Proof For any 0 θ 1 and any x, y R, f sup (θ x + (1 θ) y) = sup f i (θ x + (1 θ) y) i I sup θf i ( x) + (1 θ) f i ( y) i I by convexity of the f i

102 Proof For any 0 θ 1 and any x, y R, f sup (θ x + (1 θ) y) = sup f i (θ x + (1 θ) y) i I sup θf i ( x) + (1 θ) f i ( y) i I θ sup i I f i ( x) + (1 θ) sup f j ( y) j I by convexity of the f i

103 Proof For any 0 θ 1 and any x, y R, f sup (θ x + (1 θ) y) = sup f i (θ x + (1 θ) y) i I sup θf i ( x) + (1 θ) f i ( y) i I θ sup i I f i ( x) + (1 θ) sup f j ( y) j I = θf sup ( x) + (1 θ) f sup ( y) by convexity of the f i

104 Weak duality If p is a primal optimum and d a dual optimum d p

105 Strong duality For convex problems d = p under very weak conditions LPs: The primal optimum is finite General convex programs (Slater s condition): There exists a point that is strictly feasible f i ( x) < 0 1 i m

106 l 1 -norm minimization The dual problem of min x x 1 subject to A x = y is max y T ν ν subject to A T ν 1

107 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y

108 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function If A T ν[i] > 1? l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y

109 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y If A T ν[i] > 1? We can set x[i] and l ( α, ν)

110 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y If A T ν[i] > 1? We can set x[i] and l ( α, ν) If A T ν 1? (A T ν) T x

111 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y If A T ν[i] > 1? We can set x[i] and l ( α, ν) If A T ν 1? (A T ν) T A x x T 1 ν x 1

112 Proof Lagrangian L ( x, ν) = x 1 + ν T ( y A x) Lagrange dual function l ( α, ν) := inf x R n x 1 (AT ν) T x + ν T y If A T ν[i] > 1? We can set x[i] and l ( α, ν) If A T ν 1? so l ( α, ν) = ν T y (A T ν) T A x x T 1 ν x 1

113 Strong duality The solution ν to satisfies max y T ν ν subject to A T ν 1 (A T ν )[i]= sign( x [i]) for all x [i] 0 for all solutions x to the primal problem min x x 1 subject to A x = y

114 Dual solution True Signal Sign Minimum L1 Dual Solution

115 Proof By strong duality By Hölder s inequality x 1 = y T ν = (A x ) T ν = ( x ) T (A T ν ) m = (A T ν )[i] x [i] i=1 x 1 m (A T ν )[i] x [i] i=1 with equality if and only if (A T ν )[i] = sign( x [i]) for all x [i] 0

116 Another algorithm for sparse recovery Aim: Find nonzero locations of a sparse vector x from y = A x Insight: We have access to inner products of x and A T w for any w y T w

117 Another algorithm for sparse recovery Aim: Find nonzero locations of a sparse vector x from y = A x Insight: We have access to inner products of x and A T w for any w y T w = (A x) T w

118 Another algorithm for sparse recovery Aim: Find nonzero locations of a sparse vector x from y = A x Insight: We have access to inner products of x and A T w for any w y T w = (A x) T w = x T (A T w)

119 Another algorithm for sparse recovery Aim: Find nonzero locations of a sparse vector x from y = A x Insight: We have access to inner products of x and A T w for any w y T w = (A x) T w = x T (A T w) Idea: Maximize A T w, bounding magnitude of entries by 1

120 Another algorithm for sparse recovery Aim: Find nonzero locations of a sparse vector x from y = A x Insight: We have access to inner products of x and A T w for any w y T w = (A x) T w = x T (A T w) Idea: Maximize A T w, bounding magnitude of entries by 1 Entries where x is nonzero should saturate to 1 or -1

121 Compressed sensing Convex constrained problems Analyzing optimization-based methods

122 Analyzing optimization-based methods Best case scenario: Primal solution has closed form Otherwise: Use dual solution to characterize primal solution

123 Minimum l 2 -norm solution Let A R m n be a full rank matrix such that m < n For any y R n the solution to the optimization problem is arg min x x 2 subject to A x = y. x := VS 1 U T y where A = USV T is the SVD of A = A T ( A T A) 1 y

124 Proof x = P row(a) x + P row(a) x Since A is full rank V, P row(a) x = V c for some vector c R n A x = AP row(a) x

125 Proof x = P row(a) x + P row(a) x Since A is full rank V, P row(a) x = V c for some vector c R n A x = AP row(a) x = USV T V c

126 Proof x = P row(a) x + P row(a) x Since A is full rank V, P row(a) x = V c for some vector c R n A x = AP row(a) x = USV T V c = US c

127 Proof x = P row(a) x + P row(a) x Since A is full rank V, P row(a) x = V c for some vector c R n A x = AP row(a) x = USV T V c = US c A x = y is equivalent to US c = y and c = S 1 U T y

128 Proof For all feasible vectors x P row(a) x = VS 1 U T y By Pythagoras theorem, minimizing x 2 is equivalent to minimizing x 2 2 = Prow(A) x Prow(A) x 2 2

129 Regular subsampling

130 Minimum l 2 -norm solution (regular subsampling) 4 True Signal Minimum L2 Solution

131 Regular subsampling A := 1 2 [ Fm/2 F m/2 ] F m/2 F m/2 = I F m/2 F m/2 = I x := [ xup x down ]

132 Regular subsampling x l2 = arg min A x= y x 2

133 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y

134 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y [ ] ( [ ]) 1 = 1 F m/2 1 [ ] 1 F 2 Fm/2 Fm/2 F m/2 2 m/2 1 [ 2 2 Fm/2 Fm/2 ] [ ] x F up m/2 x down

135 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y [ ] ( [ ]) 1 = 1 F m/2 1 [ ] 1 F 2 Fm/2 Fm/2 F m/2 2 m/2 1 [ 2 2 Fm/2 Fm/2 [ ] ) 1 F m/2 Fm/2 + F m/2fm/2 ( ) Fm/2 x up + F m/2 x down [ ] = 1 F (1 m/2 2 2 F m/2 ] [ ] x F up m/2 x down

136 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y [ ] ( [ ]) 1 = 1 F m/2 1 [ ] 1 F 2 Fm/2 Fm/2 F m/2 2 m/2 1 [ 2 2 Fm/2 Fm/2 [ ] ) 1 F m/2 Fm/2 + F m/2fm/2 ( ) Fm/2 x up + F m/2 x down [ ] = 1 F (1 m/2 2 Fm/2 2 [ ] = 1 F m/2 I 1 ( ) F m/2 x up + F m/2 x down 2 F m/2 ] [ ] x F up m/2 x down

137 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y [ ] ( [ ]) 1 = 1 F m/2 1 [ ] 1 F 2 Fm/2 Fm/2 F m/2 2 m/2 1 [ 2 2 Fm/2 Fm/2 [ ] ) 1 F m/2 Fm/2 + F m/2fm/2 ( ) Fm/2 x up + F m/2 x down [ ] = 1 F (1 m/2 2 Fm/2 2 [ ] = 1 F m/2 2 Fm/2 I 1 ( ) F m/2 x up + F m/2 x down [ = 1 ( )] F m/2 Fm/2 x up + F m/2 x ( down ) 2 Fm/2 x up + F m/2 x down F m/2 ] [ ] x F up m/2 x down

138 Regular subsampling x l2 = arg min A x= y x 2 = A T ( A T A ) 1 y [ ] ( [ ]) 1 = 1 F m/2 1 [ ] 1 F 2 Fm/2 Fm/2 F m/2 2 m/2 1 [ 2 2 Fm/2 Fm/2 [ ] ) 1 F m/2 Fm/2 + F m/2fm/2 ( ) Fm/2 x up + F m/2 x down [ ] = 1 F (1 m/2 2 Fm/2 2 [ ] = 1 F m/2 2 Fm/2 I 1 ( ) F m/2 x up + F m/2 x down [ = 1 ( )] F m/2 Fm/2 x up + F m/2 x ( down ) 2 Fm/2 Fm/2 x up + F m/2 x down = 1 [ ] xup + x down 2 x up + x down ] [ ] x F up m/2 x down

139 Minimum l 1 -norm solution Problem: arg min A x= y x 1 doesn t have a closed form Instead we can use a dual variable to certify optimality

140 Dual solution The solution ν to satisfies max y T ν ν subject to A T ν 1 (A T ν )[i]= sign( x [i]) for all x [i] 0 where x [i] is a solution to the primal problem min x x 1 subject to A x = y

141 Dual certificate If there exists a vector ν R n such that (A T ν)[i] = sign( x [i]) if x [i] 0 (A T ν)[i] <1 if x [i] = 0 then x is the unique solution to the primal problem min x x 1 subject to A x = y as long as the submatrix A T is full rank

142 Proof 1 ν is feasible for the dual problem, so for any primal feasible x x 1 y T ν

143 Proof 1 ν is feasible for the dual problem, so for any primal feasible x x 1 y T ν = (A x ) T ν

144 Proof 1 ν is feasible for the dual problem, so for any primal feasible x x 1 y T ν = (A x ) T ν = ( x ) T (A T ν)

145 Proof 1 ν is feasible for the dual problem, so for any primal feasible x x 1 y T ν = (A x ) T ν = ( x ) T (A T ν) = x [i] sign( x [i]) i T

146 Proof 1 ν is feasible for the dual problem, so for any primal feasible x x 1 y T ν = (A x ) T ν = ( x ) T (A T ν) = x [i] sign( x [i]) i T x must be a solution = x 1

147 Proof 2 A T ν is a subgradient of the l 1 norm at x For any other feasible vector x x 1 x 1 + (A T ν) T ( x x )

148 Proof 2 A T ν is a subgradient of the l 1 norm at x For any other feasible vector x x 1 x 1 + (A T ν) T ( x x ) = x 1 + ν T (A x A x )

149 Proof 2 A T ν is a subgradient of the l 1 norm at x For any other feasible vector x x 1 x 1 + (A T ν) T ( x x ) = x 1 + ν T (A x A x ) = x 1

150 Random subsampling

151 Minimum l 1 -norm solution (random subsampling) 4 True Signal Minimum L1 Solution

152 Exact sparse recovery via l 1 -norm minimization Assumption: There exists a signal x R m with s nonzeros such that A x = y for a random A R m n (random Fourier, Gaussian iid,... ) Exact recovery: If the number of measurements satisfies m C s log n the solution of the problem minimize x 1 subject to A x = y is the original signal with probability at least 1 1 n

153 Proof Show that dual certificate always exists We need A T T ν = sign( x T ) A T T c ν < 1 s constraints Idea: Impose A T ν = sign( x ) and minimize A T T c ν Problem: No closed-form solution How about minimizing l 2 norm?

154 Proof of exact recovery Prove that dual certificate exists for any s-sparse x Dual certificate candidate: Solution of minimize v 2 subject to A T T v = sign ( x T ) Closed-form solution ν l2 := A T ( A T T A T ) 1 sign ( x T ) A T T A T is invertible with high probability We need to prove that A T ν l2 satisfies (A T ν l2 ) T c < 1

155 Dual certificate Sign Pattern Dual Function

156 Proof of exact recovery To control (A T ν l2 ) T c, we need to bound for i T c A T i Let w := ( A T T A T ) 1 sign ( x T ) ( A T T A T ) 1 sign ( x T ) A T i w can be bounded using independence Result then follows from union bound

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016 Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use

More information

Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Lecture 3 January 28

Lecture 3 January 28 EECS 28B / STAT 24B: Advanced Topics in Statistical LearningSpring 2009 Lecture 3 January 28 Lecturer: Pradeep Ravikumar Scribe: Timothy J. Wheeler Note: These lecture notes are still rough, and have only

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html Acknowledgement: this slides is based on Prof. Emmanuel Candes and Prof. Wotao Yin

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

Sparse Recovery Beyond Compressed Sensing

Sparse Recovery Beyond Compressed Sensing Sparse Recovery Beyond Compressed Sensing Carlos Fernandez-Granda www.cims.nyu.edu/~cfgranda Applied Math Colloquium, MIT 4/30/2018 Acknowledgements Project funded by NSF award DMS-1616340 Separable Nonlinear

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem: HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course

More information

Lagrangian Duality and Convex Optimization

Lagrangian Duality and Convex Optimization Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization Sparse Optimization Lecture: Dual Certificate in l 1 Minimization Instructor: Wotao Yin July 2013 Note scriber: Zheng Sun Those who complete this lecture will know what is a dual certificate for l 1 minimization

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

EE 227A: Convex Optimization and Applications October 14, 2008

EE 227A: Convex Optimization and Applications October 14, 2008 EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider

More information

Sparse solutions of underdetermined systems

Sparse solutions of underdetermined systems Sparse solutions of underdetermined systems I-Liang Chern September 22, 2016 1 / 16 Outline Sparsity and Compressibility: the concept for measuring sparsity and compressibility of data Minimum measurements

More information

Lecture 7: Convex Optimizations

Lecture 7: Convex Optimizations Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Optimization-based sparse recovery: Compressed sensing vs. super-resolution

Optimization-based sparse recovery: Compressed sensing vs. super-resolution Optimization-based sparse recovery: Compressed sensing vs. super-resolution Carlos Fernandez-Granda, Google Computational Photography and Intelligent Cameras, IPAM 2/5/2014 This work was supported by a

More information

IEOR 265 Lecture 3 Sparse Linear Regression

IEOR 265 Lecture 3 Sparse Linear Regression IOR 65 Lecture 3 Sparse Linear Regression 1 M Bound Recall from last lecture that the reason we are interested in complexity measures of sets is because of the following result, which is known as the M

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Overview. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Overview Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 1/25/2016 Sparsity Denoising Regression Inverse problems Low-rank models Matrix completion

More information

EE364a Review Session 5

EE364a Review Session 5 EE364a Review Session 5 EE364a Review announcements: homeworks 1 and 2 graded homework 4 solutions (check solution to additional problem 1) scpd phone-in office hours: tuesdays 6-7pm (650-723-1156) 1 Complementary

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

A Brief Review on Convex Optimization

A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

More information

15. Conic optimization

15. Conic optimization L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

Semidefinite Programming Basics and Applications

Semidefinite Programming Basics and Applications Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

Towards a Mathematical Theory of Super-resolution

Towards a Mathematical Theory of Super-resolution Towards a Mathematical Theory of Super-resolution Carlos Fernandez-Granda www.stanford.edu/~cfgranda/ Information Theory Forum, Information Systems Laboratory, Stanford 10/18/2013 Acknowledgements This

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given. HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

Linear and non-linear programming

Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector Muhammad Salman Asif Thesis Committee: Justin Romberg (Advisor), James McClellan, Russell Mersereau School of Electrical and Computer

More information

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017

COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University UNDERDETERMINED LINEAR EQUATIONS We

More information

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 2003 2003.09.02.10 6. The Positivstellensatz Basic semialgebraic sets Semialgebraic sets Tarski-Seidenberg and quantifier elimination Feasibility

More information

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form: 0.1 N. L. P. Katta G. Murty, IOE 611 Lecture slides Introductory Lecture NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP does not include everything

More information

Sparse Legendre expansions via l 1 minimization

Sparse Legendre expansions via l 1 minimization Sparse Legendre expansions via l 1 minimization Rachel Ward, Courant Institute, NYU Joint work with Holger Rauhut, Hausdorff Center for Mathematics, Bonn, Germany. June 8, 2010 Outline Sparse recovery

More information

8. Geometric problems

8. Geometric problems 8. Geometric problems Convex Optimization Boyd & Vandenberghe extremal volume ellipsoids centering classification placement and facility location 8 Minimum volume ellipsoid around a set Löwner-John ellipsoid

More information

SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD

SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD EE-731: ADVANCED TOPICS IN DATA SCIENCES LABORATORY FOR INFORMATION AND INFERENCE SYSTEMS SPRING 2016 INSTRUCTOR: VOLKAN CEVHER SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD STRUCTURED SPARSITY

More information

The convex algebraic geometry of rank minimization

The convex algebraic geometry of rank minimization The convex algebraic geometry of rank minimization Pablo A. Parrilo Laboratory for Information and Decision Systems Massachusetts Institute of Technology International Symposium on Mathematical Programming

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as Chapter 8 Geometric problems 8.1 Projection on a set The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as dist(x 0,C) = inf{ x 0 x x C}. The infimum here is always achieved.

More information

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6 Nonlinear Programming 3rd Edition Theoretical Solutions Manual Chapter 6 Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts 1 NOTE This manual contains

More information

Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers

Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers Demixing Sines and Spikes: Robust Spectral Super-resolution in the Presence of Outliers Carlos Fernandez-Granda, Gongguo Tang, Xiaodong Wang and Le Zheng September 6 Abstract We consider the problem of

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Uniqueness Conditions for A Class of l 0 -Minimization Problems

Uniqueness Conditions for A Class of l 0 -Minimization Problems Uniqueness Conditions for A Class of l 0 -Minimization Problems Chunlei Xu and Yun-Bin Zhao October, 03, Revised January 04 Abstract. We consider a class of l 0 -minimization problems, which is to search

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016 Linear Programming Larry Blume Cornell University, IHS Vienna and SFI Summer 2016 These notes derive basic results in finite-dimensional linear programming using tools of convex analysis. Most sources

More information

Statistical Issues in Searches: Photon Science Response. Rebecca Willett, Duke University

Statistical Issues in Searches: Photon Science Response. Rebecca Willett, Duke University Statistical Issues in Searches: Photon Science Response Rebecca Willett, Duke University 1 / 34 Photon science seen this week Applications tomography ptychography nanochrystalography coherent diffraction

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 3: Sparse signal recovery: A RIPless analysis of l 1 minimization Yuejie Chi The Ohio State University Page 1 Outline

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

EE/AA 578, Univ of Washington, Fall Duality

EE/AA 578, Univ of Washington, Fall Duality 7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

3. Linear Programming and Polyhedral Combinatorics

3. Linear Programming and Polyhedral Combinatorics Massachusetts Institute of Technology 18.433: Combinatorial Optimization Michel X. Goemans February 28th, 2013 3. Linear Programming and Polyhedral Combinatorics Summary of what was seen in the introductory

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine CS295: Convex Optimization Xiaohui Xie Department of Computer Science University of California, Irvine Course information Prerequisites: multivariate calculus and linear algebra Textbook: Convex Optimization

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

4. Convex optimization problems

4. Convex optimization problems Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization

More information

Lecture 22: More On Compressed Sensing

Lecture 22: More On Compressed Sensing Lecture 22: More On Compressed Sensing Scribed by Eric Lee, Chengrun Yang, and Sebastian Ament Nov. 2, 207 Recap and Introduction Basis pursuit was the method of recovering the sparsest solution to an

More information

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions A. LINEAR ALGEBRA. CONVEX SETS 1. Matrices and vectors 1.1 Matrix operations 1.2 The rank of a matrix 2. Systems of linear equations 2.1 Basic solutions 3. Vector spaces 3.1 Linear dependence and independence

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives Subgradients subgradients and quasigradients subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE392o, Stanford University Basic inequality recall basic

More information

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation

More information

Near Optimal Signal Recovery from Random Projections

Near Optimal Signal Recovery from Random Projections 1 Near Optimal Signal Recovery from Random Projections Emmanuel Candès, California Institute of Technology Multiscale Geometric Analysis in High Dimensions: Workshop # 2 IPAM, UCLA, October 2004 Collaborators:

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information