Some Recent Advances. in Non-convex Optimization. Purushottam Kar IIT KANPUR

Size: px

Start display at page:

Download "Some Recent Advances. in Non-convex Optimization. Purushottam Kar IIT KANPUR"

Scott Clyde Taylor
5 years ago
Views:

1 Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR

2 Outline of the Talk Recap of Convex Optimization Why Non-convex Optimization? Non-convex Optimization: A Brief Introduction Robust Regression: A Non-convex Approach Robust Regression: Application to Face Recognition Robust PCA: A Sketch and Application to Foreground Extraction in Images

3 Recap of Convex Optimization

4 Convex Optimization Convex function Convex set

5 Examples Linear Programming Quadratic Programming Semidefinite Programming

6 Applications Resource Allocation Regression Classification Clustering/Partitioning Signal Processing Dimensionality Reduction

Methods Barrier methods Annealing methods Other Methods Cutting plane methods

7 Techniques Projected (Sub)gradient Methods Stochastic, mini-batch variants Primal, dual, primal-dual approaches Coordinate update techniques Interior Point Methods Barrier methods Annealing methods Other Methods Cutting plane methods Accelerated routines Proximal methods Distributed optimization Derivative-free optimization

8 Why Non-convex Optimization?

9 Gene Expression Analysis DNA micro-array gene expression data

10 Recommender Systems n k m =

11 Image Reconstruction and Robust Face Recognition = = =

12 Image Denoising and Robust Face Recognition = = n

13 Large Scale Surveillance Foreground-background separation = = + n m = +

14 Non Convex Optimization Sparse Recovery Matrix Completion Robust Regression Robust PCA

15 Non-convex Optimization: A Brief Introduction

16 Relaxation-based Techniques Convexify the feasible set

17 Alternating Minimization Matrix Completion Robust PCA also Robust Regression, coming up

18 Projected Gradient Descent Top s elements by magnitude Perform k-truncated SVD Sparse Recovery

19 Pursuit and Greedy Methods Set of atoms Sparse Recovery

20 Robust Regression: A Non-convex Approach

21 Linear Regression

22 Linear Regression

23 Linear Regression image.frompo.com

24 Linear Regression with Noise

25 Linear Regression with Noise Residual

26 Linear Regression with Noise

27 Linear Regression with Noise

28 Linear Regression with Noise

29 Linear Regression with Corruptions

30 Robust Regression Corruptions are adversarial, adaptive, but only on a few locations

31 Robust Regression Corruptions are adversarial, adaptive, but only on a few locations Attempt 1 3

32 Robust Regression Corruptions are adversarial, adaptive, but only on a few locations Attempt 1 10

33 Robust Regression Corruptions are adversarial, adaptive, but only on a few locations Attempt 1 10

34 Robust Regression Corruptions are adversarial, adaptive, but only on a few locations Attempt 2 [Wright and Ma 2010*, Nguyen et al, 2013*]

35 Lessons from History If among these errors are some which appear too large to be admissible, then those equations which produced these errors will be rejected, as coming from too faulty experiments, and the unknowns will be determined by means of the other equations, which will then give much smaller errors Adrien-Marie Legendre, On the Method of Least Squares, 1805

36 Linear Regression with Corruptions

37 Linear Regression with Corruptions

38 Linear Regression with Corruptions

39 Linear Regression with Corruptions TORRENT-FC Thresholding Operator-based Robust RegrEssioN method [Bhatia et al, 2015]

40 TORRENT in Action!

41 TORRENT in Action!

42 TORRENT in Action!

43 TORRENT in Action!

44 TORRENT in Action!

45 TORRENT in Action!

46 TORRENT in Action!

47 TORRENT in Action!

48 TORRENT in Action!

49 TORRENT in Action!

50 TORRENT in Action!

51 TORRENT in Action!

52 TORRENT in Action!

53 TORRENT in Action!

54 TORRENT in Action!

55 TORRENT in Action!

56 TORRENT in Action!

57 TORRENT in Action!

58 TORRENT in Action!

59 TORRENT in Action!

60 TORRENT in Action!

61 TORRENT in Action!

62 TORRENT in Action!

63 Alt-Min in Theory Recovery Guarantees Robust against adaptive adversaries has access to data, gold model, and noise Requirement: Data needs to satisfy some nice properties Enough data needs to be present Guarantees: TORRENT will recover the gold model if i.e.

64 Alt-Min in Theory Recovery Guarantees Robust against adaptive adversaries has access to data, gold model, and noise Requirement: Data needs to satisfy some nice properties Enough data needs to be present Guarantees: TORRENT will recover the gold model if i.e.

65 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

66 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

67 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

68 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

69 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

70 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

71 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

72 Alt-Min in Theory Linear rate of convergence Suppose each alternation one step Convergence Rates After T = log 1 ε time steps Invariant: at time t, active set s.t

73 Alt-Min in Practice Quality of Recovery [Bhatia et al 2015]

74 Alt-Min in Practice Speed of Recovery [Bhatia et al 2015]

75 Robust Regression: Application to Face Recognition Extended Yale B dataset, 38 people, 800 images

76 Face Recognition 10% noise 30% noise 50% noise 70% noise [Bhatia et al 2015]

78 Robust PCA: A Sketch and Application to Foreground Extraction in Images

79 The Alternating Projection Procedure [Netrapalli et al 2014]

81 Concluding Comments Non-convex optimization is an exciting area Widespread applications Much better modelling of problems Much more scalable algorithms Provable guarantees So Full of opportunities Full of challenges

Acknowledgements http://research.microsoft.

aspx Portions of this talk were based on joint work with

82 Acknowledgements Portions of this talk were based on joint work with Kush Bhatia Microsoft Research Prateek Jain Microsoft Research Ambuj Tewari U. Michigan, Ann Arbor

83 The Data Sciences Medha Atre Arnab Bhattacharya Sumit Ganguly Purushottam Kar Harish Karnick Vinay Namboodiri Piyush Rai Indranil Saha Gaurav Sharma Sandeep Shukla

84 Machine Learning Vision, Image Processing Our Strengths Databases, Data Mining Online, Streaming Algorithms Cyber-physical Systems

85 Questions?

TORRENT as an Alt-Min Procedure TORRENT indeed performs Alt-Min Two variables in TORRENT active set and model encodes the complement of the corruption

86 TORRENT as an Alt-Min Procedure TORRENT indeed performs Alt-Min Two variables in TORRENT active set and model encodes the complement of the corruption vector TORRENT alternates between Fixing model and choosing active set Fixing active set and choosing model Both steps reduce the residual as much as possible

87 Linear Regression with Corruptions TORRENT-GD Thresholding Operator-based Robust RegrEssioN method [Bhatia et al, 2015]

88 Linear Regression with Corruptions TORRENT-HYB Thresholding Operator-based Robust RegrEssioN method [Bhatia et al, 2015]

Consistent Robust Regression

Consistent Robust Regression Kush Bhatia University of California, Berkeley Prateek Jain Microsoft Research, India Parameswaran Kamalaruban EPFL, Switzerland Purushottam Kar Indian Institute of Technology,