Sta$s$cal Op$miza$on for Big Data. Zhaoran Wang and Han Liu (Joint work with Tong Zhang)

Size: px

Start display at page:

Download "Sta$s$cal Op$miza$on for Big Data. Zhaoran Wang and Han Liu (Joint work with Tong Zhang)"

Nathaniel Gilbert
5 years ago
Views:

1 Sta$s$cal Op$miza$on for Big Data Zhaoran Wang and Han Liu (Joint work with Tong Zhang)

2 Big Data Movement Big Data = Massive Data- size + High Dimensional + Complex Structural + Highly Noisy Big Data give rise to Big Models 2

3 Challenges of Big Models Ques%on! How to effecbvely fit these Big Models? 3

4 Challenges of Big Models Nonconvex & Complicated = ( ; {,..., }) {z } EsBmator Nonconvex & Massive Data- size Infinite- dimensional 4

5 Challenges of Big Models Current Solu%on! General- purposed Finite- dimensional Convex OpBmizaBon Methods 5

6 Challenges of Big Models Our Mission! OpBmizaBon Methods Tailored to StaBsBcal Models 6

7 Challenges of Big Models In this talk! Taming Nonconvexity 7

8 Key to Taming Nonconvexity ExploraBon of Local Convex Region! Key: Good IniBalizaBon 8

9 Nonconvexity Penalized M- EsBmator R L( )+P ( ) 9

10 Nonconvexity Loss FuncBon R L( )+P ( ) Penalty : RegularizaBon Parameter 10

11 Nonconvexity Convex Loss FuncBon Least Squares L( )=! 11

12 Nonconvexity Nonconvex Loss FuncBon Semiparametric EllipBcal Design Loss L( )=(,! ) (, ) Semiparametric Covariance EsBmator 12

13 Nonconvex Loss Func$on Robustness Gaussian z x y (, ) 13

14 z Nonconvex Loss Func$on Robustness Beyond Gaussian x 2 y (, ) 14

15 z Nonconvex Loss Func$on Robustness Beyond Gaussian y x (, ) 15

16 Nonconvex Penalty Oracle Property 4 3 MCP l 1 SCAD Introduces Bias pλ(βj) 2 1 Corrects Bias Oracle Property β j 16

17 Nonconvex Penalty Oracle Property R L( )+P ( ) = ( ) ( ) L( ) As if we are solving a low- dimensional problem knowing the true support. 17

18 Challenge of Nonconvexity Global SoluBon Intractable to Compute! Ques%on! Good Local SoluBon? 18

19 Challenge of Nonconvexity Loss FuncBon R L( )+P ( ) Penalty : RegularizaBon Parameter 19

20 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Randomness L( )=L( ; {,..., }) {z }! Nonconvex in the Worst Case 20

21 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Randomness L( )=L( ; {,..., }) {z } 21

22 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind! Randomness L( )=L( ; {,..., }) {z } Provably Strongly Convex with High Probability for Sparse 22

23 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Randomness L( )=L( ; {,..., }) {z } 23

24 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Loss FuncBon R L( )+P ( ) Penalty : RegularizaBon Parameter 24

25 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Concave P ( )=Q ( )+ k k 1 Nonconvex 25

26 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind P ( )=Q ( )+ k k 1 pλ(βj) MCP l 1 SCAD β j qλ(βj) MCP SCAD β j 26

27 Revisi$ng Nonconvexity with Sta(s(cal Model in Mind Strongly Convex Concave L( )+P ( ) = L( )+Q ( )+ k k 1 {z } Strongly Convex! with High Probability for Sparse 27

28 Key to Taming Nonconvexity Local Convex Region: Sparse Set : {z } = ( ) 28

29 Key to Taming Nonconvexity Local Convex Region: Sparse Set! Key: Good IniBalizaBon and then 29

30 Key to Taming Nonconvexity Walk in the Local Convex Region : {z } = ( ) 30

31 Key to Taming Nonconvexity Walk in the Local Convex Region : {z } = ( ) 31

32 Key to Taming Nonconvexity Walk in the Local Convex Region : {z } = ( ) 32

33 Taming Nonconvexity Good IniBalizaBon! Sparse Approximate KKT CondiBon with Precision / 33

34 Approximate KKT Condi$on 3 Approx. Local SoluBon 2 Exact Local SoluBon 1 { = / O

36 Taming Nonconvexity Is Zero a Good IniBalizaBon?! Sparse Approximate KKT CondiBon For Large 36

37 Taming Nonconvexity Local SoluBon for Larger! Sparse Approximate KKT CondiBon For Slightly Smaller 37

38 Taming Nonconvexity Path Following! > >...> >...> = Sufficiently Large Zero is the Exact SoluBon Target RegularizaBon Parameter 38

39 Taming Nonconvexity

40 Taming Nonconvexity Path Following! > >...> >...> = {z } + / = (, ) 40

41 Taming Nonconvexity Path Following For each + R ( ; )+ 41

42 Taming Nonconvexity Path Following + R ( ; )+ QuadraBc ApproximaBon of L( )+Q ( ) 42

43 Taming Nonconvexity Strongly Convex Concave L( )+P ( )=L( )+Q ( )+ {z } Strongly Convex! with High Probability for Sparse 43

44 Taming Nonconvexity Path Following + R ( ; )+ So\- thresholding 44

45 Empirical Results True PosiBve False PosiBve 45

46 Computa$onal Results 1. IteraBon Complexity ( / ) OpBmal No Be_er First- order Method Even for Convex Problems 2. Uniqueness of the A_ained Local! SoluBon Is it Good? 46

47 Sta$s$cal Results 1. Rate of Convergence / + > < For En%re Regulariza%on Path 47

48 Sta$s$cal Results 1. Rate of Convergence / + / > / < / For = = / 48

49 Sta$s$cal Results 1. Rate of Convergence / + / Oracle Lasso Much Sharper For = = / 49

50 Sta$s$cal Results 2. Exact Support Recovery Required Signal Strength > /, ( ) Then ( )= 50

51 Sta$s$cal Results 2. Exact Support Recovery Required Signal Strength > /, ( ) Weakest Requirement Possible Empirical Results Explained 51

52 Sta$s$cal Results 2. Exact Support Recovery Required Signal Strength > /, ( ) Weakest Requirement Possible Empirical Results Explained 52

53 Take Home Message 1. Taming nonconvexity by exploring the stabsbcal model;! 2. Local convex region and smart inibalizabon are the keys;! 3. Nonconvex penalized M- esbmators: Path- following solves all problems. 53

Nonconvex penalties: Signal-to-noise ratio and algorithms

Nonconvex penalties: Signal-to-noise ratio and algorithms Patrick Breheny March 21 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/22 Introduction In today s lecture, we will return to nonconvex