The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
|
|
- Calvin Sharp
- 5 years ago
- Views:
Transcription
1 The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences Yuxin Chen Emmanuel Candès Department of Statistics, Stanford University, Sep. 2016
2 Nonconvex optimization is everywhere For instance, maximum likelihood estimation is nonconvex in numerous problems maximize x l(x; y) subject to x S matrix completion phase retrieval dictionary learning blind deconvolution robust PCA...
3 Keshavan et al 08, Netrapalli et al 13, Candès et al 14, Soltanolkotabi 14, Jain et al 14, Sun et al 14, Chen et al 15, Cai et al 15, Tu et al 15, Sun et al 15, White et al 15, Li et al 16, Yi et al 16, Zhang et al 16, Wang et al 16,... Recent flurry of research in nonconvex procedures Nice geometry within a neighborhood around x (basin of attraction)
4 Keshavan et al 08, Netrapalli et al 13, Candès et al 14, Soltanolkotabi 14, Jain et al 14, Sun et al 14, Chen et al 15, Cai et al 15, Tu et al 15, Sun et al 15, White et al 15, Li et al 16, Yi et al 16, Zhang et al 16, Wang et al 16,... Recent flurry of research in nonconvex procedures Nice geometry within a neighborhood around x (basin of attraction) Suggests two-stage paradigms 1. Start from an appropriate initial point
5 Keshavan et al 08, Netrapalli et al 13, Candès et al 14, Soltanolkotabi 14, Jain et al 14, Sun et al 14, Chen et al 15, Cai et al 15, Tu et al 15, Sun et al 15, White et al 15, Li et al 16, Yi et al 16, Zhang et al 16, Wang et al 16,... Recent flurry of research in nonconvex procedures Nice geometry within a neighborhood around x (basin of attraction) Suggests two-stage paradigms 1. Start from an appropriate initial point 2. Proceed via some iterative updates
6 This talk: a discrete nonconvex problem
7 Joint alignment from pairwise differences n unknown variables: x1,, xn m possible states: xi {1, 2,, m} x1 = x2 = x3 = 12...
8 Joint alignment from pairwise differences n unknown variables: x1,, xn m possible states: xi {1, 2,, m}
9 Joint alignment from pairwise differences n unknown variables: x1,, xn m possible states: xi {1, 2,, m}
10 Joint alignment from pairwise differences Measurements: pairwise differences ind. yi,j = xi xj + ηi,j mod m, {z} i 6= j noise Bandiera, Charikar, Singer, Zhu 13; Chen, Guibas, Huang 14 xi xj mod m
11 Joint differences alignment from pairwise di eren Joint alignment from pairwise Measurements: pairwise differences (0) z ind. yi,j = xi xj + ηi,j mod m, {z} w(1) z (1) w(2) z (2) w(3) z (2) i 6= j noise ( Measurements: di erences xi xj mod pairwise m xi with prob.m 0 ind xj mod yi,j = e.g. random corruption model Uniform (m) else ind. yi,j = xi xj + i,j mod m ( {z} xi xj mod m with ind noiseprob. 0 yi,j = Uniform (m) else C xi = 1 () 2 3 rate 0 : non-corruption xi = e1 = 6. 7 xi = xj = 0; yi,j ) 2 L1,1 6.. > Bandiera, Charikar, Singer, Zhu 13; 14 xi Chen,xGuibas, LHuang L=4 j i,j () ` (xi... 3 L1,n xi
12 Joint differences alignment from pairwise di eren Joint alignment from pairwise Measurements: pairwise differences (0) z ind. yi,j = xi xj + ηi,j mod m, {z} w(1) z (1) w(2) z (2) w(3) z (2) i 6= j noise ( Measurements: di erences xi xj mod pairwise m xi with prob.m 0 ind xj mod yi,j = e.g. random corruption model Uniform (m) else ind. yi,j = xi xj + i,j mod m ( {z} xi xj mod m with ind noiseprob. 0 yi,j = Uniform (m) else C 2 3 rate 0 : non-corruption xi = 1 () xi = e1 = 6. 7 xi = 2 () xi Goal: recover {xi } (up to global offset) ` (xi xj = 0; yi,j ) 2 3 L1,1 L1,n Bandiera, Charikar, Singer, Zhu 13; 14 x> LHuang L = j i,j. i Chen,xGuibas,
13 Motivation: multi-image alignment Jointly align a collection of images/shapes of the same physical object
14 Motivation: multi-image alignment Jointly align a collection of images/shapes of the same physical object x i : angle of rotation associated with each shape computer vision/graphics
15 Motivation: multi-image alignment Step 1: compute pairwise estimates of relative angles of rotations...
16 Motivation: multi-image alignment Step 1: compute pairwise estimates of relative angles of rotations... Step 2: aggregate these pairwise information for joint alignment
17 Maximum likelihood estimates (MLE) maximize {xi} l (x i, x j ; y i,j ) i,j subj. to x i {1,, m}, 1 i n Log-likelihood function l may be complicated
18 Maximum likelihood estimates (MLE) maximize {xi} l (x i, x j ; y i,j ) i,j subj. to x i {1,, m}, 1 i n Log-likelihood function l may be complicated Discrete input space
19 Maximum likelihood estimates (MLE) maximize {xi} l (x i, x j ; y i,j ) i,j subj. to x i {1,, m}, 1 i n Log-likelihood function l may be complicated Discrete input space Looks daunting
20 Another look in lifted space Discrete variables orthogonal vectors in higher-dimensional space x i =1 () x i = e 1 = x i =2 () x i = e 1 = x i = j x i = e j
21 Another look in lifted space Pairwise sample y i,j encode l(x i, x j ) by L i,j R m m [L i,j ] α,β = l(x i = α, x j = β)
22 Another look in lifted space Pairwise sample y i,j encode l(x i, x j ) by L i,j R m m [L i,j ] α,β = l(x i = α, x j = β) e.g. random corruption model { { x y i,j = i x j, w.p. π 0 log(π l(x Unif(m), else i, x j ) = π 0 m ), if x i x j = y i,j log( 1 π 0 ), else m
23 Another look in lifted space Pairwise sample y i,j encode l(x i, x j ) by L i,j R m m [L i,j ] α,β = l(x i = α, x j = β) e.g. random corruption model { { x y i,j = i x j, w.p. π 0 log(π l(x Unif(m), else i, x j ) = π 0 m ), if x i x j = y i,j log( 1 π 0 ), else m `(x i =2,x j = 5; y i,j = 2)
24 Another look in lifted space Pairwise sample y i,j encode l(x i, x j ) by L i,j R m m [L i,j ] α,β = l(x i = α, x j = β) e.g. random corruption model { { x y i,j = i x j, w.p. π 0 log(π l(x Unif(m), else i, x j ) = π 0 m ), if x i x j = y i,j log( 1 π 0 ), else m `(x i =2,x j = 5; y i,j = 2) This enables quadratic representation l(x i, x j ) = x i L i,j x j x > i L i,j x j
25 MLE is equivalent to a binary quadratic program L = 4 () L 1,1 L 1,n L n,1 L n,n L i,j
26 MLE is equivalent to a binary quadratic program L = 4 () L 1,1 L 1,n L n,1 L n,n L i,j maximize l(x i x j; y ij) i,j subj. to x i {1,, m} maximize x x Lx subj. to x = x 1. x n x i {e 1,, e m}
27 MLE is equivalent to a binary quadratic program L = 4 () L 1,1 L 1,n L n,1 L n,n L i,j maximize l(x i x j; y ij) i,j subj. to x i {1,, m} maximize x x Lx subj. to x = x 1. x n x i {e 1,, e m} This is essentially nonconvex constrained PCA
28 How to solve nonconvex constrained PCA? PCA maximize x x Lx subj. to x = 1 Power method: for t = 1, 2, z (t) z (t) = Lz (t 1) normalize ( z (t))
29 How to solve nonconvex constrained PCA? PCA maximize x x Lx subj. to x = 1 Constrained PCA maximize x x Lx subj. to x i {e 1,, e m} Power method: for t = 1, 2, z (t) z (t) = Lz (t 1) normalize ( z (t)) Projected power method: for t = 1, 2, z (t) z (t) = Lz (t 1) ( Project n µz (t) ) µ: scaling factor
30 Projection onto standard simplex maximize x={xi } x Lx s.t. x i {e 1,, e m} z (t) z (t) = Lz (t 1) ( Project n µz (t) )
31 Projection onto standard simplex maximize x={xi } x Lx s.t. x i {e 1,, e m} z (t) z (t) = Lz (t 1) Project n ( µz (t) ) n is convex hull of feasibility set, i.e. { } z = [z i] 1 i n i : 1 z i = 1; z i 0 µz (t) 1 µz (t) 2 µz (t) n
32 Illustration Projected power method: z (t+1) ( Project ) n µlz (t)
33 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration
34 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration
35 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
36 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
37 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
38 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
39 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
40 Illustration Projected power method: z (t+1) Lz is gradient of 1 2 z Lz Project n ( µlz (t) ) power iteration projection
41 Initialization? L = E[L] }{{} approx. low-rank + L E [L]
42 Initialization? L = E[L] }{{} approx. low-rank + L E [L] Spectral initialization 1. ˆL rank-m approximation of L ˆL
43 Initialization? L = E[L] }{{} approx. low-rank + L E [L] Spectral initialization 1. ˆL rank-m approximation of L ˆL 2. z (0) Project n(µẑ), where ẑ is a random column of ˆL
44 Summary of projected power method (PPM) 1. Spectral initialization 2. For t = 1, 2, z (t) Project n (µlz (t 1))
45 Random corruption model ( y i,j ind = ( x i x j mod m with prob. 0 Uniform (m) else 2 3
46 Random corruption model ( y i,j ind = ( x i x j mod m with prob. 0 Uniform (m) else 2 3 Theorem (Chen-Candès 16) Fix m > 0 and set µ 1/σ 2 (L). With high prob., PPM recovers the truth exactly within O(log n) iterations if signal-to-noise ratio (SNR) not too small: π 0 > 2 log n mn
47 Implications Theorem (Chen-Candès 16) PPM succeeds within O(log n) iterations if log n non-corruption rate π 0 > 2 mn PPM succeeds even when most (i.e. 1 O ( ) log n n ) entries are corrupted
48 Implications Theorem (Chen-Candès 16) PPM succeeds within O(log n) iterations if log n non-corruption rate π 0 > 2 mn PPM succeeds even when most (i.e. 1 O ( ) log n n ) entries are corrupted Nearly linear time algorithm
49 Implications Theorem (Chen-Candès 16) PPM succeeds within O(log n) iterations if log n non-corruption rate π 0 > 2 mn PPM succeeds even when most (i.e. 1 O ( ) log n n ) entries are corrupted Nearly linear time algorithm Works for any initialization obeying z (0) x < 0.5 x
50 Empirical misclassification rate input corruption rate: 1-: theory number of variables: n (m =3) 10% 1% 0.1% 0.01% 0 Misclassification rate when n and π 0 vary (µ = 10/σ 2 (L))
51 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0
52 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0 Distributions of y i,j under different hypotheses P 0 x i x j = 0
53 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0 Distributions of y i,j under different hypotheses P 0 P 1 x i x j = 0 x i x j = 1
54 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0 Distributions of y i,j under different hypotheses P 0 P 1 P 9 x i x j = 0 x i x j = 1 x i x j = 9
55 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0 Distributions of y i,j under different hypotheses P 0 P 1 P 9 x i x j = 0 x i x j = 1 x i x j = 9 KL(P 0 P 1) KL(P 0 P 9)
56 More general noise models y i,j = x i x j + η i,j mod m, where η i,j i.i.d. P 0 Distributions of y i,j under different hypotheses P 0 P 1 P 9 x i x j = 0 x i x j = 1 x i x j = 9 KL(P 0 P 1) KL(P 0 P 9) Theorem (Chen-Candès 16) Fix m > 0 and set µ 1/σ 2(L). Under mild conditions, PPM succeeds within O(log n) iterations with high prob., provided that KL min := min 1 l<m KL(P0 P l) > 4 log n n
57 Interpretation: why KL min matters Theorem (Chen-Candès 16)... PPM succeeds within O(log n) iterations if KL min := min 1 l<m KL(P0 P l) > 4 log n n Suppose x 1 = = x n = 1
58 Interpretation: why KL min matters Theorem (Chen-Candès 16)... PPM succeeds within O(log n) iterations if KL min := min 1 l<m KL(P0 P l) > 4 log n n Suppose x 1 = = x n = 1
59 Interpretation: why KL min matters Theorem (Chen-Candès 16)... PPM succeeds within O(log n) iterations if KL min := min 1 l<m KL(P0 P l) > 4 log n n Suppose x 1 = = x n = 1 Peaks of E[L] reveal ground truth E[L i,j ]
60 Interpretation: why KL min matters Theorem (Chen-Candès 16)... PPM succeeds within O(log n) iterations if KL min := min 1 l<m KL(P0 P l) > 4 log n n Suppose x 1 = = x n = 1 Peaks of E[L] reveal ground truth E[L i,j ] L E[L] if KL min is sufficiently large
61 Empirical misclassification rate Modified Gaussian noise model: ( ) P {η i,j = z} exp z2 2σ, z m standard dev of Gaussian density theory number of variables: n (m =9) µ = 20/σ m (L) 1% 0.1% 0.01% 0
62 PPM is information-theoretically optimal
63 PPM is information-theoretically optimal Theorem (Chen-Candès 16) Fix m > 0. No method achieves exact recovery if KL min < 4 log n n
64 Singer 09; Wang & Singer 12; Bandeira et al 14; Boumal 16; Liu et al 16, Perry et al Large-m case: random corruption model y i,j = { x i x j, with prob. π 0 Unif(m), else Theorem (Chen-Candès 16) Suppose log n m poly(n). PPM succeeds if π 0 1 n
65 Singer 09; Wang & Singer 12; Bandeira et al 14; Boumal 16; Liu et al 16, Perry et al Large-m case: random corruption model y i,j = { x i x j, with prob. π 0 Unif(m), else Theorem (Chen-Candès 16) Suppose log n m poly(n). PPM succeeds if π 0 1 n Spiky model: when m n, model converges to x i [0, 1), y i,j = { x i x j, with prob. π 0 Unif(0, 1), else
66 Singer 09; Wang & Singer 12; Bandeira et al 14; Boumal 16; Liu et al 16, Perry et al Large-m case: random corruption model y i,j = { x i x j, with prob. π 0 Unif(m), else Theorem (Chen-Candès 16) Suppose log n m poly(n). PPM succeeds if π 0 1 n Spiky model: when m n, model converges to x i [0, 1), y i,j = { x i x j, with prob. π 0 Unif(0, 1), else Succeeds even if a dominant fraction 1 O(1/ n) of inputs are corrupted
67 Singer 09; Wang & Singer 12; Bandeira et al 14; Boumal 16; Liu et al 16, Perry et al Large-m case: random corruption model y i,j = { x i x j, with prob. π 0 Unif(m), else Theorem (Chen-Candès 16) Suppose log n m poly(n). PPM succeeds if π 0 1 n Smooth noise model if m n
68 Singer 09; Wang & Singer 12; Bandeira et al 14; Boumal 16; Liu et al 16, Perry et al Large-m case: random corruption model y i,j = { x i x j, with prob. π 0 Unif(m), else Theorem (Chen-Candès 16) Suppose log n m poly(n). PPM succeeds if π 0 1 n Smooth noise model if m n Recovers each x i [0, 1) up to a resolution of 1 1 m n
69 Joint shape alignment: Chair dataset from ShapeNet 1 20 representative shapes (out of 50) 1 We add extra noise to each point of the shapes to make it more challenging.
70 Joint shape alignment: Chair dataset from ShapeNet 1 20 representative shapes (out of 50) pairwise cost l i,j(x i, x j): avg nearest-neighbor squared distance 1 We add extra noise to each point of the shapes to make it more challenging.
71 Joint shape alignment: Chair dataset from ShapeNet 1 20 representative shapes (out of 50) pairwise cost l i,j(x i, x j): avg nearest-neighbor squared distance aligned shapes 1 We add extra noise to each point of the shapes to make it more challenging.
72 Joint shape alignment: angular estimation errors 2 Cumulative distribution PPM SDP Absolute angular estimation error projected power method semidefinite relaxation Runtime 2.4 sec sec 2 We add extra noise to each point of the shapes to make it more challenging.
73 Description 111 images of Joint a toy housegraph matching: CMU House dataset Thumbnails images of a toy house
74 Description 111 images of Joint a toy housegraph matching: CMU House dataset Thumbnails images of a toy house input matches 3 representative images
75 Description 111 images of Joint a toy housegraph matching: CMU House dataset Thumbnails images of a toy house input matches optimized matches 3 representative images
76 Dixon imaging in body MRI Zhang et al., Magn. Reson. Med., phasor candidates for field inhomogeneity at each voxel candidate 1 candidate 2
77 Dixon imaging in body MRI Zhang et al., Magn. Reson. Med., phasor candidates for field inhomogeneity at each voxel candidate 1 optimize some pariwise cost function candidate 2 recovery maximize l(xi, x j ) subject to x i {1, 2}
78 Dixon imaging in body MRI Representative cases of water signal recovery Zhang et al., Magn. Reson. Med., 2016 commercial software projected power method
79 Things I have not talked about General noise model with large m 2. Incomplete data
80 Concluding remarks A new approach to discrete assignment problems Finds MLE in suitable regimes Computationally efficient Paper: The projected power method: an efficient algorithm for joint alignment from pairwise differences, Y. Chen and E. Candès, 2016
Information Recovery from Pairwise Measurements
Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large
More informationNonconvex Methods for Phase Retrieval
Nonconvex Methods for Phase Retrieval Vince Monardo Carnegie Mellon University March 28, 2018 Vince Monardo (CMU) 18.898 Midterm March 28, 2018 1 / 43 Paper Choice Main reference paper: Solving Random
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationJointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs
Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs Jiaming Xu Joint work with Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, and Lei Ying University of Illinois, Urbana-Champaign
More informationRapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization
Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016
More informationFast Angular Synchronization for Phase Retrieval via Incomplete Information
Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department
More informationProvable Alternating Minimization Methods for Non-convex Optimization
Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish
More informationDonald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016
Optimization for Tensor Models Donald Goldfarb IEOR Department Columbia University UCLA Mathematics Department Distinguished Lecture Series May 17 19, 2016 1 Tensors Matrix Tensor: higher-order matrix
More informationCOR-OPT Seminar Reading List Sp 18
COR-OPT Seminar Reading List Sp 18 Damek Davis January 28, 2018 References [1] S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht. Low-rank Solutions of Linear Matrix Equations via Procrustes
More informationProvable Non-convex Phase Retrieval with Outliers: Median Truncated Wirtinger Flow
Provable Non-convex Phase Retrieval with Outliers: Median Truncated Wirtinger Flow Huishuai Zhang Department of EECS, Syracuse University, Syracuse, NY 3244 USA Yuejie Chi Department of ECE, The Ohio State
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationNonconvex Matrix Factorization from Rank-One Measurements
Yuanxin Li Cong Ma Yuxin Chen Yuejie Chi CMU Princeton Princeton CMU Abstract We consider the problem of recovering lowrank matrices from random rank-one measurements, which spans numerous applications
More informationRanking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization
Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis
More informationSolving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems
Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems Yuxin Chen Department of Statistics Stanford University Stanford, CA 94305 yxchen@stanfor.edu Emmanuel J. Candès
More informationDetecting Sparse Structures in Data in Sub-Linear Time: A group testing approach
Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach Boaz Nadler The Weizmann Institute of Science Israel Joint works with Inbal Horev, Ronen Basri, Meirav Galun and Ery Arias-Castro
More informationLift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints
Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints Nicolas Boumal Université catholique de Louvain (Belgium) IDeAS seminar, May 13 th, 2014, Princeton The Riemannian
More informationIncremental Reshaped Wirtinger Flow and Its Connection to Kaczmarz Method
Incremental Reshaped Wirtinger Flow and Its Connection to Kaczmarz Method Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yingbin Liang Department of EECS Syracuse
More informationMatrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University
Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small
More informationFundamental Limits of Weak Recovery with Applications to Phase Retrieval
Proceedings of Machine Learning Research vol 75:1 6, 2018 31st Annual Conference on Learning Theory Fundamental Limits of Weak Recovery with Applications to Phase Retrieval Marco Mondelli Department of
More information1-bit Matrix Completion. PAC-Bayes and Variational Approximation
: PAC-Bayes and Variational Approximation (with P. Alquier) PhD Supervisor: N. Chopin Bayes In Paris, 5 January 2017 (Happy New Year!) Various Topics covered Matrix Completion PAC-Bayesian Estimation Variational
More informationSpectral Method and Regularized MLE Are Both Optimal for Top-K Ranking
Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang Ranking A fundamental
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationOrientation Assignment in Cryo-EM
Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014
More informationCompressed Sensing and Sparse Recovery
ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing
More informationLow-rank Matrix Completion with Noisy Observations: a Quantitative Comparison
Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison Raghunandan H. Keshavan, Andrea Montanari and Sewoong Oh Electrical Engineering and Statistics Department Stanford University,
More informationCombining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation
UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford).
More informationLow-Rank Matrix Recovery
ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery
More informationMedian-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation
Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation Yuejie Chi, Yuanxin Li, Huishuai Zhang, and Yingbin Liang Abstract Recent work has demonstrated the effectiveness
More informationLikelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square
Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel
More information1-Bit Matrix Completion
1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationThree-dimensional structure determination of molecules without crystallization
Three-dimensional structure determination of molecules without crystallization Amit Singer Princeton University, Department of Mathematics and PACM June 4, 214 Amit Singer (Princeton University) June 214
More information1-Bit Matrix Completion
1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible
More informationSome Optimization and Statistical Learning Problems in Structural Biology
Some Optimization and Statistical Learning Problems in Structural Biology Amit Singer Princeton University, Department of Mathematics and PACM January 8, 2013 Amit Singer (Princeton University) January
More informationSelf-Calibration and Biconvex Compressive Sensing
Self-Calibration and Biconvex Compressive Sensing Shuyang Ling Department of Mathematics, UC Davis July 12, 2017 Shuyang Ling (UC Davis) SIAM Annual Meeting, 2017, Pittsburgh July 12, 2017 1 / 22 Acknowledgements
More informationCompressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles
Or: the equation Ax = b, revisited University of California, Los Angeles Mahler Lecture Series Acquiring signals Many types of real-world signals (e.g. sound, images, video) can be viewed as an n-dimensional
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationMatrix Completion: Fundamental Limits and Efficient Algorithms
Matrix Completion: Fundamental Limits and Efficient Algorithms Sewoong Oh PhD Defense Stanford University July 23, 2010 1 / 33 Matrix completion Find the missing entries in a huge data matrix 2 / 33 Example
More informationA Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde
A Convex Approach for Designing Good Linear Embeddings Chinmay Hegde Redundancy in Images Image Acquisition Sensor Information content
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationNonlinear Analysis with Frames. Part III: Algorithms
Nonlinear Analysis with Frames. Part III: Algorithms Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD July 28-30, 2015 Modern Harmonic Analysis and Applications
More informationReshaped Wirtinger Flow for Solving Quadratic System of Equations
Reshaped Wirtinger Flow for Solving Quadratic System of Equations Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 344 hzhan3@syr.edu Yingbin Liang Department of EECS Syracuse University
More informationSparse Solutions of an Undetermined Linear System
1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationConstrained optimization
Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained
More informationInformation Recovery from Pairwise Measurements: A Shannon-Theoretic Approach
Information Recovery from Pairwise Measurements: A Shannon-Theoretic Approach Yuxin Chen Statistics Stanford University Email: yxchen@stanfordedu Changho Suh EE KAIST Email: chsuh@kaistackr Andrea J Goldsmith
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationapproximation algorithms I
SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018
ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space
More informationROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210
ROBUST BLIND SPIKES DECONVOLUTION Yuejie Chi Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 4 ABSTRACT Blind spikes deconvolution, or blind super-resolution, deals with
More informationSemi Supervised Distance Metric Learning
Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =
More informationLecture Notes 9: Constrained Optimization
Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form
More informationECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis
ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear
More informationRobust PCA by Manifold Optimization
Journal of Machine Learning Research 19 (2018) 1-39 Submitted 8/17; Revised 10/18; Published 11/18 Robust PCA by Manifold Optimization Teng Zhang Department of Mathematics University of Central Florida
More informationMAP Examples. Sargur Srihari
MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationRecovering any low-rank matrix, provably
Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix
More informationInderjit Dhillon The University of Texas at Austin
Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp) Introduction 2 / 29 Notion of distance
More informationPhase retrieval with random Gaussian sensing vectors by alternating projections
Phase retrieval with random Gaussian sensing vectors by alternating projections arxiv:609.03088v [math.st] 0 Sep 06 Irène Waldspurger Abstract We consider a phase retrieval problem, where we want to reconstruct
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationCS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu
CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge
More informationOn the local stability of semidefinite relaxations
On the local stability of semidefinite relaxations Diego Cifuentes Department of Mathematics Massachusetts Institute of Technology Joint work with Sameer Agarwal (Google), Pablo Parrilo (MIT), Rekha Thomas
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie
More informationThe role of dimensionality reduction in classification
The role of dimensionality reduction in classification Weiran Wang and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationSome Recent Advances. in Non-convex Optimization. Purushottam Kar IIT KANPUR
Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk Recap of Convex Optimization Why Non-convex Optimization? Non-convex Optimization: A Brief Introduction Robust
More informationHigh-Rank Matrix Completion and Subspace Clustering with Missing Data
High-Rank Matrix Completion and Subspace Clustering with Missing Data Authors: Brian Eriksson, Laura Balzano and Robert Nowak Presentation by Wang Yuxiang 1 About the authors
More informationLow-rank Matrix Completion from Noisy Entries
Low-rank Matrix Completion from Noisy Entries Sewoong Oh Joint work with Raghunandan Keshavan and Andrea Montanari Stanford University Forty-Seventh Allerton Conference October 1, 2009 R.Keshavan, S.Oh,
More informationRobust Camera Location Estimation by Convex Programming
Robust Camera Location Estimation by Convex Programming Onur Özyeşil and Amit Singer INTECH Investment Management LLC 1 PACM and Department of Mathematics, Princeton University SIAM IS 2016 05/24/2016,
More informationA new method on deterministic construction of the measurement matrix in compressed sensing
A new method on deterministic construction of the measurement matrix in compressed sensing Qun Mo 1 arxiv:1503.01250v1 [cs.it] 4 Mar 2015 Abstract Construction on the measurement matrix A is a central
More informationCommunication-efficient and Differentially-private Distributed SGD
1/36 Communication-efficient and Differentially-private Distributed SGD Ananda Theertha Suresh with Naman Agarwal, Felix X. Yu Sanjiv Kumar, H. Brendan McMahan Google Research November 16, 2018 2/36 Outline
More informationThresholds for the Recovery of Sparse Solutions via L1 Minimization
Thresholds for the Recovery of Sparse Solutions via L Minimization David L. Donoho Department of Statistics Stanford University 39 Serra Mall, Sequoia Hall Stanford, CA 9435-465 Email: donoho@stanford.edu
More informationAdaptive Affinity Matrix for Unsupervised Metric Learning
Adaptive Affinity Matrix for Unsupervised Metric Learning Yaoyi Li, Junxuan Chen, Yiru Zhao and Hongtao Lu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering,
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationPHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN
PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the
More informationSum-of-Squares Method, Tensor Decomposition, Dictionary Learning
Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationAdaptive one-bit matrix completion
Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines
More informationBinary matrix completion
Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion
More informationLecture Notes 10: Matrix Factorization
Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To
More informationBregman Divergences for Data Mining Meta-Algorithms
p.1/?? Bregman Divergences for Data Mining Meta-Algorithms Joydeep Ghosh University of Texas at Austin ghosh@ece.utexas.edu Reflects joint work with Arindam Banerjee, Srujana Merugu, Inderjit Dhillon,
More informationSolving DC Programs that Promote Group 1-Sparsity
Solving DC Programs that Promote Group 1-Sparsity Ernie Esser Contains joint work with Xiaoqun Zhang, Yifei Lou and Jack Xin SIAM Conference on Imaging Science Hong Kong Baptist University May 14 2014
More informationNon-convex Robust PCA: Provable Bounds
Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing
More informationImplicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion
: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion Cong Ma 1 Kaizheng Wang 1 Yuejie Chi Yuxin Chen 3 Abstract Recent years have seen a flurry of activities in designing provably
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationSymmetric Factorization for Nonconvex Optimization
Symmetric Factorization for Nonconvex Optimization Qinqing Zheng February 24, 2017 1 Overview A growing body of recent research is shedding new light on the role of nonconvex optimization for tackling
More informationLinear dimensionality reduction for data analysis
Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for
More informationBias-free Sparse Regression with Guaranteed Consistency
Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March
More informationGlobally Convergent Levenberg-Marquardt Method For Phase Retrieval
Globally Convergent Levenberg-Marquardt Method For Phase Retrieval Zaiwen Wen Beijing International Center For Mathematical Research Peking University Thanks: Chao Ma, Xin Liu 1/38 Outline 1 Introduction
More informationSpatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood
Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October
More information10-725/36-725: Convex Optimization Prerequisite Topics
10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the
More information