COR-OPT Seminar Reading List Sp 18
|
|
- Erick Crawford
- 5 years ago
- Views:
Transcription
1 COR-OPT Seminar Reading List Sp 18 Damek Davis January 28, 2018 References [1] S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow. In: arxiv: [math] (July 13, 2015). arxiv: url: [2] R. Meka, P. Jain, and I. S. Dhillon. Guaranteed Rank Minimization via Singular Value Projection. In: arxiv: [cs, math] (Sept. 30, 2009). arxiv: url: [3] R. Ge, C. Jin, and Y. Zheng. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis. In: arxiv: [cs, math, stat] (Apr. 3, 2017). arxiv: url: [4] S. Bhojanapalli, B. Neyshabur, and N. Srebro. Global Optimality of Local Search for Low Rank Matrix Recovery. In: arxiv: [cs, math, stat] (May 23, 2016). arxiv: url: [5] R. Ge, J. D. Lee, and T. Ma. Matrix Completion has No Spurious Local Minimum. In: arxiv: [cs, stat] (May 23, 2016). arxiv: url: http : //arxiv.org/abs/ [6] R. Ge and T. Ma. On the Optimization Landscape of Tensor Decompositions. In: arxiv: [cs, math, stat] (June 17, 2017). arxiv: url: http: //arxiv.org/abs/ [7] R. Ge, F. Huang, C. Jin, and Y. Yuan. Escaping From Saddle Points Online Stochastic Gradient for Tensor Decomposition. In: arxiv: [cs, math, stat] (Mar. 6, 2015). arxiv: url: [8] A. S. Bandeira, N. Boumal, and V. Voroninski. On the low-rank approach for semidefinite programs arising in synchronization and community detection. In: arxiv: [math] (Feb. 14, 2016). arxiv: url:
2 [9] C. Kim, A. S. Bandeira, and M. X. Goemans. Community Detection in Hypergraphs, Spiked Tensor Models, and Sum-of-Squares. In: arxiv: [cs, math, stat] (May 8, 2017). arxiv: url: [10] E. Abbe. Community detection and stochastic block models: recent developments. In: arxiv: [cs, math, stat] (Mar. 29, 2017). arxiv: url: [11] K. Kawaguchi. Deep Learning without Poor Local Minima. In: arxiv: [cs, math, stat] (May 23, 2016). arxiv: url: [12] M. Hardt and T. Ma. Identity Matters in Deep Learning. In: arxiv: [cs, stat] (Nov. 13, 2016). arxiv: url: [13] M. Hardt, T. Ma, and B. Recht. Gradient Descent Learns Linear Dynamical Systems. In: arxiv: [cs, math, stat] (Sept. 16, 2016). arxiv: url: [14] A. S. Bandeira, N. Boumal, and A. Singer. Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. In: Mathematical Programming (May 2017), pp issn: , doi: /s arxiv: url: [15] A. Bandeira, P. Rigollet, and J. Weed. Optimal rates of estimation for multireference alignment. In: arxiv: [math, stat] (Feb. 27, 2017). arxiv: url: [16] D. Boob and G. Lan. Theoretical properties of the global optimizer of two layer neural network. In: arxiv: [cs] (Oct. 30, 2017). arxiv: url: [17] D. Boob and G. Lan. Theoretical properties of the global optimizer of two layer neural network. In: arxiv: [cs] (Oct. 30, 2017). arxiv: url: [18] L. Wang and A. Singer. Exact and Stable Recovery of Rotations for Robust Synchronization. In: arxiv: [cs, math] (Nov. 11, 2012). arxiv: url: [19] R. Vershynin. Estimation in high dimensions: a geometric perspective. In: arxiv: [math, stat] (May 20, 2014). arxiv: url: [20] H. Liu, M.-C. Yue, and A. M.-C. So. On the Estimation Performance and Convergence Rate of the Generalized Power Method for Phase Synchronization. In: arxiv: [math] (Mar. 1, 2016). arxiv: url: org/abs/
3 [21] N. Boumal. Nonconvex phase synchronization. In: arxiv: [math] (Jan. 22, 2016). arxiv: url: [22] Y. Zhong and N. Boumal. Near-optimal bounds for phase synchronization. In: arxiv: [math] (Mar. 20, 2017). arxiv: url: org/abs/ [23] V. Roulet, N. Boumal, and A. d Aspremont. Computational Complexity versus Statistical Performance on Sparse Recovery Problems. In: arxiv: [math] (June 10, 2015). arxiv: url: [24] M. Simchowitz, A. E. Alaoui, and B. Recht. On the Gap Between Strict-Saddles and True Convexity: An Omega(log d) Lower Bound for Eigenvector Approximation. In: arxiv: [cs, math, stat] (Apr. 14, 2017). arxiv: url: http: //arxiv.org/abs/ [25] Y. Chen and E. Candes. The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences. In: arxiv: [cs, math, stat] (Sept. 19, 2016). arxiv: url: [26] P.-L. Loh. Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. In: arxiv: [cs, math, stat] (Jan. 1, 2015). arxiv: url: [27] G. B. Arous, S. Mei, A. Montanari, and M. Nica. The landscape of the spiked tensor model. In: arxiv: [math, stat] (Nov. 15, 2017). arxiv: url: [28] E. Abbe, L. Massoulie, A. Montanari, A. Sly, and N. Srivastava. Group Synchronization on Grids. In: arxiv: [cs, math, stat] (June 26, 2017). arxiv: url: [29] S. S. Du, J. D. Lee, Y. Tian, B. Poczos, and A. Singh. Gradient Descent Learns Onehidden-layer CNN: Don t be Afraid of Spurious Local Minima. In: arxiv: [cs, math, stat] (Dec. 3, 2017). arxiv: url: [30] Q. Qu, Y. Zhang, Y. C. Eldar, and J. Wright. Convolutional Phase Retrieval via Gradient Descent. In: arxiv: [cs, math, stat] (Dec. 3, 2017). arxiv: url: [31] J. C. Duchi and F. Ruan. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. In: arxiv: [cs, math, stat] (May 5, 2017). arxiv: url: [32] E. Abbe, J. Fan, K. Wang, and Y. Zhong. Entrywise Eigenvector Analysis of Random Matrices with Low Expected Rank. In: arxiv: [math, stat] (Sept. 27, 2017). arxiv: url: 3
4 [33] Y. Chen, Y. Chi, and A. Goldsmith. Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming. In: arxiv: [cs, math, stat] (Oct. 2, 2013). arxiv: url: [34] J. Tang, F. Bach, M. Golbabaee, and M. Davies. Structure-Adaptive, Variance- Reduced, and Accelerated Stochastic Optimization. In: arxiv: [math] (Dec. 8, 2017). arxiv: url: [35] M. Soltanolkotabi. Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization. In: arxiv: [cs, math, stat] (Feb. 20, 2017). arxiv: url: (visited on 12/30/2017). [36] A. Ahmed, B. Recht, and J. Romberg. Blind Deconvolution using Convex Programming. In: arxiv: [cs, math] (Nov. 21, 2012). arxiv: url: (visited on 01/10/2018). [37] H. Namkoong and J. C. Duchi. Variance-based Regularization with Convex Objectives. In: Advances in Neural Information Processing Systems. 2017, pp [38] G. Liu, Q. Liu, and X. Yuan. A New Theory for Matrix Completion. In: Advances in Neural Information Processing Systems. 2017, pp [39] N. Chatterji and P. L. Bartlett. Alternating minimization for dictionary learning with random initialization. In: Advances in Neural Information Processing Systems. 2017, pp [40] Z. Artstein and R. J.-B. Wets. Consistency of minimizers and the SLLN for stochastic programs. IBM Thomas J. Watson Research Division, [41] S. Goel and A. Klivans. Eigenvalue decay implies polynomial-time learnability for neural networks. In: Advances in Neural Information Processing Systems. 2017, pp [42] K. Hayashi and Y. Yoshida. Fitting Low-Rank Tensors in Constant Time. In: Advances in Neural Information Processing Systems. 2017, pp [43] C. Ma, K. Wang, Y. Chi, and Y. Chen. Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. In: arxiv preprint arxiv: (2017). [44] V. I. Norkin and R. J.-B. Wets. Law of small numbers as concentration inequalities for sums of independent random setsand random set valued mappings. In: The Association of Lithuanian Serials, July 3, 2012, pp isbn: doi: / stoprog url: http : / / www. moksloperiodika. lt / STOPROG_2012/abstract/017.html (visited on 01/17/2018). 4
5 [45] M. Soltanolkotabi. Learning ReLUs via Gradient Descent. In: arxiv preprint arxiv: (2017). [46] S. S. Du, Y. Wang, and A. Singh. On the Power of Truncated SVD for General Highrank Matrix Estimation Problems. In: arxiv preprint arxiv: (2017). [47] G. Wang, G. B. Giannakis, Y. Saad, and J. Chen. Solving Almost all Systems of Random Quadratic Equations. In: arxiv preprint arxiv: (2017). [48] X. Huang, Z. Liang, C. Bajaj, and Q. Huang. Translation Synchronization via Truncated Least Squares. In: Advances in Neural Information Processing Systems. 2017, pp [49] D. Cohen, Y. C. Eldar, and G. Leus. Universal lower bounds on sampling rates for covariance estimation. In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp [50] S. Mei, Y. Bai, and A. Montanari. The Landscape of Empirical Risk for Non-convex Losses. In: arxiv: [stat] (July 21, 2016). arxiv: url: http: //arxiv.org/abs/ (visited on 01/17/2018). 5
Research Statement Qing Qu
Qing Qu Research Statement 1/4 Qing Qu (qq2105@columbia.edu) Today we are living in an era of information explosion. As the sensors and sensing modalities proliferate, our world is inundated by unprecedented
More informationImplicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion
: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion Cong Ma 1 Kaizheng Wang 1 Yuejie Chi Yuxin Chen 3 Abstract Recent years have seen a flurry of activities in designing provably
More informationNon-square matrix sensing without spurious local minima via the Burer-Monteiro approach
Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach Dohuyng Park Anastasios Kyrillidis Constantine Caramanis Sujay Sanghavi acebook UT Austin UT Austin UT Austin Abstract
More informationCharacterization of Gradient Dominance and Regularity Conditions for Neural Networks
Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Yi Zhou Ohio State University Yingbin Liang Ohio State University Abstract zhou.1172@osu.edu liang.889@osu.edu The past
More informationIncremental Reshaped Wirtinger Flow and Its Connection to Kaczmarz Method
Incremental Reshaped Wirtinger Flow and Its Connection to Kaczmarz Method Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yingbin Liang Department of EECS Syracuse
More informationFoundations of Deep Learning: SGD, Overparametrization, and Generalization
Foundations of Deep Learning: SGD, Overparametrization, and Generalization Jason D. Lee University of Southern California November 13, 2018 Deep Learning Single Neuron x σ( w, x ) ReLU: σ(z) = [z] + Figure:
More informationOverparametrization for Landscape Design in Non-convex Optimization
Overparametrization for Landscape Design in Non-convex Optimization Jason D. Lee University of Southern California September 19, 2018 The State of Non-Convex Optimization Practical observation: Empirically,
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu
More informationNonconvex Matrix Factorization from Rank-One Measurements
Yuanxin Li Cong Ma Yuxin Chen Yuejie Chi CMU Princeton Princeton CMU Abstract We consider the problem of recovering lowrank matrices from random rank-one measurements, which spans numerous applications
More informationFundamental Limits of Weak Recovery with Applications to Phase Retrieval
Proceedings of Machine Learning Research vol 75:1 6, 2018 31st Annual Conference on Learning Theory Fundamental Limits of Weak Recovery with Applications to Phase Retrieval Marco Mondelli Department of
More informationarxiv: v2 [stat.ml] 27 Sep 2016
Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach arxiv:1609.030v [stat.ml] 7 Sep 016 Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, and Sujay Sanghavi
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationRanking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization
Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis
More informationHow to Escape Saddle Points Efficiently? Praneeth Netrapalli Microsoft Research India
How to Escape Saddle Points Efficiently? Praneeth Netrapalli Microsoft Research India Chi Jin UC Berkeley Michael I. Jordan UC Berkeley Rong Ge Duke Univ. Sham M. Kakade U Washington Nonconvex optimization
More informationComposite nonlinear models at scale
Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)
More informationFast Angular Synchronization for Phase Retrieval via Incomplete Information
Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department
More informationMini-Course 1: SGD Escapes Saddle Points
Mini-Course 1: SGD Escapes Saddle Points Yang Yuan Computer Science Department Cornell University Gradient Descent (GD) Task: min x f (x) GD does iterative updates x t+1 = x t η t f (x t ) Gradient Descent
More informationResearch Statement Figure 2: Figure 1:
Qing Qu Research Statement 1/6 Nowadays, as sensors and sensing modalities proliferate (e.g., hyperspectral imaging sensors [1], computational microscopy [2], and calcium imaging [3] in neuroscience, see
More informationSymmetric Factorization for Nonconvex Optimization
Symmetric Factorization for Nonconvex Optimization Qinqing Zheng February 24, 2017 1 Overview A growing body of recent research is shedding new light on the role of nonconvex optimization for tackling
More informationA Conservation Law Method in Optimization
A Conservation Law Method in Optimization Bin Shi Florida International University Tao Li Florida International University Sundaraja S. Iyengar Florida International University Abstract bshi1@cs.fiu.edu
More informationNonconvex Methods for Phase Retrieval
Nonconvex Methods for Phase Retrieval Vince Monardo Carnegie Mellon University March 28, 2018 Vince Monardo (CMU) 18.898 Midterm March 28, 2018 1 / 43 Paper Choice Main reference paper: Solving Random
More informationStochastic Variance Reduction for Nonconvex Optimization. Barnabás Póczos
1 Stochastic Variance Reduction for Nonconvex Optimization Barnabás Póczos Contents 2 Stochastic Variance Reduction for Nonconvex Optimization Joint work with Sashank Reddi, Ahmed Hefny, Suvrit Sra, and
More informationConvergence of Cubic Regularization for Nonconvex Optimization under KŁ Property
Convergence of Cubic Regularization for Nonconvex Optimization under KŁ Property Yi Zhou Department of ECE The Ohio State University zhou.1172@osu.edu Zhe Wang Department of ECE The Ohio State University
More informationThe Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences Yuxin Chen Emmanuel Candès Department of Statistics, Stanford University, Sep. 2016 Nonconvex optimization
More informationMedian-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation
Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation Yuejie Chi, Yuanxin Li, Huishuai Zhang, and Yingbin Liang Abstract Recent work has demonstrated the effectiveness
More informationA Geometric Analysis of Phase Retrieval
A Geometric Analysis of Phase Retrieval Ju Sun, Qing Qu, John Wright js4038, qq2105, jw2966}@columbiaedu Dept of Electrical Engineering, Columbia University, New York NY 10027, USA Abstract Given measurements
More informationThe non-convex Burer Monteiro approach works on smooth semidefinite programs
The non-convex Burer Monteiro approach works on smooth semidefinite programs Nicolas Boumal Department of Mathematics Princeton University nboumal@math.princeton.edu Vladislav Voroninski Department of
More informationTutorial: PART 2. Optimization for Machine Learning. Elad Hazan Princeton University. + help from Sanjeev Arora & Yoram Singer
Tutorial: PART 2 Optimization for Machine Learning Elad Hazan Princeton University + help from Sanjeev Arora & Yoram Singer Agenda 1. Learning as mathematical optimization Stochastic optimization, ERM,
More informationarxiv: v1 [cs.lg] 4 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks Simon S. Du 1, Xiyu Zhai, Barnabás Póczos 1, and Aarti Singh 1 arxiv:1810.0054v1 [cs.lg] 4 Oct 018 1 Machine Learning Department,
More informationProvable Non-convex Phase Retrieval with Outliers: Median Truncated Wirtinger Flow
Provable Non-convex Phase Retrieval with Outliers: Median Truncated Wirtinger Flow Huishuai Zhang Department of EECS, Syracuse University, Syracuse, NY 3244 USA Yuejie Chi Department of ECE, The Ohio State
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationSpectral Initialization for Nonconvex Estimation: High-Dimensional Limit and Phase Transitions
Spectral Initialization for Nonconvex Estimation: High-Dimensional Limit and hase Transitions Yue M. Lu aulson School of Engineering and Applied Sciences Harvard University, USA Email: yuelu@seas.harvard.edu
More informationSGD CONVERGES TO GLOBAL MINIMUM IN DEEP LEARNING VIA STAR-CONVEX PATH
Under review as a conference paper at ICLR 9 SGD CONVERGES TO GLOAL MINIMUM IN DEEP LEARNING VIA STAR-CONVEX PATH Anonymous authors Paper under double-blind review ASTRACT Stochastic gradient descent (SGD)
More informationROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210
ROBUST BLIND SPIKES DECONVOLUTION Yuejie Chi Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 4 ABSTRACT Blind spikes deconvolution, or blind super-resolution, deals with
More informationarxiv: v1 [math.oc] 9 Oct 2018
Cubic Regularization with Momentum for Nonconvex Optimization Zhe Wang Yi Zhou Yingbin Liang Guanghui Lan Ohio State University Ohio State University zhou.117@osu.edu liang.889@osu.edu Ohio State University
More informationFirst Efficient Convergence for Streaming k-pca: a Global, Gap-Free, and Near-Optimal Rate
58th Annual IEEE Symposium on Foundations of Computer Science First Efficient Convergence for Streaming k-pca: a Global, Gap-Free, and Near-Optimal Rate Zeyuan Allen-Zhu Microsoft Research zeyuan@csail.mit.edu
More informationRapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization
Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016
More informationRecovery of Compactly Supported Functions from Spectrogram Measurements via Lifting
Recovery of Compactly Supported Functions from Spectrogram Measurements via Lifting Mark Iwen markiwen@math.msu.edu 2017 Friday, July 7 th, 2017 Joint work with... Sami Merhi (Michigan State University)
More informationSymmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization Xingguo Li Jarvis Haupt Dept of Electrical and Computer Eng University of Minnesota Email: lixx1661, jdhaupt@umdedu
More informationGlobal Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond
Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationLinear dimensionality reduction for data analysis
Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for
More informationNonlinear Optimization Methods for Machine Learning
Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks
More informationLow-rank Solutions of Linear Matrix Equations via Procrustes Flow
Low-rank Solutions of Linear Matrix Equations via Procrustes low Stephen Tu, Ross Boczar, Max Simchowitz {STEPHENT,BOCZAR,MSIMCHOW}@BERKELEYEDU EECS Department, UC Berkeley, Berkeley, CA Mahdi Soltanolkotabi
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline
More informationGradient Descent Can Take Exponential Time to Escape Saddle Points
Gradient Descent Can Take Exponential Time to Escape Saddle Points Simon S. Du Carnegie Mellon University ssdu@cs.cmu.edu Jason D. Lee University of Southern California jasonlee@marshall.usc.edu Barnabás
More informationSparse and low-rank decomposition for big data systems via smoothed Riemannian optimization
Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development
More informationConvex Phase Retrieval without Lifting via PhaseMax
Convex Phase etrieval without Lifting via PhaseMax Tom Goldstein * 1 Christoph Studer * 2 Abstract Semidefinite relaxation methods transform a variety of non-convex optimization problems into convex problems,
More informationReferences. --- a tentative list of papers to be mentioned in the ICML 2017 tutorial. Recent Advances in Stochastic Convex and Non-Convex Optimization
References --- a tentative list of papers to be mentioned in the ICML 2017 tutorial Recent Advances in Stochastic Convex and Non-Convex Optimization Disclaimer: in a quite arbitrary order. 1. [ShalevShwartz-Zhang,
More informationNon-Convex Optimization. CS6787 Lecture 7 Fall 2017
Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper
More informationLift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints
Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints Nicolas Boumal Université catholique de Louvain (Belgium) IDeAS seminar, May 13 th, 2014, Princeton The Riemannian
More informationFundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis
Fundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis Oussama Dhifallah and Yue M. Lu John A. Paulson School of Engineering and Applied Sciences Harvard University, Cambridge, MA 0238,
More information+1 (951) Suite Riverside, CA 92507
Samet Oymak oymak@ece.ucr.edu Winston Chung Hall +1 (951) 827-7701 Suite 322 www.sametoymak.com Riverside, CA 92507 ACADEMIC EXPERIENCE University of California, Riverside Assistant Professor in Electrical
More informationAN ALGORITHM FOR EXACT SUPER-RESOLUTION AND PHASE RETRIEVAL
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) AN ALGORITHM FOR EXACT SUPER-RESOLUTION AND PHASE RETRIEVAL Yuxin Chen Yonina C. Eldar Andrea J. Goldsmith Department
More informationGradient Descent Learns One-hidden-layer CNN: Don t be Afraid of Spurious Local Minima
: Don t be Afraid of Spurious Local Minima Simon S. Du Jason D. Lee Yuandong Tian 3 Barnabás Póczos Aarti Singh Abstract We consider the problem of learning a one-hiddenlayer neural network with non-overlapping
More informationDiffusion Approximations for Online Principal Component Estimation and Global Convergence
Diffusion Approximations for Online Principal Component Estimation and Global Convergence Chris Junchi Li Mengdi Wang Princeton University Department of Operations Research and Financial Engineering, Princeton,
More informationOn the local stability of semidefinite relaxations
On the local stability of semidefinite relaxations Diego Cifuentes Department of Mathematics Massachusetts Institute of Technology Joint work with Sameer Agarwal (Google), Pablo Parrilo (MIT), Rekha Thomas
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationExact Low-rank Matrix Recovery via Nonconvex M p -Minimization
Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:
More informationSolving Large-scale Systems of Random Quadratic Equations via Stochastic Truncated Amplitude Flow
Solving Large-scale Systems of Random Quadratic Equations via Stochastic Truncated Amplitude Flow Gang Wang,, Georgios B. Giannakis, and Jie Chen Dept. of ECE and Digital Tech. Center, Univ. of Minnesota,
More informationDay 3 Lecture 3. Optimizing deep networks
Day 3 Lecture 3 Optimizing deep networks Convex optimization A function is convex if for all α [0,1]: f(x) Tangent line Examples Quadratics 2-norms Properties Local minimum is global minimum x Gradient
More informationGlobal Optimality in Matrix and Tensor Factorizations, Deep Learning and More
Global Optimality in Matrix and Tensor Factorizations, Deep Learning and More Ben Haeffele and René Vidal Center for Imaging Science Institute for Computational Medicine Learning Deep Image Feature Hierarchies
More informationStein s Method for Matrix Concentration
Stein s Method for Matrix Concentration Lester Mackey Collaborators: Michael I. Jordan, Richard Y. Chen, Brendan Farrell, and Joel A. Tropp University of California, Berkeley California Institute of Technology
More informationDistributed Inexact Newton-type Pursuit for Non-convex Sparse Learning
Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology
More informationGAMINGRE 8/1/ of 7
FYE 09/30/92 JULY 92 0.00 254,550.00 0.00 0 0 0 0 0 0 0 0 0 254,550.00 0.00 0.00 0.00 0.00 254,550.00 AUG 10,616,710.31 5,299.95 845,656.83 84,565.68 61,084.86 23,480.82 339,734.73 135,893.89 67,946.95
More informationReshaped Wirtinger Flow for Solving Quadratic System of Equations
Reshaped Wirtinger Flow for Solving Quadratic System of Equations Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 344 hzhan3@syr.edu Yingbin Liang Department of EECS Syracuse University
More informationCHAPTER 1 EXPRESSIONS, EQUATIONS, FUNCTIONS (ORDER OF OPERATIONS AND PROPERTIES OF NUMBERS)
Aug 29 CHAPTER 1 EXPRESSIONS, EQUATIONS, FUNCTIONS (ORDER OF OPERATIONS AND PROPERTIES OF NUMBERS) Sept 5 No School Labor Day Holiday CHAPTER 1 EXPRESSIONS, EQUATIONS, FUNCTIONS (RELATIONS AND FUNCTIONS)
More informationarxiv: v1 [cs.it] 21 Feb 2013
q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto
More informationA Brief Overview of Practical Optimization Algorithms in the Context of Relaxation
A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation Zhouchen Lin Peking University April 22, 2018 Too Many Opt. Problems! Too Many Opt. Algorithms! Zero-th order algorithms:
More informationOn the fast convergence of random perturbations of the gradient flow.
On the fast convergence of random perturbations of the gradient flow. Wenqing Hu. 1 (Joint work with Chris Junchi Li 2.) 1. Department of Mathematics and Statistics, Missouri S&T. 2. Department of Operations
More informationOnline Generalized Eigenvalue Decomposition: Primal Dual Geometry and Inverse-Free Stochastic Optimization
Online Generalized Eigenvalue Decomposition: Primal Dual Geometry and Inverse-Free Stochastic Optimization Xingguo Li Zhehui Chen Lin Yang Jarvis Haupt Tuo Zhao University of Minnesota Princeton University
More informationAn iterative hard thresholding estimator for low rank matrix recovery
An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical
More informationNon-Convex Optimization in Machine Learning. Jan Mrkos AIC
Non-Convex Optimization in Machine Learning Jan Mrkos AIC The Plan 1. Introduction 2. Non convexity 3. (Some) optimization approaches 4. Speed and stuff? Neural net universal approximation Theorem (1989):
More informationStochastic optimization in Hilbert spaces
Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic optimization Hilbert spaces 1 / 48 Outline Learning vs Statistics Aymeric Dieuleveut Stochastic optimization Hilbert
More informationOptimal Spectral Initialization for Signal Recovery with Applications to Phase Retrieval
Optimal Spectral Initialization for Signal Recovery with Applications to Phase Retrieval Wangyu Luo, Wael Alghamdi Yue M. Lu Abstract We present the optimal design of a spectral method widely used to initialize
More informationLocal Strong Convexity of Maximum-Likelihood TDOA-Based Source Localization and Its Algorithmic Implications
Local Strong Convexity of Maximum-Likelihood TDOA-Based Source Localization and Its Algorithmic Implications Huikang Liu, Yuen-Man Pun, and Anthony Man-Cho So Dept of Syst Eng & Eng Manag, The Chinese
More informationStatistical Machine Learning for Structured and High Dimensional Data
Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,
More informationarxiv: v1 [cs.lg] 4 Oct 2018
Gradient descent aligns the layers of deep linear networks Ziwei Ji Matus Telgarsky {ziweiji,mjt}@illinois.edu University of Illinois, Urbana-Champaign arxiv:80.003v [cs.lg] 4 Oct 08 Abstract This paper
More information+1 (626) Dwight Way Berkeley, CA 94704
Samet Oymak sametoymak@gmail.com The Voleon Group +1 (626) 720-2114 2170 Dwight Way www.sametoymak.com Berkeley, CA 94704 ACADEMIC EXPERIENCE University of California, Berkeley (Sept. 2014 June 2015) Postdoctoral
More informationADAPTIVE FILTER THEORY
ADAPTIVE FILTER THEORY Fifth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada International Edition contributions by Telagarapu Prabhakar Department
More informationRestricted Strong Convexity Implies Weak Submodularity
Restricted Strong Convexity Implies Weak Submodularity Ethan R. Elenberg Rajiv Khanna Alexandros G. Dimakis Department of Electrical and Computer Engineering The University of Texas at Austin {elenberg,rajivak}@utexas.edu
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied
More informationADAPTIVE FILTER THEORY
ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface
More informationTensor Low-Rank Completion and Invariance of the Tucker Core
Tensor Low-Rank Completion and Invariance of the Tucker Core Shuzhong Zhang Department of Industrial & Systems Engineering University of Minnesota zhangs@umn.edu Joint work with Bo JIANG, Shiqian MA, and
More informationHigh-dimensional Statistics
High-dimensional Statistics Pradeep Ravikumar UT Austin Outline 1. High Dimensional Data : Large p, small n 2. Sparsity 3. Group Sparsity 4. Low Rank 1 Curse of Dimensionality Statistical Learning: Given
More informationAnalysis of Robust PCA via Local Incoherence
Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu
More informationPHASE RETRIEVAL FROM STFT MEASUREMENTS VIA NON-CONVEX OPTIMIZATION. Tamir Bendory and Yonina C. Eldar, Fellow IEEE
PHASE RETRIEVAL FROM STFT MEASUREMETS VIA O-COVEX OPTIMIZATIO Tamir Bendory and Yonina C Eldar, Fellow IEEE Department of Electrical Engineering, Technion - Israel Institute of Technology, Haifa, Israel
More informationContents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016
ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 LECTURERS: NAMAN AGARWAL AND BRIAN BULLINS SCRIBE: KIRAN VODRAHALLI Contents 1 Introduction 1 1.1 History of Optimization.....................................
More informationLearning with stochastic proximal gradient
Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and
More informationStochastic Optimization Methods for Machine Learning. Jorge Nocedal
Stochastic Optimization Methods for Machine Learning Jorge Nocedal Northwestern University SIAM CSE, March 2017 1 Collaborators Richard Byrd R. Bollagragada N. Keskar University of Colorado Northwestern
More informationSparse Solutions of an Undetermined Linear System
1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research
More informationA random perturbation approach to some stochastic approximation algorithms in optimization.
A random perturbation approach to some stochastic approximation algorithms in optimization. Wenqing Hu. 1 (Presentation based on joint works with Chris Junchi Li 2, Weijie Su 3, Haoyi Xiong 4.) 1. Department
More informationAsynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization
Proceedings of the hirty-first AAAI Conference on Artificial Intelligence (AAAI-7) Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization Zhouyuan Huo Dept. of Computer
More informationarxiv: v2 [math.oc] 5 Nov 2017
Gradient Descent Can Take Exponential Time to Escape Saddle Points arxiv:175.1412v2 [math.oc] 5 Nov 217 Simon S. Du Carnegie Mellon University ssdu@cs.cmu.edu Jason D. Lee University of Southern California
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationSpectral Method and Regularized MLE Are Both Optimal for Top-K Ranking
Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang Ranking A fundamental
More informationCOMPRESSED Sensing (CS) is a method to recover a
1 Sample Complexity of Total Variation Minimization Sajad Daei, Farzan Haddadi, Arash Amini Abstract This work considers the use of Total Variation (TV) minimization in the recovery of a given gradient
More informationTensor network vs Machine learning. Song Cheng ( 程嵩 ) IOP, CAS
Tensor network vs Machine learning Song Cheng ( 程嵩 ) IOP, CAS physichengsong@iphy.ac.cn Outline Tensor network in a nutshell TN concepts in machine learning TN methods in machine learning Outline Tensor
More informationEstimators based on non-convex programs: Statistical and computational guarantees
Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright
More information