COR-OPT Seminar Reading List Sp 18 Damek Davis January 28, 2018 References [1] S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow. In: arxiv:1507.03566 [math] (July 13, 2015). arxiv: 1507.03566. url: http://arxiv.org/abs/1507.03566. [2] R. Meka, P. Jain, and I. S. Dhillon. Guaranteed Rank Minimization via Singular Value Projection. In: arxiv:0909.5457 [cs, math] (Sept. 30, 2009). arxiv: 0909. 5457. url: http://arxiv.org/abs/0909.5457. [3] R. Ge, C. Jin, and Y. Zheng. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis. In: arxiv:1704.00708 [cs, math, stat] (Apr. 3, 2017). arxiv: 1704.00708. url: http://arxiv.org/abs/1704.00708. [4] S. Bhojanapalli, B. Neyshabur, and N. Srebro. Global Optimality of Local Search for Low Rank Matrix Recovery. In: arxiv:1605.07221 [cs, math, stat] (May 23, 2016). arxiv: 1605.07221. url: http://arxiv.org/abs/1605.07221. [5] R. Ge, J. D. Lee, and T. Ma. Matrix Completion has No Spurious Local Minimum. In: arxiv:1605.07272 [cs, stat] (May 23, 2016). arxiv: 1605. 07272. url: http : //arxiv.org/abs/1605.07272. [6] R. Ge and T. Ma. On the Optimization Landscape of Tensor Decompositions. In: arxiv:1706.05598 [cs, math, stat] (June 17, 2017). arxiv: 1706.05598. url: http: //arxiv.org/abs/1706.05598. [7] R. Ge, F. Huang, C. Jin, and Y. Yuan. Escaping From Saddle Points Online Stochastic Gradient for Tensor Decomposition. In: arxiv:1503.02101 [cs, math, stat] (Mar. 6, 2015). arxiv: 1503.02101. url: http://arxiv.org/abs/1503.02101. [8] A. S. Bandeira, N. Boumal, and V. Voroninski. On the low-rank approach for semidefinite programs arising in synchronization and community detection. In: arxiv:1602.04426 [math] (Feb. 14, 2016). arxiv: 1602.04426. url: http://arxiv.org/abs/1602. 04426. 1
[9] C. Kim, A. S. Bandeira, and M. X. Goemans. Community Detection in Hypergraphs, Spiked Tensor Models, and Sum-of-Squares. In: arxiv:1705.02973 [cs, math, stat] (May 8, 2017). arxiv: 1705.02973. url: http://arxiv.org/abs/1705.02973. [10] E. Abbe. Community detection and stochastic block models: recent developments. In: arxiv:1703.10146 [cs, math, stat] (Mar. 29, 2017). arxiv: 1703. 10146. url: http://arxiv.org/abs/1703.10146. [11] K. Kawaguchi. Deep Learning without Poor Local Minima. In: arxiv:1605.07110 [cs, math, stat] (May 23, 2016). arxiv: 1605.07110. url: http://arxiv.org/abs/ 1605.07110. [12] M. Hardt and T. Ma. Identity Matters in Deep Learning. In: arxiv:1611.04231 [cs, stat] (Nov. 13, 2016). arxiv: 1611.04231. url: http://arxiv.org/abs/1611. 04231. [13] M. Hardt, T. Ma, and B. Recht. Gradient Descent Learns Linear Dynamical Systems. In: arxiv:1609.05191 [cs, math, stat] (Sept. 16, 2016). arxiv: 1609.05191. url: http://arxiv.org/abs/1609.05191. [14] A. S. Bandeira, N. Boumal, and A. Singer. Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. In: Mathematical Programming 163.1 (May 2017), pp. 145 167. issn: 0025-5610, 1436-4646. doi: 10.1007/s10107-016-1059-6. arxiv: 1411.3272. url: http://arxiv.org/abs/1411.3272. [15] A. Bandeira, P. Rigollet, and J. Weed. Optimal rates of estimation for multireference alignment. In: arxiv:1702.08546 [math, stat] (Feb. 27, 2017). arxiv: 1702. 08546. url: http://arxiv.org/abs/1702.08546. [16] D. Boob and G. Lan. Theoretical properties of the global optimizer of two layer neural network. In: arxiv:1710.11241 [cs] (Oct. 30, 2017). arxiv: 1710.11241. url: http://arxiv.org/abs/1710.11241. [17] D. Boob and G. Lan. Theoretical properties of the global optimizer of two layer neural network. In: arxiv:1710.11241 [cs] (Oct. 30, 2017). arxiv: 1710.11241. url: http://arxiv.org/abs/1710.11241. [18] L. Wang and A. Singer. Exact and Stable Recovery of Rotations for Robust Synchronization. In: arxiv:1211.2441 [cs, math] (Nov. 11, 2012). arxiv: 1211.2441. url: http://arxiv.org/abs/1211.2441. [19] R. Vershynin. Estimation in high dimensions: a geometric perspective. In: arxiv:1405.5103 [math, stat] (May 20, 2014). arxiv: 1405.5103. url: http://arxiv.org/abs/1405. 5103. [20] H. Liu, M.-C. Yue, and A. M.-C. So. On the Estimation Performance and Convergence Rate of the Generalized Power Method for Phase Synchronization. In: arxiv:1603.00211 [math] (Mar. 1, 2016). arxiv: 1603.00211. url: http://arxiv. org/abs/1603.00211. 2
[21] N. Boumal. Nonconvex phase synchronization. In: arxiv:1601.06114 [math] (Jan. 22, 2016). arxiv: 1601.06114. url: http://arxiv.org/abs/1601.06114. [22] Y. Zhong and N. Boumal. Near-optimal bounds for phase synchronization. In: arxiv:1703.06605 [math] (Mar. 20, 2017). arxiv: 1703.06605. url: http://arxiv. org/abs/1703.06605. [23] V. Roulet, N. Boumal, and A. d Aspremont. Computational Complexity versus Statistical Performance on Sparse Recovery Problems. In: arxiv:1506.03295 [math] (June 10, 2015). arxiv: 1506.03295. url: http://arxiv.org/abs/1506.03295. [24] M. Simchowitz, A. E. Alaoui, and B. Recht. On the Gap Between Strict-Saddles and True Convexity: An Omega(log d) Lower Bound for Eigenvector Approximation. In: arxiv:1704.04548 [cs, math, stat] (Apr. 14, 2017). arxiv: 1704.04548. url: http: //arxiv.org/abs/1704.04548. [25] Y. Chen and E. Candes. The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences. In: arxiv:1609.05820 [cs, math, stat] (Sept. 19, 2016). arxiv: 1609.05820. url: http://arxiv.org/abs/1609.05820. [26] P.-L. Loh. Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. In: arxiv:1501.00312 [cs, math, stat] (Jan. 1, 2015). arxiv: 1501.00312. url: http://arxiv.org/abs/1501.00312. [27] G. B. Arous, S. Mei, A. Montanari, and M. Nica. The landscape of the spiked tensor model. In: arxiv:1711.05424 [math, stat] (Nov. 15, 2017). arxiv: 1711.05424. url: http://arxiv.org/abs/1711.05424. [28] E. Abbe, L. Massoulie, A. Montanari, A. Sly, and N. Srivastava. Group Synchronization on Grids. In: arxiv:1706.08561 [cs, math, stat] (June 26, 2017). arxiv: 1706.08561. url: http://arxiv.org/abs/1706.08561. [29] S. S. Du, J. D. Lee, Y. Tian, B. Poczos, and A. Singh. Gradient Descent Learns Onehidden-layer CNN: Don t be Afraid of Spurious Local Minima. In: arxiv:1712.00779 [cs, math, stat] (Dec. 3, 2017). arxiv: 1712.00779. url: http://arxiv.org/abs/ 1712.00779. [30] Q. Qu, Y. Zhang, Y. C. Eldar, and J. Wright. Convolutional Phase Retrieval via Gradient Descent. In: arxiv:1712.00716 [cs, math, stat] (Dec. 3, 2017). arxiv: 1712. 00716. url: http://arxiv.org/abs/1712.00716. [31] J. C. Duchi and F. Ruan. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. In: arxiv:1705.02356 [cs, math, stat] (May 5, 2017). arxiv: 1705.02356. url: http://arxiv.org/abs/1705.02356. [32] E. Abbe, J. Fan, K. Wang, and Y. Zhong. Entrywise Eigenvector Analysis of Random Matrices with Low Expected Rank. In: arxiv:1709.09565 [math, stat] (Sept. 27, 2017). arxiv: 1709.09565. url: http://arxiv.org/abs/1709.09565. 3
[33] Y. Chen, Y. Chi, and A. Goldsmith. Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming. In: arxiv:1310.0807 [cs, math, stat] (Oct. 2, 2013). arxiv: 1310.0807. url: http://arxiv.org/abs/1310.0807. [34] J. Tang, F. Bach, M. Golbabaee, and M. Davies. Structure-Adaptive, Variance- Reduced, and Accelerated Stochastic Optimization. In: arxiv:1712.03156 [math] (Dec. 8, 2017). arxiv: 1712.03156. url: http://arxiv.org/abs/1712.03156. [35] M. Soltanolkotabi. Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization. In: arxiv:1702.06175 [cs, math, stat] (Feb. 20, 2017). arxiv: 1702.06175. url: http://arxiv.org/abs/ 1702.06175 (visited on 12/30/2017). [36] A. Ahmed, B. Recht, and J. Romberg. Blind Deconvolution using Convex Programming. In: arxiv:1211.5608 [cs, math] (Nov. 21, 2012). arxiv: 1211.5608. url: http://arxiv.org/abs/1211.5608 (visited on 01/10/2018). [37] H. Namkoong and J. C. Duchi. Variance-based Regularization with Convex Objectives. In: Advances in Neural Information Processing Systems. 2017, pp. 2975 2984. [38] G. Liu, Q. Liu, and X. Yuan. A New Theory for Matrix Completion. In: Advances in Neural Information Processing Systems. 2017, pp. 785 794. [39] N. Chatterji and P. L. Bartlett. Alternating minimization for dictionary learning with random initialization. In: Advances in Neural Information Processing Systems. 2017, pp. 1994 2003. [40] Z. Artstein and R. J.-B. Wets. Consistency of minimizers and the SLLN for stochastic programs. IBM Thomas J. Watson Research Division, 1994. [41] S. Goel and A. Klivans. Eigenvalue decay implies polynomial-time learnability for neural networks. In: Advances in Neural Information Processing Systems. 2017, pp. 2189 2199. [42] K. Hayashi and Y. Yoshida. Fitting Low-Rank Tensors in Constant Time. In: Advances in Neural Information Processing Systems. 2017, pp. 2470 2478. [43] C. Ma, K. Wang, Y. Chi, and Y. Chen. Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. In: arxiv preprint arxiv:1711.10467 (2017). [44] V. I. Norkin and R. J.-B. Wets. Law of small numbers as concentration inequalities for sums of independent random setsand random set valued mappings. In: The Association of Lithuanian Serials, July 3, 2012, pp. 94 99. isbn: 978-609-95241-4- 6. doi: 10. 5200 / stoprog. 2012. 17. url: http : / / www. moksloperiodika. lt / STOPROG_2012/abstract/017.html (visited on 01/17/2018). 4
[45] M. Soltanolkotabi. Learning ReLUs via Gradient Descent. In: arxiv preprint arxiv:1705.04591 (2017). [46] S. S. Du, Y. Wang, and A. Singh. On the Power of Truncated SVD for General Highrank Matrix Estimation Problems. In: arxiv preprint arxiv:1702.06861 (2017). [47] G. Wang, G. B. Giannakis, Y. Saad, and J. Chen. Solving Almost all Systems of Random Quadratic Equations. In: arxiv preprint arxiv:1705.10407 (2017). [48] X. Huang, Z. Liang, C. Bajaj, and Q. Huang. Translation Synchronization via Truncated Least Squares. In: Advances in Neural Information Processing Systems. 2017, pp. 1458 1467. [49] D. Cohen, Y. C. Eldar, and G. Leus. Universal lower bounds on sampling rates for covariance estimation. In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 3272 3276. [50] S. Mei, Y. Bai, and A. Montanari. The Landscape of Empirical Risk for Non-convex Losses. In: arxiv:1607.06534 [stat] (July 21, 2016). arxiv: 1607.06534. url: http: //arxiv.org/abs/1607.06534 (visited on 01/17/2018). 5