IE598 Big Data Optimization Introduction Instructor: Niao He Jan 17, 2018 1
A little about me Assistant Professor, ISE & CSL UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia Tech, 2010 2015 B.S. in Mathematics, University of Sci. & Tech. of China, 2006 2010 2
A little about the course Big Data Optimization Explore modern optimization theories, algorithms, and big data applications Emphasize a deep understanding of structure of optimization problems and computation complexity of numerical algorithms Expose to the frontier of research in the intersection of large-scale optimization and machine learning 3
A little about you PhD or Master? ISE, ECE, Stat, CS? Took any optimization courses? 4
Course Details Prerequisites: no formal ones, but assume knowledge in linear algebra, real analysis, and probability theory basic machine learning and optimization at graduate level Textbooks: no required ones, but recommend to read the listed references on syllabus Ben-Tal & Nemirovski (2011) Nesterov (2003) Beck (2017) Bubeck (2015) 5
Course Details Evaluation: groups of size 1~3 Paper Presentation (25%) Final Project (75%) Proposal (10%) : 1~3 pages Report (40%): 10~15 pages under given format Presentation (25%): 15~20 mins Bonus (20 pts): a conference or journal submission Deadlines: see syllabus 6
Course Admin Syllabus & Website http://niaohe.ise.illinois.edu/ie598/ (pwd: spring2018) Where to get help No TAs Email: niaohe@illinois.edu with [IE 598] in your subject Office Location: 211 Transport Building Office Hours: Mon. 3:00-4:00 or by appointment via email 7
Introduction 8
Starts with the buzzword 9
Era of Big Data Big data heat in academia 10
Era of Big Data Big data heat in industry LinkedIn: 48,000+ Data Scientist jobs in United States 11
More than a buzzword Big data revolution in various areas Robotics and autonomous car Natural language processing Computer vision Healthcare Healthcare Finance Environment Lifestyle Aerospace 12
How to do data analysis? Key Steps Pose a problem Collect data Pre-process and clean data Formulate a mathematical model Find a solution Evaluate and interpret the results 13
What is Optimization? Find the optimal solution that minimize/maximize an objective function subject to constraints 14
Why do we care? Optimization lies at the heart of many fields, especially machine learning. Finance Portfolio selection, asset pricing, etc. Electrical Engineering Signal and image processing, control and robotics, etc. Industrial Engineering Supply chain, revenue management, transportation etc. Computer Science Machine learning, computer vision, etc. 15
Example Portfolio Selection Markowitz Mean-Variance Model where w is a vector of portfolio weights R is the expected returns Σ is the variance of portfolio returns λ > 0 is the risk tolerance factor 16
Example Image Denoising Total Variation Denoising Model where x is image matrix O is the noisy image, P is the observed entries TV(x) is the total variation 17
Example Inventory Newsvendor Model where q is number of newspaper to be stocked D is the random demand c is the unit purchase price p is the sell price 18
Example Regression Linear Regression Model where x i : predictor vector (feature) y i : response vector (label) w: parameters to be learned n: number of data points 19
Example Regularization Ridge Regression Model 2 w 2 = σj=1 d w j 2 is the L 2 -regularization LASSO (Least Absolute Shrinkage and Selection Operator) w 1 = σd j=1 w j is the L 1 -regularization 20
Example Classification Maximum Margin Classifier Model where x i : predictor vector (feature) y i {1, 1}: label/class w: parameters to be learned n: number of data points 21
Example More Classification Soft Margin SVM (support vector machine) Logistic Regression 22
Example Maximum Likelihood Estimation Assume data points x 1,..., x n are drawn i.i.d. from some distribution and we want to fit the data with a model p(x w) with parameter w, the maximum likelihood estimation is to solve Least square regression as a special case Logistic regression as a special case 23
Example Clustering K-Means Model where x 1,, x n : data μ 1,, μ k : cluster centers to be learned C 1,, C k : clusters to be assigned to 24
Many More Examples in ML Supervised learning (predictive models) Regression Classification Neural networks Boosting Unsupervised learning (data exploration) Clustering (K-means) Dimension reduction (PCA) Density estimation Reinforcement learning Collaborative filtering Graphical models Probabilistic inference 25
Theme of This Course How to solve optimization problems efficiently in the Big Data environment? 26
Structure of Optimization Linear vs. Nonlinear Deterministic vs. Stochastic Continuous vs. Combinatorial Smooth vs. Nonsmooth Convex vs. Nonconvex Low-dimensional vs. High-dimensional Static vs. Online Single vs. Sequential Decision Making 27
Easy or Hard? What makes an optimization problem easy or hard? Find minimum volume ellipsoid Find maximum volume ellipsoid NP-hard Polynomial solvable Example from L. Xiao, CS286 seminar 28
Easy or Hard? What makes an optimization problem easy or hard? Linear Optimization Polynomial Optimization Polynomial solvable P ~ NP-hard 29
Complexity and Convexity The great watershed in optimization isn t between linearity and nonlinearity, but convexity and nonconvexity. R. Rockafellar, SIAM Review 1993 Non-Convex Optimization Convex Optimization 30
Types of Algorithms Polynomial-time algorithms (dates back to 1970s or so) E.g., ellipsoid method, interior point method (IPM) First-order algorithms (dates back to 1900s, resurrection since 1980) E.g., accelerated gradient descent method (AGD) Second-order algorithms E.g., Newton method, L-BFGS Stochastic and randomized algorithms (dates back to 1950s, resurrection since 2004) E.g., stochastic approximation (SA) 31
Central Topics Algorithms Large-Scale Optimization Complexity Applications 32
Courtesy Warning If you are looking for software/tools for big data analytics : IE 529 Stats of Big Data & Clustering Check CS/CSE related data mining courses If you are looking for introduction-level optimization: ECE 490 Introduction to Optimization IE 510 Advanced Nonlinear Programming If you are looking for combinatorial optimization: IE 598 Advanced Integer Programming Otherwise, welcome to the class and see you next week! 33