Advanced Mixed Integer Programming Formulations for Non-Convex Optimization Problems in Statistical Learning

Size: px
Start display at page:

Download "Advanced Mixed Integer Programming Formulations for Non-Convex Optimization Problems in Statistical Learning"

Transcription

1 Advanced Mixed Integer Programming Formulations for Non-Convex Optimization Problems in Statistical Learning Juan Pablo Vielma Massachusetts Institute of Technology 2016 IISA International Conference on Statistics. Corvallis, Oregon, August, 2016.

2 (Custom) Product Recommendations via CBCA Feature SX530 RX100 Zoom 50x 3.6x Prize $ $ Weight ounces 7.5 ounces Prefer Feature TG-4 G9 Waterproof Yes No Prize $ $ Weight 7.36 lb 7.5 lb Prefer We recommend: Feature TG-4 Galaxy 2 Waterproof Yes No Prize $ $ Viewfinder Electronic Optical Prefer 1 / 22

3 Towards Optimal Product Recommendation Find enough information about preferences to recommend Feature SX530 RX100 Zoom 50x 3.6x Prize $ $ Weight ounces 7.5 ounces Prefer Feature TG-4 G9 Waterproof Yes No Prize $ $ Weight 7.36 lb 7.5 lb Prefer We recommend: Feature TG-4 Galaxy 2 Waterproof Yes No Prize $ $ Viewfinder Electronic Optical Prefer How do I pick the next (1 st ) question to obtain the largest reduction of uncertainty or variance on preferences 2 / 22

4 Choice-based Conjoint Analysis Feature Chewbacca BB-8 Wookiee Yes No Droid No Yes Blaster Yes No 0 1 1A = x 2 0 I would buy toy Product Profile x 1 x 2 3 / 22

5 MNL Preference Model Utilities for 2 products, n features (e.g. n = 12) U 1 = x = X n i=1 i x 1 i + 1 U 2 = x = X n i=1 i x 2 i + 2 part-worths product profile noise (gumbel) Utility maximizing customer: x 1 x 2, U 1 U 2 Noise can result in response error: L x 1 x 2 = P x 1 x 2 = e x1 e x1 + e x2 4 / 22

6 Next Question To Reduce Variance : Bayesian Prior Distribution of Feature SX530 RX100 Zoom 50x 3.6x Prize $ $ Weight ounces 7.5 ounces Prefer Bayesian Update Posterior Distribution Feature TG-4 Galaxy 2 Waterproof Yes No Prize $ $ MCMC Bayesian Update Posterior Distribution Viewfinder Electronic Optical 0.10 Prefer Black-box objective: Question Selection = Enumeration Question selection by Mixed Integer Programming (MIP) 5 / 22

7 Avoiding Enumeration with MIP

8 Traveling Salesman Problem (TSP): Visit Cities Fast 7 / 22

9 How about 49 cities? Number of tours Fastest supercomputer Assuming one floating point operation per tour: > years times the age of the universe! How long does it take on an iphone? Less than a second! = 48!/ flops 4 iterations of cutting plane method! Dantzig, Fulkerson and Johnson 1954 did it by hand! For more info see tutorial in ConcordeTSP app Cutting planes are the key for effectively solving (even NPhard) MIP problems in practice. 8 / 22

10 50+ Years of MIP = Significant Solver Speedups Algorithmic Improvements (Machine Independent): CPLEX v1.2 (1991) v11 (2007): 29,000x speedup Gurobi v1 (2009) v6.5 (2015): 48.7x speedup Commercial, but free for academic use (Reasonably) effective free / open source solvers: GLPK, CBC and SCIP (free only for non-commercial) Easy to use, fast and versatile modeling languages Julia based JuMP modelling language Linear MIP solvers very mature and effective: Convex nonlinear MIP getting there (quadratic nearly there) 9 / 22

11 Question Selection with MIP

12 Bayesian Update and Geometric Updates Prior distribution Answer likelihood Posterior distribution N(µ, ) x 1 x 2 f x 1 x 2 ( ; µ, ) f x 1 x 2 = R L x 1 x 2 Multidimensional Integration? ( ; µ, ) L x 1 x 2 R ( ; µ, ) L ( x1 x 2 ) d non-convex on x 1,x 2 2 {0, 1} n 11 / 22

13 D-Efficiency and Posterior Covariance Matrix x 1 x 2 N(µ, ) x 1 x 2 cov( )= 1 Variance = D-Efficiency: Non-convex function f x 1,x 2 := E,x 1 / x 2 det( i ) 1/p Even evaluating expected D-Efficiency for a question requires multidimensional integration cov( )= 2 12 / 22

14 Standard Question Selection Criteria ( µ) 0 1 ( µ) apple r Choice balance: Minimize distance to center µ x 1 x 2 µ Postchoice symmetry: Maximize variance of question x 1 x 2 0 x 1 x 2 13 / 22

15 D-efficiency: Balance Question Trade-off D-efficiency = Non-convex function distance: d := µ x 1 x 2 f(d, v) variance: v := x 1 x 2 0 x 1 x 2 of Can evaluate f(d, v) with 1-dim integral 14 / 22

16 Optimization Model min f(d, v) s.t. µ x 1 x 2 = d x 1 x 2 0 x 1 x 2 = v A 1 x 1 + A 2 x 2 apple b x 1 6= x 2 x 1,x 2 2 {0, 1} n 15 / 22

17 Technique 1: Binary Quadratic x 1,x 2 2 {0, 1} n x 1 x 2 0 x 1 x 2 = v X l i,j apple x l i, X l i,j apple x l j, X l i,j x l i + x l j 1, X l i,j 0 apple X l i,j = x l i x l j (l 2 {1, 2}, i,j 2 {1,...,n}) : W i,j = x 1 i x 2 j : W i,j apple x 1 i, W i,j apple x 2 j, W i,j x 1 i + x 2 j 1, W i,j 0 nx i,j=1 Xi,j 1 + Xi,j 2 W i,j W j,i i,j = v 16 / 22

18 Technique 1: Binary Quadratic x 1,x 2 2 {0, 1} n x 1 6= x 2, kx 1 x 2 k Xi,j l = x l i x l j (l 2 {1, 2}, i,j 2 {1,...,n}) : Xi,j l apple x l i, Xi,j l apple x l j, Xi,j l x l i + x l j 1, Xi,j l 0 W i,j = x 1 i x 2 j : W i,j apple x 1 i, W i,j apple x 2 j, W i,j x 1 i + x 2 j 1, W i,j 0 nx i,j=1 X 1 i,j + X 2 i,j W i,j W j,i 1 17 / 22

19 Technique 2: Piecewise Linear Functions D-efficiency = Non-convex function distance: d := µ x 1 x 2 f(d, v) variance: v := x 1 x 2 0 x 1 x 2 of Can evaluate f(d, v) with 1-dim integral Piecewise Linear Interpolation MIP formulation 18 / 22

20 f(d 3 ) f(d 2 ) f(d 5 ) f(d 1 ) f(d 4 ) Simple Formulation for Univariate Functions z = f(x) d 1 d 2 d 3 d 4 d 5 Size = O (# of segments) x z = X 5 1= X 5 j=1 j=1 dj f(d j ) j j, j 0 19 / 22

21 f(d 3 ) f(d 2 ) f(d 5 ) f(d 1 ) f(d 4 ) Advanced Formulation for Univariate Functions z = f(x) d 1 d 2 d 3 d 4 d 5 Size = O (log 2 # of segments) x z = X 5 1= X 5 j=1 j=1 y 2 {0, 1} 2 dj f(d j ) j j, j 0 20 / 22

22 Computational Performance Advanced formulations provide an computational advantage Advantage is significantly more important for free solvers State of the art commercial solvers can be significantly better that free solvers Still, free is free! Time [s] Time [s] Simple CPLEX GLPK Advanced Simple Advanced 21 / 22

23 Summary and Main Messages Always choose Chewbacca! MIP can solve very challenging problems in practice Commercial solvers best, but free solvers reasonable Easily accessible and integrated into complex systems through the JuMP modeling language github.com/juliaopt/jump.jl Formulations = speed-ups and are (relatively) easy to learn Mixed integer linear programming formulation techniques. J. P. Vielma. SIAM Review 57, pp CBC application: 22 / 22

Advanced Mixed Integer Programming (MIP) Formulation Techniques

Advanced Mixed Integer Programming (MIP) Formulation Techniques Advanced Mixed Integer Programming (MIP) Formulation Techniques Juan Pablo Vielma Massachusetts Institute of Technology Center for Nonlinear Studies, Los Alamos National Laboratory. Los Alamos, New Mexico,

More information

Mixed Integer Programming Approaches for Experimental Design

Mixed Integer Programming Approaches for Experimental Design Mixed Integer Programming Approaches for Experimental Design Juan Pablo Vielma Massachusetts Institute of Technology Joint work with Denis Saure DRO brown bag lunch seminars, Columbia Business School New

More information

Mixed Integer Programming (MIP) for Daily Fantasy Sports, Statistics and Marketing

Mixed Integer Programming (MIP) for Daily Fantasy Sports, Statistics and Marketing Mixed Integer Programming (MIP) for Daily Fantasy Sports, Statistics and Marketing Juan Pablo Vielma Massachusetts Institute of Technology AM/ES 121, SEAS, Harvard. Boston, MA, November, 2016. MIP & Daily

More information

Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA)

Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA) Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA) Juan Pablo Vielma Massachusetts Institute of Technology Joint work with Denis Saure Operations Management Seminar, Rotman School of

More information

Mixed Integer Programming (MIP) for Causal Inference and Beyond

Mixed Integer Programming (MIP) for Causal Inference and Beyond Mixed Integer Programming (MIP) for Causal Inference and Beyond Juan Pablo Vielma Massachusetts Institute of Technology Columbia Business School New York, NY, October, 2016. Traveling Salesman Problem

More information

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014 The Traveling Salesman Problem: An Overview David P. Williamson, Cornell University Ebay Research January 21, 2014 (Cook 2012) A highly readable introduction Some terminology (imprecise) Problem Traditional

More information

18 hours nodes, first feasible 3.7% gap Time: 92 days!! LP relaxation at root node: Branch and bound

18 hours nodes, first feasible 3.7% gap Time: 92 days!! LP relaxation at root node: Branch and bound The MIP Landscape 1 Example 1: LP still can be HARD SGM: Schedule Generation Model Example 157323 1: LP rows, still can 182812 be HARD columns, 6348437 nzs LP relaxation at root node: 18 hours Branch and

More information

Recent Advances in Mixed Integer Programming Modeling and Computation

Recent Advances in Mixed Integer Programming Modeling and Computation Recent Advances in Mixed Integer Programming Modeling and Computation Juan Pablo Vielma Massachusetts Institute of Technology MIT Center for Transportation & Logistics. Cambridge, Massachusetts, June,

More information

21. Set cover and TSP

21. Set cover and TSP CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 21. Set cover and TSP ˆ Set covering ˆ Cutting problems and column generation ˆ Traveling salesman problem Laurent Lessard (www.laurentlessard.com)

More information

Encodings in Mixed Integer Linear Programming

Encodings in Mixed Integer Linear Programming Encodings in Mixed Integer Linear Programming Juan Pablo Vielma Sloan School of usiness, Massachusetts Institute of Technology Universidad de hile, December, 23 Santiago, hile. Mixed Integer er inary Formulations

More information

What is an integer program? Modelling with Integer Variables. Mixed Integer Program. Let us start with a linear program: max cx s.t.

What is an integer program? Modelling with Integer Variables. Mixed Integer Program. Let us start with a linear program: max cx s.t. Modelling with Integer Variables jesla@mandtudk Department of Management Engineering Technical University of Denmark What is an integer program? Let us start with a linear program: st Ax b x 0 where A

More information

19. Fixed costs and variable bounds

19. Fixed costs and variable bounds CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 19. Fixed costs and variable bounds ˆ Fixed cost example ˆ Logic and the Big M Method ˆ Example: facility location ˆ Variable lower bounds Laurent

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming: models and applications; complexity Ann-Brith Strömberg 2011 04 01 Modelling with integer variables (Ch. 13.1) Variables Linear programming (LP) uses continuous

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Gestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA

Gestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA Gestion de la production Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA 1 Contents 1 Integer Linear Programming 3 1.1 Definitions and notations......................................

More information

Travelling Salesman Problem

Travelling Salesman Problem Travelling Salesman Problem Fabio Furini November 10th, 2014 Travelling Salesman Problem 1 Outline 1 Traveling Salesman Problem Separation Travelling Salesman Problem 2 (Asymmetric) Traveling Salesman

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Integer programming (part 2) Lecturer: Javier Peña Convex Optimization /36-725

Integer programming (part 2) Lecturer: Javier Peña Convex Optimization /36-725 Integer programming (part 2) Lecturer: Javier Peña Convex Optimization 10-725/36-725 Last time: integer programming Consider the problem min x subject to f(x) x C x j Z, j J where f : R n R, C R n are

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 6.4/43 Principles of Autonomy and Decision Making Lecture 8: (Mixed-Integer) Linear Programming for Vehicle Routing and Motion Planning Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute

More information

From structures to heuristics to global solvers

From structures to heuristics to global solvers From structures to heuristics to global solvers Timo Berthold Zuse Institute Berlin DFG Research Center MATHEON Mathematics for key technologies OR2013, 04/Sep/13, Rotterdam Outline From structures to

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabás Póczos http://www.gaussianprocess.org/ 2 Some of these slides in the intro are taken from D. Lizotte, R. Parr, C. Guesterin

More information

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017 Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient

More information

Relevance Vector Machines for Earthquake Response Spectra

Relevance Vector Machines for Earthquake Response Spectra 2012 2011 American American Transactions Transactions on on Engineering Engineering & Applied Applied Sciences Sciences. American Transactions on Engineering & Applied Sciences http://tuengr.com/ateas

More information

Integer Linear Programming (ILP)

Integer Linear Programming (ILP) Integer Linear Programming (ILP) Zdeněk Hanzálek, Přemysl Šůcha hanzalek@fel.cvut.cz CTU in Prague March 8, 2017 Z. Hanzálek (CTU) Integer Linear Programming (ILP) March 8, 2017 1 / 43 Table of contents

More information

15.081J/6.251J Introduction to Mathematical Programming. Lecture 24: Discrete Optimization

15.081J/6.251J Introduction to Mathematical Programming. Lecture 24: Discrete Optimization 15.081J/6.251J Introduction to Mathematical Programming Lecture 24: Discrete Optimization 1 Outline Modeling with integer variables Slide 1 What is a good formulation? Theme: The Power of Formulations

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Introduction to Integer Programming

Introduction to Integer Programming Lecture 3/3/2006 p. /27 Introduction to Integer Programming Leo Liberti LIX, École Polytechnique liberti@lix.polytechnique.fr Lecture 3/3/2006 p. 2/27 Contents IP formulations and examples Total unimodularity

More information

Asymmetric Traveling Salesman Problem (ATSP): Models

Asymmetric Traveling Salesman Problem (ATSP): Models Asymmetric Traveling Salesman Problem (ATSP): Models Given a DIRECTED GRAPH G = (V,A) with V = {,, n} verte set A = {(i, j) : i V, j V} arc set (complete digraph) c ij = cost associated with arc (i, j)

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 4: Types of errors. Bayesian regression models. Logistic regression Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture

More information

Mixed Integer Programming Models for Non-Separable Piecewise Linear Cost Functions

Mixed Integer Programming Models for Non-Separable Piecewise Linear Cost Functions Mixed Integer Programming Models for Non-Separable Piecewise Linear Cost Functions Juan Pablo Vielma H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Solving LP and MIP Models with Piecewise Linear Objective Functions

Solving LP and MIP Models with Piecewise Linear Objective Functions Solving LP and MIP Models with Piecewise Linear Obective Functions Zonghao Gu Gurobi Optimization Inc. Columbus, July 23, 2014 Overview } Introduction } Piecewise linear (PWL) function Convex and convex

More information

A note on : A Superior Representation Method for Piecewise Linear Functions by Li, Lu, Huang and Hu

A note on : A Superior Representation Method for Piecewise Linear Functions by Li, Lu, Huang and Hu A note on : A Superior Representation Method for Piecewise Linear Functions by Li, Lu, Huang and Hu Juan Pablo Vielma, Shabbir Ahmed and George Nemhauser H. Milton Stewart School of Industrial and Systems

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Lecture Note 1: Introduction to optimization. Xiaoqun Zhang Shanghai Jiao Tong University

Lecture Note 1: Introduction to optimization. Xiaoqun Zhang Shanghai Jiao Tong University Lecture Note 1: Introduction to optimization Xiaoqun Zhang Shanghai Jiao Tong University Last updated: September 23, 2017 1.1 Introduction 1. Optimization is an important tool in daily life, business and

More information

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact

More information

MIXED INTEGER PROGRAMMING APPROACHES FOR NONLINEAR AND STOCHASTIC PROGRAMMING

MIXED INTEGER PROGRAMMING APPROACHES FOR NONLINEAR AND STOCHASTIC PROGRAMMING MIXED INTEGER PROGRAMMING APPROACHES FOR NONLINEAR AND STOCHASTIC PROGRAMMING A Thesis Presented to The Academic Faculty by Juan Pablo Vielma Centeno In Partial Fulfillment of the Requirements for the

More information

Introduction to Machine Learning. Regression. Computer Science, Tel-Aviv University,

Introduction to Machine Learning. Regression. Computer Science, Tel-Aviv University, 1 Introduction to Machine Learning Regression Computer Science, Tel-Aviv University, 2013-14 Classification Input: X Real valued, vectors over real. Discrete values (0,1,2,...) Other structures (e.g.,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Branch-and-Bound. Leo Liberti. LIX, École Polytechnique, France. INF , Lecture p. 1

Branch-and-Bound. Leo Liberti. LIX, École Polytechnique, France. INF , Lecture p. 1 Branch-and-Bound Leo Liberti LIX, École Polytechnique, France INF431 2011, Lecture p. 1 Reminders INF431 2011, Lecture p. 2 Problems Decision problem: a question admitting a YES/NO answer Example HAMILTONIAN

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

A Tighter Piecewise McCormick Relaxation for Bilinear Problems

A Tighter Piecewise McCormick Relaxation for Bilinear Problems A Tighter Piecewise McCormick Relaxation for Bilinear Problems Pedro M. Castro Centro de Investigação Operacional Faculdade de Ciências niversidade de isboa Problem definition (NP) Bilinear program min

More information

Data Structures for Efficient Inference and Optimization

Data Structures for Efficient Inference and Optimization Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Fast Regularization Paths via Coordinate Descent

Fast Regularization Paths via Coordinate Descent August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor

More information

A note on : A Superior Representation Method for Piecewise Linear Functions

A note on : A Superior Representation Method for Piecewise Linear Functions A note on : A Superior Representation Method for Piecewise Linear Functions Juan Pablo Vielma Business Analytics and Mathematical Sciences Department, IBM T. J. Watson Research Center, Yorktown Heights,

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

MTTTS16 Learning from Multiple Sources

MTTTS16 Learning from Multiple Sources MTTTS16 Learning from Multiple Sources 5 ECTS credits Autumn 2018, University of Tampere Lecturer: Jaakko Peltonen Lecture 6: Multitask learning with kernel methods and nonparametric models On this lecture:

More information

Split Cuts for Convex Nonlinear Mixed Integer Programming

Split Cuts for Convex Nonlinear Mixed Integer Programming Split Cuts for Convex Nonlinear Mixed Integer Programming Juan Pablo Vielma Massachusetts Institute of Technology joint work with D. Dadush and S. S. Dey S. Modaresi and M. Kılınç Georgia Institute of

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Computational complexity theory

Computational complexity theory Computational complexity theory Introduction to computational complexity theory Complexity (computability) theory deals with two aspects: Algorithm s complexity. Problem s complexity. References S. Cook,

More information

International ejournals

International ejournals ISSN 0976 1411 Available online at www.internationalejournals.com International ejournals International ejournal of Mathematics and Engineering 2 (2017) Vol. 8, Issue 1, pp 11 21 Optimization of Transportation

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Relevance Vector Machines

Relevance Vector Machines LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian

More information

Improved methods for the Travelling Salesperson with Hotel Selection

Improved methods for the Travelling Salesperson with Hotel Selection Improved methods for the Travelling Salesperson with Hotel Selection Marco Castro marco.castro@ua.ac.be ANT/OR November 23rd, 2011 Contents Problem description Motivation Mathematical formulation Solution

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Disconnecting Networks via Node Deletions

Disconnecting Networks via Node Deletions 1 / 27 Disconnecting Networks via Node Deletions Exact Interdiction Models and Algorithms Siqian Shen 1 J. Cole Smith 2 R. Goli 2 1 IOE, University of Michigan 2 ISE, University of Florida 2012 INFORMS

More information

Reliability Monitoring Using Log Gaussian Process Regression

Reliability Monitoring Using Log Gaussian Process Regression COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

ELECTRE TRI plug-in in Quantum GIS and ElectreTriBM webservice What s new?

ELECTRE TRI plug-in in Quantum GIS and ElectreTriBM webservice What s new? ELECTRE TRI plug-in in Quantum GIS and ElectreTriBM webservice What s new? Olivier Sobrie University of Mons Faculty of engineering October 17, 2011 University of Mons Olivier Sobrie - October 17, 2011

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Section #2: Linear and Integer Programming

Section #2: Linear and Integer Programming Section #2: Linear and Integer Programming Prof. Dr. Sven Seuken 8.3.2012 (with most slides borrowed from David Parkes) Housekeeping Game Theory homework submitted? HW-00 and HW-01 returned Feedback on

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

APPLIED MECHANISM DESIGN FOR SOCIAL GOOD

APPLIED MECHANISM DESIGN FOR SOCIAL GOOD APPLIED MECHANISM DESIGN FOR SOCIAL GOOD JOHN P DICKERSON Lecture #4 09/08/2016 CMSC828M Tuesdays & Thursdays 12:30pm 1:45pm PRESENTATION LIST IS ONLINE! SCRIBE LIST COMING SOON 2 THIS CLASS: (COMBINATORIAL)

More information

Modeling with Integer Programming

Modeling with Integer Programming Modeling with Integer Programg Laura Galli December 18, 2014 We can use 0-1 (binary) variables for a variety of purposes, such as: Modeling yes/no decisions Enforcing disjunctions Enforcing logical conditions

More information

Using Matrices And Hungarian Method To Solve The Traveling Salesman Problem

Using Matrices And Hungarian Method To Solve The Traveling Salesman Problem Salem State University Digital Commons at Salem State University Honors Theses Student Scholarship 2018-01-01 Using Matrices And Hungarian Method To Solve The Traveling Salesman Problem Briana Couto Salem

More information

Unit 1A: Computational Complexity

Unit 1A: Computational Complexity Unit 1A: Computational Complexity Course contents: Computational complexity NP-completeness Algorithmic Paradigms Readings Chapters 3, 4, and 5 Unit 1A 1 O: Upper Bounding Function Def: f(n)= O(g(n)) if

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Computational complexity theory

Computational complexity theory Computational complexity theory Introduction to computational complexity theory Complexity (computability) theory deals with two aspects: Algorithm s complexity. Problem s complexity. References S. Cook,

More information

Logistic Regression. Seungjin Choi

Logistic Regression. Seungjin Choi Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

1. Consider the following graphs and choose the correct name of each function.

1. Consider the following graphs and choose the correct name of each function. Name Date Summary of Functions Comparing Linear, Quadratic, and Exponential Functions - Part 1 Independent Practice 1. Consider the following graphs and choose the correct name of each function. Part A:

More information

Supervised Learning Coursework

Supervised Learning Coursework Supervised Learning Coursework John Shawe-Taylor Tom Diethe Dorota Glowacka November 30, 2009; submission date: noon December 18, 2009 Abstract Using a series of synthetic examples, in this exercise session

More information

Inderjit Dhillon The University of Texas at Austin

Inderjit Dhillon The University of Texas at Austin Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp) Introduction 2 / 29 Notion of distance

More information

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Why should you care about the solution strategies?

Why should you care about the solution strategies? Optimization Why should you care about the solution strategies? Understanding the optimization approaches behind the algorithms makes you more effectively choose which algorithm to run Understanding the

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Machine Learning Practice Page 2 of 2 10/28/13

Machine Learning Practice Page 2 of 2 10/28/13 Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes

More information

Statistical Methods for Data Mining

Statistical Methods for Data Mining Statistical Methods for Data Mining Kuangnan Fang Xiamen University Email: xmufkn@xmu.edu.cn Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find

More information

Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Resource Constrained Project Scheduling Linear and Integer Programming (1)

Resource Constrained Project Scheduling Linear and Integer Programming (1) DM204, 2010 SCHEDULING, TIMETABLING AND ROUTING Lecture 3 Resource Constrained Project Linear and Integer Programming (1) Marco Chiarandini Department of Mathematics & Computer Science University of Southern

More information

Chapter 3: Discrete Optimization Integer Programming

Chapter 3: Discrete Optimization Integer Programming Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Sito web: http://home.deib.polimi.it/amaldi/ott-13-14.shtml A.A. 2013-14 Edoardo

More information

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Organization. I MCMC discussion. I project talks. I Lecture.

Organization. I MCMC discussion. I project talks. I Lecture. Organization I MCMC discussion I project talks. I Lecture. Content I Uncertainty Propagation Overview I Forward-Backward with an Ensemble I Model Reduction (Intro) Uncertainty Propagation in Causal Systems

More information

How fast can we add (or subtract) two numbers n and m?

How fast can we add (or subtract) two numbers n and m? Addition and Subtraction How fast do we add (or subtract) two numbers n and m? How fast can we add (or subtract) two numbers n and m? Definition. Let A(d) denote the maximal number of steps required to

More information

Learning a Mixture of Gaussians via Mixed Integer Optimization

Learning a Mixture of Gaussians via Mixed Integer Optimization Learning a Mixture of Gaussians via Mixed Integer Optimization Hari Bandi Dimitris Bertsimas Rahul Mazumder October 15, 2018 Abstract We consider the problem of estimating the parameters of a multivariate

More information

Introduction to Integer Linear Programming

Introduction to Integer Linear Programming Lecture 7/12/2006 p. 1/30 Introduction to Integer Linear Programming Leo Liberti, Ruslan Sadykov LIX, École Polytechnique liberti@lix.polytechnique.fr sadykov@lix.polytechnique.fr Lecture 7/12/2006 p.

More information

COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017

COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SOFT CLUSTERING VS HARD CLUSTERING

More information

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA Sistemi di Elaborazione dell Informazione Regressione Ruggero Donida Labati Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy http://homes.di.unimi.it/donida

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

ML (cont.): SUPPORT VECTOR MACHINES

ML (cont.): SUPPORT VECTOR MACHINES ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version

More information

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information