Mixed Integer Programming Approaches for Experimental Design
|
|
- Derick Wilkins
- 5 years ago
- Views:
Transcription
1 Mixed Integer Programming Approaches for Experimental Design Juan Pablo Vielma Massachusetts Institute of Technology Joint work with Denis Saure DRO brown bag lunch seminars, Columbia Business School New York, NY, October, 2016.
2 Motivation: (Custom) Product Recommendations Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces Feature TG-4 G9 Waterproof Yes No Weight 7.36 lb 7.5 lb We recommend: Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical Experimental Design with MIP 1 / 27
3 Motivation: (Custom) Product Recommendations Feature SX530 RX100 Zoom 50x 3.6x Feature TG-4 G9 Waterproof Yes No Weight ounces 7.5 ounces Weight 7.36 lb 7.5 lb Toubia, Hauser and Dahan (2003) We recommend: Toubia, Hauser, and Simester (2004) Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical Experimental Design with MIP 2 / 27
4 Towards Optimal Product Recommendation Find enough information about preferences to recommend Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces Feature TG-4 G9 Waterproof Yes No Weight 7.36 lb 7.5 lb We recommend: Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical How do I pick the next (1 st ) question to obtain the largest reduction of uncertainty or variance on preferences Compensatory model estimation (part-worths), not just assortment Experimental Design with MIP 3 / 27
5 Next Question To Reduce Variance : Bayesian Prior Distribution of Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces Bayesian Update Posterior Distribution Feature TG-4 Galaxy 2 Waterproof Yes No MCMC Bayesian Update Posterior Distribution Viewfinder Electronic Optical Black-box objective: Question Selection = Enumeration Question selection by Mixed Integer Programming (MIP) Experimental Design with MIP 4 / 27
6 Traveling Salesman Problem (TSP): Visit Cities Fast Paradoxes, Contradictions, and the Limits of Science Many research results define boundaries of what cannot be known, predicted, or described. Classifying these limitations shows us the structure of science and reason. Noson S. Yanofsky S A computer would have to check all these possible routes to find the shortest one. Experimental Design with MIP 5 / 27
7 MIP = Avoid Enumeration Number of tours for 49 cities Fastest supercomputer Assuming one floating point operation per tour: > years times the age of the universe! How long does it take on an iphone? Less than a second! 4 iterations of cutting plane method! = 48!/ flops Dantzig, Fulkerson and Johnson 1954 did it by hand! For more info see tutorial in ConcordeTSP app Cutting planes are the key for effectively solving (even NPhard) MIP problems in practice. Experimental Design with MIP 6 / 27
8 50+ Years of MIP = Significant Solver Speedups Algorithmic Improvements (Machine Independent): CPLEX v1.2 (1991) v11 (2007): 29,000x speedup Gurobi v1 (2009) v6.5 (2015): 48.7x speedup Commercial, but free for academic use (Reasonably) effective free / open source solvers: GLPK, COIN-OR (CBC) and SCIP (only for non-commercial) Easy to use, fast and versatile modeling languages Julia based JuMP modelling language Linear MIP solvers very mature and effective: Convex nonlinear MIP getting there (even MI-SDP!), quadratic nearly there Experimental Design with MIP 7 / 27
9 Choice-based Conjoint Analysis (CBCA) Feature Chewbacca BB-8 Wookiee Yes No Droid No Yes Blaster Yes No 0 1 1A = x 2 0 I would buy toy Product Profile x 1 x 2 Experimental Design with MIP 8 / 27
10 MNL Preference Model Utilities for 2 products, n features (e.g. n = 12) U 1 = x = X n i=1 i x 1 i + 1 U 2 = x = X n i=1 i x 2 i + 2 part-worths product profile noise (gumbel) Utility maximizing customer: x 1 x 2, U 1 U 2 Noise can result in response error: L x 1 x 2 = P x 1 x 2 = e x1 e x1 + e x2 Experimental Design with MIP 9 / 27
11 Next Question To Reduce Variance : Bayesian Prior Distribution of Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces Bayesian Update Posterior Distribution Feature TG-4 Galaxy 2 Waterproof Yes No MCMC Bayesian Update Posterior Distribution Viewfinder Electronic Optical MNL Preference Model Logistic Regression x 1 x 2, z 0 z = x 1 x 2 Experimental Design with MIP 10 / 27
12 Linear Experimental Design Feature TG-4 Galaxy 2 Waterproof Yes No Unknown 2 R n Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical Questions: Z = z 1... z q T 2 R q n z i q i=1 Viewfinder Electronic Optical Answers: Y = T y 1... y q Model: P (Y,Z )=L (Y,Z )= Y q Objective: Choose Z to learn fast i=1 h yi, z i Experimental Design with MIP 11 / 27
13 Bayesian Framework Prior distribution Answer likelihood Posterior distribution N(µ, ) L (Y,Z) g ( Y,Z ) g ( Y,Z )= R R ( ; µ, ) L (Y,Z) ( ; µ, ) L (Y,Z) d fast = minimize posterior variance Experimental Design with MIP 12 / 27
14 Goal = Minimize Expected Posterior Variance f (Z) n f (Z) =E Y (det cov ( Y,Z)) 1/mo Possible solution approaches: g ( Y,Z ) / ( ; µ, ) L (Y,Z) I ( Y,Z) 2 ln g ( Y,Z ) / 2 ln L (Y,Z) cov ( Y,Z) I ˆ Y,Z 1, E N(µ, ) ni ( Y,Z) 1o max Z E Y det I ˆ Y,Z 1/m? Experimental Design with MIP 13 / 27
15 A Really Good Case = Linear Regression n f (Z) =E Y (det cov ( Y,Z)) 1/mo y i = z i + i, i N(0, 1) g ( Y,Z )= ( ; µ 0, 0 ) 0 = var ( Y,Z) = Z T Z min Z f (Z) = max Z det Z T Z + 1 1/m Z discrete MISDP or MISOCP for m = n Experimental Design with MIP 14 / 27
16 A Relatively Good Case = Few Questions n f (Z) =E Y (det cov ( Y,Z)) 1/mo Z = E ( cov ( z i q i=1 Rn, q n n Y,Z)=m Y, µ z i q, z it z jo q i=1 n Y,Z)=M Y, µ z i q, z it z jo q i=1 n f (Z) =f µ z i q, z it z jo q i=1 i,j=1 i,j=1 i,j=1 Experimental Design with MIP 15 / 27
17 A Relatively Good Case = Few Questions n f (Z) =E Y (det cov ( Y,Z)) 1/mo Z = E ( cov ( z i q Only requirements: i=1 Rn, q n n Y,Z)=m Y, µ z i q, z it z jo q N(µ, ) i=1 L (Y,Z)= Y q n Y,Z)=M Y, µ z i q h, z it z jo q i=1 yi, z i i=1 Logistic regression with small q n f (Z) =f µ z i q, z it z jo q i=1 i,j=1 i,j=1 i,j=1 Experimental Design with MIP 16 / 27
18 Question Selection for CBCA x 1 x 2 N(µ, ) x 1 x 2 cov( )= 1 Variance = D-Efficiency: Non-convex function Without previous slide, even evaluation requires dim d ( ) - dimensional integration f x 1,x 2 := E,x 1 / x 2 det( i ) 1/p cov( )= 2 Experimental Design with MIP 17 / 27
19 D-efficiency Simplification for CBCA D-efficiency = Non-convex function distance: d := µ x 1 x 2 f(d, v) variance: v := x 1 x 2 0 x 1 x 2 of Can evaluate f(d, v) with 1-dim integral Experimental Design with MIP 18 / 27
20 Simplification = Trade-off for known criteria ( µ) 0 1 ( µ) apple r Choice balance: Minimize distance to center µ x 1 x 2 µ Postchoice symmetry: Maximize variance of question x 1 x 2 0 x 1 x 2 Experimental Design with MIP 19 / 27
21 Optimization Model min f(d, v) s.t. µ x 1 x 2 = d x 1 x 2 0 x 1 x 2 = v A 1 x 1 + A 2 x 2 apple b linearize x k i xl j x 1 6= x 2 x 1,x 2 2 {0, 1} n Experimental Design with MIP 20 / 27
22 Technique 2: Piecewise Linear Functions D-efficiency = Non-convex function distance: d := µ x 1 x 2 variance: v := x 1 x 2 0 x 1 x 2 of f(d, v) Can evaluate f(d, v) with 1-dim integral Piecewise Linear Interpolation MIP formulation Experimental Design with MIP 21 / 27
23 MIP-based Adaptive Questionnaires Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces Feature TG-4 G9 Waterproof Yes No Weight 7.36 lb 7.5 lb We recommend: Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical E Y,X 1,X 2 cov Y,X 1,X 2 Optimal one-step look-ahead moment-matching approximate Bayesian approach. Experimental Design with MIP 22 / 27
24 Optimal One-Step Look-Ahead Prior distribution Feature SX530 RX100 Zoom 50x 3.6x Weight ounces 7.5 ounces min f x 1,x x1,x 2 2 Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical N µ i, i Feature TG-4 Galaxy G9 2 Waterproof Yes No Viewfinder Weight Electronic 7.36 lb Optical 7.5 lb Solve with MIP formulation Experimental Design with MIP 23 / 27
25 Moment-Matching Approximate Bayesian Update Answer likelihood Prior distribution Posterior distribution N µ i, i Feature TG-4 Galaxy 2 Waterproof Yes No Viewfinder Electronic Optical approx. N µ i+1, i+1 µ i+1 = E y, x 1,x 2 i+1 =cov y, x 1,x 2 1-dim integral Experimental Design with MIP 24 / 27
26 Computational Experiments 16 questions, 2 options, 12 and 24 features Simulate MNL responses with known Question Selection MIP-based using CPLEX and open source COIN-OR solver Knapsack-based geometric Heuristic by Toubia et al. Time limits of 1 s and 10 s Metrics: Estimator variance = det cov Y,X 1,X 2 1/2 Estimator distance = E Y,X 1,X 2 2 Computed for true posterior with MCMC Experimental Design with MIP 25 / 27
27 Results for 12 Features, 1 s time limit Estimator Variance Estimator Distance Heuristic (Avg. = 0.04 s, Max = 0.61s) COIN-OR (Avg. = 0.93 s, Max = 1s) CPLEX (Avg. = 0.21 s, Max = 0.48s) Experimental Design with MIP 26 / 27
28 Does it Scale? Results for 24 features Estimator Variance Estimator Distance Heuristic (Avg. = 0.19 s, Max = 3s) CPLEX 1s (Avg. = 1 s, Max = 1s) CPLEX 10s (Avg. = 7.7 s, Max = 10s) Experimental Design with MIP 27 / 27
29 Some improvements for 24 features Estimator Variance Estimator Distance Heuristic (Avg. = 0.19 s, Max = 3s) CPLEX 1s (Avg. = 1 s, Max = 1s) CPLEX 10s (Avg. = 7.7 s, Max = 10s) Experimental Design with MIP 28 / 27
30 Summary and Main Messages Always choose Chewbacca! MIP can now solve challenging problems in practice Even in near-real time Appropriate domain expertise can be crucial for MIP ing Commercial solvers best, but free solvers reasonable Integration into complex systems easy with JuMP Some scalability : get the most out of small data Adaptive Choice-based Conjoint Analysis Improves on existing geometric methods Experimental Design with MIP 29 / 27
Advanced Mixed Integer Programming Formulations for Non-Convex Optimization Problems in Statistical Learning
Advanced Mixed Integer Programming Formulations for Non-Convex Optimization Problems in Statistical Learning Juan Pablo Vielma Massachusetts Institute of Technology 2016 IISA International Conference on
More informationAdvanced Mixed Integer Programming (MIP) Formulation Techniques
Advanced Mixed Integer Programming (MIP) Formulation Techniques Juan Pablo Vielma Massachusetts Institute of Technology Center for Nonlinear Studies, Los Alamos National Laboratory. Los Alamos, New Mexico,
More informationEllipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA)
Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA) Juan Pablo Vielma Massachusetts Institute of Technology Joint work with Denis Saure Operations Management Seminar, Rotman School of
More informationMixed Integer Programming (MIP) for Daily Fantasy Sports, Statistics and Marketing
Mixed Integer Programming (MIP) for Daily Fantasy Sports, Statistics and Marketing Juan Pablo Vielma Massachusetts Institute of Technology AM/ES 121, SEAS, Harvard. Boston, MA, November, 2016. MIP & Daily
More informationMixed Integer Programming (MIP) for Causal Inference and Beyond
Mixed Integer Programming (MIP) for Causal Inference and Beyond Juan Pablo Vielma Massachusetts Institute of Technology Columbia Business School New York, NY, October, 2016. Traveling Salesman Problem
More information18 hours nodes, first feasible 3.7% gap Time: 92 days!! LP relaxation at root node: Branch and bound
The MIP Landscape 1 Example 1: LP still can be HARD SGM: Schedule Generation Model Example 157323 1: LP rows, still can 182812 be HARD columns, 6348437 nzs LP relaxation at root node: 18 hours Branch and
More informationRecent Advances in Mixed Integer Programming Modeling and Computation
Recent Advances in Mixed Integer Programming Modeling and Computation Juan Pablo Vielma Massachusetts Institute of Technology MIT Center for Transportation & Logistics. Cambridge, Massachusetts, June,
More information21. Set cover and TSP
CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 21. Set cover and TSP ˆ Set covering ˆ Cutting problems and column generation ˆ Traveling salesman problem Laurent Lessard (www.laurentlessard.com)
More informationThe Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014
The Traveling Salesman Problem: An Overview David P. Williamson, Cornell University Ebay Research January 21, 2014 (Cook 2012) A highly readable introduction Some terminology (imprecise) Problem Traditional
More information19. Fixed costs and variable bounds
CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 19. Fixed costs and variable bounds ˆ Fixed cost example ˆ Logic and the Big M Method ˆ Example: facility location ˆ Variable lower bounds Laurent
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationFrom structures to heuristics to global solvers
From structures to heuristics to global solvers Timo Berthold Zuse Institute Berlin DFG Research Center MATHEON Mathematics for key technologies OR2013, 04/Sep/13, Rotterdam Outline From structures to
More information16.410/413 Principles of Autonomy and Decision Making
6.4/43 Principles of Autonomy and Decision Making Lecture 8: (Mixed-Integer) Linear Programming for Vehicle Routing and Motion Planning Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabás Póczos http://www.gaussianprocess.org/ 2 Some of these slides in the intro are taken from D. Lizotte, R. Parr, C. Guesterin
More informationMachine Learning, Midterm Exam: Spring 2009 SOLUTION
10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationGestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA
Gestion de la production Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA 1 Contents 1 Integer Linear Programming 3 1.1 Definitions and notations......................................
More informationMVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg
MVE165/MMG630, Integer linear programming: models and applications; complexity Ann-Brith Strömberg 2011 04 01 Modelling with integer variables (Ch. 13.1) Variables Linear programming (LP) uses continuous
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationBranch-and-Bound. Leo Liberti. LIX, École Polytechnique, France. INF , Lecture p. 1
Branch-and-Bound Leo Liberti LIX, École Polytechnique, France INF431 2011, Lecture p. 1 Reminders INF431 2011, Lecture p. 2 Problems Decision problem: a question admitting a YES/NO answer Example HAMILTONIAN
More informationBagging During Markov Chain Monte Carlo for Smoother Predictions
Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods
More informationInteger Linear Programming (ILP)
Integer Linear Programming (ILP) Zdeněk Hanzálek, Přemysl Šůcha hanzalek@fel.cvut.cz CTU in Prague March 8, 2017 Z. Hanzálek (CTU) Integer Linear Programming (ILP) March 8, 2017 1 / 43 Table of contents
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationTravelling Salesman Problem
Travelling Salesman Problem Fabio Furini November 10th, 2014 Travelling Salesman Problem 1 Outline 1 Traveling Salesman Problem Separation Travelling Salesman Problem 2 (Asymmetric) Traveling Salesman
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationInternational ejournals
ISSN 0976 1411 Available online at www.internationalejournals.com International ejournals International ejournal of Mathematics and Engineering 2 (2017) Vol. 8, Issue 1, pp 11 21 Optimization of Transportation
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationINTRODUCTION TO PATTERN RECOGNITION
INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationSimple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017
Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationOverview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation
Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated
More informationLinear classifiers: Overfitting and regularization
Linear classifiers: Overfitting and regularization Emily Fox University of Washington January 25, 2017 Logistic regression recap 1 . Thus far, we focused on decision boundaries Score(x i ) = w 0 h 0 (x
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationLecture Note 1: Introduction to optimization. Xiaoqun Zhang Shanghai Jiao Tong University
Lecture Note 1: Introduction to optimization Xiaoqun Zhang Shanghai Jiao Tong University Last updated: September 23, 2017 1.1 Introduction 1. Optimization is an important tool in daily life, business and
More informationMixed Integer Programming Solvers: from Where to Where. Andrea Lodi University of Bologna, Italy
Mixed Integer Programming Solvers: from Where to Where Andrea Lodi University of Bologna, Italy andrea.lodi@unibo.it November 30, 2011 @ Explanatory Workshop on Locational Analysis, Sevilla A. Lodi, MIP
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationSection #2: Linear and Integer Programming
Section #2: Linear and Integer Programming Prof. Dr. Sven Seuken 8.3.2012 (with most slides borrowed from David Parkes) Housekeeping Game Theory homework submitted? HW-00 and HW-01 returned Feedback on
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationAlgorithm Design Strategies V
Algorithm Design Strategies V Joaquim Madeira Version 0.0 October 2016 U. Aveiro, October 2016 1 Overview The 0-1 Knapsack Problem Revisited The Fractional Knapsack Problem Greedy Algorithms Example Coin
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationEnsembles. Léon Bottou COS 424 4/8/2010
Ensembles Léon Bottou COS 424 4/8/2010 Readings T. G. Dietterich (2000) Ensemble Methods in Machine Learning. R. E. Schapire (2003): The Boosting Approach to Machine Learning. Sections 1,2,3,4,6. Léon
More informationComputer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression
Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:
More informationWhat is an integer program? Modelling with Integer Variables. Mixed Integer Program. Let us start with a linear program: max cx s.t.
Modelling with Integer Variables jesla@mandtudk Department of Management Engineering Technical University of Denmark What is an integer program? Let us start with a linear program: st Ax b x 0 where A
More informationEliciting Consumer Preferences using Robust Adaptive Choice Questionnaires
Eliciting Consumer Preferences using Robust Adaptive Choice Questionnaires JACOB ABERNETHY, THEODOROS EVGENIOU, OLIVIER TOUBIA, and JEAN-PHILIPPE VERT 1 1 Jacob Abernethy is a graduate student at Toyota
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationWelcome to... Problem Analysis and Complexity Theory , 3 VU
Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU Birgit Vogtenhuber Institute for Software Technology email: bvogt@ist.tugraz.at office: Inffeldgasse 16B/II, room IC02044 slides: http://www.ist.tugraz.at/pact17.html
More informationLarge-scale optimization and decomposition methods: outline. Column Generation and Cutting Plane methods: a unified view
Large-scale optimization and decomposition methods: outline I Solution approaches for large-scaled problems: I Delayed column generation I Cutting plane methods (delayed constraint generation) 7 I Problems
More informationDesign of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.
Design of Text Mining Experiments Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/research Active Learning: a flavor of design of experiments Optimal : consider
More informationUnit 1A: Computational Complexity
Unit 1A: Computational Complexity Course contents: Computational complexity NP-completeness Algorithmic Paradigms Readings Chapters 3, 4, and 5 Unit 1A 1 O: Upper Bounding Function Def: f(n)= O(g(n)) if
More informationArtificial Neural Networks
Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationNP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]
NP-Completeness Andreas Klappenecker [based on slides by Prof. Welch] 1 Prelude: Informal Discussion (Incidentally, we will never get very formal in this course) 2 Polynomial Time Algorithms Most of the
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationWhy should you care about the solution strategies?
Optimization Why should you care about the solution strategies? Understanding the optimization approaches behind the algorithms makes you more effectively choose which algorithm to run Understanding the
More informationCSE 546 Final Exam, Autumn 2013
CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationShortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers
Shortening Picking Distance by using Rank-Order Clustering and Genetic Algorithm for Distribution Centers Rong-Chang Chen, Yi-Ru Liao, Ting-Yao Lin, Chia-Hsin Chuang, Department of Distribution Management,
More informationQuantum computing with superconducting qubits Towards useful applications
Quantum computing with superconducting qubits Towards useful applications Stefan Filipp IBM Research Zurich Switzerland Forum Teratec 2018 June 20, 2018 Palaiseau, France Why Quantum Computing? Why now?
More informationParameter estimation Conditional risk
Parameter estimation Conditional risk Formalizing the problem Specify random variables we care about e.g., Commute Time e.g., Heights of buildings in a city We might then pick a particular distribution
More informationSVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning
SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are
More informationRegression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.
Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions
More information10701/15781 Machine Learning, Spring 2007: Homework 2
070/578 Machine Learning, Spring 2007: Homework 2 Due: Wednesday, February 2, beginning of the class Instructions There are 4 questions on this assignment The second question involves coding Do not attach
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationBehavioral Data Mining. Lecture 7 Linear and Logistic Regression
Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression
More informationSolving LP and MIP Models with Piecewise Linear Objective Functions
Solving LP and MIP Models with Piecewise Linear Obective Functions Zonghao Gu Gurobi Optimization Inc. Columbus, July 23, 2014 Overview } Introduction } Piecewise linear (PWL) function Convex and convex
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationShort Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning
Short Course Robust Optimization and 3. Optimization in Supervised EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 Outline Overview of Supervised models and variants
More information15.081J/6.251J Introduction to Mathematical Programming. Lecture 24: Discrete Optimization
15.081J/6.251J Introduction to Mathematical Programming Lecture 24: Discrete Optimization 1 Outline Modeling with integer variables Slide 1 What is a good formulation? Theme: The Power of Formulations
More informationLogistic Regression. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Logistic Regression Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative Please start HW 1 early! Questions are welcome! Two principles for estimating parameters Maximum Likelihood
More informationSCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati
SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA Sistemi di Elaborazione dell Informazione Regressione Ruggero Donida Labati Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy http://homes.di.unimi.it/donida
More informationMathematical Optimization Models and Applications
Mathematical Optimization Models and Applications Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 1, 2.1-2,
More informationMachine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart
Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Guy Lebanon Binary Classification Binary classification is the most basic task in machine learning, and yet the most frequent. Binary classifiers often serve as the
More informationImproved methods for the Travelling Salesperson with Hotel Selection
Improved methods for the Travelling Salesperson with Hotel Selection Marco Castro marco.castro@ua.ac.be ANT/OR November 23rd, 2011 Contents Problem description Motivation Mathematical formulation Solution
More informationScheduling and Optimization Course (MPRI)
MPRI Scheduling and optimization: lecture p. /6 Scheduling and Optimization Course (MPRI) Leo Liberti LIX, École Polytechnique, France MPRI Scheduling and optimization: lecture p. /6 Teachers Christoph
More informationComputer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression
Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationINTRODUCTION TO PATTERN
INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take
More informationIndicator Constraints in Mixed-Integer Programming
Indicator Constraints in Mixed-Integer Programming Andrea Lodi University of Bologna, Italy - andrea.lodi@unibo.it Amaya Nogales-Gómez, Universidad de Sevilla, Spain Pietro Belotti, FICO, UK Matteo Fischetti,
More informationAsymmetric Traveling Salesman Problem (ATSP): Models
Asymmetric Traveling Salesman Problem (ATSP): Models Given a DIRECTED GRAPH G = (V,A) with V = {,, n} verte set A = {(i, j) : i V, j V} arc set (complete digraph) c ij = cost associated with arc (i, j)
More information