Estimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS

Similar documents
Convex Optimization in Classification Problems

Robust Optimization for Risk Control in Enterprise-wide Optimization

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

EE 227A: Convex Optimization and Applications April 24, 2008

Robust Novelty Detection with Single Class MPM

Robust Fisher Discriminant Analysis

Distributionally Robust Convex Optimization

Handout 8: Dealing with Data Uncertainty

Robust Kernel-Based Regression

Distributionally Robust Convex Optimization

Second Order Cone Programming, Missing or Uncertain Data, and Sparse SVMs

Practical Robust Optimization - an introduction -

Robust Model Predictive Control through Adjustable Variables: an application to Path Planning

LIGHT ROBUSTNESS. Matteo Fischetti, Michele Monaci. DEI, University of Padova. 1st ARRIVAL Annual Workshop and Review Meeting, Utrecht, April 19, 2007

A Geometric Characterization of the Power of Finite Adaptability in Multistage Stochastic and Adaptive Optimization

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

Scalable robust hypothesis tests using graphical models

Multi-Range Robust Optimization vs Stochastic Programming in Prioritizing Project Selection

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS

A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem

Mathematical Optimization Models and Applications

The Social Cost of Stochastic & Irreversible Climate Change

Robust linear optimization under general norms

Short Course Robust Optimization and Machine Learning. Lecture 6: Robust Optimization in Machine Learning

Quantifying Stochastic Model Errors via Robust Optimization

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

Robust conic quadratic programming with ellipsoidal uncertainties

SOME HISTORY OF STOCHASTIC PROGRAMMING

Advances in Convex Optimization: Theory, Algorithms, and Applications

Likelihood Bounds for Constrained Estimation with Uncertainty

ECOM 009 Macroeconomics B. Lecture 2

A General Projection Property for Distribution Families

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Discriminant Analysis and Statistical Pattern Recognition

Applications of Robust Optimization in Signal Processing: Beamforming and Power Control Fall 2012

Handout 6: Some Applications of Conic Linear Programming

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

R O B U S T E N E R G Y M AN AG E M E N T S Y S T E M F O R I S O L AT E D M I C R O G R I D S

Introduction p. 1 Fundamental Problems p. 2 Core of Fundamental Theory and General Mathematical Ideas p. 3 Classical Statistical Decision p.

APPROXIMATE SOLUTION OF A SYSTEM OF LINEAR EQUATIONS WITH RANDOM PERTURBATIONS

Robust Partially Observable Markov Decision Processes

Convex Optimization: Applications

Planning and Model Selection in Data Driven Markov models

Advanced Machine Learning

Distributionally Robust Discrete Optimization with Entropic Value-at-Risk

Almost Robust Optimization with Binary Variables

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

Polynomial-Time Verification of PCTL Properties of MDPs with Convex Uncertainties and its Application to Cyber-Physical Systems

Robustness and Regularization: An Optimization Perspective

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

Sparse and Robust Optimization and Applications

STA 4273H: Sta-s-cal Machine Learning

Practicable Robust Markov Decision Processes

Optimisation and Operations Research

Stat 5101 Lecture Notes

Improvements to Benders' decomposition: systematic classification and performance comparison in a Transmission Expansion Planning problem

Robust optimization to deal with uncertainty

Machine Learning Lecture 2

Machine Learning Lecture 2

Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty

A Robust Minimax Approach to Classification

Linear Models for Classification

A Geometric Characterization of the Power of Finite Adaptability in Multi-stage Stochastic and Adaptive Optimization

Partially Observable Markov Decision Processes (POMDPs)

Stochastic resource allocation Boyd, Akuiyibo

2D Image Processing. Bayes filter implementation: Kalman filter

Modeling Uncertainty in Linear Programs: Stochastic and Robust Linear Programming

VCMC: Variational Consensus Monte Carlo

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Toward Probabilistic Forecasts of Convective Storm Activity for En Route Air Traffic Management

Curve Fitting Re-visited, Bishop1.2.5

ORIGINS OF STOCHASTIC PROGRAMMING

Reinforcement Learning Wrap-up

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

Robust Dual-Response Optimization

Pareto Efficiency in Robust Optimization

Robust Growth-Optimal Portfolios

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017

Empirical Risk Minimization

Data-Driven Optimization under Distributional Uncertainty

BAYESIAN APPROACH TO DECISION MAKING USING ENSEMBLE WEATHER FORECASTS

Optimization Tools in an Uncertain Environment

Robust Scheduling with Logic-Based Benders Decomposition

Statistical Analysis of Online News

Interval Data Classification under Partial Information: A Chance-constraint Approach

POLYNOMIAL-TIME IDENTIFICATION OF ROBUST NETWORK FLOWS UNDER UNCERTAIN ARC FAILURES

Hidden Markov Models (HMM) and Support Vector Machine (SVM)

7. Statistical estimation

Riccati difference equations to non linear extended Kalman filter constraints

1 Robust optimization

Natural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley

Sparse Covariance Selection using Semidefinite Programming

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Weather routing using dynamic programming to win sailing races

Robust portfolio selection under norm uncertainty

Robust discrete optimization and network flows

CS 4100 // artificial intelligence. Recap/midterm review!

Introduction to Machine Learning

Transcription:

MURI Meeting June 20, 2001 Estimation and Optimization: Gaps and Bridges Laurent El Ghaoui EECS UC Berkeley 1

goals currently, estimation (of model parameters) and optimization (of decision variables) are done separately in decision-making our goal is to integrate these two steps into one: better handling of uncertainties (estimation errors) data-driven optimization testbed involves a routing problem for aerial vehicles 2

agenda a driving application classical approach to decision-making advances in optimization robust optimization moment-based optimization joint estimation and optimization 3

driving application dynamic routing of aerial vehicles under uncertainty: g replacements R1 R3a R3 R3b A R2 B uncertain zone 4

driving application: challenges this is a problem with recourse (information is revealed as process evolves) it has stochastic uncertainty (network capacity is random) the model parameters (transition matrix governing state of nodes) are full of estimation errors a robust solution, which is not too conservative, is needed (trade-off risk vs. reward) 5

placements classical approach to decision-making model parameters estimation optimization decisions data estimation phase: compute an estimate of model parameters fields: identification, estimation, data-fitting, etc decision phase: choose decision variables in an optimal way fields: optimization, control, etc above processes can be done iteratively ( adaptive control ) 6

optimization under uncertainty traditional optimization assumes perfect knowledge of data in effect, tunes solution to (imperfect) data hence, approach is not reliable in practice recent study (Ben-Tal, Nemirovski, 2000) found nearly 90% of all linear programs proposed in NETLIB to be extremely sensitive to implementation errors 7

advances in optimization under uncertainty recent approaches account for data uncertainty: stochastic optimization: distribution of (random) uncertainty is known robust optimization: bounds on uncertainty are known moment-based optimization: use partial knowledge (eg, moments) on distribution of random uncertainties main challenge: obtain tractable, yet accurate, approximations to these hard problems 8

robust optimization principle: assume data is only known within given bounds find a solution that is guaranteed to meet the constraints whatever the uncertainty within its bounds similar to a game between decision-maker and uncertainty leads to very hard problems in general 9

robust optimization: results efficient approximations, ie, solutions with guaranteed robustness, based on convex optimization (uses a tractable sufficient condition for robustness) evaluation of quality of results (how conservative is the sufficient condition?) vast array of success stories: control systems, circuit design, network design, etc 10

example: worst-case simulation simulate state bounds for an uncertain dynamical system 3 2 1 0 1 2 3 0 10 20 30 40 50 60 11

robust optimization: challenges handling random uncertainty possible answer: moment-based optimization obtain the bounds on uncertainty from data possible answer: joint estimation and optimization handling recourse 12

moment-based optimization assume uncertainty is random, with partially known distribution (typically, moments, such as expected values) design against worst-case distribution results in an estimate of the (worst-case) distribution that is most prudent to choose hard problems in general but efficient approximations based on convex optimization 13

example: minimax probability classifier (with G. Lanckriet, C. Bhattacharyya, M. Jordan) problem: classify data points in two separate clusters approach: assume data points are distributed according to a distribution of which we only known moments (mean and covariances) minimize the worst-case probability of misclassification, over all distributions having the observed mean and covariances problem is solved exactly as a simple convex optimization problem (second-order cone program) 14

minimax probability classifier 10 minimax probabilistic separator x points y points mean of x mean of y 5 0 5 5 0 5 10 15

joint estimation and optimization robust optimization requires bounds on model parameters (a point estimate is not sufficient anymore!) how do we get these bounds from data? (these bounds should be accurate!) we ll show two examples of joint approach 16

worst-case identification/simulation for dynamical systems, in a sliding window fashion: build an uncertain system model for example, a set of recursive, linear equations with unknown-bout-bounded coefficients do a one-step ahead prediction based on above model above can be solved using convex optimization 17

output example 1.1 one step ahead worst case prediction with AR model of order 4 1.05 1 0.95 150 200 250 300 350 time 18

estimating a transition matrix many applications require the knowledge of the transition matrix of a Markov chain this matrix often arises in the context of stochastic dynamic programming, Markov decision processes, etc in driving application (optimal routing of aerial vehicles), transition matrix describes (say, weather) uncertainty 19

estimating the transition matrix to estimate transition matrix, solve a maximum-likelihood problem: maximize L(P ) subject to P P where P is the transition matrix set P describes bounds (a priori knowledge) on P (such as, sign) L(P ) is the (concave) log-likelihood function (depends on data) 20

estimating bounds classical approach: estimate intervals of confidence on P leads to very conservative results confidence region approach: describe uncertainty by a set of the form {P P L(P ) β} where the level β is appropriately chosen (below the maximal value of L over P) Fisher information matrix approach: use quadratic approximation of the above set (turns out, not necessary to do this approximation) 21

Constrained loglikelihood and its approximation example 5.15 x 104 True loglikelihood and quadratic approximation vs. P(1,1) 5.2 5.25 5.3 5.35 5.4 5.45 True loglikelihood function Quadratic approximation to the loglikelihood Beta level 5.5 5.55 0.97 0.975 0.98 0.985 0.99 0.995 1 1.005 P(1,1) 22

results maximal and average difference between the ML estimate and its bounds: Ell (Ellipsoid approach) and CL (using CL-regions) max(p Pl) mean(p Pl) max(pu P ) mean(pu P ) CL 0.0381 0.0035 0.0339 0.0060 Ell 0.1485 0.0114 0.0407 0.0052 23

associated decision problem often, we are interested in optimal resource allocation problems involving a transition matrix classical approach: get best estimate of transition matrix, P solve decision problem min V P V V where means scalar product, V is decision set 24

robust decision problem robust version: estimate confidence region (i.e., parameter β) solve game min max V V P Cβ V P above problem can be expressed as a classical convex program (easy to solve) 25

summary of results extended robust optimization to random uncertainty with moment-based optimization, and applied it to a new classification problem developed a framework for a better (less conservative) treatment of uncertainty in optimization: joint estimation and optimization obtained specific results (algorithm) for joint estimation/optimization of a transition matrix (problem forms the basis of many others) developed a joint identification and prediction scheme for dynamical systems with bounded uncertainty obtained an O(n 3 ) algorithm solving this problem students: A. Varma, G. Lanckriet 26

conclusions a better integration of estimation and optimization is necessary to address real-life decision-making challenges our goal is to make this a unified, data-driven process next steps: robust dynamic programming, robust optimization problems with recourse, robust combinatorial optimization 27