Optimal Transport in Risk Analysis
|
|
- Janis Patrick
- 5 years ago
- Views:
Transcription
1 Optimal Transport in Risk Analysis Jose Blanchet (based on work with Y. Kang and K. Murthy) Stanford University (Management Science and Engineering), and Columbia University (Department of Statistics and Department of IEOR). Blanchet (Columbia U. and Stanford U.) 1 / 25
2 Goal: Goal: Present a comprehensive framework for decision making under model uncertainty... This presentation is an invitation to read these two papers: Blanchet (Columbia U. and Stanford U.) 2 / 25
3 Example: Model Uncertainty in Ruin Probabilities R (t) = the reserve (perhaps multiple lines) at time t. Blanchet (Columbia U. and Stanford U.) 3 / 25
4 Example: Model Uncertainty in Ruin Probabilities R (t) = the reserve (perhaps multiple lines) at time t. Ruin probability (in finite time horizon T ) u T = P true (R (t) B for some t [0, T ]). Blanchet (Columbia U. and Stanford U.) 3 / 25
5 Example: Model Uncertainty in Ruin Probabilities R (t) = the reserve (perhaps multiple lines) at time t. Ruin probability (in finite time horizon T ) u T = P true (R (t) B for some t [0, T ]). B is a set which models bankruptcy. Blanchet (Columbia U. and Stanford U.) 3 / 25
6 Example: Model Uncertainty in Ruin Probabilities R (t) = the reserve (perhaps multiple lines) at time t. Ruin probability (in finite time horizon T ) u T = P true (R (t) B for some t [0, T ]). B is a set which models bankruptcy. Problem: Model (P true ) may be complex, intractable or simply unknown... Blanchet (Columbia U. and Stanford U.) 3 / 25
7 A Distributionally Robust Risk Analysis Formulation Our solution: Estimate u T by solving sup P (R (t) B for some t [0, T ]), D c (P 0,P ) δ where P 0 is a suitable model. Blanchet (Columbia U. and Stanford U.) 4 / 25
8 A Distributionally Robust Risk Analysis Formulation Our solution: Estimate u T by solving sup P (R (t) B for some t [0, T ]), D c (P 0,P ) δ where P 0 is a suitable model. P 0 = proxy for P true. Blanchet (Columbia U. and Stanford U.) 4 / 25
9 A Distributionally Robust Risk Analysis Formulation Our solution: Estimate u T by solving sup P (R (t) B for some t [0, T ]), D c (P 0,P ) δ where P 0 is a suitable model. P 0 = proxy for P true. P 0 right trade-off between fidelity and tractability. Blanchet (Columbia U. and Stanford U.) 4 / 25
10 A Distributionally Robust Risk Analysis Formulation Our solution: Estimate u T by solving sup P (R (t) B for some t [0, T ]), D c (P 0,P ) δ where P 0 is a suitable model. P 0 = proxy for P true. P 0 right trade-off between fidelity and tractability. δ is the distributional uncertainty size. Blanchet (Columbia U. and Stanford U.) 4 / 25
11 A Distributionally Robust Risk Analysis Formulation Our solution: Estimate u T by solving sup P (R (t) B for some t [0, T ]), D c (P 0,P ) δ where P 0 is a suitable model. P 0 = proxy for P true. P 0 right trade-off between fidelity and tractability. δ is the distributional uncertainty size. D c ( ) is the distributional uncertainty region. Blanchet (Columbia U. and Stanford U.) 4 / 25
12 Desirable Elements of Distributionally Robust Formulation Would like D c ( ) to have wide flexibility (even non-parametric). Blanchet (Columbia U. and Stanford U.) 5 / 25
13 Desirable Elements of Distributionally Robust Formulation Would like D c ( ) to have wide flexibility (even non-parametric). Want optimization to be tractable. Blanchet (Columbia U. and Stanford U.) 5 / 25
14 Desirable Elements of Distributionally Robust Formulation Would like D c ( ) to have wide flexibility (even non-parametric). Want optimization to be tractable. Want to preserve advantages of using P 0. Blanchet (Columbia U. and Stanford U.) 5 / 25
15 Desirable Elements of Distributionally Robust Formulation Would like D c ( ) to have wide flexibility (even non-parametric). Want optimization to be tractable. Want to preserve advantages of using P 0. Want a way to estimate δ. Blanchet (Columbia U. and Stanford U.) 5 / 25
16 Connections to Distributionally Robust Optimization Standard choices based on divergence (such as Kullback-Leibler) - Hansen & Sargent (2016) ( )) dv D (v µ) = E v (log. dµ Blanchet (Columbia U. and Stanford U.) 6 / 25
17 Connections to Distributionally Robust Optimization Standard choices based on divergence (such as Kullback-Leibler) - Hansen & Sargent (2016) ( )) dv D (v µ) = E v (log. dµ Robust Optimization: Ben-Tal, El Ghaoui, Nemirovski (2009). Blanchet (Columbia U. and Stanford U.) 6 / 25
18 Connections to Distributionally Robust Optimization Standard choices based on divergence (such as Kullback-Leibler) - Hansen & Sargent (2016) ( )) dv D (v µ) = E v (log. dµ Robust Optimization: Ben-Tal, El Ghaoui, Nemirovski (2009). Big problem: Absolute continuity may typically be violated... Blanchet (Columbia U. and Stanford U.) 6 / 25
19 Connections to Distributionally Robust Optimization Standard choices based on divergence (such as Kullback-Leibler) - Hansen & Sargent (2016) ( )) dv D (v µ) = E v (log. dµ Robust Optimization: Ben-Tal, El Ghaoui, Nemirovski (2009). Big problem: Absolute continuity may typically be violated... Think of using Brownian motion as a proxy model for R (t)... Blanchet (Columbia U. and Stanford U.) 6 / 25
20 Connections to Distributionally Robust Optimization Standard choices based on divergence (such as Kullback-Leibler) - Hansen & Sargent (2016) ( )) dv D (v µ) = E v (log. dµ Robust Optimization: Ben-Tal, El Ghaoui, Nemirovski (2009). Big problem: Absolute continuity may typically be violated... Think of using Brownian motion as a proxy model for R (t)... We advocate using optimal transport costs (e.g. Wasserstein distance). Blanchet (Columbia U. and Stanford U.) 6 / 25
21 Elements of Optimal Transport Costs S X and S Y be Polish spaces. Blanchet (Columbia U. and Stanford U.) 7 / 25
22 Elements of Optimal Transport Costs S X and S Y be Polish spaces. B SX and B SY be the associated Borel σ-fields. Blanchet (Columbia U. and Stanford U.) 7 / 25
23 Elements of Optimal Transport Costs S X and S Y be Polish spaces. B SX and B SY be the associated Borel σ-fields. c ( ) : S X S X [0, ) be lower semicontinuous. Blanchet (Columbia U. and Stanford U.) 7 / 25
24 Elements of Optimal Transport Costs S X and S Y be Polish spaces. B SX and B SY be the associated Borel σ-fields. c ( ) : S X S X [0, ) be lower semicontinuous. µ ( ) and v ( ) Borel probability measures defined on S X and S Y. Blanchet (Columbia U. and Stanford U.) 7 / 25
25 Elements of Optimal Transport Costs S X and S Y be Polish spaces. B SX and B SY be the associated Borel σ-fields. c ( ) : S X S X [0, ) be lower semicontinuous. µ ( ) and v ( ) Borel probability measures defined on S X and S Y. Given π a Borel prob. measure on S X S Y, π X (A) = π (A S Y ) and π Y (C ) = π (S X C ). Blanchet (Columbia U. and Stanford U.) 7 / 25
26 Definition of Optimal Transport Costs Define D c (µ, v) = min{e π (c (X, Y )) : π X = µ and π Y = v}. π Blanchet (Columbia U. and Stanford U.) 8 / 25
27 Definition of Optimal Transport Costs Define D c (µ, v) = min{e π (c (X, Y )) : π X = µ and π Y = v}. π This is the so-called Kantorovich problem (see Villani (2008)). Blanchet (Columbia U. and Stanford U.) 8 / 25
28 Definition of Optimal Transport Costs Define D c (µ, v) = min{e π (c (X, Y )) : π X = µ and π Y = v}. π This is the so-called Kantorovich problem (see Villani (2008)). If c ( ) is a metric then D c (µ, v) is a Wasserstein distance of order 1. Blanchet (Columbia U. and Stanford U.) 8 / 25
29 Definition of Optimal Transport Costs Define D c (µ, v) = min{e π (c (X, Y )) : π X = µ and π Y = v}. π This is the so-called Kantorovich problem (see Villani (2008)). If c ( ) is a metric then D c (µ, v) is a Wasserstein distance of order 1. If c (x, y) = 0 if and only if x = y then D c (µ, v) = 0 if and only if µ = v. Blanchet (Columbia U. and Stanford U.) 8 / 25
30 Definition of Optimal Transport Costs Define D c (µ, v) = min{e π (c (X, Y )) : π X = µ and π Y = v}. π This is the so-called Kantorovich problem (see Villani (2008)). If c ( ) is a metric then D c (µ, v) is a Wasserstein distance of order 1. If c (x, y) = 0 if and only if x = y then D c (µ, v) = 0 if and only if µ = v. Kantorovich s problem is a "nice" infinite dimensional linear programming problem. Blanchet (Columbia U. and Stanford U.) 8 / 25
31 Illustration of Optimal Transport Costs Blanchet (Columbia U. and Stanford U.) 9 / 25
32 Fundamental Theorem of Robust Performance Analysis Theorem (B. and Murthy (2016)) Suppose that c ( ) is lower semicontinuous and that h ( ) is upper semicontinuous with E P0 f (X ) <. Then, sup E P (f (Y )) = inf E P 0 [λδ + sup{f (z) λc (X, z)}]. D c (P 0,P ) δ λ 0 z Moreover, (π ) and dual λ are primal-dual solutions if and only if f (y) λ c (x, y) = sup{f (z) λ c (x, z)} (x, y) -π a.s. z λ (E π [c (X, Y ) δ]) = 0. Blanchet (Columbia U. and Stanford U.) 10 / 25
33 Implications for Risk Analysis Theorem (B. and Murthy (2016)) Suppose that c ( ) is lower semicontinuous and that B is a closed set. Let c B (x) = inf{c (x, y) : y B}, then sup P (Y B) = P 0 (c B (X ) 1/λ ), D c (P 0,P ) δ where λ 0 satisfies (under mild assumptions on c B (X )) δ = E 0 [c B (X ) I (c B (X ) 1/λ )]. Blanchet (Columbia U. and Stanford U.) 11 / 25
34 Application 1: Back to Classical Risk Problem Suppose that c (x, y) = d J (x ( ), y ( )) = Skorokhod J 1 metric. = inf { sup x (t) y (φ (t)), sup φ (t) t }. φ( ) bijection t [0,1] t [0,1] Blanchet (Columbia U. and Stanford U.) 12 / 25
35 Application 1: Back to Classical Risk Problem Suppose that c (x, y) = d J (x ( ), y ( )) = Skorokhod J 1 metric. = inf { sup x (t) y (φ (t)), sup φ (t) t }. φ( ) bijection t [0,1] t [0,1] If R (t) = b Z (t), then ruin during time interval [0, 1] is B b = {z ( ) : b sup z (t)}. t [0,1] Blanchet (Columbia U. and Stanford U.) 12 / 25
36 Application 1: Back to Classical Risk Problem Suppose that c (x, y) = d J (x ( ), y ( )) = Skorokhod J 1 metric. = inf { sup x (t) y (φ (t)), sup φ (t) t }. φ( ) bijection t [0,1] t [0,1] If R (t) = b Z (t), then ruin during time interval [0, 1] is B b = {z ( ) : b sup z (t)}. t [0,1] Let P 0 ( ) be the Wiener measure want to compute sup P (Z B b ). D c (P 0,P ) δ Blanchet (Columbia U. and Stanford U.) 12 / 25
37 Application 1: Computing Distance to Bankruptcy So: {c Bb (z) 1/λ } = {sup t [0,1] z (t) b 1/λ }, and ( ) sup P (Z B b ) = P 0 sup Z (t) b 1/λ. D c (P 0,P ) δ t [0,1] Blanchet (Columbia U. and Stanford U.) 13 / 25
38 Application 1: Computing Uncertainty Size Note any coupling π so that π X = P 0 and π Y = P satisfies D c (P 0, P) E π [c (X, Y )] δ. Blanchet (Columbia U. and Stanford U.) 14 / 25
39 Application 1: Computing Uncertainty Size Note any coupling π so that π X = P 0 and π Y = P satisfies D c (P 0, P) E π [c (X, Y )] δ. So use any coupling between evidence and P 0 or expert knowledge. Blanchet (Columbia U. and Stanford U.) 14 / 25
40 Application 1: Computing Uncertainty Size Note any coupling π so that π X = P 0 and π Y = P satisfies D c (P 0, P) E π [c (X, Y )] δ. So use any coupling between evidence and P 0 or expert knowledge. We discuss choosing δ non-parametrically in a moment Blanchet (Columbia U. and Stanford U.) 14 / 25
41 Application 1: Illustration of Coupling Given arrivals and claim sizes let Z (t) = m 1/2 2 N (t) k=1 (X k m 1 ) Blanchet (Columbia U. and Stanford U.) 15 / 25
42 Application 1: Coupling in Action Blanchet (Columbia U. and Stanford U.) 16 / 25
43 Application 1: Numerical Example Assume Poisson arrivals. Pareto claim sizes with index 2.2 (P (V > t) = 1/(1 + t) 2.2 ). Cost c (x, y) = d J (x, y) 2 < note power of 2. Used Algorithm 1 to calibrate (estimating means and variances from data). b P 0 (Ruin) P true (Ruin) P robust (Ruin) P true (Ruin) Blanchet (Columbia U. and Stanford U.) 17 / 25
44 Additional Applications: Multidimensional Ruin Problems Paper: Quantifying Distributional Model Risk via Optimal Transport (B. & Murthy 16) contains more applications Multidimensional risk processes (explicit evaluation of c B (x) for d J metric). Control: min θ sup P :D (P,P0 ) δ E [L (θ, Z )] < robust optimal reinsurance. Blanchet (Columbia U. and Stanford U.) 18 / 25
45 Connections to Machine Learning Connection to machine learning helps further understand why optimal transport costs are sensible choices... Paper: Robust Wasserstein Profile Inference and Applications to Machine Learning (B., Murthy & Kang 16) Blanchet (Columbia U. and Stanford U.) 19 / 25
46 Robust Performance Analysis in Machine Learning Consider estimating β R m in linear regression Y i = βx i + e i, where {(Y i, X i )} n i=1 are data points. Blanchet (Columbia U. and Stanford U.) 20 / 25
47 Robust Performance Analysis in Machine Learning Consider estimating β R m in linear regression Y i = βx i + e i, where {(Y i, X i )} n i=1 are data points. Optimal Least Squares approach consists in estimating β via n 1 ( ) 2 MSE (β) = min β n Y i β T X i. i=1 Blanchet (Columbia U. and Stanford U.) 20 / 25
48 Robust Performance Analysis in Machine Learning Consider estimating β R m in linear regression Y i = βx i + e i, where {(Y i, X i )} n i=1 are data points. Optimal Least Squares approach consists in estimating β via n 1 ( ) 2 MSE (β) = min β n Y i β T X i. i=1 Apply the distributionally robust estimator based on optimal transport. Blanchet (Columbia U. and Stanford U.) 20 / 25
49 Connection to Sqrt-Lasso Theorem (B., Kang, Murthy (2016)) Suppose that c ( (x, y), ( { x, y )) x x = 2 q if y = y if y = y. Then, if 1/p + 1/q = 1 ( ( ) ) 2 max E 1/2 P Y β T X = P :D c (P,P n ) δ MSE (β) + δ β 2 p. Remark 1: This is sqrt-lasso (Belloni et al. (2011)). Remark 2: Also representations for support vector machines, LAD lasso, group lasso, adaptive lasso, and more! Blanchet (Columbia U. and Stanford U.) 21 / 25
50 For Instance... Regularized Estimators Theorem (B., Kang, Murthy (2016)) Suppose that c ( (x, y), ( x, y )) = { x x q if y = y if y = y. Then, ] sup E P [log(1 + e Y βt X ) P : D c (P,P n ) δ = 1 n n log(1 + e Y i β T X i ) + δ β p. i=1 Remark 1: This is regularized logistic regression (see also Esfahani and Kuhn 2015). Blanchet (Columbia U. and Stanford U.) 22 / 25
51 Connections to Empirical Likelihood Paper: Also chooses δ optimally introducing an extension of Empirical Likelihood called "Robust Wasserstein Profile Inference". Blanchet (Columbia U. and Stanford U.) 23 / 25
52 The Robust Wasserstein Profile Function Pick δ = 95% quantile of R n (β ) and we show that nr n (β ) d E [e 2 ] E [e 2 ] (E e ) 2 N(0, Cov (X )) 2 q. Blanchet (Columbia U. and Stanford U.) 24 / 25
53 Conclusions Presented systematic approach for quantifying model misspecification. Blanchet (Columbia U. and Stanford U.) 25 / 25
54 Conclusions Presented systematic approach for quantifying model misspecification. Approach based on optimal transport sup P (X B) = P 0 (c B (X ) 1/λ ). D (P,P 0 ) δ Blanchet (Columbia U. and Stanford U.) 25 / 25
55 Conclusions Presented systematic approach for quantifying model misspecification. Approach based on optimal transport sup P (X B) = P 0 (c B (X ) 1/λ ). D (P,P 0 ) δ Closed forms solutions, tractable in terms of P 0, feasible calibration of δ. Blanchet (Columbia U. and Stanford U.) 25 / 25
56 Conclusions Presented systematic approach for quantifying model misspecification. Approach based on optimal transport sup P (X B) = P 0 (c B (X ) 1/λ ). D (P,P 0 ) δ Closed forms solutions, tractable in terms of P 0, feasible calibration of δ. New statistical estimators, connections to machine learning & regularization. Blanchet (Columbia U. and Stanford U.) 25 / 25
57 Conclusions Presented systematic approach for quantifying model misspecification. Approach based on optimal transport sup P (X B) = P 0 (c B (X ) 1/λ ). D (P,P 0 ) δ Closed forms solutions, tractable in terms of P 0, feasible calibration of δ. New statistical estimators, connections to machine learning & regularization. Extensions of Empirical Likelihood & connections to optimal regularization. Blanchet (Columbia U. and Stanford U.) 25 / 25
Optimal Transport Methods in Operations Research and Statistics
Optimal Transport Methods in Operations Research and Statistics Jose Blanchet (based on work with F. He, Y. Kang, K. Murthy, F. Zhang). Stanford University (Management Science and Engineering), and Columbia
More informationPRIVATE RISK AND VALUATION: A DISTRIBUTIONALLY ROBUST OPTIMIZATION VIEW
PRIVATE RISK AND VALUATION: A DISTRIBUTIONALLY ROBUST OPTIMIZATION VIEW J. BLANCHET AND U. LALL ABSTRACT. This chapter summarizes a body of work which has as objective the quantification of risk exposures
More informationDistributionally Robust Stochastic Optimization with Wasserstein Distance
Distributionally Robust Stochastic Optimization with Wasserstein Distance Rui Gao DOS Seminar, Oct 2016 Joint work with Anton Kleywegt School of Industrial and Systems Engineering Georgia Tech What is
More informationQuantifying Stochastic Model Errors via Robust Optimization
Quantifying Stochastic Model Errors via Robust Optimization IPAM Workshop on Uncertainty Quantification for Multiscale Stochastic Systems and Applications Jan 19, 2016 Henry Lam Industrial & Operations
More informationDistirbutional robustness, regularizing variance, and adversaries
Distirbutional robustness, regularizing variance, and adversaries John Duchi Based on joint work with Hongseok Namkoong and Aman Sinha Stanford University November 2017 Motivation We do not want machine-learned
More informationSuperhedging and distributionally robust optimization with neural networks
Superhedging and distributionally robust optimization with neural networks Stephan Eckstein joint work with Michael Kupper and Mathias Pohl Robust Techniques in Quantitative Finance University of Oxford
More informationCS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018
CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 1 Overview This lecture mainly covers Recall the statistical theory of GANs
More informationRobust Dual-Response Optimization
Yanıkoğlu, den Hertog, and Kleijnen Robust Dual-Response Optimization 29 May 1 June 1 / 24 Robust Dual-Response Optimization İhsan Yanıkoğlu, Dick den Hertog, Jack P.C. Kleijnen Özyeğin University, İstanbul,
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization
Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization Instructor: Farid Alizadeh Author: Ai Kagawa 12/12/2012
More informationDistributionally Robust Groupwise Regularization Estimator
Proceedings of Machine Learning Research 77:97 112, 2017 ACML 2017 Distributionally Robust Groupwise Regularization Estimator Jose Blanchet jose.blanchet@columbia.edu Yang Kang yang.kang@columbia.edu 1255
More informationDistributionally Robust Convex Optimization
Distributionally Robust Convex Optimization Wolfram Wiesemann 1, Daniel Kuhn 1, and Melvyn Sim 2 1 Department of Computing, Imperial College London, United Kingdom 2 Department of Decision Sciences, National
More informationDistributionally Robust Optimization with ROME (part 1)
Distributionally Robust Optimization with ROME (part 1) Joel Goh Melvyn Sim Department of Decision Sciences NUS Business School, Singapore 18 Jun 2009 NUS Business School Guest Lecture J. Goh, M. Sim (NUS)
More informationIntroduction to Rare Event Simulation
Introduction to Rare Event Simulation Brown University: Summer School on Rare Event Simulation Jose Blanchet Columbia University. Department of Statistics, Department of IEOR. Blanchet (Columbia) 1 / 31
More informationRandom Convex Approximations of Ambiguous Chance Constrained Programs
Random Convex Approximations of Ambiguous Chance Constrained Programs Shih-Hao Tseng Eilyan Bitar Ao Tang Abstract We investigate an approach to the approximation of ambiguous chance constrained programs
More informationProceedings of the 2014 Winter Simulation Conference A. Tolk, S. D. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds.
Proceedings of the 204 Winter Simulation Conference A. Tolk, S. D. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds. ROBUST RARE-EVENT PERFORMANCE ANALYSIS WITH NATURAL NON-CONVEX CONSTRAINTS
More informationRobust Actuarial Risk Analysis
Robust Actuarial Risk Analysis Jose Blanchet Henry Lam Qihe Tang Zhongyi Yuan Abstract This paper investigates techniques for the assessment of model error in the context of insurance risk analysis. The
More informationNishant Gurnani. GAN Reading Group. April 14th, / 107
Nishant Gurnani GAN Reading Group April 14th, 2017 1 / 107 Why are these Papers Important? 2 / 107 Why are these Papers Important? Recently a large number of GAN frameworks have been proposed - BGAN, LSGAN,
More informationarxiv: v2 [math.oc] 16 Jul 2016
Distributionally Robust Stochastic Optimization with Wasserstein Distance Rui Gao, Anton J. Kleywegt School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205
More informationWasserstein GAN. Juho Lee. Jan 23, 2017
Wasserstein GAN Juho Lee Jan 23, 2017 Wasserstein GAN (WGAN) Arxiv submission Martin Arjovsky, Soumith Chintala, and Léon Bottou A new GAN model minimizing the Earth-Mover s distance (Wasserstein-1 distance)
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationPosterior Regularization
Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods
More informationAmbiguity Sets and their applications to SVM
Ambiguity Sets and their applications to SVM Ammon Washburn University of Arizona April 22, 2016 Ammon Washburn (University of Arizona) Ambiguity Sets April 22, 2016 1 / 25 Introduction Go over some (very
More informationRobust Optimization for Risk Control in Enterprise-wide Optimization
Robust Optimization for Risk Control in Enterprise-wide Optimization Juan Pablo Vielma Department of Industrial Engineering University of Pittsburgh EWO Seminar, 011 Pittsburgh, PA Uncertainty in Optimization
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationDoes Distributionally Robust Supervised Learning Give Robust Classifiers?
Weihua Hu 2 Gang Niu 2 Issei Sato 2 Masashi Sugiyama 2 Abstract Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is
More informationDistributionally Robust Discrete Optimization with Entropic Value-at-Risk
Distributionally Robust Discrete Optimization with Entropic Value-at-Risk Daniel Zhuoyu Long Department of SEEM, The Chinese University of Hong Kong, zylong@se.cuhk.edu.hk Jin Qi NUS Business School, National
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationApproximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)
Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationSchool of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia , USA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 DISTRIBUTIONALLY ROBUST STOCHASTIC PROGRAMMING ALEXANDER SHAPIRO School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationf-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models
IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,
More informationarxiv: v1 [stat.me] 2 Nov 2017
On Optimization over Tail Distributions Clémentine Mottet Department of Mathematics and Statistics, Boston University, Boston, MA 2115, cmottet@bu.edu Henry Lam Department of Industrial Engineering and
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationData-Driven Distributionally Robust Chance-Constrained Optimization with Wasserstein Metric
Data-Driven Distributionally Robust Chance-Constrained Optimization with asserstein Metric Ran Ji Department of System Engineering and Operations Research, George Mason University, rji2@gmu.edu; Miguel
More informationCORC Tech Report TR Robust dynamic programming
CORC Tech Report TR-2002-07 Robust dynamic programming G. Iyengar Submitted Dec. 3rd, 2002. Revised May 4, 2004. Abstract In this paper we propose a robust formulation for discrete time dynamic programming
More informationAre You a Bayesian or a Frequentist?
Are You a Bayesian or a Frequentist? Michael I. Jordan Department of EECS Department of Statistics University of California, Berkeley http://www.cs.berkeley.edu/ jordan 1 Statistical Inference Bayesian
More informationOptimal Transport and Wasserstein Distance
Optimal Transport and Wasserstein Distance The Wasserstein distance which arises from the idea of optimal transport is being used more and more in Statistics and Machine Learning. In these notes we review
More informationDistributionally Robust Convex Optimization
Submitted to Operations Research manuscript OPRE-2013-02-060 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,
More informationHandout 8: Dealing with Data Uncertainty
MFE 5100: Optimization 2015 16 First Term Handout 8: Dealing with Data Uncertainty Instructor: Anthony Man Cho So December 1, 2015 1 Introduction Conic linear programming CLP, and in particular, semidefinite
More informationarxiv: v3 [stat.ml] 4 Aug 2017
SEMI-SUPERVISED LEARNING BASED ON DISTRIBUTIONALLY ROBUST OPTIMIZATION JOSE BLANCHET AND YANG KANG arxiv:1702.08848v3 stat.ml] 4 Aug 2017 Abstract. We propose a novel method for semi-supervised learning
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 10 Alternatives to Monte Carlo Computation Since about 1990, Markov chain Monte Carlo has been the dominant
More informationLecture 2: Statistical Decision Theory (Part I)
Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationLearning Energy-Based Models of High-Dimensional Data
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationarxiv: v1 [stat.ml] 27 May 2018
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets arxiv:1805.10611v1 stat.ml] 27 May 2018 Rui Gao School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332
More informationRobust portfolio selection under norm uncertainty
Wang and Cheng Journal of Inequalities and Applications (2016) 2016:164 DOI 10.1186/s13660-016-1102-4 R E S E A R C H Open Access Robust portfolio selection under norm uncertainty Lei Wang 1 and Xi Cheng
More informationLagrange Relaxation: Introduction and Applications
1 / 23 Lagrange Relaxation: Introduction and Applications Operations Research Anthony Papavasiliou 2 / 23 Contents 1 Context 2 Applications Application in Stochastic Programming Unit Commitment 3 / 23
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationOn Distributionally Robust Chance Constrained Program with Wasserstein Distance
On Distributionally Robust Chance Constrained Program with Wasserstein Distance Weijun Xie 1 1 Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA 4061 June 15, 018 Abstract
More informationInformation theoretic perspectives on learning algorithms
Information theoretic perspectives on learning algorithms Varun Jog University of Wisconsin - Madison Departments of ECE and Mathematics Shannon Channel Hangout! May 8, 2018 Jointly with Adrian Tovar-Lopez
More informationSelected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.
. Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. Nemirovski Arkadi.Nemirovski@isye.gatech.edu Linear Optimization Problem,
More informationTime inconsistency of optimal policies of distributionally robust inventory models
Time inconsistency of optimal policies of distributionally robust inventory models Alexander Shapiro Linwei Xin Abstract In this paper, we investigate optimal policies of distributionally robust (risk
More informationAmbiguous Joint Chance Constraints under Mean and Dispersion Information
Ambiguous Joint Chance Constraints under Mean and Dispersion Information Grani A. Hanasusanto 1, Vladimir Roitch 2, Daniel Kuhn 3, and Wolfram Wiesemann 4 1 Graduate Program in Operations Research and
More informationDistributionally Robust Reward-risk Ratio Programming with Wasserstein Metric
oname manuscript o. will be inserted by the editor) Distributionally Robust Reward-risk Ratio Programming with Wasserstein Metric Yong Zhao Yongchao Liu Jin Zhang Xinmin Yang Received: date / Accepted:
More informationCORC Technical Report TR Ambiguous chance constrained problems and robust optimization
CORC Technical Report TR-2004-0 Ambiguous chance constrained problems and robust optimization E. Erdoğan G. Iyengar September 7, 2004 Abstract In this paper we study ambiguous chance constrained problems
More informationReducing Model Risk With Goodness-of-fit Victory Idowu London School of Economics
Reducing Model Risk With Goodness-of-fit Victory Idowu London School of Economics Agenda I. An overview of Copula Theory II. Copulas and Model Risk III. Goodness-of-fit methods for copulas IV. Presentation
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationLecture notes on statistical decision theory Econ 2110, fall 2013
Lecture notes on statistical decision theory Econ 2110, fall 2013 Maximilian Kasy March 10, 2014 These lecture notes are roughly based on Robert, C. (2007). The Bayesian choice: from decision-theoretic
More informationLecture 2: August 31
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy
More informationBAYESIAN ESTIMATION OF UNKNOWN PARAMETERS OVER NETWORKS
BAYESIAN ESTIMATION OF UNKNOWN PARAMETERS OVER NETWORKS Petar M. Djurić Dept. of Electrical & Computer Engineering Stony Brook University Stony Brook, NY 11794, USA e-mail: petar.djuric@stonybrook.edu
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationarxiv: v3 [math.oc] 25 Apr 2018
Problem-driven scenario generation: an analytical approach for stochastic programs with tail risk measure Jamie Fairbrother *, Amanda Turner *, and Stein W. Wallace ** * STOR-i Centre for Doctoral Training,
More informationACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS
ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS TABLE OF CONTENTS INTRODUCTORY NOTE NOTES AND PROBLEM SETS Section 1 - Point Estimation 1 Problem Set 1 15 Section 2 - Confidence Intervals and
More informationConvex Optimization Lecture 16
Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean
More informationDuality in Regret Measures and Risk Measures arxiv: v1 [q-fin.mf] 30 Apr 2017 Qiang Yao, Xinmin Yang and Jie Sun
Duality in Regret Measures and Risk Measures arxiv:1705.00340v1 [q-fin.mf] 30 Apr 2017 Qiang Yao, Xinmin Yang and Jie Sun May 13, 2018 Abstract Optimization models based on coherent regret measures and
More informationEstimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS
MURI Meeting June 20, 2001 Estimation and Optimization: Gaps and Bridges Laurent El Ghaoui EECS UC Berkeley 1 goals currently, estimation (of model parameters) and optimization (of decision variables)
More informationTHE stochastic and dynamic environments of many practical
A Convex Optimization Approach to Distributionally Robust Markov Decision Processes with Wasserstein Distance Insoon Yang, Member, IEEE Abstract In this paper, we consider the problem of constructing control
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationGeneralized quantiles as risk measures
Generalized quantiles as risk measures Bellini, Klar, Muller, Rosazza Gianin December 1, 2014 Vorisek Jan Introduction Quantiles q α of a random variable X can be defined as the minimizers of a piecewise
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationDiscussion of Sensitivity and Informativeness under Local Misspecification
Discussion of Sensitivity and Informativeness under Local Misspecification Jinyong Hahn April 4, 2019 Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4,
More informationStatistical Measures of Uncertainty in Inverse Problems
Statistical Measures of Uncertainty in Inverse Problems Workshop on Uncertainty in Inverse Problems Institute for Mathematics and Its Applications Minneapolis, MN 19-26 April 2002 P.B. Stark Department
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationRobust Hypothesis Testing Using Wasserstein Uncertainty Sets
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets Rui Gao School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 3332 rgao32@gatech.edu Yao Xie School of Industrial
More informationGenerative Models and Optimal Transport
Generative Models and Optimal Transport Marco Cuturi Joint work / work in progress with G. Peyré, A. Genevay (ENS), F. Bach (INRIA), G. Montavon, K-R Müller (TU Berlin) Statistics 0.1 : Density Fitting
More informationA PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction
tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationMachine learning, shrinkage estimation, and economic theory
Machine learning, shrinkage estimation, and economic theory Maximilian Kasy December 14, 2018 1 / 43 Introduction Recent years saw a boom of machine learning methods. Impressive advances in domains such
More informationLecture 6: Conic Optimization September 8
IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions
More informationOn Information Divergence Measures, Surrogate Loss Functions and Decentralized Hypothesis Testing
On Information Divergence Measures, Surrogate Loss Functions and Decentralized Hypothesis Testing XuanLong Nguyen Martin J. Wainwright Michael I. Jordan Electrical Engineering & Computer Science Department
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationarxiv: v1 [cs.it] 30 Jun 2017
1 How biased is your model? Concentration Inequalities, Information and Model Bias Konstantinos Gourgoulias, Markos A. Katsoulakis, Luc Rey-Bellet and Jie Wang Abstract arxiv:1706.10260v1 [cs.it] 30 Jun
More informationCS-E3210 Machine Learning: Basic Principles
CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction
More informationHabilitationsvortrag: Machine learning, shrinkage estimation, and economic theory
Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory Maximilian Kasy May 25, 218 1 / 27 Introduction Recent years saw a boom of machine learning methods. Impressive advances
More informationICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts
ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes
More informationSparse and Robust Optimization and Applications
Sparse and and Statistical Learning Workshop Les Houches, 2013 Robust Laurent El Ghaoui with Mert Pilanci, Anh Pham EECS Dept., UC Berkeley January 7, 2013 1 / 36 Outline Sparse Sparse Sparse Probability
More informationMachine Learning Basics: Maximum Likelihood Estimation
Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationRegression I: Mean Squared Error and Measuring Quality of Fit
Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving
More informationMachine Learning 3. week
Machine Learning 3. week Entropy Decision Trees ID3 C4.5 Classification and Regression Trees (CART) 1 What is Decision Tree As a short description, decision tree is a data classification procedure which
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More informationDecomposition Algorithms for Two-Stage Distributionally Robust Mixed Binary Programs
Decomposition Algorithms for Two-Stage Distributionally Robust Mixed Binary Programs Manish Bansal Grado Department of Industrial and Systems Engineering, Virginia Tech Email: bansal@vt.edu Kuo-Ling Huang
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationAmbiguous Chance Constrained Programs: Algorithms and Applications
Ambiguous Chance Constrained Programs: Algorithms and Applications Emre Erdoğan Adviser: Associate Professor Garud Iyengar Partial Sponsor: TÜBİTAK NATO-A1 Submitted in partial fulfilment of the Requirements
More informationAssessing financial model risk
Assessing financial model risk and an application to electricity prices Giacomo Scandolo University of Florence giacomo.scandolo@unifi.it joint works with Pauline Barrieu (LSE) and Angelica Gianfreda (LBS)
More informationRISK AND RELIABILITY IN OPTIMIZATION UNDER UNCERTAINTY
RISK AND RELIABILITY IN OPTIMIZATION UNDER UNCERTAINTY Terry Rockafellar University of Washington, Seattle AMSI Optimise Melbourne, Australia 18 Jun 2018 Decisions in the Face of Uncertain Outcomes = especially
More information