Vincent Guigues FGV/EMAp, Rio de Janeiro, Brazil 1.1 Comparison of the confidence intervals from Section 3 and from [18]
|
|
- Morris Townsend
- 6 years ago
- Views:
Transcription
1 Online supplementary material for the paper: Multistep stochastic mirror descent for risk-averse convex stochastic programs based on extended polyhedral risk measures Vincent Guigues FGV/EMAp, -9 Rio de Janeiro, Brazil Numerical experiments. Comparison of the confidence intervals from Section and from [8] We compare the coverage probabilities and the computational time of two confidence intervals with confidence level at least α =.9 on the optimal value of (.6) and (.7), built using a sample ξ N = (ξ,..., ξ N ) of size N of ξ: [ ]. the (non-asymptotic) confidence interval C = Low (Θ, Θ, N), Up (Θ, N) proposed in Section. with Θ, Θ, Θ as in Corollary... The (non-asymptotic) confidence interval C = where [ ] Low (Θ, N), Up (Θ, N) proposed in [8] ( ( Low (Θ, N) = f N ) ]) N θ + θ Dω,X M [ + Θ [M θ ] Dω,X M µ(ω) N µ(ω) (.) with f N = min x X N N t= [ ] g(x t, ξ t ) + G(x t, ξ t ) (x x t ), taking for x,..., x N, the sequence of points generated by the algorithm with constant step γ = θ µ(ω)d ω,x M N. In this [ ] expression, M satisfies E exp{ G(x, ξ) /M } exp{} for all x X. Using Theorem of [8], we have P(f(x ) < Low (Θ, N)) 6 exp{ Θ /} + exp{ Θ /} + exp{.7θ N}. Recalling that P(f(x ) > Up (Θ, N)) exp{ Θ /}, it follows that we can take Θ = ln(/α) and Θ satisfying 6 exp{ Θ /} + exp{ Θ /} + exp{.7θ N} = α/. All simulations were implemented in Matlab using Mosek Optimization Toolbox []... Comparison of the confidence intervals on a risk-neutral problem We consider problem (.6) with α =., α =.9, λ = b =, a =, n {, 6, 8, }, and where ξ is a random vector with i.i.d. Bernoulli entries: Prob(ξ(i) = ) = Ψ(i), Prob(ξ(i) = ) = Note that parameter D ω,x in [8] is parameter D ω,x given by (.9) divided by.
2 Sample C / C, problem size n size N Table : Average ratio of the widths of the confidence intervals for problem (.6). Confidence Problem size n interval 6 8 C, N = C, N = C, N = C, N = Table : Average computational time (in seconds) of a confidence interval estimated computing confidence intervals for problem (.6). Ψ(i), with Ψ(i) randomly drawn over [, ]. It follows that f(x) = α µ x + α x V x where µ(i) = E[ξ(i)] = Ψ(i) and V (i, j) = E[ξ(i)]E[ξ(j)] = (Ψ(i) )(Ψ(j) ) for i j while V (i, i) = E[ξ(i) ] =. For, we take = and for the distance-generating function the entropy function ω(x) = ω (x) = n i= x(i) ln(x(i)). We (first) take θ = in (.), meaning µ(ω)dω,x that C is obtained running with constant step γ = M N where M = α + α. We simulate instances of this problem and compute for each instance the confidence intervals C and C. The coverage probabilities of the two non-asymptotic confidence intervals are equal to one for all parameter combinations. We report in Table the mean ratio of the widths of the non-asymptotic confidence intervals. Interestingly, we observe that the confidence interval C we proposed in Section is less conservative than C : in these experiments, the mean length of the width of C divided by the width of C varies between.8 and.8, as can be seen in Table. Another advantage of C is that it tends to be computed more quickly (see Table for problem sizes n =, 6, 8, and ), especially when the problem size n increases (see Table for n =,,, and ), due to the fact that C is computed using an analytic formula while solving an (additional) optimization problem of size n is required to compute C. We now fix a problem size n = and compute realizations of the confidence intervals on the optimal value of that problem. On the top left plot of Figure, we report the optimal value as well as the approximate optimal values g N using variants and of for three sample sizes: N =,, and. On the remaining plots of this figure, the upper and Confidence Problem size n interval C, N = C, N = Table : Average computational time (in seconds) of a confidence interval estimated computing confidence intervals for problem (.6).
3 lower bounds of confidence intervals C and C are reported for sample sizes N =,, and. We observe that the upper limits of C and C are very close (though not identical since the variants and use different steps). When the sample size N increases, g N gets closer to the optimal value and the upper (resp. lower) limits tend to decrease (resp. increase). In this figure, we also see that C lower limit is much larger than C lower limit (in accordance with the results of Table ). We also note that and lower bounds appear to be almost straight lines for these simulations. This comes from the fact that the random part g N in these bounds is quite small compared to the deterministic part (remaining terms). x.. Approximate optimal value,, N=,, Approximate optimal value,, N=,, Lower bound,, N=,, Lower bound,, N=,, Upper bound,, N= Upper bound,, N= Upper bound,, N=.6. Upper bound,, N= Upper bound,, N= Upper bound,, N= Figure : Approximate optimal value, upper and lower bounds for C and C, on instances of problem (.6) of size n =. Finally, we consider for parameter θ involved in the computation of C the range of values.,.,.,.,.,,, considered in [8]. For these values of θ, the average ratios of C and C widths are given in Table. These average ratios are all above.79 and as high as. for (θ, N, n) = (.,, ), which shows again that C is much more conservative than the interval C proposed in Section. for this range of values of θ... Comparison of the confidence intervals on a risk-averse problem We reproduce the experiments of the previous section for problem (.7) with = = and the distance-generating function ω(x) = ω (x) = x. We take M = α ( ε ) + n(α + α ε ),
4 Problem size n (Ratio, θ, N) 6 8 C / C, θ =., N = C / C, θ =., N = C / C, θ =., N =....6 C / C, θ =., N = C / C, θ =., N = C / C, θ =, N = C / C, θ =, N = C / C, θ =, N =.7... Table : Average ratio of the widths of confidence intervals C and C, problem (.6). Confidence interval and ε =., problem size ε =.9, problem size sample size N C, N = C, N = C, N = C, N = Table : CVaR optimization (problem (.7)). Average computational time (in seconds) of a confidence interval estimated computing confidence intervals. and two sets of values for (α, α, ε): (α, α, ε) = (.9,.,.9) and the more risk-averse variant (α, α, ε) = (.,.9,.). For these problems, we first discretize ξ, generating a sample of size which becomes the sample space. We compute the optimal value of (.7) using this sample and sample from this set of scenarios to generate the problem instances. For different problem and sample sizes, we generate again instances. Coverage probabilities of the non-asymptotic confidence intervals are equal to one for all parameter combinations. The time required to compute these confidence intervals is given in Table while the the average ratios of the widths of C and C are reported in Table 6. We observe again on this problem that C is much more conservative than C and for N = that C is computed quicker than C for all problem sizes. When ɛ is small and more weight is given to the CVaR, the optimization problem becomes more difficult, i.e., we need a large sample size to obtain a solution of good quality. This can be seen in Figures and. On the top left plots of Figures and, for a problem of size n =, we plot realizations of the approximate optimal values g N using variants and of for two sample sizes: N = and N = (ε =. for Figure and ε =.9 for Figure ). For fixed sample size Ratio and ε =., problem size ε =.9, problem size sample size N C / C, N = C / C, N = Table 6: CVaR optimization (problem (.7)). Average ratio of the widths of the confidence intervals C and C.
5 .... Approximate optimal value,, N=, Approximate optimal value,, N=,..6.7 Lower bound,, N=, Lower bound,, N=, Upper bound,, N= Upper bound,, N= 6 Upper bound,, N= Upper bound,, N= Figure : CVaR optimization (problem (.7)). Approximate optimal value g N, upper and lower bounds of C and C on instances, problem size n = and ε =..
6 Approximate optimal value,, N=,, epsilon=.9 Approximate optimal value,, N=,, epsilon=.9 Lower bound,, N=,, epsilon=.9 Lower bound,, N=,, epsilon= Upper bound,, N=, epsilon=.9 Upper bound,, N=, epsilon=.9. Upper bound,, N=, epsilon=.9 Upper bound,, N=, epsilon= Figure : CVaR optimization (problem (.7)). Approximate optimal value g N, upper and lower bounds of C and C on instances, problem size n = and ε =.9. N, for ε =.9 these realizations are much closer to the optimal value than for ε =.. On the remaining plots of Figure and, we report the upper and lower bounds of confidence intervals C and C. We observe again that (i) upper (resp. lower) bounds decrease (resp. increase) when the sample size increases, (ii) C and C upper bounds are very close, and (iii) C lower bound is much larger than C lower bound (reflecting the fact that C is much more conservative than C ). Additionally, we observe that when ɛ is small (ε =.) and more weight is given to the CVaR (α =.9) the upper and lower bounds become more distant to the optimal value, i.e., the width of the confidence intervals increases. To conclude, confidence intervals C and C cannot be compared directly because both the constants involved and the steps used to generate the points x,..., x N, are different. However, we hypothesize that the optimization in results in both the conservativeness and the computation time difference.. Comparing the multistep and nonmultistep variants of to solve problem (.6) We solve various instances of problem (.6) (with a =, b = ) using and its multistep version defined in Section taking ω(x) = ω (x) = x. These algorithms in this case are the RSA and multistep RSA. We fix the parameters α =.9, α =., λ =, x = [; ;... ; ]], D X =, and recall that µ(ω) = µ(ω ) = M(ω ) = µ(f) =, ρ =, L = α n+α ( n+λ ), M = α +.α, 6
7 6 x MS x.... x.... x Figure : Steps (left plot), average (computed over runs) approximate optimal values (middle plot), and average (computed over runs) value of the objective function at the solution (right plot) along the iterations of the and MS algorithms run on problem (.6) with n =, N = 8. and M = n( α + α ). In this and the next section, ξ is again a random vector with i.i.d. Bernoulli entries: Prob(ξ(i) = ) = Ψ(i), Prob(ξ(i) = ) = Ψ(i), with Ψ(i) randomly drawn over [, ]. We first take n = and choose the number of iterations using Proposition., namely we take f(ysteps+ N = + 78A(f, ω ) = 8 which ensures that for the MS algorithm E[ ) f(x ).. (we also check that for this value of N, relation (.76) (an assumption of Proposition.) holds). For this value of N, the values of γ t for each iteration of the MS algorithm as well as the constant value of γ for the algorithm are represented in the left plot of Figure. We observe that the MSRSA algorithm starts with larger steps (when we are still far from the optimal solution) and ends with smaller steps (when we get closer to the optimal solution) than the RSA algorithm. We run each algorithm times and report in the middle plot of Figure the average (over the runs) of the approximate optimal values computed along the iterations with both algorithms. We also report in the right plot of Figure the average (over these runs) of the value of the objective function at the and MS solutions. More precisely, for each run of the algorithm, for iteration i the approximate optimal value is g i = i i k= g(x k, ξ k ) (defined in Algorithm ) while for iteration j of the i-th step of the MS algorithm, the approximate optimal value is g i,j = j j k= g(x i,k, ξ i,k ) (defined in Algorithm ) where ξ i,k and x i,k are respectively the k-th realization of ξ and the k-th point generated for that step i (of course, for a given run, the same samples are used for and MS). We observe that we get better (lower) approximations of the optimal value using the MSRSA algorithm. After a large number of iterations, the algorithms provide very close approximations of the optimal value (themselves close to the optimal value of the problem), which is in agreement with the results of Sections and which state that for both algorithms the approximate optimal values converge in probability to the optimal value of the problem. However, it is observed that the MSRSA algorithm provides an approximate solution of good quality much quicker than the RSA algorithm. We also observe that if the value of the sample size N = 8 chosen based on Proposition. indeed allows us to solve the problem with a good accuracy, it is very conservative. In a second series of experiments, we choose various problem sizes n and smaller sample sizes N, namely (n, N) = (, ), (n, N) = (, ), (n, N) = (, ), and (n, N) = (, ), still 7
8 observing solutions of good quality. For these values of the pair (n, N), the values of the steps used for the and MS algorithms are reported in Figure. Here again the MSRSA algorithm starts with larger steps and ends with smaller steps. x MS. x MS x. x MS MS x Figure : Steps used for the and MS algorithms to solve problem (.6) with (n, N) = (, ) (top left plot), (n, N) = (, ) (top right plot), (n, N) = (, ) (bottom left), (n, N) = (, ) (bottom right). The average (over runs) of the approximate optimal value and of the value of the objective function at the and MS solutions are reported in Figures 6 and 7. We still observe on these simulations that MS allows us to obtain a solution of good quality much quicker than and ends up with a better solution, even when only two different step sizes are used for MS. 8
9 x Figure 6: Average over realizations of the approximate optimal values computed by the and MS algorithms to solve (.6). Top left: (n, N) = (, ), top right: (n, N) = (, ), bottom left: (n, N) = (, ), bottom right: (n, N) = (, ). 9
10 x Figure 7: Average over realizations of the values of the objective function at the approximate solutions computed by the and MS algorithms to solve (.6). Top left: (n, N) = (, ), top right: (n, N) = (, ), bottom left: (n, N) = (, ), bottom right: (n, N) = (, ).
11 . Comparing the multistep and nonmultistep variants of to solve problem (.8) We reproduce the experiment of the previous section running times and MS on problem (.8) taking ω(x) = ω (x) = x, ε =.9, α =., α =.9, λ =, x = [; ; ;... ; ]], D X =, and recall that µ(ω) = µ(ω ) = M(ω ) = µ(f) =, ρ =, L = ( α ε α ( ε ) + n(α + α ε ) + λ, M = (α + α ) ( ε ), and M = + n α + α ). ε We consider again four combinations for the pair (n, N): (n, N) = (, ), (, ), (, ), and (, ). The steps used along the iterations of the and MS algorithms are reported in Figure 8.. x x. MS. MS x. x MS MS Figure 8: Steps used for the and MS algorithms to solve problem (.8) with (n, N) = (, ) (top left plot), (n, N) = (, ) (top right plot), (n, N) = (, ) (bottom left), (n, N) = (, ) (bottom right). The average (computed running the algorithms times) of the approximate optimal values
12 Figure 9: Average over realizations of the approximate optimal values computed by the and MS algorithms to solve (.8). Top left: (n, N) = (, ), top right: (n, N) = (, ), bottom left: (n, N) = (, ), bottom right: (n, N) = (, ). and of the value of the objective function at the approximate solutions are reported in Figures 9 and. In these experiments we observe again that MS approximate solutions are better along the iterations and at the end of the optimization process.
13 Figure : Average over realizations of the values of the objective function at the approximate solutions (right plots) computed by the and MS algorithms to solve (.8). Top left: (n, N) = (, ), top right: (n, N) = (, ), bottom left: (n, N) = (, ), bottom right: (n, N) = (, ).
Non-asymptotic confidence bounds for the optimal value of a stochastic program
on-asymptotic confidence bounds for the optimal value of a stochastic program Vincent Guigues, Anatoli Juditsky, Arkadi emirovski To cite this version: Vincent Guigues, Anatoli Juditsky, Arkadi emirovski.
More informationarxiv: v2 [math.oc] 18 Nov 2017
DASC: A DECOMPOSITION ALGORITHM FOR MULTISTAGE STOCHASTIC PROGRAMS WITH STRONGLY CONVEX COST FUNCTIONS arxiv:1711.04650v2 [math.oc] 18 Nov 2017 Vincent Guigues School of Applied Mathematics, FGV Praia
More informationLecture 21: Minimax Theory
Lecture : Minimax Theory Akshay Krishnamurthy akshay@cs.umass.edu November 8, 07 Recap In the first part of the course, we spent the majority of our time studying risk minimization. We found many ways
More informationCONVERGENCE ANALYSIS OF SAMPLING-BASED DECOMPOSITION METHODS FOR RISK-AVERSE MULTISTAGE STOCHASTIC CONVEX PROGRAMS
CONVERGENCE ANALYSIS OF SAMPLING-BASED DECOMPOSITION METHODS FOR RISK-AVERSE MULTISTAGE STOCHASTIC CONVEX PROGRAMS VINCENT GUIGUES Abstract. We consider a class of sampling-based decomposition methods
More informationThe L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 38
1 / 38 The L-Shaped Method Operations Research Anthony Papavasiliou Contents 2 / 38 1 The L-Shaped Method 2 Example: Capacity Expansion Planning 3 Examples with Optimality Cuts [ 5.1a of BL] 4 Examples
More informationThe L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 44
1 / 44 The L-Shaped Method Operations Research Anthony Papavasiliou Contents 2 / 44 1 The L-Shaped Method [ 5.1 of BL] 2 Optimality Cuts [ 5.1a of BL] 3 Feasibility Cuts [ 5.1b of BL] 4 Proof of Convergence
More informationScenario Generation and Sampling Methods
Scenario Generation and Sampling Methods Güzin Bayraksan Tito Homem-de-Mello SVAN 2016 IMPA May 11th, 2016 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 11 1 / 33
More informationMini-batch Stochastic Approximation Methods for Nonconvex Stochastic Composite Optimization
Noname manuscript No. (will be inserted by the editor) Mini-batch Stochastic Approximation Methods for Nonconvex Stochastic Composite Optimization Saeed Ghadimi Guanghui Lan Hongchao Zhang the date of
More informationAbstract Key Words: 1 Introduction
f(x) x Ω MOP Ω R n f(x) = ( (x),..., f q (x)) f i : R n R i =,..., q max i=,...,q f i (x) Ω n P F P F + R + n f 3 f 3 f 3 η =., η =.9 ( 3 ) ( ) ( ) f (x) x + x min = min x Ω (x) x [ 5,] (x 5) + (x 5)
More informationOn Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:
A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition
More informationTheory of Statistical Tests
Ch 9. Theory of Statistical Tests 9.1 Certain Best Tests How to construct good testing. For simple hypothesis H 0 : θ = θ, H 1 : θ = θ, Page 1 of 100 where Θ = {θ, θ } 1. Define the best test for H 0 H
More informationStatistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it
Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationProperty (T) and the Furstenberg Entropy of Nonsingular Actions
Property (T) and the Furstenberg Entropy of Nonsingular Actions Lewis Bowen, Yair Hartman and Omer Tamuz December 1, 2014 Abstract We establish a new characterization of property (T) in terms of the Furstenberg
More informationCOURSE Iterative methods for solving linear systems
COURSE 0 4.3. Iterative methods for solving linear systems Because of round-off errors, direct methods become less efficient than iterative methods for large systems (>00 000 variables). An iterative scheme
More informationA Quick Tour of Linear Algebra and Optimization for Machine Learning
A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators
More informationDETERMINISTIC AND STOCHASTIC SELECTION DYNAMICS
DETERMINISTIC AND STOCHASTIC SELECTION DYNAMICS Jörgen Weibull March 23, 2010 1 The multi-population replicator dynamic Domain of analysis: finite games in normal form, G =(N, S, π), with mixed-strategy
More informationEfficient Methods for Stochastic Composite Optimization
Efficient Methods for Stochastic Composite Optimization Guanghui Lan School of Industrial and Systems Engineering Georgia Institute of Technology, Atlanta, GA 3033-005 Email: glan@isye.gatech.edu June
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationComputing risk averse equilibrium in incomplete market. Henri Gerard Andy Philpott, Vincent Leclère
Computing risk averse equilibrium in incomplete market Henri Gerard Andy Philpott, Vincent Leclère YEQT XI: Winterschool on Energy Systems Netherlands, December, 2017 CERMICS - EPOC 1/43 Uncertainty on
More informationConvex Stochastic and Large-Scale Deterministic Programming via Robust Stochastic Approximation and its Extensions
Convex Stochastic and Large-Scale Deterministic Programming via Robust Stochastic Approximation and its Extensions Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationConvergence of Multivariate Quantile Surfaces
Convergence of Multivariate Quantile Surfaces Adil Ahidar Institut de Mathématiques de Toulouse - CERFACS August 30, 2013 Adil Ahidar (Institut de Mathématiques de Toulouse Convergence - CERFACS) of Multivariate
More informationLecture 16: Sample quantiles and their asymptotic properties
Lecture 16: Sample quantiles and their asymptotic properties Estimation of quantiles (percentiles Suppose that X 1,...,X n are i.i.d. random variables from an unknown nonparametric F For p (0,1, G 1 (p
More informationScenario decomposition of risk-averse two stage stochastic programming problems
R u t c o r Research R e p o r t Scenario decomposition of risk-averse two stage stochastic programming problems Ricardo A Collado a Dávid Papp b Andrzej Ruszczyński c RRR 2-2012, January 2012 RUTCOR Rutgers
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationStochastic Programming with Multivariate Second Order Stochastic Dominance Constraints with Applications in Portfolio Optimization
Stochastic Programming with Multivariate Second Order Stochastic Dominance Constraints with Applications in Portfolio Optimization Rudabeh Meskarian 1 Department of Engineering Systems and Design, Singapore
More informationSparse Approximation via Penalty Decomposition Methods
Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems
More informationLecture 1. Stochastic Optimization: Introduction. January 8, 2018
Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing
More informationScenario Optimization for Robust Design
Scenario Optimization for Robust Design foundations and recent developments Giuseppe Carlo Calafiore Dipartimento di Elettronica e Telecomunicazioni Politecnico di Torino ITALY Learning for Control Workshop
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationComplexity of two and multi-stage stochastic programming problems
Complexity of two and multi-stage stochastic programming problems A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA The concept
More informationChoice under Uncertainty
In the Name of God Sharif University of Technology Graduate School of Management and Economics Microeconomics 2 44706 (1394-95 2 nd term) Group 2 Dr. S. Farshad Fatemi Chapter 6: Choice under Uncertainty
More informationVISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.
VISCOSITY SOLUTIONS PETER HINTZ We follow Han and Lin, Elliptic Partial Differential Equations, 5. 1. Motivation Throughout, we will assume that Ω R n is a bounded and connected domain and that a ij C(Ω)
More informationMarch 25, 2010 CHAPTER 2: LIMITS AND CONTINUITY OF FUNCTIONS IN EUCLIDEAN SPACE
March 25, 2010 CHAPTER 2: LIMIT AND CONTINUITY OF FUNCTION IN EUCLIDEAN PACE 1. calar product in R n Definition 1.1. Given x = (x 1,..., x n ), y = (y 1,..., y n ) R n,we define their scalar product as
More informationIntroduction to Machine Learning (67577) Lecture 7
Introduction to Machine Learning (67577) Lecture 7 Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Solving Convex Problems using SGD and RLM Shai Shalev-Shwartz (Hebrew
More informationRisk Measures. A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia , USA ICSP 2016
Risk Measures A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA ICSP 2016 Min-max (distributionally robust) approach to stochastic
More informationA Sparsity Preserving Stochastic Gradient Method for Composite Optimization
A Sparsity Preserving Stochastic Gradient Method for Composite Optimization Qihang Lin Xi Chen Javier Peña April 3, 11 Abstract We propose new stochastic gradient algorithms for solving convex composite
More informationarxiv: v3 [math.oc] 25 Apr 2018
Problem-driven scenario generation: an analytical approach for stochastic programs with tail risk measure Jamie Fairbrother *, Amanda Turner *, and Stein W. Wallace ** * STOR-i Centre for Doctoral Training,
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationOperations Research Letters. On a time consistency concept in risk averse multistage stochastic programming
Operations Research Letters 37 2009 143 147 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl On a time consistency concept in risk averse
More informationP (A G) dp G P (A G)
First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume
More informationReformulation and Sampling to Solve a Stochastic Network Interdiction Problem
Network Interdiction Stochastic Network Interdiction and to Solve a Stochastic Network Interdiction Problem Udom Janjarassuk Jeff Linderoth ISE Department COR@L Lab Lehigh University jtl3@lehigh.edu informs
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 January 18, 2018 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 18, 2018 1 / 9 Sampling from a Bernoulli Distribution Theorem (Beta-Bernoulli
More informationOnline Appendix Liking and Following and the Newsvendor: Operations and Marketing Policies under Social Influence
Online Appendix Liking and Following and the Newsvendor: Operations and Marketing Policies under Social Influence Ming Hu, Joseph Milner Rotman School of Management, University of Toronto, Toronto, Ontario,
More informationConvex Optimization Lecture 16
Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean
More informationBirgit Rudloff Operations Research and Financial Engineering, Princeton University
TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street
More informationUniversity of Houston, Department of Mathematics Numerical Analysis, Fall 2005
3 Numerical Solution of Nonlinear Equations and Systems 3.1 Fixed point iteration Reamrk 3.1 Problem Given a function F : lr n lr n, compute x lr n such that ( ) F(x ) = 0. In this chapter, we consider
More informationReflections and Rotations in R 3
Reflections and Rotations in R 3 P. J. Ryan May 29, 21 Rotations as Compositions of Reflections Recall that the reflection in the hyperplane H through the origin in R n is given by f(x) = x 2 ξ, x ξ (1)
More informationMachine Learning. Lecture 3: Logistic Regression. Feng Li.
Machine Learning Lecture 3: Logistic Regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2016 Logistic Regression Classification
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationTowards stability and optimality in stochastic gradient descent
Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction
More informationThe PAC Learning Framework -II
The PAC Learning Framework -II Prof. Dan A. Simovici UMB 1 / 1 Outline 1 Finite Hypothesis Space - The Inconsistent Case 2 Deterministic versus stochastic scenario 3 Bayes Error and Noise 2 / 1 Outline
More informationZangwill s Global Convergence Theorem
Zangwill s Global Convergence Theorem A theory of global convergence has been given by Zangwill 1. This theory involves the notion of a set-valued mapping, or point-to-set mapping. Definition 1.1 Given
More informationarxiv: v4 [math.oc] 5 Jan 2016
Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The
More informationGeneralization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh
Generalization Bounds in Machine Learning Presented by: Afshin Rostamizadeh Outline Introduction to generalization bounds. Examples: VC-bounds Covering Number bounds Rademacher bounds Stability bounds
More informationInverse Stochastic Dominance Constraints Duality and Methods
Duality and Methods Darinka Dentcheva 1 Andrzej Ruszczyński 2 1 Stevens Institute of Technology Hoboken, New Jersey, USA 2 Rutgers University Piscataway, New Jersey, USA Research supported by NSF awards
More informationLearning with Rejection
Learning with Rejection Corinna Cortes 1, Giulia DeSalvo 2, and Mehryar Mohri 2,1 1 Google Research, 111 8th Avenue, New York, NY 2 Courant Institute of Mathematical Sciences, 251 Mercer Street, New York,
More informationQuestion: My computer only knows how to generate a uniform random variable. How do I generate others? f X (x)dx. f X (s)ds.
Simulation Question: My computer only knows how to generate a uniform random variable. How do I generate others?. Continuous Random Variables Recall that a random variable X is continuous if it has a probability
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationStochastic Dual Dynamic Integer Programming
Stochastic Dual Dynamic Integer Programming Jikai Zou Shabbir Ahmed Xu Andy Sun December 26, 2017 Abstract Multistage stochastic integer programming (MSIP) combines the difficulty of uncertainty, dynamics,
More informationLine search methods with variable sample size for unconstrained optimization
Line search methods with variable sample size for unconstrained optimization Nataša Krejić Nataša Krklec June 27, 2011 Abstract Minimization of unconstrained objective function in the form of mathematical
More informationLinearly-Convergent Stochastic-Gradient Methods
Linearly-Convergent Stochastic-Gradient Methods Joint work with Francis Bach, Michael Friedlander, Nicolas Le Roux INRIA - SIERRA Project - Team Laboratoire d Informatique de l École Normale Supérieure
More informationMGMT 69000: Topics in High-dimensional Data Analysis Falll 2016
MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence
More informationData-Driven Risk-Averse Stochastic Optimization with Wasserstein Metric
Data-Driven Risk-Averse Stochastic Optimization with Wasserstein Metric Chaoyue Zhao and Yongpei Guan School of Industrial Engineering and Management Oklahoma State University, Stillwater, OK 74074 Department
More informationORIGINS OF STOCHASTIC PROGRAMMING
ORIGINS OF STOCHASTIC PROGRAMMING Early 1950 s: in applications of Linear Programming unknown values of coefficients: demands, technological coefficients, yields, etc. QUOTATION Dantzig, Interfaces 20,1990
More informationWeighted uniform consistency of kernel density estimators with general bandwidth sequences
E l e c t r o n i c J o u r n a l o f P r o b a b i l i t y Vol. 11 2006, Paper no. 33, pages 844 859. Journal URL http://www.math.washington.edu/~ejpecp/ Weighted uniform consistency of kernel density
More informationAn Introduction to Laws of Large Numbers
An to Laws of John CVGMI Group Contents 1 Contents 1 2 Contents 1 2 3 Contents 1 2 3 4 Intuition We re working with random variables. What could we observe? {X n } n=1 Intuition We re working with random
More informationFinancial Optimization ISE 347/447. Lecture 21. Dr. Ted Ralphs
Financial Optimization ISE 347/447 Lecture 21 Dr. Ted Ralphs ISE 347/447 Lecture 21 1 Reading for This Lecture C&T Chapter 16 ISE 347/447 Lecture 21 2 Formalizing: Random Linear Optimization Consider the
More informationAdaptive Rejection Sampling with fixed number of nodes
Adaptive Rejection Sampling with fixed number of nodes L. Martino, F. Louzada Institute of Mathematical Sciences and Computing, Universidade de São Paulo, Brazil. Abstract The adaptive rejection sampling
More informationEstimating Unknown Sparsity in Compressed Sensing
Estimating Unknown Sparsity in Compressed Sensing Miles Lopes UC Berkeley Department of Statistics CSGF Program Review July 16, 2014 early version published at ICML 2013 Miles Lopes ( UC Berkeley ) estimating
More informationRome - May 12th Université Paris-Diderot - Laboratoire Jacques-Louis Lions. Mean field games equations with quadratic
Université Paris-Diderot - Laboratoire Jacques-Louis Lions Rome - May 12th 2011 Hamiltonian MFG Hamiltonian on the domain [0, T ] Ω, Ω standing for (0, 1) d : (HJB) (K) t u + σ2 2 u + 1 2 u 2 = f (x, m)
More informationLecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)
Lectures on Machine Learning (Fall 2017) Hyeong In Choi Seoul National University Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2) Topics to be covered:
More informationAdvanced computational methods X Selected Topics: SGD
Advanced computational methods X071521-Selected Topics: SGD. In this lecture, we look at the stochastic gradient descent (SGD) method 1 An illustrating example The MNIST is a simple dataset of variety
More informationChapter 2. Poisson point processes
Chapter 2. Poisson point processes Jean-François Coeurjolly http://www-ljk.imag.fr/membres/jean-francois.coeurjolly/ Laboratoire Jean Kuntzmann (LJK), Grenoble University Setting for this chapter To ease
More informationStochastic Quasi-Newton Methods
Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent
More informationOptimization Tools in an Uncertain Environment
Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization
More informationLecture 9: October 25, Lower bounds for minimax rates via multiple hypotheses
Information and Coding Theory Autumn 07 Lecturer: Madhur Tulsiani Lecture 9: October 5, 07 Lower bounds for minimax rates via multiple hypotheses In this lecture, we extend the ideas from the previous
More informationBandits : optimality in exponential families
Bandits : optimality in exponential families Odalric-Ambrym Maillard IHES, January 2016 Odalric-Ambrym Maillard Bandits 1 / 40 Introduction 1 Stochastic multi-armed bandits 2 Boundary crossing probabilities
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More information5 December 2016 MAA136 Researcher presentation. Anatoliy Malyarenko. Topics for Bachelor and Master Theses. Anatoliy Malyarenko
5 December 216 MAA136 Researcher presentation 1 schemes The main problem of financial engineering: calculate E [f(x t (x))], where {X t (x): t T } is the solution of the system of X t (x) = x + Ṽ (X s
More informationAlternative Characterizations of Markov Processes
Chapter 10 Alternative Characterizations of Markov Processes This lecture introduces two ways of characterizing Markov processes other than through their transition probabilities. Section 10.1 describes
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More information3.4 Linear Least-Squares Filter
X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More information1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016
AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationStochastic Dual Dynamic Programming with CVaR Risk Constraints Applied to Hydrothermal Scheduling. ICSP 2013 Bergamo, July 8-12, 2012
Stochastic Dual Dynamic Programming with CVaR Risk Constraints Applied to Hydrothermal Scheduling Luiz Carlos da Costa Junior Mario V. F. Pereira Sérgio Granville Nora Campodónico Marcia Helena Costa Fampa
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationChapter 9: Basic of Hypercontractivity
Analysis of Boolean Functions Prof. Ryan O Donnell Chapter 9: Basic of Hypercontractivity Date: May 6, 2017 Student: Chi-Ning Chou Index Problem Progress 1 Exercise 9.3 (Tightness of Bonami Lemma) 2/2
More informationRevisiting some results on the complexity of multistage stochastic programs and some extensions
Revisiting some results on the complexity of multistage stochastic programs and some extensions M.M.C.R. Reaiche IMPA, Rio de Janeiro, RJ, Brazil October 30, 2015 Abstract In this work we present explicit
More informationUniform Convergence of a Multilevel Energy-based Quantization Scheme
Uniform Convergence of a Multilevel Energy-based Quantization Scheme Maria Emelianenko 1 and Qiang Du 1 Pennsylvania State University, University Park, PA 16803 emeliane@math.psu.edu and qdu@math.psu.edu
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationAPPLICATIONS OF DIFFERENTIABILITY IN R n.
APPLICATIONS OF DIFFERENTIABILITY IN R n. MATANIA BEN-ARTZI April 2015 Functions here are defined on a subset T R n and take values in R m, where m can be smaller, equal or greater than n. The (open) ball
More informationSome new facts about sequential quadratic programming methods employing second derivatives
To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov
More information