OPTIMIZATION UNDER UNCERTAINTY A unified framework (Draft)

Size: px
Start display at page:

Download "OPTIMIZATION UNDER UNCERTAINTY A unified framework (Draft)"

Transcription

1 OPTIMIZATION UNDER UNCERTAINTY A unified framework (Draft) Warren B. Powell September 15, 2016 A JOHN WILEY & SONS, INC., PUBLICATION

2 Copyright c 2016 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) , fax (978) , or on the web at Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) , fax (201) Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created ore extended by sales representatives or written sales materials. The advice and strategies contained herin may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services please contact our Customer Care Department with the U.S. at , outside the U.S. at or fax Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, however, may not be available in electronic format. Library of Congress Cataloging-in-Publication Data: Optimization Under Uncertainty: A unified framework Printed in the United States of America

3

4

5 CONTENTS 1 Decisions and Uncertainty Some sample problems Problem types State variables Types of decisions Types of uncertainty Models of system dynamics Objectives Staging of information and decisions Formulating a stochastic optimization problem A deterministic inventory problem The transition to a stochastic formulation Choosing inventory policies A generic formulation Solution strategies Designing policies for sequential decision problems Policy search Policies based on lookahead approximations Mixing and matching Pulling it all together 21 v

6 vi CONTENTS 1.6 Sequencing problem classes Bridging to statistics From deterministic to stochastic optimization Pedagogy Bibliographic notes 27 2 Canonical Problems The basic stochastic optimization problem Final reward formulation Cumulative reward Inventory/storage problems The multiarmed bandit problem Decision trees Online computation Two-stage stochastic programming Chance constrained problems Shortest paths A deterministic shortest path problem A stochastic shortest path problem Optimal stopping Markov decision processes Reinforcement learning Optimal control Multi-stage stochastic programming The nomadic trucker Statistics and Machine learning A simple modeling framework for dynamic programs Bibliographic notes 50 3 Approximation Strategies Lookup tables with frequentist updating Lookup tables with Bayesian updating The updating equations for independent beliefs Updating for correlated beliefs Lookup tables and aggregation Hierarchical aggregation Computing bias and variance Modeling aggregation Combining multiple levels of aggregation Linear parametric models Linear regression review 74

7 CONTENTS vii Regression variations Sparse additive models and Lasso Nonlinear parametric models Maximum likelihood estimation Sampled nonlinear models Neural networks - I Nonparametric models K-nearest neighbor Kernel regression Local polynomial regression Neural networks - II Support vector machines Indexed functions, tree structures and clustering Approximations and the curse of dimensionality Why does it work?** Correlations in hierarchical estimation Proof of Proposition Bibliographic notes 95 Problems 97 4 Introduction to stochastic optimization An overview of learning problems Deterministic methods A stochastic shortest path problem A newsvendor problem with known distribution Chance constrained optimization Optimal control Discrete Markov decision processes Remarks Sampled models Formulating a sampled model Benders decomposition for a sampled convex problem Convergence Decomposition strategies Creating a sampled model Adaptive learning algorithms Modeling adaptive learning problems Objective functions for learning Designing policies Closing remarks Bibliographic notes 121

8 viii CONTENTS 5 Derivative-Based Stochastic Search Some sample applications Stochastic gradient methods A stochastic gradient algorithm - asymptotic analysis A note on notation A finite horizon formulation Stepsizes Finite differences Derivatives of simulations Transient problems Recursive Benders decomposition The basic algorithm Benders with regularization Properties of algorithms Theoretical properties Empirical issues Why does it work?** Some probabilistic preliminaries An older proof A more modern proof Bibliographic notes Adaptive estimation and stepsize policies Deterministic stepsize recipes Properties for convergence Stochastic stepsizes The case for stochastic stepsizes Convergence conditions Recipes for stochastic stepsizes Experimental notes Optimal stepsizes for nonstationary time series Optimal stepsizes for stationary data Optimal stepsizes for nonstationary data - I Optimal stepsizes for nonstationary data - II Optimal stepsizes for approximate value iteration Convergence Guidelines for choosing stepsize formulas Bibliographic notes 170 Problems Derivative-Free Stochastic Search 175

9 CONTENTS ix 7.1 Examples Lookup table belief models Frequentist belief model Bayesian belief model Frequentist or Bayesian? Objective functions for learning policies Policies Policy function approximations Cost function approximations Policies based on value function approximations Single period lookahead policies Multiperiod lookahead policies Hybrid policies Properties of policies Learning and the multiarmed bandit problem Gittins indices for learning with cumulative rewards Foundations Basic theory of Gittins indices Gittins indices for normally distributed rewards Comments Value of information policies The belief model The knowledge gradient for offline (final reward) learning Knowledge gradient for correlated beliefs Linear belief models Nonlinear belief models Concavity of information The knowledge gradient for online (cumulative reward) learning Some properties of the knowledge gradient Tuning policies The effect of units MOLTE - Optimal learning testing system Stochastic optimization with exogenous state information Learning on continuous decisions Physical states vs. belief states Bibliographic notes 218 Problems Physical state applications Deterministic problems A learning exercise: the nomadic trucker 224

10 x CONTENTS The shortest path problem The discrete budgeting problem The continuous budgeting problem Stochastic problems Decision trees A stochastic shortest path problem The gambling problem Asset valuation The asset acquisition problem - I The asset acquisition problem - II The lagged asset acquisition problem The batch replenishment problem The transformer replacement problem The dynamic assignment problem Information acquisition problems An information-collecting shortest path problem Bibliographic notes 246 Problems Modeling dynamic programs Notational style Modeling time The states of our system Defining the state variable The three states of our system The initial state S The post-decision state variable Partially observable states* Latent variables Forecasts and the transition kernel Flat vs. factored state representations* A shortest path illustration Modeling decisions Decisions, actions, and controls Making decisions The exogenous information process Basic notation for information processes Outcomes and scenarios Lagged information processes Models of information processes Supervisory processes* Policies in the information process* 278

11 CONTENTS xi 9.6 The transition function A general model Model-free dynamic programming The resource transition function Exogenous transitions The objective function The contribution function Random contributions The value of a policy Finding the best policy Risk-based and robust objective functions Illustration: An energy storage model Base models and lookahead models* Advanced probabilistic modeling concepts** A measure-theoretic view of information Conditional expectations for sequential decision problems** Bibliographic notes 299 Problems Modeling uncertainty Types of uncertainty Observational errors Prognostic uncertainty Experimental noise Transitional uncertainty Inferential (or diagnostic) uncertainty Model uncertainty Systematic exogenous uncertainty Control uncertainty Algorithmic instability Discussion Creating random processes Sample paths State/action dependent processes Types of distributions Monte Carlo simulation Generating uniform [0, 1] random variables Uniform and normal random variable Generating random variables from inverse cumulative distributions Inverse cumulative from quantile distributions Distributions with uncertain parameters 325

12 xii CONTENTS 10.5 Sampling vs. sampled models Iterative sampling: A stochastic gradient algorithm Static sampling: Solving a sampled model Sampled representation with Bayesian updating Efficient sampling Variance reduction methods Importance sampling Quantization methods Sampling in high dimensions Adversarial sampling Bibliographic notes Policies Classes of policies Policy function approximations Cost function approximations Value function approximations Lookahead policies Hybrid strategies Randomized policies Illustration: An energy storage model revisited Policy function approximation Cost function approximation I Value function approximation Deterministic lookahead Hybrid lookahead-cost function approximation Experimental testing How to choose a policy? Bibliographic notes 347 Problems Policy function approximations and policy search Classes of policy function approximations Boltzmann policies for discrete actions Affine policies Constraints Monotone policies Lookup table policies Nonparametric policies Locally parametric policies Policy search 355

13 CONTENTS xiii Online vs. offline learning Derivative-based Derivative-free - active search Derivative-free - passive search Introduction to policy search Derivative-free optimization The ranking and selection problem The frequentist approach The Bayesian view Bayesian updating with correlated beliefs Some exploration policies The knowledge gradient algorithm for discrete alternatives The basic idea Computation Simulation optimization An indifference zone algorithm Optimal computing budget allocation Online vs. offline objectives Closing remarks Bibliographic notes Cost function approximations Optimal myopic policies Bibliographic notes Discrete Markov decision processes The optimality equations Bellman s equations Computing the transition matrix Random contributions Bellman s equation using operator notation* Finite horizon problems Infinite horizon problems Value iteration A Gauss-Seidel variation Relative value iteration Bounds and rates of convergence Policy iteration Hybrid value-policy iteration Average reward dynamic programming The linear programming method for dynamic programs 395

14 xiv CONTENTS 14.9 Monotone policies* The model Submodularity and other stories From submodularity to monotonicity Why does it work?** The optimality equations Convergence of value iteration Monotonicity of value iteration Bounding the error from value iteration Randomized policies Optimality of monotone policies Bibliographic notes 416 Problems Dynamic programs with special structure Monotone dynamic programming Special cases with analytical solutions Linear-quadratic regulation Bibliographic notes Backward approximate dynamic programming Numerical approximation methods Linear models using sampled states Low rank approximations Bibliographic notes Forward ADP I: The value of a policy Sampling the value of a policy Direct policy evaluation for finite horizon problems Policy evaluation for infinite horizon problems Temporal difference updates TD(λ) TD(0) and approximate value iteration TD learning for infinite horizon problems Stochastic approximation methods Recursive least squares for linear models Recursive least squares for stationary data Recursive least squares for nonstationary data Recursive estimation using multiple observations Recursive time-series estimation* Bellman s equation using a linear model 445

15 CONTENTS xv A matrix-based derivation* A simulation-based implementation Least squares temporal differences (LSTD) Least squares policy evaluation (LSPE) Analysis of TD(0), LSTD and LSPE using a single state Recursive least squares and TD(0) LSPE LSTD Discussion Gradient-based methods for approximate value iteration* Least squares temporal differencing with kernel regression* Value function approximations based on Bayesian learning* Minimizing bias Lookup tables with correlated beliefs Parametric models Creating the prior Learning algorithms and stepsizes Least squares temporal differences Least squares policy evaluation Recursive least squares Bounding 1/n convergence for approximate value iteration Discussion Why does it work* Derivation of the recursive estimation equations The Sherman-Morrison updating formula Bibliographic notes 466 Problems Forward ADP II: Policy optimization Overview of algorithmic strategies Approximate value iteration and Q-learning using lookup tables Value iteration using a pre-decision state variable On-policy, off-policy and the exploration-exploitation problem Q-learning Value iteration using a post-decision state variable Value iteration using a backward pass Statistical bias in the max operator A stochastic shortest path problem Approximate value iteration using linear models Illustrations using regression models A geometric view of basis functions* 491

16 xvi CONTENTS 18.6 Approximate policy iteration Finite horizon problems using lookup tables Finite horizon problems using basis functions LSTD for infinite horizon problems using basis functions The actor-critic paradigm Policy gradient methods The linear programming method using basis functions Approximate policy iteration using kernel regression* Finite horizon approximations for steady-state applications Bibliographic notes Forward ADP III: Convex functions Piecewise linear approximations for scalar functions Multiple convex dimensions Benders decomposition Benders with exogenous state variable High-dimensional applications Values versus marginal values Linear approximations Solving a resource allocation problem using piecewise linear functions Regression methods Value iteration for multidimensional decision vectors Cutting planes for multidimensional functions Convexity with exogenous information state Why does it work?** The projection operation Bibliographic notes 526 Problems Lookahead policies Optimal policies using lookahead Strategies for approximating the lookahead model Deterministic lookahead A shortest path illustration Decision trees Stochastic lookahead models with discrete actions Lookahead using backward dynamic programming Monte Carlo tree search Stochastic lookahead with vector decisions Sparse sampling tree search Roll-out heuristics 538

17 CONTENTS xvii Rolling horizon procedures Discussion Bibliographic notes Risk and Robustness Bibliographic notes 545

18 PART I - INTRODUCTION We begin our journey by providing an overview of the diversity of optimization problems under uncertainty. These have been introduced over the years by a number of different communities, motivated by different applications. Many of these problems have motivated entire fields of research under names such as dynamic programming, stochastic programming, optimal control, stochastic search, ranking and selection, and multiarmed bandit problems. xviii

19 CHAPTER 1 DECISIONS AND UNCERTAINTY There are few problems that offer the richness and diversity of making decisions in the presence of uncertainty. While often presented under impenetrable names such as stochastic programming, stochastic control, and Markov decision processes, decision making under uncertainty is a universal experience, something every human has had to manage since our first experiments trying new foods when we were two years old. Some samples of everyday problems where we have to manage uncertainty include: Personal decisions - These might include how much to withdraw from an ATM machine, finding the best path to a new job, and deciding what time to leave to make an important appointment. Health decisions - Examples include joining a health club, getting annual checkups, having that mole checked, using dental floss, and scheduling a colonoscopy. Investment decisions - What mutual fund should you use? How should you allocate your investments? How much should you put away for retirement? Should you rent or purchase a house? Diet decisions - Protein or carbohydrates? Diet cola vs. orange juice? Eggs or cereal? Decisions under uncertainty span virtually every major field. Samples include Optimization Under Uncertainty. By Warren B. Powell Copyright c 2016 John Wiley & Sons, Inc. 1

20 2 DECISIONS AND UNCERTAINTY Business - What products to sell, with what features? Which supplies should you use? What price should you charge? How should we manage our fleet of delivery vehicles? Which menu attracts the most customers? Internet - What ads to display to maximize ad-clicks? Which movies attract the most attention? When/how should mass notices be sent? Engineering - How to design devices from aerosol cans to an electric vehicle, bridges to transportation systems, transistors to computers? Materials science - What combination of temperatures, pressures and concentrations should we use to create a material with the highest strength? Medical research - What molecular configuration will produce the drug which kills the most cancer cells? What set of steps are required to produce single-walled nanotubes? Economics - What interest rate should the Federal Reserve charge? What levels of market liquidity should be provided? What guidelines should be imposed on investment banks? Needless to say, listing every possible type of decision is an impossible task. However, we would argue that in the context of a particular problem, listing the decisions is easier than identifying all the sources of uncertainty. The diversity of problems is so broad, that one might ask: can all these problems be covered in a single book? There are two reasons for pursuing such an ambitious goal: A universal formulation - The entire range of problems suggested above can be modeled, with perhaps a few adjustments, using a single formulation. This formulation is quite general, and makes it possible to model a truly diverse range of problems with a high level of realism. Cross fertilization - Ideas developed from one problem class or discipline can be used to help solve problems traditionally associated with different areas. There has been a historical pattern to pick up the modeling styles and solution approaches used in the different books captured in figure 1.1. These fields include: Decision analysis - This community generally works with discrete actions, possibly discrete random outcomes, but often features complex utility functions and the handling of risk. Problems are relatively small. Stochastic search (derivative based) - This field is centered on the basic problem min x EF (x, W ) where x is continuous (scalar or vector), W is a random variable, and where the expectation typically cannot be computed. However, we assume we can compute gradients x F (x, W ) for a known W. Ranking and selection (derivative free) - This field is also focused on min x EF (x, W ), but now we assume that x can take on one of a finite set of outcomes {x 1, x 2,..., x M }. Simulation-optimization - This community evolved from within the setting of discrete event simulations, where we need to use a simulator (such as one of a manufacturing system) to compare different designs. The field of simulation optimization started

21 3 with the ranking and selection problem, but has evolved to span a wider range of problems. Online computation - This field describes methods where decisions are made which simply react to information as it comes in, without considering the impact of decisions now on the future. This field was originally motivated by mobile applications where energy and computing capacity was limited. Optimal control - The roots of this community are in engineering, focusing on the control of aircraft, spacecraft, and robots, but has expanded to economics and computer science. The original problems were written in continuous time with continuous controls, but is often written in discrete time (typically with discrete controls), since this is how computation is done. Problems are typically deterministic, possibly with uncertain parameters, and possibly with additive noise in the transition function, but this community has been widely adopted, especially in finance where problems are purely stochastic. Robust optimization - This is an extension of stochastic search with roots in engineering, where the goal is to find the best design x (of a building, an aircraft, a car) that works under the worst instance of a random variable W (which could be wind, temperature, crash conditions). Instead of min x EF (x, W ), robust optimization problems seek to solve min x max w F (x, w). For example, we might want to design a wing to handle the worst possible wind stresses. Optimal stopping - This is an important problem in finance, where we have to study when to stop and sell (or buy) an asset. It also arises in engineering when we have to decide when to stop and repair a piece of machinery. The problem is to find a time τ to sell or repair, where τ can only depend on the information available at that time. The problem is popular within the applied probability community. Markov decision processes - This community evolved primarily within applied probability, and describes a system that takes on a discrete state s, and transitions to s when we take (discrete) action a with (known) probability p(s s, a). Reinforcement learning - This field started by modeling animal behavior seeking to solve a problem (such as finding a path through a maze), where experiences (in the form of successes or failures) were captured by estimating the value of a state-action pair. Approximate dynamic programming - Several communities have developed ways of overcoming the curse of dimensionality inherent in the tools of discrete Markov decision processes by using simulation-based methods to solve dynamic programs. There are very close parallels between the fields of approximate dynamic programming (also known as adaptive dynamic programming) and reinforcement learning, although the motivating problems tend to be somewhat different, as are the research styles. Stochastic programming - This community evolved from math programming with the desire to insert random variables into linear programs. The classical problem is the two-stage problem where you pick x 0 (e.g. how many Christmas trees to plant), after which we learn random information W 1, and then we make a second decision x 1 (e.g. shipping Christmas trees to customers).

22 4 DECISIONS AND UNCERTAINTY Figure 1.1 A sampling of major books representing different fields in stochastic optimization. Sequential kriging - This community evolved within geosciences, where we need to learn the largest value of a continuous function f (x) through expensive experiments (originally field experiments). The vector x was originally a two-dimensional point in space, where f (x) might be the amount of oil or natural gas yield by drilling a hole at x. Multiarmed bandit problems - The roots of this problem come from applied probability, where the goal is to identify a discrete alternative (known as an arm in this community) that yields the highest reward, where we do not know the value of each arm. We learn the value through repeated experimentation, accumulating rewards as we progress. This is hardly a comprehensive list of stochastic optimization problems, but it provides a sense of the different communities that has been drawn into the arena of making decisions under uncertainty. Some of these communities have their roots in deterministic problems (optimal control, stochastic programming), while others have their roots in communities such as applied probability and simulation. A byproduct of this confluence of communities is a variety of notational systems and mathematical styles. This diversity of languages disguises common approaches developed in different communities. Less important than this process of re-invention of methods is the potential for cross-fertilization of ideas across communities. Indeed, as of this writing there persists a fair amount of competition between communities where proponents of one methodological approach will insist that their approach is better than another. This book is not intended to replace the much more thorough treatments in these books. Rather, our goal is to provide a unified framework that provides a more comprehensive perspective of these fields. We have found that a single problem can be reasonably approached by techniques from multiple fields such as dynamic programming (operations research), model predictive control (control theory) and policy search (computer science),

23 SOME SAMPLE PROBLEMS 5 where any one of these methods may work best, depending on the specific characteristics of the data. At the same time, powerful hybrid strategies can be created by combining the tools from different fields. 1.1 SOME SAMPLE PROBLEMS A few sample problems provides a hint into the major classes of decision problems that involve uncertainty. The newsvendor problem The newsvendor problem is one of the most widely used examples of stochastic optimization problems. Imagine that you have to decide on a quantity x of newspapers to purchase at a unit price c that will be placed in the bin the next day. You then sell up to a random demand D, charging a price p. Your profit F (x, D) is given by F (x, D) = p min{x, D} cx. We do not know the demand D when we decide x, so we have to find x that maximizes the expected profits, given by max E{p min{x, D} cx}. (1.1) x The newsvendor problem comes in many variations, which explains its continued popularity after decades of study. One of the most important variations depends on whether the distribution D is known (which allows us to solve (1.1) analytically) or unknown (or not easily computable). Prices and costs may be random (we may be purchasing energy from the power grid at highly random prices, storing it in a battery to be used later). Newsvendor problems can be formulated in two ways. Offline (final reward) formulation - The most classical formulation, stated in (1.1), is one where we find a single solution x which, when implemented, solves (1.1). In this setting, we are allowed to search for the best solution without worrying about how we arrived at x, which means we are only interested in the final reward (that is, the quality of the solution after we have finished our search). We can further divide this problem into two formulations: The asymptotic formulation - This is the formulation in (1.1) - we are looking for a single, deterministic solution x to solve (1.1). The finite time formulation - Imagine that we are allowed N samples of the function F (x, D) to find a solution x N, which will depend on both how we have gone about finding our solution, as well as the noisy observations we made along the way. Online (cumulative reward) formulation - Now imagine that we do not know the distribution of the demand D, but rather have to experiment by choosing x and then observing D, after which we can figure out how much money we made that day. The problem is that we have to maximize profits over the days while we are learning the best solution, which means we have to live with the profits while we are learning the right solution.

24 6 DECISIONS AND UNCERTAINTY A stochastic shortest path problem Imagine that we are trying to find the best path over an urban network. As a result of congestion, travel times on each link (i, j) joining nodes i and j may be random. We may assume that we know the cost from i to j as soon as we arrive at node i. We may assume the distribution of the cost c ij is known, or unknown. In addition it may be stationary (does not change with time) or nonstationary (reflecting either predictable congestion patterns, or random accidents). Optimizing medical decisions During a routine medical exam, a physician realizes that the patient has high blood sugar. Courses of treatment can include diet and exercise, or a popular drug called metformin that is known to reduce blood sugar. Information collected as part of the medical history can be used to guide this decision, since not every patient can handle metformin. The doctor will have to learn how this patient responds to a particular course of treatment, which is information that he can use to help guide his treatment of not only this patient, but others with similar characteristics. Pricing an electricity contract A utility has been asked to price a contract to sell electricity over a 5-year horizon (which is quite long). The utility can exploit 5-year contracts on fuels (coal, oil, natural gas), which provide a forecast (by financial markets) on the price of the fuels in the future. Fuel prices can be translated to the cost of producing electricity from different generators, each of which has a heat rate that translates energy input (typically measured in millions of BTUs) and electricity output (measured in megawatt-hours). We can predict the price of electricity by finding the intersection between the supply curve, constructed by sorting generators from lowest to highest cost, and the projected demand. Five years from now, there is uncertainty in both the prices of different fuels, as well as the demand for electricity (known as the load in the power community). Inverted pendulum problem This is one of the most classical problems in engineering control. Imagine you have a vertical rod hinged at the bottom on a cart that can move left and right. The rod tends to fall in whatever direction it is leaning, but this can be countered by moving the cart in the same direction to push the rod back up. The challenge is to control the cart in a way that the rod remains upright, ideally minimizing the energy to maintain the rod in its vertical position. These problems are typically low-dimensional (this problem has a one- or twodimensional controller, depending on how the dynamics are modeled), and deterministic, although uncertainty can be introduced in the transition (for example to reflect wind) or in the implementation of the control. Managing blood inventories There are eight blood types (A, B, AB, O, which can each be either positive or negative). Blood can also be stored for up to six weeks, and it may be frozen so that it can be held for longer periods of time. Each blood type has different substitution options (see figure 1.2). For example, anyone can accept O-negative blood (known as the universal donor), while A-positive blood can only be used for people with A-positive or AB-negative (known as the universal recipient). As a result of different substitution options, it is not necessarily the case that you want to use, say, A-positive blood for an A-positive patient, who can also be handled with either O-negative, O-positive, A-negative as well as A-positive blood.

25 PROBLEM TYPES 7 Figure 1.2 The different substitution possibilities between donated blood and patient types (from Cant (2006)). Hospitals (or the Red Cross) have to periodically run blood drives, which produce an uncertain response. At the same time, demand for blood comes from a mixture of routine, scheduled surgeries and bursts from large accidents, storms and domestic violence. If the problem is modeled in weekly time increments, blood may have an age from 0 to 5 weeks. These six values times the eight blood types, times two (frozen or not) gives us 96 values for the blood attribute. There are hundreds of possible assignments of these blood types to the different patient types. These problems are but a tiny sample of the wide range of problems we may encounter that combine decisions and uncertainty. These applications illustrate both offline (design first) and online (decisions have to be made and experienced over time) settings. Decisions may be scalar, vectors (the blood), or categorical (the medical decisions). Uncertainty can be introduced in different forms. And finally, there are different types of objectives, including a desire to do well on average (the newsvendor problem is typically repeated many times), as well as to handle risk (of an outage of power or a blood shortage). 1.2 PROBLEM TYPES Although we revisit this later in more detail, it is useful to have a sense of the different types of problems classes. We provide this by running down the different dimensions of a sequential decision problem, identifying the varieties of each of the dimensions. We wait until chapter 9 before providing a much more comprehensive presentation of how to model this rich problem class.

26 8 DECISIONS AND UNCERTAINTY State variables The state variable S t captures the information available at time t (we may also use S n to describe the state at iteration n) to make a decision. State variables come in three general flavors: Physical state R t - This might be inventory, the location of a vehicle on a network, or the amount of money invested in a stock, where R t may be a scalar (money in the bank), a low-dimensional vector (the inventory of different blood types) or a very high-dimensional vector (the number of different types of aircraft described by a vector of attributes). Physical states restrict the decisions we can make in some way. For example, we cannot sell more shares of a stock than we own. The location on a network determines the decisions we can make. In the vast majority of problems, decisions affect the physical state, directly or indirectly. Informational state I t - This includes information that affects the behavior of a problem, such as the temperature (that influences evaporation from a reservoir), economic variables (that influence stock prices and interest rates), medical factors (e.g. whether someone needing a knee replacement is also a smoker). A informal state variable is any relevant piece of information that we know perfectly, and that we have not classified as a physical state. Knowledge state K t - The knowledge state describes any variable or parameter which we do not know perfectly, and instead use a probability distribution. These are studied under different names such as multiarmed bandit problems (or more broadly, optimal learning problems), and partially observable Markov decision processes (POMDPs). When we return to modeling in chapter 9, we will see that state variables come in a variety of styles, as we can mix and match physical, informational and knowledge state variables in different combinations. In addition, we can different behaviors depending on the relationship between a state variable now and the past Types of decisions Decisions come in many different styles, and this has produced a variety of notational systems. The most common canonical notational systems for decisions are: Discrete action a - This notation is typically used when a is discrete (binary, integer, categorical). This is widely used in computer science, which inherited the notation from the Markov decision process community in operations research. Continuous control u - In the controls community, u is typically a low-dimensional continuous vector (say 1-10 dimensions). General vectors x - In operations research, x is typically a vector of continuous or discrete (integer) variables, where it is not unusual to solve problems with tens or hundreds of thousands of variables (dimensions). Regardless of the community, decisions may come in many forms. We will use our default notation of x, where X is the set of possible values of x. Binary - X = {0, 1}. Binary choices arise frequently in finance (hold or sell an asset), and internet applications where x = 0 means run the current website while x = 1 means run the redesigned website (this is known as A/B testing).

27 PROBLEM TYPES 9 Discrete - X = {1, 2,..., M}. Subset - x may be a vector (0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1) indicating, for example, the starting lineup of a basketball team. Scalar continuous - X = (a, b) for some b > a. This is typically written X = R. Continuous vector - x = (x 1,..., x n ) might be an n-dimensional vector, where we may write X = R n. Discrete vector - x can be a vector of binary elements, or a vector of integers (0, 1, 2,...). Categorical - x may be a category such as a type of drug, or a choice of employee described by a long list of attributes, or a choice of a movie (also described by a long list of attributes). If we let a 1, a 2,..., a K be the different attributes of a choice, we see that the number of possible categories can be extremely large. The nature of the decision variable, along with properties of the objective function, can have a major impact on the design of a solution strategies Types of uncertainty When first starting out, it is natural to start with a simple problem such as newsvendor (with random demand), or perhaps a stochastic shortest path with random link costs, or perhaps you are modeling the process of selling an asset where the price process is random. Randomness can arise in the objective function (costs or prices), the constraints (such as the supply of or demand for blood), or perhaps the transition (the evolution in the price of a stock, the amount of wind energy being generated, or the uncertain evaporation of water from a reservoir). In more complex problems, uncertainty can arise in a variety of different ways. Some examples are Parameter and model uncertainty - We have a model describing how a system should behave, but its behavior is determined by a set of parameters θ that are not known perfectly. We may use (noisy) observations to estimate these parameters, but they are at best known up to a probability distribution. We may also have uncertainty about the fundamental structure of the model. We may adopt a Bayesian setting and put a prior on either the structure of the model or the parameters governing the behavior of the model. Forecasting errors - We may be trying to forecast demands, prices, wind, and customer arrivals. Observational errors - We often cannot know the true state of the system exactly. This might arise when we cannot perfectly estimate the location and speed of a robot, or we may have errors in our ability to monitor inventory, or the actual amount of energy in a car battery. Transitional uncertainty - We may be able to deterministically calculate the change in the level of a reservoir, but rainfall and seepage introduces noise in this transition.

28 10 DECISIONS AND UNCERTAINTY Control uncertainty - The decisions we make may not be implemented precisely as we wish. Goal uncertainty - People may have difficulty articulating their objectives, which might involve balancing priorities between different objectives. There are different ways to express uncertainty. Some of these include: Uninformative prior - This means we have no idea what the value is (we do not even know the order of magnitude). We can express this by letting the variance of the distribution of our belief to infinity. Unknown distribution - There are ways of modeling stochastic processes by assuming that observations come from an exogenous, and completely unknown, distribution. Exponential family - This includes many of our most familiar distributions, such as normal, exponential, log-normal, gamma, chi-squared, beta, Bernoulli, Poisson, and geometric. Uniform distributions - Including discrete and continuous. Heavy-tailed distributions - Such as the Cauchy distribution (which has infinite variance), as well as spikes. Mixture distributions - For example, jump diffusion models which use a mixture of a low variance and high variance normal distribution. Bursts - These arise in nonstationary processes where a set of events (spread of a disease, trending of new topics, flurry of sales of a low volume product) occur together. Rare events - Events that may occur, but rarely. probability distribution. These are not well described by a A more detailed presentation of the different sources of uncertainty is given in chapter Models of system dynamics Almost all of the problems we consider in this volume can be modeled sequentially, starting with some state which is then modified by a decision and then new information, which then leads to a new state. However, there are different ways that we can compute, measure or observe these transitions. These include Model-based - Here we assume we can model the transition as a system of equations. For example, an inventory problem might include the equation R t+1 = R t + x t + ˆR t+1, where R t is our current inventory (say, water in a reservoir or product on a store shelf), x t may represent purchases or (for water) releases, and ˆR t+1 is random, exogenous changes (rainfall or leakage, sales or returns). Model-free - This describes situations where we can observe a state S t, then take an action a t, but after that all we can do is observe the next state S t+1. This might arise whenever we have a complex process such as global climate change, the dynamics

29 PROBLEM TYPES 11 of a complex production plant, a complex computer simulation, or the behavior of a human being. Both of these are very important problem classes, and there are specialized modeling and algorithmic strategies designed for each of them Objectives There are many ways to evaluate the performance of a system, which may involve a single metric, or multiple goals or objectives. Different classes of metrics include Costs or contributions - These include financial metrics that we are minimizing (costs) or maximizing (profits, revenue, contributions). Performance metrics - Here we would include non-financial metrics such as strength of a material, cancer cells killed, post-operative recovery from a surgical procedure. Faults - We may wish to maximize a performance metric (or minimize cost), but we have to monitor service failures, or flaws in a product. Time - We may be trying to minimize the time required to complete a task. We also need to consider how costs (or rewards/contributions) are being accumulated: Cumulative rewards - We may accumulate rewards (or costs) as we progress, as would happen in many online applications where we have to actually experience the process (e.g. testing different prices and observing revenues, trying out different paths through a network, or making medical decisions for patients). Final rewards - We may be able to run a series of tests (in a lab, or using a computer simulation) looking to identify the best design, and then evaluate the performance based on how well the design works, and without regard to poor results that arise while searching for the best design. We next have to consider the different ways of evaluating these metrics in the presence of uncertainty. These include Expectations - This is the most widely used strategy, which involves averaging across the different outcomes. Risk measures - This includes a variety of metrics that capture the variability of our performance metric. Examples include: Variance of the metric. A quantile (e.g. maximizing the 10th percentile of profits). Probability of being above or below some quantile. Expected value given it is above or below some threshold. Worst case - Often associated with robust optimization, we may wish to focus on the worst possible outcome, as might arise in the design of an engineering part where we want to lowest cost to handle the worst case. This is technically a form of risk measure, but robust optimization is a distinct field.

30 12 DECISIONS AND UNCERTAINTY We also need to consider the characteristics of the function being used to evaluate our system. These include Computational cost - We may be able to compute performance in fractions of a second (e.g. analytical functions), seconds to minutes (complex analytical functions, small simulations), hours (more complex simulations, some laboratory experiments, observing response to prices on the internet), days (very complex simulations, more complex laboratory experiments), and weeks to months (very complex laboratory experiments, observing market response to business initiatives). Noise level - Function evaluations may be fairly reliable (e.g. variations less than 20 percent of the average), medium to high noise, or binomial (success/failure). Analytical behavior - Differentiable? Is the function convex (minimizing) or concave (maximizing)? Unimodal? Monotone? Differentiable? Smooth? Staging of information and decisions It is useful to distinguish problem classes in terms of the staging of information and decisions. Below we list major problem classes, and describe each in terms of the sequencing of decisions and information. Offline static stochastic optimization - Decision-information Offline learning (final reward) - Decision-information-decision-information... - decision-information Online learning (cumultative reward) - Decision-information-decision-information... -decision-information Contextual bandits (cumulative reward) - Information-decision-information-decisioninformation... Two-stage stochastic programming (vector decisions) - Decision-information-decision Multistage stochastic programming (vector decisions) - Decision-information-decisioninformation... -decision-information Stochastic control (continuous decisions) - Decision-information-decision-information... -decision-information Finite horizon Markov decision processes (discrete actions, cumulative reward) - Decision-information-decision-information... -decision-information Infinite horizon Markov decision process (discrete actions, cumulative reward) - Decision-information-decision-information... Entire fields have evolved around each of these problem classes. Classification can be tricky because it depends in part on how the problem is solved. For example, the stochastic optimization problem max x EF (x, W ) is often described as static stochastic optimization because you choose x, observe W and then stop. Yet, this problem is often solved using an iterative online learning algorithm where we successively choose x n and then observe W n+1 in the hopes that x n will converge to the optimal solution. Under the right conditions, both will yield the same optimal solution to the same objective function, and yet the styles and resulting analysis of the solution approaches is quite different.

Discriminant Analysis and Statistical Pattern Recognition

Discriminant Analysis and Statistical Pattern Recognition Discriminant Analysis and Statistical Pattern Recognition GEOFFRY J. McLACHLAN The University of Queensland @EEC*ENCE A JOHN WILEY & SONS, INC., PUBLICATION This Page Intentionally Left Blank Discriminant

More information

STATISTICAL ANALYSIS WITH MISSING DATA

STATISTICAL ANALYSIS WITH MISSING DATA STATISTICAL ANALYSIS WITH MISSING DATA SECOND EDITION Roderick J.A. Little & Donald B. Rubin WILEY SERIES IN PROBABILITY AND STATISTICS Statistical Analysis with Missing Data Second Edition WILEY SERIES

More information

AN INTRODUCTION TO PROBABILITY AND STATISTICS

AN INTRODUCTION TO PROBABILITY AND STATISTICS AN INTRODUCTION TO PROBABILITY AND STATISTICS WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M.

More information

Applied Regression Modeling

Applied Regression Modeling Applied Regression Modeling Applied Regression Modeling A Business Approach Iain Pardoe University of Oregon Charles H. Lundquist College of Business Eugene, Oregon WILEY- INTERSCIENCE A JOHN WILEY &

More information

Statistical Methods. for Forecasting

Statistical Methods. for Forecasting Statistical Methods for Forecasting Statistical Methods for Forecasting BOVAS ABRAHAM JOHANNES LEDOLTER WILEY- INTERSCI ENCE A JOHN WILEY & SONS, INC., PUBLICA'TION Copyright 0 1983.2005 by John Wiley

More information

Arrow Pushing in Organic Chemistry

Arrow Pushing in Organic Chemistry Arrow Pushing in Organic Chemistry An Easy Approach to Understanding Reaction Mechanisms Daniel E. Levy Arrow Pushing in Organic Chemistry Arrow Pushing in Organic Chemistry An Easy Approach to Understanding

More information

TEACH YOURSELF THE BASICS OF ASPEN PLUS

TEACH YOURSELF THE BASICS OF ASPEN PLUS TEACH YOURSELF THE BASICS OF ASPEN PLUS TEACH YOURSELF THE BASICS OF ASPEN PLUS RALPH SCHEFFLAN Chemical Engineering and Materials Science Department Stevens Institute of Technology A JOHN WILEY & SONS,

More information

BASICS OF ANALYTICAL CHEMISTRY AND CHEMICAL EQUILIBRIA

BASICS OF ANALYTICAL CHEMISTRY AND CHEMICAL EQUILIBRIA BASICS OF ANALYTICAL CHEMISTRY AND CHEMICAL EQUILIBRIA BASICS OF ANALYTICAL CHEMISTRY AND CHEMICAL EQUILIBRIA BRIAN M. TISSUE Virginia Tech Department of Chemistry Blacksburg, VA Cover Design: Wiley Cover

More information

BASIC STRUCTURAL DYNAMICS

BASIC STRUCTURAL DYNAMICS BASIC STRUCTURAL DYNAMICS BASIC STRUCTURAL DYNAMICS James C. Anderson Ph.D. Professor of Civil Engineering, University of Southern California Farzad Naeim Ph.D., S.E., Esq. Vice President and General

More information

FOURIER TRANSFORMS. Principles and Applications. ERIC W. HANSEN Thayer School of Engineering, Dartmouth College

FOURIER TRANSFORMS. Principles and Applications. ERIC W. HANSEN Thayer School of Engineering, Dartmouth College FOURIER TRANSFORMS FOURIER TRANSFORMS Principles and Applications ERIC W. HANSEN Thayer School of Engineering, Dartmouth College Cover Image: istockphoto/olgaaltunina Copyright 2014 by John Wiley & Sons,

More information

ANALYSIS OF ELECTRIC MACHINERY AND DRIVE SYSTEMS

ANALYSIS OF ELECTRIC MACHINERY AND DRIVE SYSTEMS ANALYSIS OF ELECTRIC MACHINERY AND DRIVE SYSTEMS IEEE Press 445 Hoes Lane Piscataway, NJ 08854 IEEE Press Editorial Board 2013 John Anderson, Editor in Chief Linda Shafer Saeid Nahavandi George Zobrist

More information

Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is:

Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is: CHAPTER 6 POLICIES Perhaps one of the most widely used and poorly understood terms in dynamic programming is policy. A simple definition of a policy is: Definition 6.0.1 A policy is a rule (or function)

More information

Lecture 1: March 7, 2018

Lecture 1: March 7, 2018 Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights

More information

DRAFT Formulation and Analysis of Linear Programs

DRAFT Formulation and Analysis of Linear Programs DRAFT Formulation and Analysis of Linear Programs Benjamin Van Roy and Kahn Mason c Benjamin Van Roy and Kahn Mason September 26, 2005 1 2 Contents 1 Introduction 7 1.1 Linear Algebra..........................

More information

INTRODUCTION TO CHEMICAL ENGINEERING COMPUTING

INTRODUCTION TO CHEMICAL ENGINEERING COMPUTING INTRODUCTION TO CHEMICAL ENGINEERING COMPUTING BRUCE A. FINLÄYSON, PH.D. University of Washington Seattle, Washington iwiley- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Microsoft product screen

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Monte-Carlo Methods and Stochastic Processes

Monte-Carlo Methods and Stochastic Processes Monte-Carlo Methods and Stochastic Processes From Linear to Non-Linear EMMANUEL GOBET ECOLE POLYTECHNIQUE - UNIVERSITY PARIS-SACLAY CMAP, PALAISEAU CEDEX, FRANCE CRC Press Taylor & Francis Group 6000 Broken

More information

Bayesian Active Learning With Basis Functions

Bayesian Active Learning With Basis Functions Bayesian Active Learning With Basis Functions Ilya O. Ryzhov Warren B. Powell Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA IEEE ADPRL April 13, 2011 1 / 29

More information

Basics of Uncertainty Analysis

Basics of Uncertainty Analysis Basics of Uncertainty Analysis Chapter Six Basics of Uncertainty Analysis 6.1 Introduction As shown in Fig. 6.1, analysis models are used to predict the performances or behaviors of a product under design.

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Control Theory : Course Summary

Control Theory : Course Summary Control Theory : Course Summary Author: Joshua Volkmann Abstract There are a wide range of problems which involve making decisions over time in the face of uncertainty. Control theory draws from the fields

More information

REACTIVE INTERMEDIATE CHEMISTRY

REACTIVE INTERMEDIATE CHEMISTRY REACTIVE INTERMEDIATE CHEMISTRY REACTIVE INTERMEDIATE CHEMISTRY Edited by Robert A.Moss Department of Chemistry Rutgers University New Brunswick, NJ Matthew S. Platz Department of Chemistry Ohio State

More information

The Comprehensive Report

The Comprehensive Report High-Throughput Screening 2002: New Strategies and Technologies The Comprehensive Report Presented by HighTech Business Decisions 346 Rheem Blvd., Suite 208, Moraga, CA 94556 Tel: (925) 631-0920 Fax: (925)

More information

GREEN CHEMISTRY AND ENGINEERING

GREEN CHEMISTRY AND ENGINEERING GREEN CHEMISTRY AND ENGINEERING GREEN CHEMISTRY AND ENGINEERING A Practical Design Approach CONCEPCIÓN JIMÉNEZ-GONZÁLEZ DAVID J. C. CONSTABLE Copyright Ó 2011 by John Wiley & Sons, Inc. All rights reserved.

More information

1 Basic Analysis of Forward-Looking Decision Making

1 Basic Analysis of Forward-Looking Decision Making 1 Basic Analysis of Forward-Looking Decision Making Individuals and families make the key decisions that determine the future of the economy. The decisions involve balancing current sacrifice against future

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v)

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v) (vii) Preface... (v) CHAPTER 1 Set Theory Definition of Set... 1 Roster, Tabular or Enumeration Form... 1 Set builder Form... 2 Union of Set... 5 Intersection of Sets... 9 Distributive Laws of Unions and

More information

Stochastic Optimization

Stochastic Optimization Chapter 27 Page 1 Stochastic Optimization Operations research has been particularly successful in two areas of decision analysis: (i) optimization of problems involving many variables when the outcome

More information

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks Yingfei Wang, Chu Wang and Warren B. Powell Princeton University Yingfei Wang Optimal Learning Methods June 22, 2016

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming

The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming The Optimizing-Simulator: Merging Optimization and Simulation Using Approximate Dynamic Programming Winter Simulation Conference December 5, 2005 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu

More information

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Spall John Wiley and Sons, Inc., 2003 Preface... xiii 1. Stochastic Search

More information

Modelling Under Risk and Uncertainty

Modelling Under Risk and Uncertainty Modelling Under Risk and Uncertainty An Introduction to Statistical, Phenomenological and Computational Methods Etienne de Rocquigny Ecole Centrale Paris, Universite Paris-Saclay, France WILEY A John Wiley

More information

TRANSPORT PHENOMENA AND UNIT OPERATIONS

TRANSPORT PHENOMENA AND UNIT OPERATIONS TRANSPORT PHENOMENA AND UNIT OPERATIONS TRANSPORT PHENOMENA AND UNIT OPERATIONS A COMBINED APPROACH Richard G. Griskey A JOHN WILEY & SONS, INC., PUBLICATION This book is printed on acid-free paper Copyright

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

INTRODUCTION TO LINEAR REGRESSION ANALYSIS

INTRODUCTION TO LINEAR REGRESSION ANALYSIS INTRODUCTION TO LINEAR REGRESSION ANALYSIS WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice,

More information

1 Bewley Economies with Aggregate Uncertainty

1 Bewley Economies with Aggregate Uncertainty 1 Bewley Economies with Aggregate Uncertainty Sofarwehaveassumedawayaggregatefluctuations (i.e., business cycles) in our description of the incomplete-markets economies with uninsurable idiosyncratic risk

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

Approximate Dynamic Programming for High Dimensional Resource Allocation Problems

Approximate Dynamic Programming for High Dimensional Resource Allocation Problems Approximate Dynamic Programming for High Dimensional Resource Allocation Problems Warren B. Powell Abraham George Belgacem Bouzaiene-Ayari Hugo P. Simao Department of Operations Research and Financial

More information

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Warren B. Powell, Hugo P. Simao and Belgacem Bouzaiene-Ayari Department of Operations Research and Financial Engineering

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016 Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the

More information

Generalization and Function Approximation

Generalization and Function Approximation Generalization and Function Approximation 0 Generalization and Function Approximation Suggested reading: Chapter 8 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998.

More information

A Summary of Economic Methodology

A Summary of Economic Methodology A Summary of Economic Methodology I. The Methodology of Theoretical Economics All economic analysis begins with theory, based in part on intuitive insights that naturally spring from certain stylized facts,

More information

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam:

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam: Marks } Assignment 1: Should be out this weekend } All are marked, I m trying to tally them and perhaps add bonus points } Mid-term: Before the last lecture } Mid-term deferred exam: } This Saturday, 9am-10.30am,

More information

Tutorial: Stochastic Optimization in Energy

Tutorial: Stochastic Optimization in Energy Tutorial: Stochastic Optimization in Energy FERC, Washington, D.C. August 6, 2014 Warren B. Powell CASTLE Labs Princeton University http://www.castlelab.princeton.edu Warren B. Powell, 2014 Slide 1 Mission

More information

CHAPTER 11 Integer Programming, Goal Programming, and Nonlinear Programming

CHAPTER 11 Integer Programming, Goal Programming, and Nonlinear Programming Integer Programming, Goal Programming, and Nonlinear Programming CHAPTER 11 253 CHAPTER 11 Integer Programming, Goal Programming, and Nonlinear Programming TRUE/FALSE 11.1 If conditions require that all

More information

On-line supplement to: SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology

On-line supplement to: SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology On-line supplement to: SMART: A Stochastic Multiscale Model for e Analysis of Energy Resources, Technology and Policy This online supplement provides a more detailed version of e model, followed by derivations

More information

Components for Accurate Forecasting & Continuous Forecast Improvement

Components for Accurate Forecasting & Continuous Forecast Improvement Components for Accurate Forecasting & Continuous Forecast Improvement An ISIS Solutions White Paper November 2009 Page 1 Achieving forecast accuracy for business applications one year in advance requires

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

Distributed Optimization. Song Chong EE, KAIST

Distributed Optimization. Song Chong EE, KAIST Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links

More information

The Dynamic Energy Resource Model

The Dynamic Energy Resource Model The Dynamic Energy Resource Model Group Peer Review Committee Lawrence Livermore National Laboratories July, 2007 Warren Powell Alan Lamont Jeffrey Stewart Abraham George 2007 Warren B. Powell, Princeton

More information

BASIC GAS CHROMATOGRAPHY Second Edition HAROLD M. MCNAIR JAMES M. MILLER A JOHN WILEY & SONS, INC., PUBLICATION BASIC GAS CHROMATOGRAPHY BASIC GAS CHROMATOGRAPHY Second Edition HAROLD M. MCNAIR JAMES

More information

Information Choice in Macroeconomics and Finance.

Information Choice in Macroeconomics and Finance. Information Choice in Macroeconomics and Finance. Laura Veldkamp New York University, Stern School of Business, CEPR and NBER Spring 2009 1 Veldkamp What information consumes is rather obvious: It consumes

More information

16.4 Multiattribute Utility Functions

16.4 Multiattribute Utility Functions 285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate

More information

Decision-making, inference, and learning theory. ECE 830 & CS 761, Spring 2016

Decision-making, inference, and learning theory. ECE 830 & CS 761, Spring 2016 Decision-making, inference, and learning theory ECE 830 & CS 761, Spring 2016 1 / 22 What do we have here? Given measurements or observations of some physical process, we ask the simple question what do

More information

(MATH 1203, 1204, 1204R)

(MATH 1203, 1204, 1204R) College Algebra (MATH 1203, 1204, 1204R) Departmental Review Problems For all questions that ask for an approximate answer, round to two decimal places (unless otherwise specified). The most closely related

More information

TRANSPORT PHENOMENA FOR CHEMICAL REACTOR DESIGN

TRANSPORT PHENOMENA FOR CHEMICAL REACTOR DESIGN TRANSPORT PHENOMENA FOR CHEMICAL REACTOR DESIGN Laurence A. Belfiore Department of Chemical Engineering Colorado State University Fort Collins, CO A JOHN WILEY & SONS, INC., PUBLICATION TRANSPORT PHENOMENA

More information

MODELING DYNAMIC PROGRAMS

MODELING DYNAMIC PROGRAMS CHAPTER 5 MODELING DYNAMIC PROGRAMS Perhaps one of the most important skills to develop in approximate dynamic programming is the ability to write down a model of the problem. Everyone who wants to solve

More information

LEGAL DISCLAIMER. APG Coin (APG) White Paper (hereinafter 'the White Paper', 'the Document') is presented for informational purposes only

LEGAL DISCLAIMER. APG Coin (APG) White Paper (hereinafter 'the White Paper', 'the Document') is presented for informational purposes only LEGAL DISCLAIMER THIS DOCUMENT DOES NOT GIVE PERSONAL LEGAL OR FINANCIAL ADVICE. YOU ARE STRONGLY ENCOURAGED TO SEEK YOUR OWN PROFESSIONAL LEGAL AND FINANCIAL ADVICE. APG Coin (APG) White Paper (hereinafter

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

56:171 Operations Research Final Exam December 12, 1994

56:171 Operations Research Final Exam December 12, 1994 56:171 Operations Research Final Exam December 12, 1994 Write your name on the first page, and initial the other pages. The response "NOTA " = "None of the above" Answer both parts A & B, and five sections

More information

SpringerBriefs in Statistics

SpringerBriefs in Statistics SpringerBriefs in Statistics For further volumes: http://www.springer.com/series/8921 Jeff Grover Strategic Economic Decision-Making Using Bayesian Belief Networks to Solve Complex Problems Jeff Grover

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes Today s s Lecture Lecture 20: Learning -4 Review of Neural Networks Markov-Decision Processes Victor Lesser CMPSCI 683 Fall 2004 Reinforcement learning 2 Back-propagation Applicability of Neural Networks

More information

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework Warren B. Powell, Hugo P. Simao and Belgacem Bouzaiene-Ayari Department of Operations Research and Financial Engineering

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014 Warwick Business School Forecasting System Summary Ana Galvao, Anthony Garratt and James Mitchell November, 21 The main objective of the Warwick Business School Forecasting System is to provide competitive

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

Clearing the Jungle of Stochastic Optimization

Clearing the Jungle of Stochastic Optimization Clearing the Jungle of Stochastic Optimization Warren B. Powell Department of Operations Research and Financial Engineering Princeton University Prepared for Informs TutORials, 2014. DRAFT April 7, 2014

More information

IE598 Big Data Optimization Introduction

IE598 Big Data Optimization Introduction IE598 Big Data Optimization Introduction Instructor: Niao He Jan 17, 2018 1 A little about me Assistant Professor, ISE & CSL UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia

More information

Week 1 Quantitative Analysis of Financial Markets Distributions A

Week 1 Quantitative Analysis of Financial Markets Distributions A Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability Chapter 3 Probability 3.1 Basic Concepts of Probability 3.2 Conditional Probability and the Multiplication Rule 3.3 The Addition Rule 3.4 Additional Topics in Probability and Counting Section 3.1 Basic

More information

INTRODUCTION TO PATTERN RECOGNITION

INTRODUCTION TO PATTERN RECOGNITION INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take

More information

STRESS IN ASME PRESSURE VESSELS, BOILERS, AND NUCLEAR COMPONENTS

STRESS IN ASME PRESSURE VESSELS, BOILERS, AND NUCLEAR COMPONENTS STRESS IN ASME PRESSURE VESSELS, BOILERS, AND NUCLEAR COMPONENTS Wiley-ASME Press Series List Stress in ASME Pressure Vessels, Boilers, and Nuclear Jawad October 2017 Components Robust Adaptive Control

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

RISK AND RELIABILITY IN OPTIMIZATION UNDER UNCERTAINTY

RISK AND RELIABILITY IN OPTIMIZATION UNDER UNCERTAINTY RISK AND RELIABILITY IN OPTIMIZATION UNDER UNCERTAINTY Terry Rockafellar University of Washington, Seattle AMSI Optimise Melbourne, Australia 18 Jun 2018 Decisions in the Face of Uncertain Outcomes = especially

More information

Antti Salonen PPU Le 2: Forecasting 1

Antti Salonen PPU Le 2: Forecasting 1 - 2017 1 Forecasting Forecasts are critical inputs to business plans, annual plans, and budgets Finance, human resources, marketing, operations, and supply chain managers need forecasts to plan: output

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

1 [15 points] Frequent Itemsets Generation With Map-Reduce

1 [15 points] Frequent Itemsets Generation With Map-Reduce Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

PPU411 Antti Salonen. Forecasting. Forecasting PPU Forecasts are critical inputs to business plans, annual plans, and budgets

PPU411 Antti Salonen. Forecasting. Forecasting PPU Forecasts are critical inputs to business plans, annual plans, and budgets - 2017 1 Forecasting Forecasts are critical inputs to business plans, annual plans, and budgets Finance, human resources, marketing, operations, and supply chain managers need forecasts to plan: output

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Chapter 6 Continuous Probability Distributions

Chapter 6 Continuous Probability Distributions Math 3 Chapter 6 Continuous Probability Distributions The observations generated by different statistical experiments have the same general type of behavior. The followings are the probability distributions

More information

Antti Salonen KPP Le 3: Forecasting KPP227

Antti Salonen KPP Le 3: Forecasting KPP227 - 2015 1 Forecasting Forecasts are critical inputs to business plans, annual plans, and budgets Finance, human resources, marketing, operations, and supply chain managers need forecasts to plan: output

More information

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt. SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University

More information

Lecture 3: Policy Evaluation Without Knowing How the World Works / Model Free Policy Evaluation

Lecture 3: Policy Evaluation Without Knowing How the World Works / Model Free Policy Evaluation Lecture 3: Policy Evaluation Without Knowing How the World Works / Model Free Policy Evaluation CS234: RL Emma Brunskill Winter 2018 Material builds on structure from David SIlver s Lecture 4: Model-Free

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem Parisa Mansourifard 1/37 Bayesian Congestion Control over a Markovian Network Bandwidth Process:

More information

Demand Forecasting. for. Microsoft Dynamics 365 for Operations. User Guide. Release 7.1. April 2018

Demand Forecasting. for. Microsoft Dynamics 365 for Operations. User Guide. Release 7.1. April 2018 Demand Forecasting for Microsoft Dynamics 365 for Operations User Guide Release 7.1 April 2018 2018 Farsight Solutions Limited All Rights Reserved. Portions copyright Business Forecast Systems, Inc. This

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

CS599 Lecture 1 Introduction To RL

CS599 Lecture 1 Introduction To RL CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming

More information

Machine Learning Overview

Machine Learning Overview Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression

More information

Chapter 7 Forecasting Demand

Chapter 7 Forecasting Demand Chapter 7 Forecasting Demand Aims of the Chapter After reading this chapter you should be able to do the following: discuss the role of forecasting in inventory management; review different approaches

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information