AMPLDev User Guide. OptiRisk Systems

Size: px

Start display at page:

Download "AMPLDev User Guide. OptiRisk Systems"

Valentine McKenzie
6 years ago
Views:

1 AMPLDev User Guide OptiRisk Systems

2 i Version: Prepared by Neha Murarka, Victor Zverovich, Christian Valente, Cristiano Arbex Valle and Gautam Mitra OptiRisk Systems Copyright c 2014 OptiRisk Systems DO NOT DUPLICATE WITHOUT PERMISSION All brand names, product names are trademarks or registered trademarks of their respective holders. The material presented in this manual is subject to change without prior notice and is intended for general information only. The views of the authors expressed in this document do not represent the views and/or opinions of OptiRisk Systems. Disclaimer: This manual is a work in progress and is updated regularly. OptiRisk Systems One Oxford Road Uxbridge, Middlesex, UB9 4DA United Kingdom (0)

3 Contents I Overview of AMPLDev 1 1 Scope and Purpose What is AMPL? Who can use AMPLDev? Why use AMPLDev? Installing AMPLDev AMPLDev stand-alone application AMPLDev plug-in for Eclipse users (Not supported yet) Java Runtime Environment installation Getting Eclipse Installing AMPLDev plug-in AMPLDev User Interface Workbench Perspectives Views Project Explorer Console Outline Solution Editors Content Assist Preferences Pre launch scriptpreferences!pre Launch options Search pathspreferences!search paths II Modelling with AMPLDev 13 4 Projects within AMPLDev Concept of a project Creating a new project Using the File menu in a non-ampl perspective Using the File menu in the AMPL perspective Using the Project Explorer context menu in the AMPL perspective Creating and adding files to an AMPL project Running Models and Analysing Results Single file launch Context menu launch Launch toolbar Multiple file launch ii

4 CONTENTS iii The File Selection tab Default launch Viewing output and errors Viewing the solution III AMPL: A Mathematical Programming Language 25 6 Learning AMPL by Example A Linear Programming Model Translating the model into AMPL Sets Parameters Variables Objective Constraints Databases and Spreadsheets Example: A Diet Problem Preparing the Data Spreadsheets Databases Reading Data from Tables Read Parameters only Read Sets and Parameters Establishing Correspondences Other values Writing Data into Tables Rows inferred from the data specifications (data-specs) Rows inferred from a key specification (key-spec) Reading and Writing into the same Table Using two table declarations Using the same table declaration Indexed collections of Tables and Columns Indexed collections of Tables Indexed collections of Data Columns IV SAMPL: Extensions to AMPL for Stochastic Programming and Robust Optimisation 49 8 Introduction to modelling under uncertainty Background Limitations of deterministic models Paradigms for modelling under uncertainty Stochastic Programming Robust Optimisation Modelling random parameters using discrete scenarios Modelling uncertainties (robust)

5 CONTENTS iv 9 Modelling with SAMPL and Tutorial How to represent models under uncertainty Stochastic programming model entities Robust Optimisation model entities Asset & Liability Management (ALM) Model (Tutorial 1) Introduction Algebraic Formulation ALM Dataset Representation Mapping algebraic entities to AMPL Expected Value Problem in AMPL Stochastic ALM in AMPL Stochastic ALM Formulated in SAMPL CC and ICC Formulations in AMPL CC and ICC Formulations in SAMPL Robust Formulations in AMPL Robust Formulations in SAMPL SAMPL: Language Reference Introduction Stages Scenario Probabilities Random Data Chance Constraints Integrated Chance Constraints Scenario Tree The Dakota Problem (Tutorial 2) Introduction Deterministic Algebraic Formulation Data Representation Expected Value Problem in AMPL A Stochastic Formulation A Stochastic Formulation in AMPL A Stochastic Formulation in SAMPL A Chance Constraint Formulation in AMPL A Chance Constraint Formulation in SAMPL An Integrated Chance Constraint Formulation in AMPL An Integrated Chance Constraint Formulation in SAMPL News Vendor Problem (Tutorial 3) News Vendor Problem: Single Resource Introduction Model Summary Expected Value Problem in AMPL A Stochastic Formulation A Stochastic Formulation in AMPL A Stochastic Formulation in SAMPL News Vendor Problem: Multiple Resources Model Summary Data Representation Expected Value Problem in AMPL A Two-Stage Stochastic Formulation

6 CONTENTS v 12.7 A Two-stage Stochastic Formulation in AMPL A Two-Stage Stochastic Formulation in SAMPL A Chance Constraint Formulation in AMPL A Chance Constraint Formulation in SAMPL An Integrated Chance Constraint Formulation in AMPL An Integrated Chance Constraint Formulation in SAMPL Supply Chain (Tutorial 5) Introduction Algebraic Formulation Model Summary Data Representation Expected Value Formulation in AMPL A Stochastic Formulation in AMPL A Stochastic Formulation in AMPL A Stochastic Formulation in SAMPL A Chance Constraint Formulation in AMPL A Chance Constraint Formulation in SAMPL An Integrated Chance Constraint Formulation in AMPL An Integrated Chance Constraint Formulation in SAMPL Scenario Generation Issues and desirable properties in scenario generation Overview of SG methods Decision Evaluation 131 References 137 Index 138 Appendices 139 A SAMPL known weaknesses

7 Part I Overview of AMPLDev 1

8 Chapter 1 Scope and Purpose 1.1 What is AMPL? AMPL[48] is an algebraic modelling language that is used for formulating and solving optimization problems. AMPL supports linear and non-linear programming optimization models covering both discrete and continuous variables. Stochastic extension of AMPL (SAMPL) has comprehensive support for stochastic programming; SAMPL is an extension of AMPL. You can easily create models and establish connection to various data sources, such as spreadsheets, databases or plain text files, within AMPL. From the model and data, the AMPL translator generates an internal representation, passes it to one of the many solvers supported by AMPL and provides you with the results. The language is simple and generally follows the mathematical notation used by both mathematicians and practitioners in the field of operations research. AMPLDev is a graphical interface for AMPL. It is based on the popular Eclipse development platform and is available as a stand-alone application and as a plug-in for Eclipse. For a general introduction to AMPL see [48]; a description SAMPL may be found in (reference). AMPLDev as a stand-alone application This version is a complete bundle of the necessary components required to run AMPLDev without the complications of installing any prerequisites. You can think of it as a core Eclipse IDE with the AMPLDev plug-in preinstalled. The only drawback is that this stand-alone version of AMPLDev may have limited capabilities of extending with other Eclipse plug-ins. If you do not require Eclipse for any other reason, then we recommend installing this version of AMPLDev. 1.2 Who can use AMPLDev? AMPLDev can be used by anyone who is considering using an algebraic modelling language and would like a quick and easy way to start off the process of learning AMPL, and even for those who are already using AMPL and would like a modern IDE for this language. The common areas that use such a tool are distribution, production, scheduling and other areas that have large-scale optimization problems. To use AMPLDev, you are expected to have a basic knowledge of programming and some optimisation modelling experience. Experience with Eclipse or any other development environment, although not required, will considerably speed up the learning process. 1.3 Why use AMPLDev? AMPLDev gives AMPL users the benefits of a modern Integrated Development Environment (IDE) which has the following features. 2

9 CHAPTER 1. SCOPE AND PURPOSE 3 A smart editor with context-sensitive syntax highlighting. Efficient error reporting with the ability to instantly go to the error location. A solution view which organises and separates the results from solving the model. A project explorer that allows you to organise all your projects and corresponding folders with useful context menus that directly allow you to run AMPL files. Built-in interactive AMPL console. Outline view that shows the model components: parameters, sets, variables, objectives and constraints. For users who are new to AMPL such as students who are learning modelling using AMPL for the first time, the carefully designed Graphical User Interface (GUI) helps to easily get started. At the same time AMPLDev is beneficial for advanced users who need such features as built-in interactive console, integration with version control systems provided by Eclipse and projects supporting other programming languages in addition to AMPL. For the users who wish to install the plug-in version of AMPLDev, Eclipse is absolutely free to use and has minimal installation requirements. Integrating AMPL with Eclipse is a huge advantage if you are already an Eclipse user and can have one software that takes care of all your development needs. In addition, you can even extend the AMPLDev plug-in if you wish to customize it further for your needs.

10 Chapter 2 Installing AMPLDev This chapter describes how to set up your computer for using AMPLDev. The first section covers installing the AMPLDev stand-alone application which is the essential step of getting started with AMPLDev. The second section describes how to install the AMPLDev plug-in in Eclipse; primarily for advanced users. 2.1 AMPLDev stand-alone application This is the fastest way to start using AMPLDev. Download the archive from the website onto your local machine, unless you have already received the archive via other means. Once you extract all the files from the archive, in the root folder you will find the AMPLDev executable, double-click it and you are good to go! The stand-alone application does not require any Java Runtime Environment (JRE) or Eclipse installations as these items of software are all bundled in with the executable. 2.2 AMPLDev plug-in for Eclipse users (Not supported yet) If you already have Eclipse set up on your machine (a version from 4.1 onwards), then you may skip the next two sections. First, install the JRE on your machine. Following that, the next section describes how to get Eclipse and install it. The last subsection explains how to install the AMPLDev plug-in into your Eclipse Platform so that you can program in AMPL using Eclipse. Java Runtime Environment installation The JRE needs to be installed to run Eclipse. You can download the latest version of the JRE by following the link, It is recommended to use the JRE of version 7 or higher for the Eclipse version (Eclipse Juno 4.2); our system which is based on this version, is described in this manual. If you already have a JDK installed, make sure it is version 1.7 and up. TIP: Many computers already have the JRE installed but if you are not sure, the Java website has a nice tool that checks your computer for you and asks you to download the latest version if you do not have any or if you need an upgrade. You can either search for Do I have Java? or use the following link, 4

11 CHAPTER 2. INSTALLING AMPLDEV 5 Getting Eclipse Once you have the JRE set up, your computer is now ready for Eclipse. The version used throughout the book is Eclipse Juno 4.2 and can be freely downloaded from the Eclipse website ( eclipse.org/downloads/). For users, who wish to use a previous version, it is recommended to use only versions from Eclipse 4.1 onwards. On the download page, there are many different types of Eclipse Juno packages available; we recommend that you download the Classic version. Most of the others focus on a specific development area such as Java or PHP but the Classic version is an all round Integrated Development Environment (IDE) with all the relevant tools needed for our purpose. Eclipse is available for major operating systems (Windows, Linux and Mac OS) and supports 32-bit and 64-bit platforms. TIP: Some 64-bit machines may have a 32-bit JRE installed; that is fine as long as the Eclipse is also 32-bit. Hence when downloading Eclipse, it does not depend upon your machine but upon the JRE installed. For example, a 64-bit Eclipse will not work on a 64-bit machine with a 32-bit JRE; either get a 32-bit Eclipse or a 64-bit JRE. You can choose from numerous mirror sites, available across Europe, Asia, North America, South America and Australia, depending upon your location for better download speeds. Once the file is downloaded, unzip it at your desired location. As such, there is no setup file that needs to be run for the installation. Then run the eclipse executable (eclipse.exe for Windows users) which is located inside the extracted eclipse folder. But don t forget to install the AMPLDev plug-in if you would like to use AMPL! Installing AMPLDev plug-in There are two ways to install the AMPLDev plug-in; these two ways are described below. Installing AMPLDev plug-in via update site - recommended 1. Once you have downloaded the AMPL plug-in archive on your computer, start Eclipse and click on Install New Software... in the Help menu. 2. Click on the Add... button. 3. If you have extracted the plug-in file, click on Local... and navigate to the AMPL plug-in folder or do the same with Archive..., if you have not extracted the AMPL plug-in. 4. Click on OK and the Install New Software... window will list the available AMPL plug-in. 5. Select the plug-in and click Next. 6. Follow the default settings and the AMPL plug-in will be installed. 7. Restart Eclipse and you should have AMPL support! TIP: If you cannot see the AMPL plug-in listed as an option to install, uncheck the Group items by category. Installing AMPLDev plug-in manually 1. Extract all the files from the AMPL plug-in archive to any location. The location does not matter as you will soon be copying the required folders into Eclipse.

12 CHAPTER 2. INSTALLING AMPLDEV 6 2. Once the archive has been completely extracted, you will find a plugins folder inside it. Copy the contents. 3. Navigate to the folder where you have installed your Eclipse. Inside the eclipse folder, you will see a folder named dropins. Paste the previously copied plugins folder into the dropins folder. 4. Repeat the previous two steps for the features folder in the extracted plug-in archive by copying the contents into Eclipse s dropins features folder. 5. Simply start Eclipse or restart it (if it was already running) and you should have AMPL support!

13 Chapter 3 AMPLDev User Interface In this chapter the Graphical User Interface for AMPLDev is described. The basic interface is described here along with its various parts and some important menu items as well. You will notice that there are many features provided by Eclipse which are for more advanced users and are not paramount to using AMPL. For Eclipse features, we recommend referencing the Eclipse Help sections. This document pertains to AMPLDev and we will describe the various functionalities, views and graphical tools relevant to AMPL. 3.1 Workbench Once the computer has been set up for AMPLDev (see chapter 2), run the ampldev executable (ampldev.exe for Windows users) to start an AMPLDev session. When AMPLDev starts, it will ask for a workspace; this is essentially a directory on the computer in which it will save the projects (see figure 3.1) you create. Figure 3.1: Workspace selection TIP: You can change the workspace any time by choosing Switch workspace... from the File menu. The workbench is the interface that you will see when you start AMPLDev (3.2). This is where you can manage the whole project life-cycle, from creation to the final product. A workbench window has menus and toolbars along with perspectives, where a perspective is a customized collection of views and editors based on the purpose of the perspective. AMPLDev allows users to open multiple sessions of the workbench. Therefore while a window/- workbench is open, by running the eclipse executable file again, another window will open that can be used with a different workspace. 7

14 CHAPTER 3. AMPLDEV USER INTERFACE 8 Figure 3.2: AMPLDev workbench 3.2 Perspectives Although perspectives is an advanced topic and meant particularly for people interested in using AMPLDev along with Eclipse, we have mentioned it here briefly since AMPLDev as a stand-alone does allow you to change perspectives. A perspective is a customized layout of views that is set in an efficient and useful way for the purpose it was created. These views are set around an editor which is used to edit your files. You can move the views around to place them in a manner that you desire and these preferences will be saved for the next time you start a session. There are two ways to get the AMPL perspective. The AMPLDev stand-alone generally starts with the AMPL perspective already but for some systems settings, it might not. Switching to the AMPL perspective can be achieved in one of the following ways: In Window menu, select Open Perspective..., choose Other... and then click on AMPL. Create an AMPL project; and if you are not in the AMPL perspective, AMPLDev will ask you to change to the AMPL perspective upon the creation of the project. Although it is not necessary to use the AMPL perspective for using AMPL, the layout is what best suits for AMPL modelling. There are also many AMPL-specific shortcuts that are available easily when in this perspective. They will be discussed in later chapters but one example is the toolbar which has buttons to create AMPL projects and files without going through the File menu. 3.3 Views Views provide for smart ways to represent any kind of information; such as the Project Explorer which shows an easy-to-understand tree structure of all the parts of a project: files, folders etc, collectively known as resources. Views are also handy to navigate through large systems of information. A perspective contains a set of views that open automatically when the perspective is selected. Although, you can also open views that are not part of the current perspective via the Window menu

CHAPTER 3. AMPLDEV USER INTERFACE 9 item and then selecting Show View... As described previously, these additional views opened in a perspective are saved under the perspective.

15 CHAPTER 3. AMPLDEV USER INTERFACE 9 item and then selecting Show View... As described previously, these additional views opened in a perspective are saved under the perspective. Views can be moved around at any time by clicking and holding the left mouse button on its title bar; they can either be left detached or docked around the Workbench window. They can also be maximised by double-clicking on the title bar and doing it a second time will bring it back to its original size and position. To manage the screen space better, views can be stacked on top of each other and viewed by clicking on the tabs above the stacked views. The active view will have a highlighted tab. Most views also have menus of their own which can be accessed by clicking the drop down arrow in the top right corner of the view. Sometimes they may even have their own toolbars. The following sections describes the views that form the AMPL perspective. Project Explorer As explained in section 3.3, the Project Explorer view (figure 3.3) displays the projects in the workspace in a tree-structure. All the functions are the same as described in the Eclipse manual for the Project Explorer view with the addition of some AMPL specific functions, which are active in the AMPL perspective. Figure 3.3: The Project Explorer view in the AMPL perspective The context menu (pop-up menu on right-clicking, figure 3.4) allows you to add AMPL projects and files into the workspace, without going through the File menu wizards. The Run As... and the Debug As... also have an AMPL quick launch option, if the selection is an AMPL file. On clicking this quick launch, only the selected file is run by AMPL. In order to run multiple files, you must create a launch configuration (see section 5.2) Console This view displays all the output and runtime errors from AMPL and SAMPL and allows user to enter commands and input data. For more details, see section 5.4. Outline This view (3.5) shows the list of model components, such as parameters, sets, variables, objectives and constraints, for the AMPL file currently open in the editor. Solution This view displays the solution report for the last run. For more details, see section 5.5.

4 Editors The model and data source files are opened, modified and saved using

16 CHAPTER 3. AMPLDEV USER INTERFACE 10 Figure 3.4: Context menu in Project Explorer Figure 3.5: Outline view 3.4 Editors The model and data source files are opened, modified and saved using the editor. Different types of files have their own editors associated with them. When you double-click on any file in the views, it

17 CHAPTER 3. AMPLDEV USER INTERFACE 11 will automatically open it in the editor area with its respective editor. If there are multiple editors open at one time, then they are stacked in the editor area but can be separated in the same manner as how views are moved around except that they cannot be detached completely. When a file s name tab has an asterisk (*), it means there are still unsaved changes in that file. Content Assist The AMPL and SAMPL editors support content assist, which greatly improves the productivity when coding. Based on the context of the editing session, content assist provides the developer with a list of accessible keywords according to AMPL specification and the currently defined entities. It is sufficient to begin writing a word in the editor and press CTRL+SPACE to bring up the content assist window, which will suggest various possible terminations for the prefix which has been typed. Figure 3.6: Content assist 3.5 Preferences AMPLDev has many options which can be configured opening the Windows menu and clicking the Preferences item (see 3.7). These options affect, among others, appearance (e.g. colors and fonts), and default behaviour of the workspace management system. Pre launch script Some AMPL and SAMPL specific configuration options can be set from the same menu under the AMPL and SAMPL labels. Clicking on the headings AMPL and SAMPL brings up the prelaunch window for the respective environment. It is an AMPL or SAMPL script which is executed every time a launch configuration is solved. It can be used to specify options which need to be global and common for all executed scripts on a particular machine (e.g. what solver to use or some commands to send s after finishing the solution process). Search paths Under AMPL, clicking on Search path the user can select the search path for the current AMPLDev installation (see figure 3.8). Any directory specified here will be searched for executables when running any launch configuration. It can be used to specify the location of solvers or other utility programs which are used only in AMPLDev session without modifying the global system path.

can be found online on their website at http://help.eclipse.org/juno/index.

18 CHAPTER 3. AMPLDEV USER INTERFACE 12 Figure 3.7: Preferences Figure 3.8: Search Path TIP: Eclipse has extensive documentation on the Eclipse Workbench which can be found online on their website at (Eclipse Juno 4.2) and can also be accessed via the software s Help menu. This is the version AMPLDev is built on.

19 Part II Modelling with AMPLDev 13

20 Chapter 4 Projects within AMPLDev The following sections describe the different ways you can create a project, add files to it and use some of the available templates. If you are starting Eclipse for the first time, follow the steps described in section 3.1 to start the Eclipse workbench. 4.1 Concept of a project A project is a collection of source code and any supporting files such as documentation and data that may be arranged using folders in a way similar to organizing files in the filesystem. Projects are contained within workspaces described in section 3.1. Some projects are associated with one primary language and occasionally some supporting languages; it is not possible to add unsupported language files in such a project. This does not apply to the workspace which can have a mix of various types of projects under it. You can organise files and folders in any manner as you please under the project root. 4.2 Creating a new project There are numerous ways to create a project. perspective to create an AMPL project. In general, you do not need to be in the AMPL Using the File menu in a non-ampl perspective This is the most universal method of creating a new project. 1. Go to the File menu. 2. Select New. 3. Select Project A New Project dialog will open which will ask you to select a project wizard. Choose AMPL Project under the AMPL category (see figure 4.1) and then click Next. 5. Give the project a name. You can either choose the default workspace location, or browse for another location. The Create model, data and script directories option as its name suggest creates the three respective folders in your project, if you check it (figure 4.2). You can always create the folders later (using the context menu of the project). These folders are independent of the files you put in them and is only for your personal organization. For example, you can put a.mod file in the data folder. They are not restrictive by type (see figure 3.3). 14

The project has been created and you will be able to see it in Project Explorer.

21 CHAPTER 4. PROJECTS WITHIN AMPLDEV 15 Figure 4.1: Eclipse New Project dialog 6. After clicking Finish, AMPLDev will ask you if you want to change to the AMPL perspective, if you are not in the AMPL persperctive. 7. The project has been created and you will be able to see it in Project Explorer. Using the File menu in the AMPL perspective To create a project in the AMPL perspective is fairly straightforward. TIP: 1. Go to File menu. 2. Select New. 3. Choose AMPL Project and follow the steps as described in the previous section to create the project. There is the New button ( ) in the toolbar which is present in all perspectives by default. Clicking this button opens a dialog to create a new file or a project. Figure 4.2: AMPL Project wizard

22 CHAPTER 4. PROJECTS WITHIN AMPLDEV 16 Using the Project Explorer context menu in the AMPL perspective The context menu is the menu that pops up with the right click (or command + click for a Mac). In the AMPL perspective when right-clicking in the Project Explorer view, there are shortcuts to create a project. 4.3 Creating and adding files to an AMPL project Before describing the steps to create and add files, here is a brief description on the types of files supported by the AMPL plug-in for Eclipse. Supported AMPL file extensions The three traditional AMPL file extensions are.mod for model files,.dat for data files and.run for script files. They are all supported by AMPLDev. In addition we have introduced a new file extension for AMPL files; you can now create an.ampl file which can also be used for AMPL code. Similarly, we have.sampl for SAMPL code. The.ampl files encompass all aspects of the.mod,.dat and.run file types; for example, you can have TestModel.ampl, TestData.ampl and TestScript.ampl which can have the same code as TestModel.mod, TestData.dat and TestScript.run. This will not make a difference while executing the files because the AMPL translator treats all the files in the same way regardless of their extension. We recommend using the.ampl extension for new files because the.mod and.dat extensions are often associated with other programs. For example, the.mod extension is often recognized as an extension for audio files. Create AMPL file Most of the steps are the same as creating a project except this time we choose to create a new file. 1. Go to the File menu. 2. Select New. 3. If you do not see the options for all the different AMPL file types then select Other... A wizard dialog will open, where you can choose the file you would like to add under the AMPL category and click Next. 4. The first field requires you to choose the project or a folder in the project in which the file must be placed. Then choose a filename. As you can see from figure 4.3, the title of the wizard is an AMPL File, so you do not need to enter the extension; this wizard will automatically create a.ampl file. 5. Once you have filled in the required information, AMPLDev will create a new file in the chosen project or project folder and open it in the editor. Using other means To create and add a file via the Project Explorer context menu in the AMPL perspective, is exactly the same as the project creation described in section 4.2. The New File shortcuts are in the same locations and you can follow the same steps for the file wizard as described in section 4.3.

23 CHAPTER 4. PROJECTS WITHIN AMPLDEV 17 Figure 4.3: New AMPL File wizard

24 Chapter 5 Running Models and Analysing Results Running a model written in AMPL requires the use of AMPL compiler/interpreter which translates the model and combines the model and data into a machine-readable and solver-specific form, passes it to one of the many solvers supported by AMPL. It then gets the results and executes any script commands. In Eclipse, these runs are referred to as launches. It is possible to run a single file or multiple files in the order you want. There is an additional default launch available for projects that contain only one model and one data file. Note: At this stage, debug functionalities are not implemented, so the Run and Debug buttons have the same effect for AMPL projects. 5.1 Single file launch The previous chapter covered how to create a project and add files to it. When you do a single run, there is no relationship between the consecutive launches. For example, you cannot run a model file first and then a corresponding data file; each launch will be independent and Eclipse will not be able to associate the model with the data. (For doing that, refer to section 5.2). The single file launch allows you to quickly run a file without creating a custom launch configuration. A simple AMPL model that will work in a single file launch, is given below: set PROD ; # products param rate { PROD } > 0; # tons produced per hour param avail >= 0; # hours available in week param profit { PROD }; # profit per ton param market { PROD } >= 0; # limit on tons sold in week var Make { p in PROD } >= 0, <= market [ p]; # tons produced # Objective : total profits from all products maximize Total_Profit : sum { p in PROD } profit [ p] * Make [ p]; # Constraint : total of hours used by all # products may not exceed hours available subject to Time : sum {p in PROD } (1/ rate [p]) * Make [p] <= avail ; data ; 18

25 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 19 set PROD := bands coils ; param : rate profit market := bands coils ; param avail := 40; option solver cplexamp ; solve ; You may want to change the solver based on the ones installed on your machine, changing the statement option solver cplexamp; accordingly. There are two ways to run this code: context menu or launch toolbar. Both of them can be used in any perspective. Context menu launch This method is fairly straightforward. 1. In the Project Explorer, select the file you wish to execute. For this example, we have placed the code in the file steel.ampl. 2. Right click (or command + click for Mac) to open the context menu. 3. Select Run As or Debug As and you will see another menu pop-up (figure 5.1). Figure 5.1: Context menu launch 4. This pop-up will have an option saying 1 AMPL. By selecting this, AMPLDev will automatically create a launch configuration for this file and the file will be executed. The launch configuration will have the same name as the file.

26 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 20 Note: in the Project Explorer, if you right click on a project folder, the Run As or Debug As will show you an AMPL option, but this is only valid for a default launch, refer to section 5.3. Launch toolbar Once you have created a default configuration like in the previous defined steps, you can see these launches in the dropdown menus accessible with the Run ( ) and Debug ( ) toolbar buttons (figure 5.2). Figure 5.2: Launch toolbar When you click the down arrow next to these buttons, you will see the previous launched configuration for steel.ampl. You will also see the other method of launching the file which is the Run As or Debug As option, see figure 5.3. Figure 5.3: Launch toolbar Run menu: 1 steel.ampl is the previously created launch configuration This shortcut creates a launch configuration for the current selection which means if there is a file selected in the editor or in the Project Explorer, it will launch that. In the case of steel.ampl, since there already is a launch configuration it will not create a new one. You can even choose 1 steel.ampl if you would like to launch that file again. As mentioned before, the launches are independent of the file extension; so whether you run an.ampl file, a.mod file or a.run file, AMPLDev will treat them the same way. The only exception is for the files with the.dat extensions which will be run in the data mode (as if the files contain the data statement at the beginning). 5.2 Multiple file launch Having a single file execution is not always desirable; you may like to separate your model from the data or have multiple models for some data and it is not ideal to put all the code in one file. For this reason, there is a multiple file launch.

27 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 21 In order to run multiple files, you will need to use the launch configuration dialog. This can be found in either the context menu or launch toolbar shortcuts mentioned in the single file launch (section 5.1). Looking at figure 5.1 and 5.3, you will see the option Run Configurations... (or Debug Configurations...). by selecting that you will open the launch configuration dialog (figure 5.4). Figure 5.4: Launch configuration dialog Double click AMPL/SAMPL in the tree on the left, or right click it and select New. This will create a new configuration and will open a clean File Selection tab on the right. You can give the configuration a name to help differentiate between various configurations. TIP: You will notice that the configuration created earlier steel.ampl is also present on the left hand side. By selecting that, you can add more files to it later on, if you wish to expand that code into multiple files. So the single file launch can later be adapted for multiple files. The File Selection tab In this tab, you can add the files you wish to run together. Start by clicking the Add button to add some files. This will open a window with a list of all the files in their respective projects and folders (figure 5.5). If you click the folder, all the files in it will be selected. After selecting the files, they will be automatically added to the table so you can now select the order in which you would like to execute them by selecting the file in the table and clicking the Up and Down buttons to move the selection (figure 5.4). Clicking Apply just saves the changes, although Eclipse will always ask you to save if there are any unsaved changes before you launch. You will also notice two checkboxes under the table. The first one (Solve after executing the files listed above) is the equivalent of writing solve; at the end of your AMPL code. If checked AMPL will automatically solve after running the files selected in the table. This will also populate the Solution view (see section 5.5). The second one (Use the SAMPL translator) specifies whether to use the SAMPL translator instead of the AMPL one.

28 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 22 Figure 5.5: Selecting the files for the launch configuration Once you are ready to run the set of files, hit the Run or Debug button and your files will be executed. 5.3 Default launch This launch is available for projects that contain only one model (.mod) and one data (.dat) file. This is to avoid the hassle of creating a launch configuration (as it comes under the multiple file launch), for a simple project structure. You do not need to do anything special for using this type of launch; AMPLDev automatically checks whether the launch conditions of one model and one data file are satisfied in the project and creates a default configuration. You use the same shortcuts as described in single file launch (section 5.1). The main difference is that when you have a one model and one data project you can even launch it from the project folder s context menu via Run As or Debug As. If these conditions are not satisfied then, AMPLDev will ask you to create a launch configuration. 5.4 Viewing output and errors While a run is in progress the output and errors if there are any will show up in the Console view. This view is generally located in the bottom panel of AMPLDev. If it is not visible you can view it via the Window menu, selecting Show View and then choose the Console view. The console is interactive; you can enter and execute arbitrary commands there (see figure 5.6). The text entered by the user is colored differently from the output text. In the case of an error the detailed information and a hyperlink to the error location if available is shown in the console (figure 5.7). In this example, there is a syntax error because cost has been mistyped. The location hyperlink contains the name of the file, the line number and the offset from the beginning of the file. Clicking the hyperlink opens the file in the editor and highlights the line presumably containing the error. Note that in some cases the error may be located in the code above this line, for example in the case of missing semicolon. You can use the display and print commands to output various information from your AMPL code to the console.

If you cannot see it then you can open it via Window Show View...).

29 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 23 Figure 5.6: Console view Figure 5.7: Error shown in the Console view 5.5 Viewing the solution The solution report will be displayed in the Solution view (generally located in the right panel of the AMPL perspective. If you cannot see it then you can open it via Window Show View...). This view is only populated if the Solve after executing the files listed above option is checked for the launch configuration (see figure 5.4). There are three sections in the Solution view, refer to figure 5.8. The first part states the Objective value and the name of the objective function in brackets. The first table describes the variables and their corresponding values. Reduced Cost is the amount by which the objective value would improve if the value of the corresponding variable is increased by one. Slack is the distance from the bound. The second table describes the constraints where Body is the body of the constraint, Dual Value is an indication of how much the objective value can be increased if the constraint is relaxed by one unit and lastly Slack is the distance between the body of the constraint and its bound.

30 CHAPTER 5. RUNNING MODELS AND ANALYSING RESULTS 24 Figure 5.8: Solution view

31 Part III AMPL: A Mathematical Programming Language 25

32 Chapter 6 Learning AMPL by Example This chapter gives an overview of the basic syntax required to start modelling in AMPL. We use a simple real world example to introduce this syntax. 6.1 A Linear Programming Model National Insurance Associates (NIA) carries an investment portfolio of stocks, bonds and other investment alternatives. Currently 200,000 of funds are available and must be considered for new investment opportunities. The four stock options National is considering and the relevant financial data are as follows: Stock A B C D Price per share Annual rate of return Risk measure per invested Table 6.1: Financial Data The risk measure quantifies the risk associated with the stock. It could be simply the mean absolute deviation and it indicates uncertainty in respect of it realising the projected annual return: higher values indicate greater risk. NIA s top management has stipulated the following investment guidelines, 1. The annual rate of return for the portfolio must be 9% 2. No one stock can account for more than 50% of the total sterling investment For the NIA problem, the Linear Programming (LP) model needs to estimate the number of stocks NIA should buy with minimum risk and complying with the provided guidelines. These are referred to as the objective and the constraints, respectively. Objective: Minimise the risk as much as possible while buying the shares. Based on table 6.1, we risk 0.10 on each pound invested in stock A, 0.07 for stock B, 0,05 for stock C and 0.08 for stock D. Therefore, if you are buying x 1 shares for stock A, then our risk exposure is x 1, since each share of stock A costs 100. Along the same lines, we can formulate our risk to be, Risk = ( x 1 ) + ( x 2 ) + ( x 3 ) + ( x 4 ) Next, we formulate the constraints. Some constraints are explicitly given such as the investment guidelines in our problem but some need a little more logic reasoning such as constraint 1, described below. 26

33 CHAPTER 6. LEARNING AMPL BY EXAMPLE 27 Constraint 1: The NIA has given a budget ( 200,000) on the amount we have available to invest in these four stocks. So, we must comply with the budget. (100 x 1 ) + (50 x 2 ) + (80 x 3 ) + (40 x 4 ) Constraint 2: The first guideline says that the annual rate of return must be 9%. Table 6.1 contains the annual rate of return for each stock. From that, we have the following, ( x 1 ) + ( x 2 ) + ( x 3 ) + ( x 4 ) Constraint 3: The second guideline says that no single stock can be more than 50% of the total sterling investment. So each stock can only be invested for a maximum of 100, x x x x This linear program is summarised in listing 6.3. An algebraic formulation of the complete model is given below along with the corresponding data for the NIA problem after, Listing 6.1: Algebraic Formulation of NIA Model Given: S a set of stocks a i i S, price per share of stock i b i i S, annual rate of return for stock i c i i S, risk per pound sterling invested in stock i R required annual rate of return for portfolio M maximum fraction of total pound sterling investment a stock can account for F total funds available Define: x i i S, number of shares bought of stock i Minimise: subject to: c i a i x i i S risk over all shares bought for all stocks (a i x i ) F amount invested must be less than or equal to available i S funds (b i a i x i ) RF total return must be greater than or equal to required return i S of portfolio a i x i MF i S, investment in stock i must be less than the maximum a stock can account for in total sterling investment

34 CHAPTER 6. LEARNING AMPL BY EXAMPLE 28 Listing 6.2: Data for the NIA Model S = {A, B, C, D} R = 0.09 M = 0.5 F = i A B C D a i b i c i Translating the model into AMPL Listing 6.1 describes a generalised model which does not have to be specific to the NIA problem; it is simply a minimisation model based on some constraints. Only when the data in Listing 6.2 is applied to the model, are we addressing a particular problem. The NIA problem can now be written out explicitly as shown in Listing 6.3. Listing 6.3: Linear Program of NIA problem Minimise: 10x x 2 + 4x x 4 subject to: 100x x x x x 1 + 4x x 3 + 4x x x x x Such a formulation is manageable when the data set is small enough to be written explicitly, but as the data set increases, it is harder to do this. AMPL is a language that encourages separation of model and data; it allows you to define your model and data as their corresponding algebraic formulations described in Listing 6.1 and 6.2. Thereby, making the model independent of the data and letting the language manage the formulation of the optimisation problem corresponding to Listing 6.3. The equivalent AMPL representation of the NIA problem is specified in Listing 6.4 and 6.5 # Sets set stocks ; Listing 6.4: AMPL code for Model # Parameters param reqreturn ; param maxallow ; param totalfunds ; param price { stocks }; param return { stocks };

35 CHAPTER 6. LEARNING AMPL BY EXAMPLE 29 param risk { stocks }; # Variables var buyamount { stocks } >= 0; # Objective minimize riskobj : sum {i in stocks }( risk [i] * price [i] * buyamount [i]); # Constraints subject to investment : sum {i in stocks }( price [i] * buyamount [i]) <= totalfunds ; subject to ret : sum {i in stocks }( return [i]* price [i]* buyamount [i]) >= reqreturn * totalfunds ; subject to invest { i in stocks }: price [ i] * buyamount [ i] <= maxallow * totalfunds ; data ; Listing 6.5: AMPL code for Data # Sets set stocks := A B C D; # Parameters param reqreturn := 0. 09; param maxallow := 0.5; param totfunds := ; param price := A 100 B 50 C 80 D 40; param return := A 0.12 B 0.08 C 0.06 D 0. 10; param risk := A 0.10 B 0.07 C 0.05 D 0. 08; By having the ability to define a model separately from the data, it is possible to re-use this model for different situations. For example, when the total funds has increased or the portfolio can include more than the four described stocks. The AMPL syntax very closely follows conventional mathematical terms so it s easy to take the algebraic model and convert it into AMPL. The basic components of AMPL are, Sets

36 CHAPTER 6. LEARNING AMPL BY EXAMPLE 30 Parameters Variables (whose values the solver is to determine) Objective (to be maximised or minimised) Constraints (the solution must satisfy) Comments can appear anywhere in the AMPL code. They start with a # symbol and are active till the end of the line. Sets A set can be defined as a collection of well-defined objects. In our example, we have one fundamental set: the stocks. When defining a set, AMPL uses the set keyword before the unique name to identify the set. set stocks ; The data associated with this set is represented in Listing 6.5. Note: All declarations in AMPL must end with a semi-colon(;). Parameters Parameters are the numerical values that are associated with the model. This is what is generally used to define the available data. In AMPL, we use the param keyword to declare a parameter. We have three parameters that take scalars, param reqreturn ; param maxallow ; param totalfunds ; They represent the required annual rate of return for the portfolio, the maximum amount of a single stock that can be accounted for in the investment and the total funds available for the investment, respectively. The scalars are defined in the data (Listing 6.5) Parameters do not necessarily represent a single value; they can be vectors or matrices of numerical values, as well. The following parameters are a group of values that have been indexed over the set of stocks; implying that for each parameter, there is one numerical value for each object in the set. The indexing set is written after the unique identifier and within curly brackets. param price { stocks }; param return { stocks }; param risk { stocks }; The parameter s data assignment demonstrates its ability to take more than one numerical value, param price := A 100 B 50 C 80 D 40; Here, the parameter price assigns a value to each stock defined in the set stocks.

37 CHAPTER 6. LEARNING AMPL BY EXAMPLE 31 Variables These are the decision variables that are determined by the optimising algorithm, unlike parameters whose values are given by the modeller. Apart from that, variables declarations are the same as parameters. In AMPL, a variable is declared using the keyword var and they can also be indexed over sets. var buyamount { stocks } >= 0; In our example, there is only one decision variable which represents the amount of each stock to buy. Hence, it is indexed over the set stocks. Variables and parameters can also apply restrictions on their numerical values. For the variable buyamount, we have put a non-negative restriction i.e. for every value estimated for every stock in the variable, the value must be greater than equal to 0. Objective The objective function to be optimised is defined by a linear or non-linear expression. The expression is generally defined using the sets, parameters and variables. Depending on the model, we add the keywords minimize and maximize before the name of the function. minimize riskobj : sum {i in stocks }( risk [i]* price [i]* buyamount [i]); Please take special care with the spellings as AMPL uses American English. Here, there are two new concepts: the dummy index i and the keyword sum. The dummy index is declared by associating it with a set using the in keyword: {i in stocks}. They are used to obtain a member using the subscript expression ([]) on the parameter or variable who have been indexed over this set. For example, in the declaration of parameter risk, (param risk {stocks};), it has the simplest form of an indexing expression: {stocks}. When we wish to index into a particular member, we use the dummy index: risk[i]. The keyword sum is used with an indexing expression to sum over all the values defined by the index. In our example, we sum the products of the risk, price and amount to buy for each stock. The scope of the dummy index extends only till the end of this linear expression: the scope of the sum. Constraints Constraints generally start with the keywords subject to but even this is optional; AMPL assumes that any declaration not beginning with a keyword is a constraint. The algebraic description of a constraint may be an equality or inequality composed of the parameters and variables. The simplest constraint imposes a limitation; for example the investment constraint. investment : sum {i in stocks }( price [i]* buyamount [i]) <= totalfunds ; Here, we are making sure that the amount of stocks bought is less than or equal to the available funds: an upper limit. Similarly the ret constraint places a lower limit on the amount of return expected from our investment. Most of the constraints in large linear programming models are defined as indexed collections by giving an indexing expression after the constraint name. invest {i in stocks }: price [i]* buyamount [i]<= maxallow * totalfunds ; The constraint invest imposes a limit on each stock, represented by the dummy index i, by not letting any one particular stock be bought for more than maxallow of the total sterling investment.

38 Chapter 7 Databases and Spreadsheets AMPL s structure of indexed data is very similar to the structure of relational tables commonly used in database applications. Users can take advantage of this similarity using AMPL s table declaration. This chapter illustrates the use of databases and spreadsheets in AMPL; a simple model (the diet problem) will be used for that purpose and is introduced in the first section. Then, it describes how to prepare the data for AMPL to use, how to establish correspondences between AMPL and data entities and how to perform the actual read/write operations. 7.1 Example: A Diet Problem In the diet problem, we must choose a mix of prepared foods that will satisfy the daily required nutrients. They must be chosen in such a way that these requirements are met and the cost is minimal for a week s worth of food i.e. at least 700% of each required nutrient. Following (table 7.1) is the list of prepared food items with their corresponding costs, FOOD Food Description Cost f_min f_max BEEF Beef CHK Chicken FISH Fish HAM Ham MCH Mac & Cheese MTL Meat Loaf SPG Spaghetti TUR Turkey Table 7.1: Prepared Food Data f_min and f_max impose a minimum and maximum limit on the amount of food chosen. If we try to solve the diet problem without any limit on the chosen foods, then the solution tends to have too much of specific food types and none at all of others which is not ideal in a real world situation. In the same manner we put minimum and maximum limits on the amount of each nutrient (table 7.3): 700% and 20000% respectively, 0 mg and mg for sodium (NA), and Cal and Cal for total calories (CAL). We have two sets that contains the food and nutrient items: FOOD and NUTR. For the already described data, we require 6 parameters: cost, f_min, f_max, amt, n_min and n_max. To represent the data shown in table 7.2, we use the parameter amt. It stands for the amount of each nutrient present in each type of food and is therefore indexed over both the sets. The AMPL model is shown in listing

39 CHAPTER 7. DATABASES AND SPREADSHEETS 33 FOOD A(%) C(%) B1(%) B2(%) NA(mg) CAL(Cal) BEEF CHK FISH HAM MCH MTL SPG TUR Table 7.2: Nutrient Data NUTR n_min n_max A C B B NA CAL Table 7.3: Nutrient Limits set FOOD ; set NUTR ; Listing 7.1: Model file for Diet problem param cost { FOOD } > 0; param f_ min { FOOD } >= 0; param f_max {j in FOOD } >= f_min [j]; param n_ min { NUTR } >= 0; param n_max {i in NUTR } >= n_min [i]; param amt { NUTR, FOOD } >= 0; var Buy {j in FOOD } >= f_min [j], <= f_max [j]; minimize total_cost : sum { j in FOOD } cost [ j] * Buy [ j]; subject to diet { i in NUTR }: n_min [i] <= sum {j in FOOD } amt [i,j] * Buy [j] <= n_max [i]; 7.2 Preparing the Data There are various ways to pass the data introduced in the previous section to AMPL. The most obvious is writing it directly as AMPL data; a possible implementation of that is given in listing 7.2. Listing 7.2: Data file for Diet problem

40 CHAPTER 7. DATABASES AND SPREADSHEETS 34 set NUTR := A C B1 B2; set FOOD := BEEF CHK FISH HAM MCH MTL SPG TUR ; param : cost f_ min f_ max := BEEF CHK FISH HAM MCH MTL SPG TUR ; param : n_ min n_ max := A C B B NA CAL ; param amt (tr): A C B1 B2 NA CAL := BEEF CHK FISH HAM MCH MTL SPG TUR ; Realistically, we may want to store the data in an external database or spreadsheet; this may be because the data can be very large and cumbersome to type out in AMPL or because we already have the data stored externally. AMPL provides commands which allow you to import the data from an external database or spreadsheet into AMPL s indexed data structure. This is primarily done by the table declaration, table table-name inout opt string-list opt : key-spec, data-spec, data-spec,... ; and by the corresponding read table command, read table table-name; The various parts of the above declaration will be described in the following sections. Spreadsheets In the following syntax and examples, we use Microsoft Excel to describe connections to an external spreadsheet. In an Excel file called diet.xls, we create a range, called Foods with column names: cost, f_min and f_max. Since these parameters are indexed over the set, FOOD, we add a column to represent it

(a) Foods data (b) Nutrients Data Figure 7.1: Excel Spreadsheet for Foods and Nutrients data In the Foods range, the key column is FOOD.

41 CHAPTER 7. DATABASES AND SPREADSHEETS 35 (see AMPL code listing 7.1). It is important to create a range and give it a suitable unique name as this is what we will be using when reading and writing the data between the spreadsheet and AMPL (see figure 7.1(a)). (a) Foods data (b) Nutrients Data Figure 7.1: Excel Spreadsheet for Foods and Nutrients data In the Foods range, the key column is FOOD. The key column is indicated by the sets a parameter is indexed over; moreover it can be inferred from the table (figure 7.1(a)) as the minimum amount of column data needed to identify one row without ambiguity. Similarly, we create a range called Nutrients for n_min and n_max indexed over NUTR and a table Amounts for the third parameter amt. There is a slight difference in the latter table i.e. the parameter is indexed over two sets: NUTR and FOOD. To handle this, there is a column for each set which goes over the set values in such a manner that it creates a list of unique pairs between the two sets (see figure 7.2). In order to connect to the spreadsheet, the optional string-list which is part of the table declaration is used. For spreadsheets, the table string-list consists of the table handler name, the filename and the optional range name. The range name is only necessary to specify if the AMPL table-name in the table declaration is different from it. table Foods IN " ODBC " " diet. xls ":... ; The table handler being used is called Open Database Connection (ODBC) and as the name suggests, it provides access to databases and spreadsheets through an open database connection. The Excel filename is "diet.xls". The filename may also contain the directory if it is at a different location. If the AMPL table-name was different, then the declaration will change to, table FoodInput IN " ODBC " " diet. xls " " Foods ":... ; Since the AMPL table name is now FoodInput, we must specify the range name in the Excel file that contains the data: "Foods". Databases AMPL allows connections to any SQL database as long as the appropriate ODBC handler is installed. For the purposes of this example we have used MySQL.

42 CHAPTER 7. DATABASES AND SPREADSHEETS 36 Figure 7.2: The amount of nutrients in each food. (The list is not completely visible.) We create an SQL database called diet, in which there is a table called Foods with column names that match the parameter and set names. Similarly, we create table Nutrients and Amounts with their columns corresponding to the appropriate parameters and sets. The main difference between spreadsheets and databases occurs in the connection strings. There are two ways to connect to a database: Data Source Name (DSN) or the standard connection string. Standard Connection String The string-list component of the table declaration is similar to Excel; there are three typical parts: table handler name (which will be "ODBC" again), a connection string and a relational table name (again, if this is omitted then the relational table name will be taken as the AMPL table-name). The interesting part is in the connection string; a typical example of a MySQL connection string is, " DRIVER ={ MySQL ODBC 5.1 Driver }; SERVER = localhost ; DATABASE = diet ; USER = myusername ; PASSWORD = mypassword ; OPTION =3; INITSTMT = SET sql_ mode = ANSI_ QUOTES ;" The connection parameters are: DRIVER: This is the name of the driver which allows the ODBC handler to connect AMPL and the SQL database. SERVER: Name of the server that is hosting the database. If the database is on the machine you are using, a local database, then the server name is localhost. DATABASE: Name of the database. In this case, it is diet.

43 CHAPTER 7. DATABASES AND SPREADSHEETS 37 USER: The username of the account being used. If the username is the root account then use root. PASSWORD: The password associated with the user account. OPTION: This is specific to MySQL and is used to make the server behave in a specific manner (Please refer to MySQL reference for details). We always set OPTION=3; for the scope of this manual. INITSTMT: This is also specific to MySQL and is used in this case to switch the ANSI mode (SET sql_mode= ANSI_QUOTES ). When connecting to a MySQL database through AMPL, the quotes must be in ANSI mode or else AMPL throws a table not found error. The connection string changes based on the MySQL ODBC connector version and the SQL application used. Information regarding connection strings is easily available online (Example: http: // Hence, the complete table declaration for Foods is, table dietfoods IN " ODBC " " DRIVER ={ MySQL ODBC 5.1 Driver }; SERVER = localhost ; DATABASE = diet ; USER = myusername ; PASSWORD = mypassword ; OPTION =3; INITSTMT = SET sql_ mode = ANSI_ QUOTES ;" " foods ":... ; If instead of dietfoods, we use foods, then the third string in the above declaration is not required. Data Source Name (DSN) An alternative to using a complete connection string is to use the ODBC configuration utility and create a DSN. For Windows users, go to Control Panel -> Administrative Tools -> Data Sources (ODBC). The ODBC Data Source Administrator dialog will pop up; then select Add under User DSN or System DSN. It will ask you to select a driver; in our case, we choose MySQL ODBC 5.1 Driver and click Finish. Once Finish is clicked, MySQL will open its Data Source Configuration utility (see figure 7.3). The connection parameters are the same as described in the previous section. The Initial Statement corresponds to INITSTMT and can be found under Details and then the Connection tab. It contains the SQL statement: SET sql_mode= ANSI_QUOTES ;. With the help of the utility, you can even Test the connection to the database to make sure it is working. The table declaration for using a DSN is given below, table dietfoods IN " ODBC " " Diet DSN " " foods ":... ; Here the second string will contain the data source name, which is "Diet DSN" (figure 7.3). Again, if the AMPL table-name is the same as the SQL table name, then the third string ("foods") is not required. 7.3 Reading Data from Tables In order to use an external relational table, we employ a table declaration that specifies a read/write status of IN. The general form of this kind of declaration is, table table-name IN string-list opt : key-spec, data-spec, data-spec,... ;

CHAPTER 7. DATABASES AND SPREADSHEETS 38 Figure 7.3: MySQL Data Source Configuration Utility Each table declaration has two parts divided by the colon.

44 CHAPTER 7. DATABASES AND SPREADSHEETS 38 Figure 7.3: MySQL Data Source Configuration Utility Each table declaration has two parts divided by the colon. Before the colon, the declaration provides general information: table-name - the name by which the table is known within AMPL, the keyword IN - states that the default for all non-key table columns will be read-only (AMPL will use the columns as only input and will not write out to them) and the optional string-list - provides the information to locate the table in an external database file, and is specific to the database type and access method being used (Described in section 7.2). After the colon, the declaration gives the details of the correspondence between AMPL entities and relational table columns. The key-spec - names the key columns and is surrounded by square brackets, [... ] and data-spec - gives the data columns. Data values are subsequently read from the table into the AMPL entities by the following command, The table declaration only defines a correspondence. To read values from columns of a relational table into the AMPL sets and parameters, we must use an explicit read table command. The table-name must be the same in both the declarations. read table table-name; In the Diet problem, to read data from the table Foods in the spreadsheet, we would use the following declaration followed by the read command, table Foods IN " ODBC " " diet. xls ": FOOD <- [ FOOD ], cost, f_min, f_max ; read table Foods ; The string-list, "ODBC" "diet.xls" specifies that we are connecting to the external relational database through an Open Database Connection (ODBC). For the sake of simplicity, we will not

45 CHAPTER 7. DATABASES AND SPREADSHEETS 39 keep writing out the string-list; it is the only thing that varies for the database but the rest of the syntax remains the same. In the second part, the expression FOOD <- [FOOD] indicates that the entries in the key column FOOD will populate the AMPL set FOOD. cost, f_min and f_max are the names of the other three columns in the relational table from which we will read the values into the corresponding parameters cost, f_min and f_max. Read Parameters only Values from the data columns are assigned to like-named parameters in AMPL. It is sufficient to give a square-bracketed list of key columns and then a list of data columns. The simplest case is when there is only one key column, table Foods IN : [ FOOD ], cost, f_min, f_ max ; Here, the columns cost, f_min and f_max are associated with their corresponding parameters in the current AMPL model. Note that we are not using the FOOD <- [FOOD] expression as in this case we will not be reading the key column values, only the parameters. When the following command is executed, read table Foods ; the relational table is read one row at a time. The key column entry is used as a subscript to the parameters; for example, referring to table 7.1, cost[ham] will be assigned with Reading multidimensional parameters are done in the same manner as for a single indexing set. The name of each data column must match an AMPL parameter and the dimension of the parameter s indexing set must equal the number of key columns. table Amounts IN : [ NUTR, FOOD ], amt ; read table Amounts ; Values of unindexed or scalar parameters may also be supplied by a relational table; the table will have only one row and no key columns i.e. the data column contains a single value. The table declaration will have an empty key-spec, []. For example, if we have a parameter that takes in a value for the number of time periods (param T;), the table declaration can be written as, table TimePeriods IN : [], T; read table TimePeriods ; Read Sets and Parameters As mentioned in section 7.3, we can read the members of a set from the data columns at the same time that parameters indexed over that set are read from the table by using the following expression for the key-spec, set-name <- [key-col-spec, key-col-spec,... ] The simplest case involves reading a one-dimensional set and the parameters indexed over it, as in the diet problem, we have, table Foods IN : FOOD <- [ FOOD ], cost, f_min, f_max ; In this particular case, where the key column FOOD has the same name as the AMPL set FOOD, the table declaration can have an abbreviated key-spec,

46 CHAPTER 7. DATABASES AND SPREADSHEETS 40 table Foods IN : [ FOOD ] IN, cost, f_min, f_max ; We can also write the table declaration in the following manner if the key column s name in the relational table is different from the AMPL set, table Foods IN : FOOD <- [ FoodKey ], cost, f_min, f_max ; Here, the entries of the key column FoodKey from the relational table are read into AMPL s set, FOOD. A similar syntax is used for the multidimensional case. In the case of reading the parameter amt which is indexed over two sets, NUTR and FOOD, we can use the following declaration, table Amounts IN : PAIR <- [NUTR, FOOD ], amt ; When the read table Amounts; declaration is executed, each row of the relational table provides a pair of entries from the key columns: NUTR and FOOD. These members are read into AMPL as pairs into the two-dimensional set, PAIR. Hence, we have to update the code of the diet model from listing 7.1 to the following, set FOOD ; set NUTR ; set PAIR within { NUTR, FOOD };... param amt { PAIR } >= 0;... We have added a new set, PAIR, that is made up of the sets NUTR and FOOD, and we have updated the amt parameter s indexing set. Establishing Correspondences There are times when the AMPL model s set and parameter declarations do not necessarily conform in all respects to the organization of tables in the external databases. The most common difference is when the names of the parameters differ from their corresponding data columns in the relational tables. To resolve such an issue, we can use the following form in the data-spec, param-name data-col-name Thus, in the diet problem, if Foods is defined as, table Foods IN: [ FOOD ], cost, f_ min ~ lowerlim, f_ max ~ upperlim ; then the AMPL parameter, f_min and f_max would be read from data columns lowerlim and upperlim in the external relational table. Similarly, index key-col-name can be used in the key-spec to associate a dummy index for subscripting the param-name in the data-specs. There are three common cases to benefit from such a correspondence,

47 CHAPTER 7. DATABASES AND SPREADSHEETS 41 Case 1: When the numbering in a parameter index is different between the AMPL parameter and relational table. For example, if the relational table started counting time periods from 0 but in the model, the time periods start at 1, the table declaration can be as follows, table BondPrices IN: [b ~ BOND, t ~ TIME ], price [b,t +1] ~ price ; Case 2: When the order of multidimensional parameters in AMPL do not match each other. For example, if we have another amount parameter in the diet model called amt2 which is declared as, param amt { NUTR, FOOD } >= 0; param amt2 { FOOD, NUTR } >= 0; The table declaration for these two parameters would be as follows, table Amounts IN: [n ~ NUTR, f ~ FOOD ], amt, amt2 [f,n] ~ amt2 ; Case 3: When the values of an AMPL parameter are divided among several data columns. For example, in the parameter amt, if the nutritional data for BEEF and CHK are given in different data columns, amtbeef and amtchk respectively. Then, the table declaration is, table Amounts IN: [ n ~ NUTR ], amt [n, " BEEF "] ~ amtbeef, amt [n, " CHK "] ~ amtchk ; In all these cases, when using dummy indices, it is important to use a correspondence ( data-col-name) even though the parameter names match in AMPL and the relational table. This is because, amt2[f,n] does not exist in the relational table and hence we must associate it with amt2. In general, wherever the AMPL expression for the recipient is not a valid data column in the relational table, a data-col-name must be used. Other values A table declaration used for reading data can have assignable expressions anywhere a parameter is permitted. In AMPL, an assignable expression is one which can have a value assigned to it. Hence variables and constraint names can be assigned with values while reading from a relational table. This is primarily done when evaluating a previously stored solution or to provide initial values to the solver. For example, table Foods IN: [ FOOD ] IN, cost, f_min, f_max, Buy ; read table Foods ; We read the set members of FOOD, parameters values and also the initial values for the variable, Buy. 7.4 Writing Data into Tables When writing data to a relational table, the table declaration uses a read/write status of OUT. table table-name OUT string-list opt : key-spec, data-spec, data-spec,... ;

48 CHAPTER 7. DATABASES AND SPREADSHEETS 42 The optional string-list is the same as the one used when reading from a table and provides the information about the external database file, and is specific to the database type and access method being used. Once again, we will not be including this string-list in the following examples where irrelevant. Just as when reading from the table, the actual writing only takes places when the write table table-name; command is executed. Generally, the external file specified in the table declaration is either created if it does not exist or overwritten if it does. The same applies for a relational table that is specified in the string-list. The table declaration can also specify appending or overwriting columns in an already existing table, which is described in section 7.5. TIP: Generally, the write command should be called after the model is solved, if there are entities in the data-specs that are populated while solving the model. Otherwise, those data columns will be empty and the output table may not be useful. Although, the key-specs and data-specs are similar to the ones used for reading from a table, there are some differences. When writing to a table, the syntax allows for a broader range of AMPL expressions to be used especially because when reading, the data is already existing in the table but in the case of writing, the data needs to be determined by AMPL. The next two sections describe how writing rows can be inferred using either the data-specs or the key-spec. Rows inferred from the data specifications (data-specs) When only the key column names for the relational table are specified in a bracketed list, i.e. without specifying any indexing AMPL sets, [key-col-name, key-col-name,... ] then the tables rows are determined using the union of the indexed sets implied by the AMPL entities stated in the data-specs: inference is implicit. For this reason, all the items listed in the data-specs must have the same dimension. For example, in the case of entities that index over one set, table Foods OUT " ODBC ", " diet. xls " " FoodsOut ": [ FoodName ], f_min, Buy, f_max ; the string-list here specifies the connection type and the external filename, plus the relational table name "FoodsOut". When write table Foods; is executed the following columns are created in the table: FoodName, f_min, Buy and f_max. Here, the implicit set among the AMPL entities (f_min, Buy and f_max) is FOOD. Therefore, each row of the column FoodName gets a member of the set FOOD and the rest of the columns get the values of f_min, Buy and f_max subscripted by that member. Tables with more than one dimension are managed in a similar manner. Therefore, the following command, table Amounts OUT : [ NUTR, FOOD ], amt ; write table Amounts ; will produce an output similar to figure 7.2. The rows will be indexed over the union of NUTR and FOOD.

49 CHAPTER 7. DATABASES AND SPREADSHEETS 43 Using Correspondences We can also export a relational table with suffixed variables or constraints such as the dual and slack values related to the constraint diet. table Nutrs OUT : [ Nutrient ], diet. lslack, diet. ldual, diet. uslack, diet. udual ; But this will produce an error since most database software do not allow a dot in their column names. Hence we establish a correspondence ( ). table Nutrs OUT : [ Nutrient ], diet. lslack ~ lb_slack, diet. ldual ~ lb_dual, diet. uslack ~ ub_slack, diet. udual ~ ub_ dual ; This will assign the data columns with the name on the right of. It can also be used with unsuffixed names when you wish to have a data column name different from the AMPL entity because sometimes AMPL entities do not have valid column names. Correspondences can also be used with an AMPL expression instead of only an entity. These will generally require indexing over the corresponding set using a dummy index. The following statements are examples of unsuffixed names and indexing over a dummy index for calculating the data. or table Purchases OUT : [ FoodName ], Buy ~ servings, {f in FOOD } 100* Buy [f]/ f_max [f] ~ percent ; table Purchases OUT : [ FoodName ], {f in FOOD } ( Buy [f] ~ servings, 100* Buy [f]/ f_max [f] ~ percent ); Figure 7.4 shows a spreadsheet with the output of any of the previous table declarations. Figure 7.4: Output for table Purchases The data-spec can also contain expression with operators like sum that need to define their own dummy indices. For example, the following declaration calculates the total of each nutrient over all the food products,

50 CHAPTER 7. DATABASES AND SPREADSHEETS 44 table Totals OUT : [ NUTR ], {n in NUTR }( sum {f in FOOD } amt [n,f] ~ TotNut ); Rows inferred from a key specification (key-spec) The other option for writing to a relational table is when the table rows are determined by explicitly specifying the AMPL sets. set-spec -> [key-col-spec, key-col-spec,... ] set-spec can be the name of an AMPL set or a set-expression enclosed in { } and key-col-specs are the names of the corresponding key columns for the relational table. In section, the arrow <- points in the other direction, indicating values that need to be read into the set; this declaration uses the opposite, (->), to indicate information is to be written from the set into the key columns. In the case of a one-dimensional set, the following command, table Foods OUT : FOOD -> [ FoodName ], f_min, Buy, f_max ; will create a table row for each member of the AMPL set, FOOD, when the write table Foods; command is executed. The key column can have the same name as the AMPL set, table Foods OUT : FOOD -> [ FOOD ], f_min, Buy, f_max ; In this special case, FOOD -> [FOOD] can also be written as [FOOD] OUT. Table with more than one dimension are managed by surrounding the list of sets within curly brackets and the number of key-col-specs must be equal to the dimension of the key-spec. Therefore, the table declaration will be written as, table Amounts OUT : {NUTR, FOOD } -> [ Nutrient, FoodName ], amt ; Using Correspondences The use of is the same as the previous section s Using Correspondences (section ). The only time the syntax differs is when using dummy indices. Since the rows are now determined from the key-spec, the definition of the dummy index can appear either in the set-spec or the key-col-spec, shown below respectively, and table Purchases OUT : { f in FOOD } -> [ FoodName ], Buy [f] ~ servings, 100* Buy [f]/ f_max [f] ~ percent ; table Purchases OUT : FOOD -> [ f ~ FoodName ], Buy [f] ~ servings, 100* Buy [f]/ f_max [f] ~ percent ; Figure 7.4 represents the output of both the above mentioned declarations.

51 CHAPTER 7. DATABASES AND SPREADSHEETS Reading and Writing into the same Table The previous sections describe how to import data from an external relational table and how to export data into a different relational table. There could be cases in which we would want to use the same external relational table for both actions: importing and exporting. There are two ways to handle this: either by using two separate table declarations or combining them into one declaration which will specify which columns are to be read and which are to be written into. Using two table declarations By using the declarations described in the previous sections, one table declaration can be used for reading the data and a second one can be used for writing into the same table. However, the outcome of the write command will overwrite the entire relational table. This is not always ideal because usually, when writing to an already existing table, one would prefer either adding or rewriting certain columns but not re-writing the whole table. To distinguish between columns meant for reading versus columns for writing, each column can be given a read/write status. For example, if the diet problem has an external table, called Foods, with the columns cost, f_min and f_max for reading and Buy for writing, then the table declaration to read the data will be, table FoodInput IN " ODBC " " diet. xls " " Foods ": FOOD <- [ FoodName ], cost, f_min, f_max ; In order to write the results into the same table without overwriting the table, the declaration can be as follows: table FoodOutput " ODBC " " diet. xls " " Foods ": [ FoodName ], cost IN, f_min IN, Buy OUT, f_max IN; First, read table FoodInput; is executed wherein the three columns cost, f_min and f_max are read into AMPL; if the Buy column exists in the relational table, it will be ignored. Once the model is solved, write table FoodOutput; is executed. The only column that will be written is the column with an OUT status, Buy; the other columns are left as-is. If the AMPL table for output is declared in the following manner, table FoodOutput " ODBC " " diet. xls " " Foods ": [ FoodName ], Buy OUT ; where all the columns have an OUT status, then most database software assumes that the entire table must be re-written. Hence the above declaration will delete all the previously read data columns and overwrite it with the Buy column. Although, if we use the following declaration instead, table FoodOutput " ODBC " " diet. xls " " Foods ": [ FoodName ], Buy ; then only the Buy columns will be overwritten; the default status of the columns in such a situation is taken to be INOUT. Using the same table declaration When using the same declaration to perform reading and writing, the columns in the data-spec can contain the following read/write statuses: IN: for a column that is only for reading

52 CHAPTER 7. DATABASES AND SPREADSHEETS 46 OUT: for a column that is only for writing INOUT: for a column which will be for both, reading and writing The key-spec may use the following arrows: <- : reads the data in the key columns into an AMPL set -> : writes the data from the AMPL set into the key columns <-> : does both, reads and writes data between the AMPL set and key columns In the diet example, a single table declaration can be written as, table Foods " ODBC " " diet. xls " " Foods ": FOOD <- [ FoodName ], cost IN, f_ min IN, Buy OUT, f_ max IN; When read table Foods; is executed, the data from FoodName is read into the AMPL set FOOD and the data columns cost, f_min and f_max are read into their corresponding parameters. When the write table Foods; command is executed then data is written into the table s Buy column only. 7.6 Indexed collections of Tables and Columns Sometimes declaring an indexed collection of tables or defining an indexed collection of data columns within a table is more convenient for the purposes of the problem. This can be done using the table declaration. Indexed collections of Tables Just the way sets, parameters and other AMPL components can have an indexing expression, in the same manner, tables can also have an indexing expression. table table-name {indexing-expr} opt string-list opt :... ; Each member of the indexing expression defines an AMPL table which are denoted by appending a bracketed subscript or subscripts of the table-name. In the diet problem, the following declaration defines one table for each member of the set FOOD, table DietSens { j in FOOD } OUT " ODBC " " diet. xls " (" Sens " & j): [ Food ], f_min, Buy, f_max ; When write table DietSens; is executed, diet.xls will now contain a range for every member of FOOD in their corresponding sheets (see figure 7.5). Each of these ranges will be named using the third item (which defines the table/range name) in the string-list: ("Sens" & j), where j represents the index for the set FOOD. Therefore, AMPL table DietSens["BEEF"] will have a corresponding range called SensBEEF and so on. If instead of the third item in the string-list, the second item (which defines the filename) contains a string expression with the index, table DietSens { j in FOOD } OUT " ODBC " (" DietSens " & j & ". xls "): [ Food ], f_min, Buy, f_max ;

xls and so on. Each of these files will contain a single table whose default name will be DietSens.

53 CHAPTER 7. DATABASES AND SPREADSHEETS 47 Figure 7.5: Table with indexed ranges (Shown above only for BEEF) then on execution of the write command, there will be an Excel file created for each item in FOOD whose names will be DietSensBEEF.xls and so on. Each of these files will contain a single table whose default name will be DietSens. Similarly, the table indexing expression can be used in a string expression to create different data-col-names but in the same relational table, table DietSens {j in FOOD } " ODBC " " diet. xls ": [ Food ], Buy ~ (" Buy " & j); After the write command, figure 7.6 illustrates how the table declaration has created a column for each member of FOOD and given them the names corresponding to the string expression: ("Buy" & j). This declaration does not have a read/write status of OUT; if it did then each column would have overwritten the previous one. Instead the data-spec Buy has been left without a read/write status, hence in this case the default will be INOUT. Figure 7.6: Table with indexed data columns

CHAPTER 7. DATABASES AND SPREADSHEETS 48 Indexed collections of Data Columns In a table declaration, each data-spec generally refers to a different AMPL parameter, variable or expression.

54 CHAPTER 7. DATABASES AND SPREADSHEETS 48 Indexed collections of Data Columns In a table declaration, each data-spec generally refers to a different AMPL parameter, variable or expression. However, there are times when data values that correspond to a single AMPL entity are split into multiple data columns, one for each member of a specified indexing set. This is most common when reading or writing two-dimensional tables. For example, the parameter, param amt { NUTR, FOOD } >= 0; can also be represented as a two-dimensional table (see figure 7.7) instead of a list of unique pairs generated by {NUTR,FOOD} (as in figure 7.2) Figure 7.7: Two-dimensional data table in Excel The general form for specifying an indexed collection of data columns is, {indexing-expr} < data-spec, data-spec,... > Each data-spec has any of the forms previously seen and the indexing-expr defines one or more dummy indices that run over the indexing set. These indices are used in expressions within the data-specs and also appear in string expressions that give the names of the columns in the external database. In the Diet problem, such a table declaration can be written as, table dietamts IN " ODBC " " diet2. xls ": [i ~ NUTR ], {j in FOOD } <amt [i,j] ~ (j) >; From figure 7.7, there is one key column, NUTR and the rest are data columns headed by the members of the set FOOD. The key column is represented by [i ~ NUTR] and associates the first table column with the set NUTR and the index i. The data-specs are generated by {j in FOOD} <...> for each FOOD member. The specific data-specs are represented by amt[i,j] ~ (j) where amt[i,j] denotes the AMPL parameter to which the data must be read into and is subscripted with the dummy indices for the two sets it represents (NUTR and FOOD) and (j) is a string expression for the name of the data column. If there were no parentheses around (j), then it would have denoted the single character j as the column name. When writing to a two-dimensional table, the following declaration is used to get a result as shown in figure 7.7 table AmountsOutput OUT " ODBC " " diet2. xls ": {i in NUTR } -> [ NUTR ], {j in FOOD } <amt [i,j] ~ (j) >;

55 Part IV SAMPL: Extensions to AMPL for Stochastic Programming and Robust Optimisation 49

56 Chapter 8 Introduction to modelling under uncertainty 8.1 Background Optimum decision making is concerned with the general problem of computing an optimal decision by taking into consideration parameters, their uncertainties and restrictions relevant to it. A significant aspect of moving from a qualitative approach to a quantitative approach is the introduction of the Mathematical Programming (MP) paradigm. Mathematical Programming models enable the modeller to quantify the effects of the decision in terms of the objectives set by the decision maker and these model are formulated to ensure that the decision does not violate any of the restrictions. Such models express the objectives as functions of the decision variables, which are restricted to take values on certain domains. A single objective optimisation problem is expressed shown below. Given a function f : A > R (8.1) the computational model (for a minimisation problem) is to search for an element x 0 A such that f(x 0 ) f(x), x A. Linear Programming (LP) models are characterised by constraints and an objective function which are linear combination of the decision variables while Quadratic Programming (QP) models can have quadratic terms in the objective function. Depending on the type of decision variables, the optimization models are classified as Integer Programming (IP) models if all the variables are integer or Mixed Integer Programming (MIP) models if some of the variables are integers and the remaining are continuous. Following the criteria regarding the form of A we have Second Order Cone Programming (SOCP) models if some constraints are quadratic, see [44] and generic Conic Optimization models if A is a convex cone. The success of such paradigms has also showed up their limitations. A fundamental assumption for this class of decision models is that the parameters which define the models are known with certainty. This assumption of certain knowledge (deterministic) in many cases does not hold. Consider for example the future commodity prices or interest rates in a financial planning model, the hourly energy demand in an energy network problem or the demand for a particular characteristic in a blending model: assuming these parameters are known with certainty at solution time could lead to solutions that are not optimal or even not feasible in the real world. In the field of optimum decision making under uncertainty, the assumption of a deterministic world is relaxed and different procedures and paradigms arise. A first step in applying MP techniques to a non-deterministic problem is to consider parameters estimation as central in the modelling process, however, in spite of much care is put into forecasting, the outcome of non-deterministic events, the forecast could always be incorrect or not precise. The modeller needs therefore to take into account the effects introduced by the uncertainty into the underlying optimisation models and study how the inevitable defects in the forecasting process can affect the quality of the solution obtained. 50

57 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 51 Sensitivity analysis is therefore introduced to study the effect of changes in the parameter values on the solution obtained. Thus cost coefficients c j or the right hand side values b i or the matrix coefficients a ij could be varied according to their full range of possible values. But this analysis considering a component at a time can answer the questions relating to the uncertain value of parameters in a very limited way. It is presented in greater detail by many researchers, among which [1]. This approach has drawbacks and limitations, as discussed in [15], [35] and [56]. Scenario Analysis: Another approach to face uncertain parameters, where the planner assumes that certain combinations of possible values of uncertain parameters should be considered together: such combinations are called scenarios and the model is solved for different scenarios. The optimal solution decisions and the corresponding objective function values are then aggregated in a heuristic way. Through this line of investigation of a range of solutions, parameter sensitivities may be highlighted and an appropriate solution is chosen in an heuristic way. Many researchers have postulated MP paradigms such as Stochastic Programming, Dynamic Programming and Robust Optimisation to consider parameter variations. In these approaches the modeller can make better use of the assumptions he is able to make about the uncertainty related to the problem. When it is possible to model the uncertainty itself by means of probability distributions, Stochastic Programming models and Dynamic Programming models can make use of this added information to provide optimal decisions which hedge against future uncertainties. Not always it is possible or cost-effective to model the uncertain data with probability distributions. In this cases, Robust Optimisation is an alternative modelling paradigm, in which very few assumption about the nature of the uncertainty are made, but nevertheless leading to solutions which are stable in respect to the uncertain future outcomes. AMPLDev SP edition, through the language SAMPL and the embedded solver FortSP, supports Stochastic Programming and Robust Optimisation modelling with extended syntax and specific solution methods. 8.2 Limitations of deterministic models Deterministic models work perfectly well in many situations. For instance consider: scheduling of airlines and buses to predetermined timetable scheduling of crews who operate the above scheduling of vehicles which carry out delivery to retailers In all these cases, if no exceptional events such as vehicles failures or unexpected staff shortages are considered, deterministic assumptions are essential and adequate. OR specialists who are used to teaching, model building, applying the models to real problems and explaining these and their scope of application to the decision makers are aware of many situations where deterministic approach is inadequate. [47] explain this rather lucidly in the following way: "As taught in introductory OR/MS courses, the dual variable π i corresponding to the i-th constraint indicates the rate of change in the optimal objective value as the RHS of the i-th constraint changes. A large dual variable signifies that the solution is highly sensitive to changes in the RHS coefficient. A small dual variable signifies a relatively insensitive solution to small data perturbations. At this point, students often ask the difficult and sometimes embarrassing question: what should we do if the solution is highly sensitive? Many, though none completely satisfactory answers are possible. Some examples are the following: 1) be careful to get the RHS demand value correct; 2) conduct a marketing effort to reduce the uncertainty in customer demand by increasing brand loyalty; 3) alert the user to the model s sensitivity; 4) make clear that the model s recommendations depend upon the model s assumptions - one of which states that the data coefficients are correct." Consider for instance the operation planning problem of an electrical power generation system. If we make assumptions about generators availability, grid usage, operating characteristics and also

CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 52 about consumer demand, we can determine an optimal electricity generation plan for the future.

At the time of implementing the optimal plan the actual values may be different due to unplanned failures, weather changes and so on.

58 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 52 about consumer demand, we can determine an optimal electricity generation plan for the future. But the optimal solution will be optimal for only a particular set of parameter values. At the time of implementing the optimal plan the actual values may be different due to unplanned failures, weather changes and so on. In a financial portfolio optimisation problem for instance we may note down the price of equities (stocks) and their returns and consider these to be known parameters. We can then construct an optimum portfolio planning model which is deterministic in structure. But as it is well known to anyone aware of the vagaries of the financial market, the price and return for stocks vary considerably and the essential aspect of this problem known as volatility is not captured by the deterministic model: in any case no one would consider implementing the solution of a deterministic portfolio optimisation model. 8.3 Paradigms for modelling under uncertainty One of the major aspects of model building and model investigation for a given problem is to gain an understanding of the problem at hand. Different paradigms can then be followed to model the problem subject to uncertainty and solve it. Figure 8.1 proposes a taxonomy of such paradigms based on [29]. The two main paradigms in the scope of this manual are, as mentioned in section 8.1, Stochastic Programming and Robust Optimisation. Figure 8.1: Taxonomy of paradigms for modelling under uncertainty Stochastic Programming (SP): This paradigm addresses the problem of uncertain data by considering the distributions of the uncertain parameters. SP extends the optimisation paradigm to the domain of descriptive models and some comparison can be made with simulation models which also provides insight by studying possible outcomes of different inputs and aggregating the results. To gain an understanding and ultimately to get a solution through the SP paradigm, we have to seek answers to the following questions: What is an optimal policy for the underlying deterministic version of the problem? Which parameters of the model can be considered known an which are uncertain/random? How are the parameters which are considered uncertain distributed? SP models are ultimately solved numerically, hence the distributions of the uncertain parameters must be discrete. If, as it is often the case, the parameters are modelled through continuous distribution,

59 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 53 a sampling process is needed to be able to use them in a SP problem. This process leads to a finite number of realisation of the uncertain parameters. Such a realisation is called a scenario and the procedure by which such realisations are generated is called Scenario Generation. Chance Constrained Programming: These kind of constraints, applicable to SP problems, limit the probability of a constraint being violated, given the distributions of the random parameters. Integrated Chance Constrained Programming: Similar to the CCP above, but the expected violation of the specified constraints is limited. Closely related to limiting CVAR in finance. Robust Optimisation (RO): Modelling a problem following the Stochastic Programming approach requires the analyst to make strong assumptions on the nature of the uncertainty, that is, to supply or to postulate probability distributions of the random parameters. There are cases in which it is impossible, or not practical, to give reasonable estimates of these probability distributions but in which the robustness of the solution obtained is vital anyway. The first set of studies which addressed these questions was due to Soyster [52] and led to a framework which is now well established and comprises various formulations. Figure 8.2: Components of Stochastic Programming and Robust Optimisation From the descriptions above, it should be clear that the two main approaches here considered to make quantitative decisions under uncertainty are composite procedures. They share the need of a decision model, which decides what policy is an optimum for the problem at hand, and they differ in how to include the uncertainty in such models. Figure 8.2 shows this fact in a simple diagram. The following subsections show a quick mathematical overview of the paradigms above. For a more hand-on introduction, see chapter 9. Stochastic Programming These problem classes can be illustrated by first considering the linear programming problem:

60 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 54 Z = min cx subject to Ax = b (8.2) x 0 where A R m n ; c, x R n ; b R m Let P denote a probability distribution, F be a sigma field and the triple (Ω, F, P ) be a probability space where ω Ω denote the realizations of the uncertain parameters. Let realizations of A, b, c for a given event ω be defined as: ξ(ω) or ξ ω = (A, b, c) ω (8.3) The associated probabilities of these realizations are often denoted as p(ξ(ω)) or p ξ(ω). For notational convenience these probabilities are denoted simply as p(ω). For the same reason, let the feasible regions corresponding to the problem stated in 8.2 and 8.3 be defined as: Distribution based versus Scenario based recourse problems F ω = {x Ax = b, x 0} for ξ(ω) (8.4) The problem defined in 8.2, 8.3 and 8.4 is a mathematical programming model with uncertainty about the values of some of the parameters. If the distribution of ξ(ω) is continuous, the problem is called a distribution based recourse problem [29]; except from some trivial cases, such problems cannot be solved. If the distribution is discrete, the cardinality of the support is limited by the available computing power, therefore in most practical applications the distributions of the stochastic parameters have to be approximated by discrete distributions with a limited number of outcomes [37]. In a discrete setting of the problem given by 8.2, 8.3 and 8.4, the event parameter takes the range of values ω = 1,..., Ω ; there are associated random vector realisations ξ(ω) and probabilities p(ω) such that: ω Ω p(ω) = 1 and Ξ = ω Ω ξ(ω) (8.5) The discretization Ξ is usually called a set of scenarios, and its representation following the dynamic structure of the problem is a scenario tree. A stochastic problem whose event outcomes are represented by a scenario tree is called scenario based recourse problem. Stochastic Programming Problems with Recourse (Here and Now) A Single Stage SP Problems A simple (single stage) stochastic programming model is formulated as: Z HN = min E[c(ω)x] where x F and F = (8.6) ω Ω F ω The optimal objective function value Z HN denotes the minimum expected costs of the stochastic optimisation problem. The optimal solution x HN F hedges against all possible events ω Ω that may occur in the future.

61 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 55 B Two-Stage SP Problems The classical two-stage SP model with recourse is formulated as: subject to Z HN = min c(ω)x + E ω [Q(x, ω)] Ax = b x 0 (8.7) where: subject to Q(x, ω) = min f(ω)y(ω) D(ω)y(ω) = d(ω) + B(ω)x, y(ω) 0, ω Ω (8.8) The matrix A and the vector b are known with certainty. The function Q(x, ω), referred to as the recourse function, is in turn defined by the linear program set out in 8.8. The recourse matrix D(ω), the right-hand side d(ω), the technology matrix B(ω), and the objective function coefficients f(ω) of this model may be random. If the recourse matrix D is fixed for all realisations then the problem is known as a SP problem with fixed recourse; if D takes the form D = (I, I), it is known as a SP problem with simple recourse. Two-stage Stochastic Programming problems with recourse separate the model s decision variables into first stage and second stage. The dynamic nature of the problem can be easily seen: an optimal first-stage decision x is determined such that it is feasible for all realisations ω Ω and has the minimum cost, while the second-stage decision y(ω) is taken after the outcome ω is observed, and compensates and adapts to different realisations. C Multi-Stage SP Problems The class of two-stage problems specified by 8.7 and 8.8 can be extended to the multistage recourse program considering a more complex dynamic setting: instead of having the two decisions x and y(ω), we consider now T sequential decisions x 0, x 1,..., x T to be taken at the stages t = 1, 2,..., T. The term "stages" can, but need not, be interpreted as "time periods"; although these concepts coincide in many applications, a stage can be regarded in general as a step where new information about the state of nature is provided. A decision made in stage t should be based on the knowledge of the previous decisions and realisations (x i, ξ i i 1,..., t 1) and such decision only affects the subsequent decisions (x i i t + 1,..., T ). In Stochastic Programming this concept is known as non-anticipativity and has to be taken into account when formulating the problem in a deterministic equivalent setting. The multistage stochastic programming recourse problem has the form (following [13], [24]): min c 1 x 1 + E ξ2 [min x2 c 2 x E ξt ξ T 1... ξ 2 [min xt c T x T ]] subject to A 11 x 1 = b 1 A 21 x 1 + A 22 x 2 = b 2 A 31 x 1 + A 32 x 2 + A 33 x 3 = b A T 1 x 1 + A T 2 x 2 + A T 3 x A T T x T = b T with l t x t u t, t = 1,..., T (8.9)

62 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 56 D Chance Constraints The chance-constrained programming problems (CCP) were first introduced in [11]. This class of problems deals with the fact that when representing a SP problem modellers often use the goal programming approach (i.e. penalties in the objective for violations in the constraints) to account for constraints violations. Sometimes it is not possible to quantify the penalty, or penalties cannot be modelled in any reasonable way. The CCP approach considers a decision feasible whenever it is feasible with a high probability. A probabilistic or chance constraint can be expressed as follows: P (h(x, ξ) 0) p (8.10) where x and ξ are respectively decisions and random vectors, P is a probability measure and p {0, 1} is called the probability or reliability level. In a two-stage SP problem with m random constraints and defining I = {1,..., m}, we distinguish between individual chance constraints and joint chance constraints P (h(x, ξ) 0) p i, i I (8.11) P (h i (x, ξ) 0, i I) p (8.12) Chance constraints are inherently a qualitative risk measure, and have been used in a wide range of applications (see, among many others, [50] and references in [54]). However, there are applications in which quantitative risk measures are more appropriate. Another sometimes undesirable characteristic of chance constraints problems is that they are non-convex in general; in particular, this is true if the underlying random vector ω follows a discrete distribution [12]. E Integrated Chance Constraints The arguments given in the paragraph above motivated the research of a different approach; Integrated Chance Constrained Programming has been introduced in [30] as an alternative, quantitative and in general convex approach to control and measure feasibility in a SP problem. The ICCP approach considers a problem to be feasible if the expected violation of the constraint is less than a predefined value. Integrated Chance Constraints (ICC) are defined in [30] as the individual integrated chance constraints: and the joint integrated chance constraints: E ω [η(x, ω) ] β i, β i 0, i I (8.13) E ω [max i I η(x, ω) ] β, β 0 (8.14) where η(x, ω) represents the under-deviation that occurs in constraint i under realisation ω, and β i is called the shortfall parameter and it limits the (maximum) expected shortfall in the (set of) ICCs. Applications of Integrated Chance Constraints are found in many fields, but their application in finance defines one important class, as it connects to the well-known Conditional Surplus-at-Risk (CSaR, a variant of Conditional Value at Risk (CVaR)), see [26].

63 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 57 Expected Value Problem The Expected Value (EV) model is constructed by replacing the random parameters by their expected values. Such an EV model is thus a linear program, as the uncertainty is dealt with before it is introduced into the underlying linear optimisation model. It is common practice to formulate and solve the EV problem in order to gain some insight into the decision problem. Denoting with XEV the decision vector resulting from the optimization of the expected value problem, its evaluation against all possible scenarios: where Z EEV = S s=1 p s Z S Z S = c S X EV (8.15) takes the name of Expectation of the Expected Value solution. If there are scenarios s for which X EV is not feasible, then Z EEV is set to be. Wait and See Problems Wait and See (WS) problems assume that the decision-maker is somehow able to wait until the uncertainty is resolved before implementing the optimal decisions. This approach therefore relies upon perfect information about the future; operatively, we solve one separate LP problem for each available scenario, thus obtaining the optimal strategy in each scenario. Because of its very assumptions such a solution cannot be implemented and is known as the "passive approach". Wait and see models are often used to analyse the probability distribution of the objective value. We assign to the expectation of the objective values of all the solved Wait and See models the notation Z W S. Z W S = S s=1 p s Z S Z S = min c(ω)x (8.16) where x F ω Stochastic Measures It can be shown that the three objective function values Z W S, Z HN and Z EEV are connected by the following ordered relationship: The inequality: Z W S Z HN Z EEV (8.17) Z HN Z EEV (8.18) can be argued in the following way: any feasible solution of the average value approximation is already considered in the Here and Now model, therefore the optimal Here and Now objective must be better. A The Value of the Stochastic Solution (VSS) The difference between these two solutions defines the Value of the Stochastic Solution (VSS) for a minimisation problem: VSS = Z EEV Z HN (8.19)

64 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 58 This is a measure of how much can be saved by implementing the (computationally expensive) Here and Now solution as opposed to the deterministic expected value solution. The practical computation of VSS is strictly related to the approach used in the computation of Z EEV. B The Expected Value of Perfect Information (EVPI) Another important index is represented by the Expected Value of Perfect Information (EVPI): EVPI = Z HN Z W S (8.20) This measure of a stochastic optimisation problem is interpreted as the expected value of the amount the decision maker is willing to pay to have perfect information (i.e. knowledge) about the future scenarios. A relatively small EVPI indicates that better forecasts will not lead to much improvement; a relatively large EVPI means that incomplete information about the future may prove costly. C Bound on EVPI and VSS Some useful bounds on the EVPI and VSS are presented below: 0 EVPI Z HN Z EV Z EEV Z EV 0 VSS Z EEV Z EV (8.21) These can help in estimating the relative benefit of implementing the computationally costly Stochastic Programming solution, as opposed to approximate solutions obtained by processing the Expected Value LP problem. Robust Optimisation Compared to Stochastic Programming, Robust optimisation has a completely different way to deal with uncertainty. Modelling a problem following the Stochastic Programming approach requires the analyst to make strong assumptions on the nature of the uncertainty, that is, to supply or postulate probability distributions of the random parameters. There are cases in which it is impossible, or not practical, to give reasonable estimates of these probability distributions but in which the robustness of the solution obtained is vital anyway. The first set of studies which addressed these questions was due to Soyster [52] and led to a framework which is now established as Robust Optimization (RO). Following this approach, no problem specific randomness model has to be found; on the contrary, the modeller only has to choose a formulation among a few available. Choosing the formulation implies choosing an uncertainty model; moreover, some of these models allow to choose how much of the optimality of the solution is traded for robustness. There are now three well known formulations of RO problems; these are given by the above mentioned Soyster, Ben-Tal and Nemirovski (see [2], [3], [4] and [5]).They all share the advantage that minimal assumptions about the nature of the uncertainties have to be made and they differ in respect of the ways in with which they represent the uncertainty sets. More specifically, the formulations by Soyster and by Bertsimas and Sim use polyhedral uncertainty sets, while the formulation by Ben-Tal and Nemirovski considers an ellipsoidal uncertainty set, transforming the original LP problem into a Second Order Cone Programming (SOCP) problem. The solution of these RO problems addresses an important question as to how much optimality for the nominal problem is given up in order to ensure robustness. Consider the following nominal linear optimisation problem:

65 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 59 Z = max c T x subject to where n j=1 a jx j b l x u a j, b, R m ; c, x, l, u R n (8.22) and assume that data uncertainty only affects elements in matrix A. The uncertainty model U we consider is the following: For a particular row i of the matrix A let J i represent the set of coefficients in row i that are subject to uncertainty. Each entry a ij, j J i is modelled as a symmetric and bounded random variable ã ij, j J i that takes values in [a ij ǎ ij, a ij + ǎ ij ], where ǎ ij > 0 is the deviation of variable ã ij around its mean value a ij. Associated with the uncertain data ã ij, we define the random variable η ij = (ã ij a ij )/ǎ ij, which obeys an unknown but symmetric distribution, and takes values in [ 1, 1]. A Soyster s Formulation In general, Soyster s formulation considers the linear optimization problem: Z = max c T x subject to where n j=1 a jx j b [a 1j ǎ 1j, a ij + ǎ 1j ] a j... j = 1,..., n [a mj ǎ mj, a mj + ǎ mj ] x 0 (8.23) This is an adaption from the formulation given in [5], which was syntactically incorrect. Soyster shows that the problem in 8.23 is equivalent to Z = max c T x subject to where n j=1 ājx j b x 0 ā ij = a ij + ǎ ij i, j (8.24) If the uncertainty sets follow the model U, the robust formulation of 8.22 following 8.24 is as follows:

66 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 60 subject to Z = max c T x j a ijx j + j J i ǎ ij y j b i i y j x j y j j (8.25) l x u y 0 It can be shown [5] that the solution to the problem above remains feasible for all realizations ã ij, j J i, although concerns have been raised regarding the fact that it trades too much of the optimality of the nominal problem to gain this robustness. B Ben-Tal and Nemirovski Considering the problem set out in 8.22, the following robust problem is constructed [4]: Z = max c T x subject to j a ijx j + j J i ǎ ij y ij + j J i ǎ2 ij z2 ij b i i y ij x j z ij y ij i, j J i (8.26) l x u y 0 If the uncertainty is represented by the model U, the probability that the ith constraint is violated is at most e Ω2 i /2 ; furthermore the model is proven to be less conservative than 8.25 as every solution of the latter problem is feasible to the former problem. C Bertsimas and Sim In this approach, a parameter Γ i is introduced, that intuitively controls the trade-off between robustness and optimality of the solution. The problem, in its equivalent linear formulation, is set out below [5]: subject to Z = max c T x j a ijx j + z i Γ i + j J i p ij b i i z i + p ij ǎ ij y j i, j J i y j x j y j j (8.27) l j x j u j j p ij 0 i, j J i y j 0 j, z i 0 i

67 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 61 Parameter Γ i takes values in the interval [0, J i ] and it has the effect of protecting the feasibility of the solution against all cases in which up to Γ i coefficients a ij, j J i will change, and one coefficient a it changes by (Γ i Γ i )ǎ it. More formally, the solution will remain feasible deterministically if the realisations behave as specified above, and moreover, even if more than Γ i parameters change, then the robust solution will be feasible with very high probability. 8.4 Modelling random parameters using discrete scenarios Scenario Generation is the term normally used to describe the process of creating a tree structure and associated discrete scenarios which are used to describe the uncertain parameters in SP models. The uncertainty representation by scenarios can be summarized as a four steps process, of which the first two are required at modelling time, the latter two at runtime: 1. Model the uncertainties with (discrete) random processes (or distributions for single stage SP) The modeller is here required to write his assumptions about the uncertainty in mathematical form; the outcome of this step is a random process or a probability distribution. As discussed in section 8.3 and in [6], these analytic models are not suitable for direct use in SP problems. 2. Approximate (discretise in case of continuous random processes or aggregate in case of discrete ones) the chosen random processes with a tree of discrete scenarios A range of techniques can be used in this step, which approximates the output of the model defined in step 1 with a scenario tree. There is both a science and an art ([10]) to this process, and a balance between fine discretisation (that lead to numerically unsolvable problems) and coarse discretisation (that could overlook important realizations) has to be obtained. Two related approaches can be identified, one is based on statistical approximations (as in [38]) and the other on approximation theory (as in [36] and [16]). 3. Estimate the parameters for the model of randomness Given some data about the reality (usually, this data set comprises historical observations) the modeller then is required to estimate the uncertain parameters for a model of randomness (i.e. for a normal distribution, mean and variance). 4. Generate the scenario tree A scenario tree is then generated by applying the model/algorithm crafted in steps 1 and 2, and the data provided by step 3. This scenario tree is then introduced in the description of the SP programming model which in turn is processed by an SP solver. The role of a scenario generator is very important in describing an SP problem. This thesis is mainly concerned with modelling aspects of SP and within the scope of this thesis, the concept of Scenario Generators library is introduced, which is a collection of models of randomness which have been produced by steps 1 and 2 of the modus operandi above. In the context of decision making, a scenario generator captures in a procedural form a domain-specific model of randomness. Operatively, the problem owner may choose between various methods which are part of the SG library to model the uncertainty at hand, evaluate its performance with the current decision model (i.e. stability tests, see paragraph 2.3) and use it to obtain the ex-ante decision (see [14]). The problem owner can then evaluate the decision obtained against real data (back testing, or stress testing) or against realizations obtained using a different scenario generator. A short and not comprehensive list of scenario generators, their applications fields and some references is summarized in table Modelling uncertainties (robust) Compared to Stochastic Programming, Robust optimisation has a completely different way to deal with uncertainty. Modelling a problem following the Stochastic Programming approach requires the analyst to make strong assumptions on the nature of the uncertainty, that is, to supply or postulate

68 CHAPTER 8. INTRODUCTION TO MODELLING UNDER UNCERTAINTY 62 probability distributions of the random parameters. There are cases in which it is impossible, or not practical, to give reasonable estimates of these probability distributions but in which the robustness of the solution obtained is vital anyway. The first set of studies which addressed these questions was due to Soyster [52] and led to a framework which is now established as Robust Optimization (RO). Following this approach, no problem specific randomness model has to be found; on the contrary, the modeller only has to choose a formulation among a few available. Choosing the formulation implies choosing an uncertainty model; moreover, some of these models allow to choose how much of the optimality of the solution is traded for robustness. There are now three well known formulations of RO problems; these are given by the above mentioned Soyster, Ben-Tal and Nemirovski [2] and Bertsimas and Sim [5]. They all share the advantage that minimal assumptions about the nature of the uncertainties have to be made and they differ in respect of the ways in with which they represent the uncertainty sets. More specifically, the formulations by Soyster and by Bertsimas and Sim use polyhedral uncertainty sets, while the formulation by Ben-Tal and Nemirovski considers an ellipsoidal uncertainty set, transforming the original LP problem into a Second Order Cone Programming (SOCP) problem. The solution of these RO problems addresses an important question as to how much optimality for the nominal problem is given up in order to ensure robustness.

69 Chapter 9 Modelling with SAMPL and Tutorial 9.1 How to represent models under uncertainty The AMPL language extensions for stochastic programming provided by SAMPL enable the user to define stochastic programming and robust optimisation models in a simple and concise way. The syntax of the extensions follows that of the AMPL language. Together with the usual entities which form an AMPL model, new keywords are introduced to support these other classes. The current extensions support the definition of: Stochastic Programming problems Scenario based recourse problems Chance Constraints Integrated Chance Constraints Robust Optimisation problems Soyster formulation Ben-Tal and Nemirovski formulation Bertsimas and Sim formulation. The difficulties that arise when using non-specialized modelling languages to formulate SP problems are mainly due to the lack of constructs for the definition of the randomness of the model coefficients and the scenario tree structure. A stochastic programming model can be considered as a linear programming model extended and refined by the introduction of random parameters, whilst a robust optimisation problem can be considered as a linear programming model combined with the definition of the model of uncertainty and the parameters it needs (see Figure 8.2). More precisely, for SP problems, the underlying LP optimisation model is extended by taking into account the probability distribution of the model?s random parameters. Such distributions are provided by the models of randomness used in scenario generators (see chapter 14), which are specific to the particular optimisation problems under investigation. For RO problems, the underlying model is extended by the definition of which parameter is uncertain and by the choice of the uncertainty model it follows (which is equivalent to choosing the kind of formulation from the three available). In general, different categories of stochastic programming problems require different language features to express the random nature of the problem. We use the term stochastic framework to denote the information represented by these constructs. Stochastic programming model entities The first requirement for the formulation of a stochastic programming problem using algebraic modelling languages is the declaration of the random parameters. In scenario-based recourse problems, 63

70 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 64 the realisations of such parameters are explicitly given in the form of a scenario tree. Each scenario is also associated with a corresponding weight (or probability). In turn, the scenario tree structure is declared in terms of stages. The stages identify the sequence of decisions in the dynamics of the underlying core model. If the temporal dimension is introduced into the model using a specific time set, the stages can be declared as subsets of this set. To summarise, a stochastic framework for scenario-based recourse problems requires constructs for the definition of stages, scenarios and random parameters (see Table 9.1). Entities Stage Information Scenario Information Random parameters Chance constraints Integrated Chance constraints Language Requirements Assignment of variables and constraints to stages Scenario set Tree structure Scenario probabilities Declaration of the random parameters in terms of the scenario set Declaration of the chance constraint and its reliability level Declaration of the integrated chance constraint and its maximum expected violation Table 9.1: Language Entities for SP problems Robust Optimisation model entities At a language level, to define a RO problem, the modeller needs to decide which formulation to use, so there should be syntax to define it. Once that is set, the uncertainty model is set). Not all the parameters need to be random, some can have reasonably certain estimates, and in these case such estimates are used. So there must be a way to define which parameters follow the uncertainty model, and which are fixed. The uncertainty model need some parameters, and some of the formulations might need some more parameters. All these entities and quantities need to be defined for a RO problem to be fully specified (see Table 9.2). Formulation Entities Language Requirements Soyster Ben-Tal and Nemirovski Bertsimas and Sim Random parameter definition Uncertainty model definition Random parameter definition Uncertainty model definition Additional parameters Random parameter definition Uncertainty model definition Additional parameters Definition of which parameters are random and which are not For each random parameter, specify the data needed by the chosen uncertainty model Definition of which parameters are random and which are not For each random parameter, specify the data needed by the chosen uncertainty model Robustness: limit to the probability of the violation of the constraint Definition of which parameters are random and which are not For each random parameter, specify the data needed by the chosen uncertainty model Robustness: amount of changes coefficients by which the model still remains feasible Table 9.2: Language Entities for RO problems

71 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL Asset & Liability Management (ALM) Model (Tutorial 1) We introduce our first example of a stochastic programming model and how it is represented in SAMPL. Introduction We consider an asset/liability management model: an investor faces the problem of creating a portfolio allocating a set of assets belonging to a universe I; each asset is characterised by a price P. The goal of the investor is to maximise the portfolio wealth at the end of a predefined time horizon T. He needs to take into account future obligations (liabilities) L and the associated transaction cost G, which is a fraction and used as a multiplicative weight. At each time period, and for each asset considered, the investor decides the amount to buy, to sell and to hold. Table 9.3 below shows a possible definition of the entities for such a model. Observe that the future price of an asset is unknown, therefore it is represented by its expected value. Type Notation Description Range / Dimensions Indices (sets) Parameters (data) Variables I Asset universe i = 1,..., I; I = 23 T Time periods t = 1,..., T ; T = 4 P it Expected Price of asset i at time period t I, T L t Liability at time period t T H i0 Initial composition of the portfolio I F t Funding in time period t T G Transaction cost as a fraction of trade value H it Quantity of assets i to hold in time period t I, T S it Quantity of assets i to sell in time period t I, T B it Quantity of assets i to buy in time period t I, T Table 9.3: ALM Model Entities Algebraic Formulation Asset Holding Constraints During the planning horizon the portfolio is rebalanced at discrete points in time (beginning of each time period). The asset holdings are expressed using two constraints. Constraint 9.1 takes into consideration the initial holdings and constraint 9.2, which applies for time-periods t > 1. H i1 = H i0 + B i1 S i1, i I (9.1) H it = H it 1 + B it S it, t = 2,..., T, i I (9.2) Fund Balance Constraints Throughout the planning period cash inflows and cash outflows occur. The former is due to the assets selling or to profitable performance of the assets along with additional funding, which the investor s company might obtain. The latter is due to the company s payments and other liabilities which have to be fulfilled as well as to the purchase of assets and the transaction costs associated with their

72 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 66 trading (buying and selling). In other words, this constraint reflects the evolution of the fund balance of the investor over time. I I (1 G) P it S it L t + F t = (1 + G) i=1 i=1 Figure 9.1 below illustrates the concept of fund balance. P it B it, t T (9.3) Figure 9.1: Fund balance constraint Objective Function The goal of the investor is to maximise the terminal wealth of the portfolio. This is expressed as equation 9.4. I P it H it (9.4) i=1 The expression above can be used to calculate the market value of the portfolio for each time-period by substituting T with t = 1,..., T. Model Summary A summary of the ALM formulation is given below: subject to: max I i=1 P it H it (9.5) H i1 = H i0 + B i1 S i1, i I H it = H it 1 + B it S it, t = 2,..., T, i I (1 G) I P it S it L t + F t = (1 + G) I P it B it, t T i=1 i=1 H it, B it, S it 0, i I, t T

73 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL ALM Dataset Representation When representing the ALM problem in AMPL we have to define the parameters data. Table 9.4 gives an example of some of the required ALM data. Time Period Liabilities Income Target Wealth Initial Holdings Table 9.4: Data for the ALM problem The next step is to define the future asset prices for all time periods. Recall that the future price is unknown, and we defined it earlier as the expected price. In fact, for every random parameter we take its mathematical expectation (given by its weighted average) in a non stochastic model. In our example, we have 23 assets and 4 different time periods, so that we have 92 different expected prices. However, asset prices are difficult to predict and generally the expected value is not an accurate predition. If we want a more accurate depiction we represent prices as a probability distribution which can be continuous or discrete. Robust Programming deals with random variables which have continuous probability distributions, whereas Stochastic Programming are based on discrete representations of random variables known as scenarios. Suppose we represent the set of asset prices P its, s S by S = 64 different future scenarios (a discussion on techniques to generate scenarios can be found in Chapter 14). The depiction of all data is given in Table 9.5. Prices Asset Time = 1 Time = 2 Time = 3 Time = 4 Scenario 1 P 1,1,1 P 1,2,1 P 1,3,1 P 1,4,1 1 2 P 2,1,1 P 2,2,1 P 2,3,1 P 2,4, P 23,1,1 P 23,2,1 P 23,3,1 P 23,4,1 1 P 1,1,2 P 1,2,2 P 1,3,2 P 1,4,2 2 2 P 2,1,2 P 2,2,2 P 2,3,2 P 2,4, P 23,1,2 P 23,2,2 P 23,3,2 P 23,4, P 1,1,64 P 1,2,64 P 1,3,64 P 1,4, P 2,1,64 P 2,2,64 P 2,3,64 P 2,4, P 23,1,64 P 23,2,64 P 23,3,64 P 23,4,64 Table 9.5: Prices for the ALM problem, 64 scenarios For each scenario s we have an associated probability p s. For all i N, t T, we calculate the expected price P it as s S p sp its /S. The inclusion of scenarios effectively adds another dimension

74 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 68 to the price matrix. The number of values we need if we have 64 scenarios is = 5888 different prices. In real world applications, the number of scenarios that realistically approximates a set of random variables can be much larger. 9.4 Mapping algebraic entities to AMPL In this section we introduce the AMPL notation that will be used throughout this tutorial. Table 9.6 presents the mapping of the algebraic entities to the corresponding AMPL sets, parameters and variables Type Notation AMPL Name Range / Dimensions set (indices) I ASSETS i = 1,..., NA; NA = 23 T TIME t = 1,..., NT; NT = 4 P it price ASSETS, TIME L t liabilities TIME param (parameters) var (variables) H i0 initialholdings ASSETS F t income TIME G Tcost H it hold ASSETS, TIME S it sell ASSETS, TIME B it buy ASSETS, TIME Table 9.6: Mapping algebraic entities to AMPL 9.5 Expected Value Problem in AMPL The ALM model defined in (9.1)-(9.4) is deterministic, i.e. is does not take into account the fact that price is a random variable, which will be discussed later. Such a deterministic problem is known as the Expected Value Problem, where each random parameter is replaced by its expected value. The expected value is obtained by calculating its weighted average. The AMPL code for the model is set out in listing 9.1. Notice that we have, for now, left the target final wealth out of the model. The comments in the code itself should provide enough guidance for a shallow understanding of the sections of the models, which follow the same order in which they have been presented. For a more detailed description of the features and the syntax of the language, please refer to the previous part of this manual. Listing 9.1: AMPL representation of the Expected Value ALM problem # Parameters for indices param NT :=4; param NA : =23; # Sets set ASSETS := 1.. NA; set TIME :=1.. NT; # Parameters

75 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 69 param Tcost := ; param liabilities { TIME }; param initialholdings { ASSETS } := 0; param income { TIME }; param price { TIME, ASSETS }; # Variables var hold { TIME, ASSETS } >=0; var buy { TIME, ASSETS } >=0; var sell { TIME, ASSETS } >=0; # Objective function maximize wealth : sum {a in ASSETS } price [NT,a]* hold [NT,a]; # Constraints subject to stockbalance1 { a in ASSETS }: hold [1,a]= initialholdings [a]+ buy [1,a]- sell [1,a]; stockbalance2 { a in ASSETS, t in 2.. NT }: hold [t,a]= hold [t -1,a]+ buy [t,a]- sell [t,a]; fundbalance { t in TIME }: (1- Tcost )*( sum {a in ASSETS } price [t,a]* sell [t,a]) - liabilities [t] + income [t] = (1+ Tcost )*( sum {a in ASSETS } price [t,a]* buy [t,a]); 9.6 Stochastic ALM in AMPL The uncertain parameters are specified using scenarios, which are the discrete realisations of the parameter values (see table 9.5 dimension to the model to include the realizations of the uncertain parameters in the form of scenario trees, as discussed in section 8.3. Given S as the set of scenarios and p s as the probability of occurrence of scenario s S, the deterministic equivalent of the stochastic programming formulation of the ALM problem is given by: subject to: S I max p s P it s H it s (9.6) s=1 i=1 H i1s = H i0 + B i1s S i1s, i I, s S H its = H it 1s + B its S its, t = 2,..., T ; i I, s S I I (1 G) P its S its L t + F t = (1 + G) P its B its, i=1 i=1 s S H it, B it, S it 0, i I, t T, s S

76 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 70 Following the explicit non-anticipativity representation presented in section, some changes in the indexation of variables, parameters and constraints are needed, to take into consideration the fact that all decisions are now dependent on the scenario, and so is the parameter price. The corresponding AMPL model is given in listing 9.2, where the parameter prob{scenario} is added to represent the probability of each scenario and the objective is now to maximize the expected final wealth. On Table 9.7 we map the stochastic algebraic entities to their AMPL counterparts. Type Description Notation AMPL Name Range / Dimensions set (indices) I Asset universe ASSETS i = 1,..., NA; NA = 23 T Time periods TIME t = 1,..., NT; NT = 4 S Future scenarios SCENARIO s = 1,..., NS; NS = 64 P its Price of asset i at time t, scenario s price ASSETS, TIME, SCENARIO L t Liability at time t liabilities TIME H i0 Initial composition of the portfolio initialholdings ASSETS param (parameters) F t Funding in time t income TIME G Transaction cost as a fraction of trade value Tcost H its Quantity of assets i to hold in time t, scenario s hold ASSETS, TIME, SCENARIO var (variables) S its Quantity of assets i to sell in time t, scenario s sell ASSETS, TIME, SCENARIO B its Quantity of assets i to buy in time t, scenario s buy ASSETS, TIME, SCENARIO Table 9.7: Mapping stochasticalgebraic entities to AMPL Listing 9.2: AMPL representation of stochastic ALM model # Parameters for indices param NT :=4; param NA : =23; param NS : =64; # Sets set ASSETS := 1.. NA; set TIME : =1.. NT; set SCENARIO := 1.. NS; # Parameters param Tcost := ; param liabilities { TIME }; param initialholdings { ASSETS } := 0; param income { TIME }; # Stochastic related parameters param prob { SCENARIO } := 1/ NS; param price { TIME, ASSETS, SCENARIO }; # Variables var hold { TIME, ASSETS, SCENARIO } >=0; var buy { TIME, ASSETS, SCENARIO } >=0; var sell { TIME, ASSETS, SCENARIO } >=0; # Objective function maximize wealth : sum { s in SCENARIO } prob [ s] * ( sum {a in ASSETS } price [NT,a,s]* hold [NT,a,s]); # Constraints subject to

77 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 71 stockbalance1 { a in ASSETS, s in SCENARIO }: hold [1,a,s]= initialholdings [a]+ buy [1,a,s]- sell [1,a,s]; stockbalance2 { a in ASSETS, t in 2.. NT, s in SCENARIO }: hold [t,a,s]= hold [t -1,a,s]+ buy [t,a,s]- sell [t,a,s]; fundbalance { t in TIME, s in SCENARIO }: (1- Tcost )*( sum {a in ASSETS } price [t,a,s]* sell [t,a,s]) - liabilities [t] + income [t] = (1+ Tcost )*( sum {a in ASSETS } price [t,a,s]* buy [t,a,s]); To complete the explicit non-anticipativity representation, a structure must be enforced to ensure that only the information available at each decision node influences the decision itself. We have therefore to add to the model the non-anticipativity constraints, which depend on the tree structure of our choice. We consider two different tree shapes and, aside the graphical representation of the tree, the non-anticipativity constraint(s) for the variable hold are listed; to ensure a correct representation of the problem, similar constraints must be defined for all the decision variables. Two stage tree, 64 scenarios The model is two stage, with stage 1 including all decisions taken for t = 1 and stage 2 including all the others. One non-anticipativity constraint is needed for each variable to ensure that, for t = 1, the values of the variables remain constant for all scenarios. # Stage1 na1 {a in ASSETS, s in 2.. NS }: hold [1,a,1]= hold [1,a,s]; Figure 9.2: Event tree for two-stage formulation Four stages tree, 64 scenarios, 4 branches at each stage To ensure this kind of structure, in which the time index is equal to the stage number, one constraint template is needed for stage 1, four for stage 2 and sixteen for stage 3, for each variable.

78 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 72 # Stage1 na1 { a in ASSETS, s in 2.. NS }: hold [1,a,1]= hold [1,a,s]; # Stage 2 na1_s2_1 { a in ASSETS,s in 2..16}: hold [2,a,s]= hold [2,a,1];... na1_s2_4 { a in ASSETS, s in }: hold [2,a,s]= hold [2,a,49]; Figure 9.3: Event tree for multi stage ALM model # Stage 3 na1_s3_1 {a in ASSETS,s in 2..4}: hold [3,a,s]= hold [3,a,1]; na1_s3_2 { a in ASSETS,s in 6..8}: hold [3,a,s]= hold [3,a,5];... na1_s3_16 { a in ASSETS, s in }: hold [3,a,s]= hold [3,a,61]; 9.7 Stochastic ALM Formulated in SAMPL The extended syntax SAMPL enables the modeller to capture the stochasticity in the model in a natural way, using ad-hoc constructs. Most important, the non-anticipativity constraints are not needed, as the generated model enforces the desired tree shape automatically. The SAMPL formulation of the model is reported below; besides the non-anticipativity constraints, which are not part of it, objective and constraints are identical to the DEQ formulation. Listing 9.3: SAMPL representation of the stochastic ALM Model # Parameters for set ranges param NT :=4; param NS : =64; param NA : =23; # Sets set ASSETS := 1.. NA; set TIME : =1.. NT; # Stochastic information scenarioset SCENARIO : =1.. NS; random param price { TIME, ASSETS, SCENARIO }; probability Prob { SCENARIO }:=1/ NS; # PARAMETERS param liabilities { TIME }; param initialholdings { ASSETS } := 0; param income { TIME }; param target { TIME }; param Tcost :=0.025; # VARIABLES var hold { t in TIME, a in ASSETS, s in SCENARIO } >=0; var buy { t in TIME, a in ASSETS, s in SCENARIO } >=0; var sell { t in TIME, a in ASSETS, s in SCENARIO } >=0;

79 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 73 # OBJECTIVE # maximize wealth : sum {s in SCENARIO } Prob [s]* marketvalue [NT,s]; maximize wealth : sum { s in SCENARIO } Prob [ s]* ( sum {a in ASSETS } price [NT,a,s]* hold [NT,a,s]); # CONSTRAINTS subject to stockbalance1 { a in ASSETS, s in SCENARIO }: hold [1,a,s]= initialholdings [a]+ buy [1,a,s]- sell [1,a,s]; stockbalance2 { a in ASSETS, t in 2.. NT, s in SCENARIO }: hold [t,a,s]= hold [t -1,a,s]+ buy [t,a,s]- sell [t,a,s]; fundbalance1 { t in TIME, s in SCENARIO }: (1- Tcost ) * sum {a in ASSETS } price [t,a,s]* sell [t,a,s] + income [t] - liabilities [t] = (1+ Tcost ) * sum {a in ASSETS } price [t,a,s]* buy [t,a,s]; Now, depending on the desired event tree, the definition of the tree shape and the partition of deci-sion variables into stages is, for the two-stage problem: Listing 9.4: SAMPL representation of the stochastic ALM Model two stage tree thetree := twostage ; # Specify a two - stage tree let { t in TIME, a in ASSETS, s in SCENARIO } hold [t,a,s]. stage := if t=1 then 1 else 2; let { t in TIME, a in ASSETS, s in SCENARIO } buy [t,a,s]. stage := if t=1 then 1 else 2; let { t in TIME, a in ASSETS, s in SCENARIO } sell [t,a,s]. stage := if t=1 then 1 else 2; The compactness of this formulation in respect to the DEQ one is noticeable; it should be noted that by generating the problem in this way, the system can automatically generate the Wait and See and the Expected Value problems, and calculate the VSS and EVPI. 9.8 CC and ICC Formulations in AMPL The incorporation of chance constraints and integrated chance constraints into this model allows the planned strategy to have some degree of underfunding, that is, at some point in time, the liquidity incomes do not match the liabilities; in our model this can be implemented by allowing the fund balance to be negative. A reformulation of the fund balance constraint of equation 9.3 is given in equation 9.7 below. The formulation has been furthermore refined with the introduction of scenarios, to reflect the fact that we are now examining the stochastic version of the model. I I (1 G) P its S its (1 + G) P its B its + F t L t (9.7) i=1 i=1 t 1,..., T ; s 1,..., S

80 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 74 To allow underfunding, one approach is to transform the constraint above to a chance constraint; this allows a violation to that constraint with a certain probability among all scenarios. The first step is to define the underfunding for each scenario; (1 G) I P its S its (1 + G) i=1 where U ts, O ts 0 and finally: I P its B its + F t + U ts O ts L t (9.8) i=1 Ū s = T t=1 t 1,..., T ; s 1,..., S where U ts and O ts are variables defined for all time periods and all scenarios. Allowing underfunding in this model has a side effect: the investor can reinvest the amount of money that "appears" from the underfunding. This is not coherent with proper cash balancing; therefore another constraint must be added, to bind the investor to invest just the cash coming from the liquidation of assets and the income at that time period. We call this the cash balance constraint, and the formulation is as follows: U ts (1 G) I P its S its + F t (1 + G) i=1 The chance constraint can be written as: I P its B its (9.9) i=1 t 1,..., T ; s 1,..., S P s 1,...,S {Ūs 0} R (9.10) where R is a reliability level, that is the probability with which we want to satisfy the constraint. There is a deterministic equivalent formulation (see the system of equations 9.9) for the CCP problem. The changes to the deterministic equivalent AMPL formulation can be seen in listing 9.5. Listing 9.5: AMPL representation of a chance constraint # Sets set ASSETS := 1.. NA; set TIME : =1.. NT; set SCENARIO := 1.. NS; # Parameters for indices param NT :=4; param NA : =23; param NS : =64; # Parameters param Tcost := ; param liabilities { TIME }; param initialholdings { ASSETS } := 0; param income { TIME }; # CC additions to parameters param M := 50000; param Reliability : =0. 8; # do not underfund with probability 80% # Stochastic related parameters param prob { SCENARIO } := 1/ NS;

81 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 75 param price { TIME, ASSETS, SCENARIO }; # Variables var hold { TIME, ASSETS, SCENARIO } >=0; var buy { TIME, ASSETS, SCENARIO } >=0; var sell { TIME, ASSETS, SCENARIO } >=0; # CC additions to variables var count { SCENARIO } binary ; var over { TIME, SCENARIO } >= 0; var under { TIME, SCENARIO } >= 0; var underdeviation { SCENARIO } >=0; # Objective function maximize wealth : sum { s in SCENARIO } prob [ s] * ( sum {a in ASSETS } price [NT,a,s]* hold [NT,a,s]); # Constraints subject to stockbalance1 { a in ASSETS, s in SCENARIO }: hold [1,a,s]= initialholdings [a]+ buy [1,a,s]- sell [1,a,s]; stockbalance2 { a in ASSETS, t in 2.. NT, s in SCENARIO }: hold [t,a,s]= hold [t -1,a,s]+ buy [t,a,s]- sell [t,a,s]; # CC additions ( extra constraints ) fundbalance { t in TIME, s in SCENARIO }: (1- Tcost )*( sum {a in ASSETS } price [t,a,s]* sell [t,a,s]) - liabilities [t] + income [t] = (1+ Tcost )*( sum {a in ASSETS } price [t,a,s]* buy [t,a,s]); fundbalance { t in TIME, s in SCENARIO }: (1- Tcost )*( sum {a in ASSETS } price [t,a,s]* sell [t,a,s]) - (1+ Tcost )*( sum {a in ASSETS } price [t,a,s]* buy [t,a,s]) + income [t] - over [t,s] + under [t,s] = liabilities [t]; cashbalance { t in TIME, s in SCENARIO }: (1- Tcost )*( sum {a in ASSETS } price [t,a,s]* sell [t,a,s])+ income [t ]>= (1+ Tcost )*( sum {a in ASSETS } price [t,a,s]* buy [t,a,s]); underdevdef {s in SCENARIO }: sum {t in TIME } under [t,s] = underdeviation [s]; # CC additions # Constraints which represent the Chance Constraint as per 2.20 CC{ s in SCENARIO }: underdeviation [ s] <= M * count [ s]; cardcc : sum {s in SCENARIO } prob [s]* count [s] <= 1- Reliability ; The artifices introduced in the model due to the DEQ formulation are highlighted in bold. It is worth noticing that this formulation introduces one binary variable for each scenario.

82 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 76 This model easily spots one weakness of the chance constraints problems, which is the fact that they represent a qualitative risk measure. The scenarios that are allowed to underfund in the problem above, do indeed underfund, and they can do so by up to M; one scenario with very little underfunding is considered equally to one which underfunds by the maximum allowed. This does not take into consideration that the amount of underfunding has an important role too. For this reason, in this case the integrated chance constraint approach might be preferable; the ICC takes the amount of underfunding into consideration, limiting the expected underfunding. Given W as the limit on underfunding, the formulation is, starting from the entities defined in the block of equations 9.8: which can be implemented in AMPL adding: E s 1,...,S [Ūs] W (9.11) Listing 9.6: AMPL representation of an integrated chance constraint param W := 50000; ICCP : sum {s in SCENARIO } prob [s]* underdeviation [s] <= W; 9.9 CC and ICC Formulations in SAMPL The formulation of the chance constraint using SAMPL extended syntax is: CC: { probability s in SCENARIO : underdeviation [ s] > 0} <= Reliability ; This is equivalent to the modifications implemented in listing 9.5. Similarly the integrated chance constraint reads: ICCP : expectation { s in SCENARIO } { underdeviation [ s]} <= W; It can be easily seen that the formulation using the extended syntax is much more compact and readable. Moreover, this reformulation allows the modelling system to use a solver that is especially designed to solve CCPs or ICCPs through specialized algorithms (see [31]) for an example), whenever such a solver is available Robust Formulations in AMPL In case more precise assumptions about the distribution of the random parameters cannot be made, reformulating the model as a robust optimization problem can help maintaining feasibility of the solution in the face of an uncertain future. Only a few and light assumptions in respect of the random parameters are made in the model of uncertainty U presented in section ; the model is reported below for ease of reading: Consider a particular row I of the matrix A and let J i represent the set of coefficients in row i that are subject to uncertainty. Each entry a ij, j J i is modelled as a symmetric and bounded random variable ã ij, j J i that takes values in [a ij ǎ ij, a ij + ǎ ij ]. Associated with the uncertain data ã ij, we define the random variable η ij = (ã ij a ij )/ǎ ij, which obeys an unknown but symmetric distribution, and takes values in [ 1, 1]. We therefore model the future assets prices as described by the model U, thus modelling it as P it with values in [P it ˇP it, P it + ˇP it ], where P it and ˇP it are respectively the mean value and the half extension of the uniform distribution of asset i at time t. It has to be noted that the prices are the

83 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 77 only non-deterministic parameters of this model, and that they appear as elements in the matrix with a multiplier, as coefficients of the variables S and B: (1 G) I P it S it (1 + G) i=1 I P it B it + F t L t (9.12) To avoid being too prolix on a topic - the formulation of robust optimisation problems - which is not central to this document, only the formulation given by Soyster (see section ) is explicitly given here. The linear program that can be inferred from Soyster s formulation, as in 8.25, is reported below: i=1 subject to Z = max cx j a ijx j + j J i ǎ ij y j b i i y j x j y j j (9.13) l x u y 0 The formulation requires the knowledge of the sets J i of coefficients in each row i that are subject to uncertainty, because an artificial variable y j must be created for each one of them. In the algebraic perspective, the procedure translates into recognizing the parameters that are defined as part of the model U (in this model, the prices) and add an artificial variable for each time they appear in each constraint. In our case, the only constraint involved is shown in equation 9.13, and the random parameter P it appears twice in it. We therefore proceed by creating two artificial variables y Bit and y Sit for each constraint, which obtains the final form: (1 + G) I P it B it (1 G) i=1 I P it S it + (1 + G) i=1 I ˇP it y Bit (1 G) i=1 I ˇP it y Sit F t L t t T (9.14) To complete the formulation, the constraints which link the artificial variables and the natural one have to be added, together with the bounds on the variables, namely: i=1 y Bit B it y Bit, t T, i A y Sit S it y Sit, t T, i A (9.15) Bit 0 and S it 0, t T, i A Expressed in AMPL, the steps above are: 1. Declare the artificial variables and the parameters of the uniform distribution (as mean value P it I used the expected value of the realizations utilized for the SP problem, so just the additional parameter amplitude was needed) Listing 9.7: AMPL robust formulation: additional variables param amplitude { TIME, ASSETS } := 10; var artificialbuy { TIME, ASSETS } >= 0; var artificialsell { TIME, ASSETS } >= 0;

84 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL Reformulate the fundbalance constraint to implement 9.13: Listing 9.8: AMPL robust formulation: constraint fundbalance { t in TIME }: (1+ Tcost )*( sum {a in ASSETS } price [t,a]* buy [t,a]) - (1- Tcost )*( sum {a in ASSETS } price [t,a]* sell [t,a])+ sum {a in ASSETS } (1+ Tcost )* artificialbuy [t,a]* amplitude [t,a] - sum {a in ASSETS } (1- Tcost )* artificialsell [t,a]* amplitude [t,a] <= income [t] - liabilities [t]; 3. Implement the variable bounds: Listing 9.9: AMPL robust formulation: variable bounds robustbuy {t in TIME, a in ASSETS }: - artificialbuy [t,a] <= buy [t,a]; robustbuy2 {t in TIME, a in ASSETS }: buy [t,a] <= artificialbuy [t,a]; robustsell {t in TIME, a in ASSETS }:- artificialsell [t,a] <= sell [t,a ]; robustsell2 {t in TIME, a in ASSETS }: sell [t,a]<= artificialsell [t,a ]; It is now apparent that, even for this simple problem, the reformulation as deterministic equivalent takes the focus of the modeller away from the problem itself, to concentrate with the definition of artificial variables and the reformulation of constraints Robust Formulations in SAMPL Expressed using SAMPL extended syntax, the steps above are simplified. The definition of the random parameter is changed to: Listing 9.10: SAMPL robust formulation: variables and parameters random param randomprice { t in TIME, a in ASSETS } dist symmetric ( price [t,a] - amplitude [t,a], price [t,a] + amplitude [ t,a]); This formal definition of the price gives all the needed information the modelling system regarding the uncertainty model U. The next step is to choose the form of the robust formulation, which is obtained via the following statement: option RobustForm Soyster ; Listing 9.11: SAMPL robust formulation: robust type Finally, the constraint are expressed identically to the deterministic version, as: Listing 9.12: SAMPL robust formulation: robust constraint Soyster fundbalance { t in TIME }: (1- Tcost )*( sum {a in ASSETS } randomprice [t,a]* sell [t,a]) - liabilities [t] + income [t] = (1+ Tcost )*( sum {a in ASSETS } randomprice [t,a]* buy [t,a]);

85 CHAPTER 9. MODELLING WITH SAMPL AND TUTORIAL 79 The system takes care of generating the artificial variables and the additional constraints automatically, thus allowing the modeller to concentrate on the problem instead of the formal specification of the uncertainty set. To obtain the other formulations (Ben Tal and Nemirovsky, or Bertsimas and Sim), the modeller simply uses a different value for the RobustForm option. These two formulations require additional parameters to specify the desired trade-off between optimality and robustness. This parameter is specified in the constraint declaration, as: Listing 9.13: SAMPL robust formulation: robust constraint with robustness fundbalance { t in TIME } suffix robustness gamma [ t]: (1- Tcost )*( sum {a in ASSETS } randomprice [t,a]* sell [t,a]) - liabilities [t] + income [t] = (1+ Tcost )*( sum {a in ASSETS } randomprice [t,a]* buy [t,a]); where gamma is an AMPL parameter containing the chosen robustness value.

86 Chapter 10 SAMPL: Language Reference 10.1 Introduction The AMPL language extensions for optimisation under uncertainty provided by SAMPL enable the user to define stochastic programming and robust optimisation models in a simple and concise way. The syntax of the extensions follows that of the AMPL language and is presented below Stages This section allows the modeller to define the grouping of the decision variables and constraints into stages. The staging is achieved by exploiting definition of suffixes. A suffix can be considered as a generic property of a variable or constraint, and can be used for our purpose to declare the stage which a variable belongs to. The stage of the constraints is automatically determined by the stage of the variables, which appear in it: the highest stage of any of such variables is the stage of the constraint. The syntax for the assignment of a stage number to a variable is very similar to that of AMPL for other suffixes, but makes use of a predefined suffix called stage: let indexing opt name. stage := expr ; Alternatively, AMPL enables the suffix to be given in the variable declarations: var name aliasopt indexing opt attributesopt, suffix stage expr ; 10.3 Scenario In scenario-based recourse problems, the uncertainty represented by the random parameters introduces a new dimension, identified by the scenario set. This set needs to be explicitly identified, because the random parameters are indexed over it. The syntax used for the declaration of the scenario set follows the syntax of AMPL for sets, but uses scenarioset instead of set in the declaration: scenarioset name aliasopt indexing opt attributesopt; There are certain conditions which have to be satisfied by the scenario set depending on the scenarios tree structure. These are formulated in the section defining the TREE keyword. 80

87 CHAPTER 10. SAMPL: LANGUAGE REFERENCE Probabilities The probability section enables the declaration of the probability distribution for the scenarios. The values of the weights can be retrieved from the data or explicitly given in the model declaration itself. The cardinality of the probability vector must obviously be the same as the number of scenarios. The syntax is the following: probability paramopt name aliasopt indexing opt attributesopt; These values represent a discrete probability distribution and need to satisfy the following: where S is the number of scenarios. 0 p s 1, s = 1,..., S (10.1) S p s = 1 s= Random Data The random parameters of the model have to be explicitly identified and are treated differently from the deterministic data. Every random parameter has to be indexed over the scenario dimension, which links the data vector to the scenario tree structure. ramdon param name indexing attributesopt; Scenario data is represented in the form of a 2-dimensional matrix (tree matrix). The matrix usually forms a grid where the columns represent time periods and the rows represent scenarios. Each entry (t, s) of the matrix represents the realisation of the random parameter at time period t under scenario s. An entry can be either a scalar or a vector. Let s consider a 3 time period horizon and random parameter p, which takes the (known) value 10 in the first time period. Let us consider that 2 realisations of the random parameter can be observed at each time period t = 2..3, as in Figure 10.1 Figure 10.1: Binomial process Figure 10.2: Scenarios example

88 CHAPTER 10. SAMPL: LANGUAGE REFERENCE 82 t = 1 t = 2 t = 3 Scenario Table 10.1: Scenario tree data t = 1 t = 2 t = 3 Scenario Table 10.2: Scenario tree sparse matrix This rule defines a simple scenario generator. The resulting scenario tree is shown in Figure The matrix representing the data for this scenario tree is reported in table The (sparse) matrix representing the data for this scenario tree is shown in table Such matrix can be represented in SAMPL in a row-wise fashion as: scenarioset scen = 1.. 4; random param dem {t, scen }; random param dem := ; 10.6 Chance Constraints Probabilistic or chance constraints are characterised by randomness in some of the coefficients and by a level β which indicates the probability of satisfying the constraint. In a scenario-based problem a chance constraint is allowed to be violated in some of the scenarios. Sum of probabilities of violated scenarios is bounded by the reliability level β associated with the constraint. Since the reliability parameter represents a probability it should be in the range [0, 1]. Chance constraints are defined using the following syntax: where subject to name indexingopt : chance - constraint - expression ; chance - constraint - expression : probability { scenario - index : basic - constraint - expression } rel - op cexpr

89 CHAPTER 10. SAMPL: LANGUAGE REFERENCE 83 or cexpr rel - op probability { scenario - index : basic - constraint - expression } and where: basic - constraint - expression : expr rel - op expr cexpr <= expr <= cexpr cexpr >= expr >= cexpr rel -op: = <= >= scenario - index : dummy - member in scenarioset - name The expr construct denotes an arithmetic expression while cexpr denotes a constant expression, one that may not contain variables. The scenario index consists of a scenario set name preceded by the keyword in and a dummy member the scope of which covers the basic constraint expression. Consider the definition of the following deterministic constraint in AMPL subject to SatisfyDemand { p in Prod }: Produce [p] >= Demand [p]; Assuming that demand is a random parameter and defining reliability parameter this constraint can be reformulated as a chance constraint param Reliability = 0.6; subject to SatisfyDemand { p in Prod }: probability {s in Scen : Produce [p,s] >= Demand [p,s]} >= Reliability ; 10.7 Integrated Chance Constraints Integrated chance constraints (ICC) were introduced by [30] as a quantitative alternative to chance constraints. Instead of bounding the probability of violating the constraint ICC bounds the expectation of a shortfall or a surplus that is generated as a result of constraint violation. This bound is denoted by the parameter β associated with each individual integrated chance constraint. It should be nonnegative but unlike the reliability parameter of chance constraints it can be greater than 1. Integrated chance constraints are defined using the following syntax: where subject to name indexingopt : icc - expression ; icc - expression : or expectation { scenario - index } ( expr less expr ) rel - op cexpr cexpr rel - op expectation { scenario - index } ( expr less expr )

90 CHAPTER 10. SAMPL: LANGUAGE REFERENCE 84 and where rel -op: = <= >= scenario - index : dummy - member in scenarioset - name The expr construct denotes an arithmetic expression while cexpr denotes a constant expression, one that may not contain variables. The scenario index consists of a scenario set name preceded by the keyword in and a dummy member the scope of which covers the less-expression in brackets but not cexpr. In AMPL the expression (a less b) is equivalent to max{a b, 0}, so it can be viewed as an amount of violation of the constraint a b. Consider the definition of the following deterministic constraint in AMPL subject to SatisfyDemand { p in Prod }: Produce [p] >= Demand [p]; Assuming that the demand is a random parameter and defining the parameter MaxExpShortfall this constraint can be reformulated as a chance constraint param MaxExpShortfall = 100; subject to SatisfyDemand { p in Prod }: expectation {s in Scen } ( Demand [p,s] less Produce [p,s]) <= MaxExpShortfall ; 10.8 Scenario Tree This section is of key importance, as it provides the modelling system with the information related to the scenario tree structure. In a scenario-based problem, the path can represent a scenario from the root node of this tree to one of the leaves. Depending on the process that drives the scenario generation, different ways of specifying this structure are allowed. It can be defined in the model itself, it can be provided externally by the scenario generator, or it can be retrieved in an automatic fashion from the scenario data. TREE A tree is described in terms of time stages, as opposed to time periods. SAMPL provides alternative ways of defining the tree structure. The syntax is as follows: tree name := opt tree_declaration ; where <tree_declaration> is one of: bundle_list tlist nway {n} multibranch {n1, n2,..., nst } binary twostage { ns} opt ; twostage If the stages section define a two-stage aggregation or the model itself is two-stage, then the tree is simply formulated using the twostage keyword. This is equivalent to define an nway{s} tree, where S is the number of scenarios as provided by the scenarioset section.

91 Chapter 11 The Dakota Problem (Tutorial 2) In this chapter we introduce the third tutorial, the Dakota Problem Introduction The Dakota Furniture Company manufactures desks, tables, and chairs. A desk sells for $60, a table sells for $40, and a chair sells for $10. The manufacture of each type of furniture requires lumber and two types of skilled labour: finishing and carpentry. The cost of each resource and the resource levels required for each item produced are given below. Resource Cost Production Requirements Desk Table Chair Lumber (bd. ft.) Finishing (hrs.) Carpentry (hrs.) Table 11.1 The question we need to answer is how much of each item should be produced in order to maximise profit. However, the Dakota Company needs to know the demand for each of the products above, and to this end they have the following forecasts with appropriate probabilities: Products Demand Low Likely High Desks Tables Chairs Probability Table Deterministic Algebraic Formulation In this section we algebraically formulate the Dakota problem as a deterministic linea problem. we first introduce some notation. Let P = [desks, tables, chairs] be the set of products and R = [lumber, finishing, carpentry] the set of resources. The production cost of each resource is given by C r, r R, and the resource 85

92 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 86 requirement for each product is given by Q rp, r R, p P. The available budget is given by B and the selling price is given by V p, p P. The demand is given by D ps, p S, s S, where S = 3 is the scenario set. We also define the probability prob s for each scenario s. In a deterministic model we calculate the demands expected value D p = s S prob sd ps /S. The decision variables are: b r x p y p Amount to buy of resource r Amount to produce of product p Amount to sell of product p The objective function of the Dakota problem is given by: max p P V p y p r R C r b r (11.1) subject to several constraints. The purchase of each resource must be enough to cover all products made: b r p P Q rp x p, r R (11.2) We cannot sell more than we produce and we cannot sell more than the demand: x p y p, p P (11.3) D p y p, p P (11.4) And, finally, the expenses cannot exceed the budget: C r b r B (11.5) r R To finalise the model, we have to include the nonnegativity constraints: b r 0, r R (11.6) x p, y p 0, p P 11.3 Data Representation The demand is the only stochastic parameter in the Dakota model. The expected demand D is given as a unidimensional vector that depends on the number of products P. The uncertainty of the demand is represented by a discrete set of scenarios S and a bidimensional matrix such as: Demand Product Scen = 1 Scen = 2... Scenario S 1 D 1,1 D 1,2... D 1,S 2 D 2,1 D 2,2... D 2,S..... P D P,1 D P,2... D P,S Table 11.3: Demand for the dakota problem, S scenarios

93 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) Expected Value Problem in AMPL In this section we introduce the expected value formulation for the Dakota problem in AMPL. We begin by introducing the notation that will be used throughout this tutorial. Table 11.4 presents the mapping of the algebraic entities to the corresponding AMPL sets, parameters and variables. Type Notation AMPL Name Range / Dimensions set (indices) P R Prod Resource C Cost Resource Q ProdReq Resource, Prod param (parameters) var (variables) V Price Prod B Budget D Demand Prod b buy Resource x amountprod Prod y amountsell Prod Table 11.4: Mapping algebraic entities to AMPL The AMPL expected value formulation is given in listing #SETS set Prod ; set Resource ; Listing 11.1: Dakota model: expected value problem in AMPL #PARAMETERS param Cost { Resource } ; param ProdReq{ Resource, Prod } ; param P r i c e { Prod } ; param Budget ; param Demand{ Prod } ; #VARIABLES var buy { r i n Resource } >= 0 ; var amountprod {p i n Prod } >= 0 ; var a m o u n t s e l l {p i n Prod } >= 0 ; #OBJECTIVE maximize w e a l t h : (sum{p i n Prod } P r i c e [ p ] a m o u n t s e l l [ p ] sum{ r i n Resource } Cost [ r ] buy [ r ] ) ; #CONSTRAINTS subject to b a l a n c e { r i n Resource } : buy [ r ] >= sum{p i n Prod } ProdReq [ r, p ] amountprod [ p ] ; subject to p r o d u c t i o n {p i n Prod } : a m o u n t s e l l [ p ] <= amountprod [ p ] ;

94 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 88 subject to s a l e s {p i n Prod } : a m o u n t s e l l [ p ] <= Demand [ p ] ; subject to b u d g e t r e s : sum{ r i n Resource } Cost [ r ] buy [ r ] <= Budget ; The data as given in the problem description is represented in a separate data file: Listing 11.2: Dakota model: data file set Prod := desk t a b l e s c h a i r ; set Resource := lumber f i n i s h i n g c a r p e n t r y ; param Cost := lumber 2. 0 f i n i s h i n g 4. 0 c a r p e n t r y 5. 2 ; param ProdReq : desk t a b l e s c h a i r := lumber f i n i s h i n g c a r p e n t r y ; param P r i c e := desk 60 t a b l e s 40 c h a i r 1 0 ; param Budget := 60000; param Demand := desk 150 t a b l e s 125 c h a i r 300; Observe that we have replaced the stochastic demand as presented in the description by its expected value A Stochastic Formulation In this section we formulate the Dakota problem as a two-stage stochastic problem which considers future scenarios for the demand. In the stochastic version, we have to decide how much of each resource to buy before the actual demand is unfolded. Thus, the variables that represent the amount to buy of each resource are considered first stage, and the remaining variables are second stage. After adding the scenario set S as the new dimension, the stochastic formulation is given by: max ( prob s S p y ps ) C r b rs (11.7) s S p P r R subject to: b rs p P Q rp x ps, r R, s S (11.8)

95 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 89 x ps y ps, p P, s S (11.9) D ps y ps, p P, s S (11.10) C r b rs B (11.11) r R b r1 = b rs, s S, r R (11.12) b r 0, r R (11.13) x p, y p 0, Equation is a non-anticipatity constraint. p P 11.6 A Stochastic Formulation in AMPL Before presenting the AMPL stochastic formulation of the Dakota problem, Table 11.5 shows the mapping from the algebraic entities to the AMPL formulation. Only the entities that are different from Table 11.4 are presented. Type set (indices) S Scen Notation AMPL Name Range / Dimensions param (parameters) D Demand Prod, Scen var (variables) b buy Resource, Scen x amountprod Prod, Scen y amountsell Prod, Scen Table 11.5: Mapping algebraic entities to AMPL Then the AMPL stochastic formulation of the Dakota problem is shown in listing Listing 11.3: Dakota model: stochastic formulation in AMPL #PARAMETERS param NS ; # number o f s c e n a r i o s #SETS set Prod ; set Resource ; #SCENARIO SET set Scen := 1.. NS ; #PARAMETERS param Cost { Resource } ; param ProdReq{ Resource, Prod } ; param P r i c e { Prod } ; param Budget ; param Demand{ Prod, Scen } ;

96 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 90 param Prob { Scen } ; #VARIABLES var buy { r i n Resource, s i n Scen } >= 0 ; var amountprod {p i n Prod, s i n Scen } >= 0 ; var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0 ; #OBJECTIVE maximize w e a l t h : sum{ s i n Scen } Prob [ s ] (sum{p i n Prod } P r i c e [ p ] a m o u n t s e l l [ p, s ] sum{ r i n Resource } Cost [ r ] buy [ r, s ] ) ; #CONSTRAINTS subject to b a l a n c e { r i n Resource, s i n Scen } : buy [ r, s ] >= sum{p i n Prod } ProdReq [ r, p ] amountprod [ p, s ] ; subject to p r o d u c t i o n {p i n Prod, s i n Scen } : a m o u n t s e l l [ p, s ] <= amountprod [ p, s ] ; subject to s a l e s {p i n Prod, s i n Scen } : a m o u n t s e l l [ p, s ] <= Demand [ p, s ] ; subject to b u d g e t r e s { s i n Scen } : sum{ r i n Resource } Cost [ r ] buy [ r, s ] <= Budget ; subject to n o n a n t i c i p a t i v i t y { r i n Resource, s i n Scen } : buy [ r, 1 ] = buy [ r, s ] ; The data is defined as in listing 11.2, with the following differences: param Prob := ; Listing 11.4: Dakota model: stochastic data file param Demand : := desk t a b l e s c h a i r ; The probability vector was added and we replaced the demand with the forecast of the three scenarios defined earlier A Stochastic Formulation in SAMPL The SAMPL formulation of the Dakota problem is given in listing Listing 11.5: Dakota model: stochastic formulation in SAMPL #PARAMETERS param NS ; # number o f s c e n a r i o s

97 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 91 #SETS set Prod ; set Resource ; #SCENARIO SET s c e n a r i o s e t Scen := 1.. NS ; #TREE tree Tree := twostage ; #RANDOM PARAMETERS random param Demand{ Prod, Scen } ; #PROBABILITIES p r o b a b i l i t y param Prob { Scen } ; #PARAMETERS param Cost { Resource } ; param ProdReq{ Resource, Prod } ; param P r i c e { Prod } ; param Budget ; #VARIABLES var buy { r i n Resource } >= 0, s u f f i x s t a g e 1 ; var amountprod {p i n Prod, s i n Scen } >= 0, s u f f i x s t a g e 2 ; var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0, s u f f i x s t a g e 2 ; #OBJECTIVE maximize w e a l t h : sum{ s i n Scen } Prob [ s ] (sum{p i n Prod } P r i c e [ p ] a m o u n t s e l l [ p, s ] sum{ r i n Resource } Cost [ r ] buy [ r ] ) ; #CONSTRAINTS subject to b a l a n c e { r i n Resource, s i n Scen } : buy [ r ] >= sum{p i n Prod } ProdReq [ r, p ] amountprod [ p, s ] ; subject to p r o d u c t i o n {p i n Prod, s i n Scen } : a m o u n t s e l l [ p, s ] <= amountprod [ p, s ] ; subject to s a l e s {p i n Prod, s i n Scen } : a m o u n t s e l l [ p, s ] <= Demand [ p, s ] ; subject to b u d g e t r e s : sum{ r i n Resource } Cost [ r ] buy [ r ] <= Budget ; The variables stages are defined with the suffix keyword, and the non-anticipativity constraint is no longer necessary. The data file for both SAMPL and stochastic AMPL is the same A Chance Constraint Formulation in AMPL Assume that Dakota wishes to meet demand with a reliability level of 60% in the future. In this section, we reformulate the Dakota model to include chance constraints to represent this new requirement. Let 0 R 1 be the desired reliability. In the CC deterministic formulation, we have to add binary variables to correct represent the chance constraint. We define: h ps to be 1 if a shortage of product p happens at scenario s, 0 otherwise.

98 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 92 We add the following constraints to ensure that we meet the required reliability level: prob s h ps 1 R, p P (11.14) s S The model is also subject to constraints The problem with the constraints above is that there is nothing guaranteeing that the h variables will be one in case a shortage happens. We have to add other constraints to guarantee that: D ps x ps Mh ps, p P, s S (11.15) Here M is a constant that is large enough to be greater than all demands D ps. The AMPL code (where we use shortage to represent variables h) for the deterministic CC formulation is given as: Listing 11.6: Dakota model: CC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param M = ; param R e l i a b i l i t y = 0. 6 ; ### VARIABLES ### var buy { r i n Resource, s i n Scen } >= 0 ; var amountprod {p i n Prod, s i n Scen } >= 0 ; var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0 ; var s h o r t a g e {p i n Prod, s i n Scen } b i n a r y ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL ### CONSTRAINTS ### subject to subject to satdemand {p i n Prod, s i n Scen } : Demand [ p, s ] amountprod [ p, s ] <= M s h o r t a g e [ p, s ] ; subject to cc {p i n Prod } : sum{ s i n Scen } P [ s ] s h o r t a g e [ p, s ] <= 1 R e l i a b i l i t y ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL Observe that the variables shortage are binary, and we added the newly defined constraints (we arbitrarily used M = ). We do not repeat the declaration of sets, parameters and the remaining constraints as they are the same as in listing A Chance Constraint Formulation in SAMPL When implementing CCs in SAMPL, there is no need to define additional binary variables. There is also no need to define constraints similar to 11.15, as this is handled by SAMPL itself. In the SAMPL formulation, we remove the shortage variables and update the objective function. We also define the chance constraints as given in the listing below. Listing 11.7: Dakota model: CC formulation in SAMPL

99 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 93 # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param R e l i a b i l i t y = 0. 6 ; ### VARIABLES ### var buy { r i n Resource, s i n Scen } >= 0 ; var amountprod {p i n Prod, s i n Scen } >= 0 ; var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0 ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL ### CONSTRAINTS ### subject to subject to cc {p i n Prod } : p r o b a b i l i t y { s i n Scen : produce [ p, s ] >= Demand [ p, s ] } >= R e l i a b i l i t y ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL An Integrated Chance Constraint Formulation in AMPL The ICCP approach considers the Dakota problem to be feasible if the expected violation of the shortfall is less than a predefined value. To define the deterministic ICCP formulation of the Dakota problem, let 0 U 1 be the maximum predefined shortfall. We then add variables to represent the shortage and excess of the items produced in relation to the demand. We define: h ps to represent the shortage of product p in scenario s. e ps to represent the excess of product p in scenario s. We add the following balance constraints: x ps D ps = e ps h ps, p P, s S (11.16) The model is also subject to constraints Then we can add the ICC constraint in its deterministic form: prob s h ps U p P (11.17) s S The AMPL code (where we use shortage to represent variables h, excess for e and shortfall for U) for the deterministic ICC formulation is given as: Listing 11.8: Dakota model: ICC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param s h o r t f a l l = 1 00; ### VARIABLES ### var buy { r i n Resource, s i n Scen } >= 0 ; var amountprod {p i n Prod, s i n Scen } >= 0 ;

100 CHAPTER 11. THE DAKOTA PROBLEM (TUTORIAL 2) 94 var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0 ; var s h o r t a g e {p i n Prod, s i n Scen } >= 0 ; var e x c e s s {p i n Prod, s i n Scen } >= 0 ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL ### CONSTRAINTS ### subject to subject to satdemand {p i n Prod, s i n Scen } : amountprod [ p, s ] Demand [ p, s ] = e x c e s s [ p, s ] s h o r t a g e [ p, s ] ; subject to i c c {p i n Prod } : sum{ s i n Scen } Prob [ s ] s h o r t a g e [ p, s ] <= s h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL An Integrated Chance Constraint Formulation in SAMPL The ICC formulation in SAMPL does not require the definition of extra variables and the balance constraint. The formulation is given below: Listing 11.9: Dakota model: ICC formulation in SAMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param s h o r t f a l l = 1 00; ### VARIABLES ### var buy { r i n Resource, s i n Scen } >= 0 ; var amountprod {p i n Prod, s i n Scen } >= 0 ; var a m o u n t s e l l {p i n Prod, s i n Scen } >= 0 ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL ### CONSTRAINTS ### subject to i c c {p i n Prod } : expectation { s i n Scen }(Demand [ p, s ] l e s s amountprod [ p, s ] ) <= s h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL

101 Chapter 12 News Vendor Problem (Tutorial 3) In this section we introduce deterministic and stochastic formulations for the News Vendor problem with single and multiple resources News Vendor Problem: Single Resource Introduction The Informer, a newspaper having a wide circulation is interested in increasing its efficiency in distribution. The company needs to decide, every day, what is the optimal number x of newspaper copies to be printed in order to maximize its profits. This, of course, depends on the expected demand D for the newspaper during the next day. Also, there is a limit on the amount of copies that can be printed, due to the capacity of the lines. We indicate this by M. Each copy of the newspaper is characterized by a unit selling price V = 90p and a unit printing cost of C = 81p. Unsold copies o (i.e. copies printed in excess) represents a loss for the company. Moreover, if the number of copies printed is less than the demand, the company misses the opportunity to increase profits. We indicate this shortfall with u. Shortfall and excess are only revealed once the demand is known, i.e. the day after the copies have been printed. In stochastic programming jargon, this means that x is a first stage variable, while o and u are second stage variables. In this problem, the total profit of the company is given by: max(v C)x (V C)u V o (12.1) The number of copies, which can be printed, is limited by M. The following constraint expresses this bound: x M The constraint above is a first stage constraint. The number of copies printed, the excess and shortfall of copies and the demand are linked by the following balance constraint: x + u o = D The constraint can be verified when the demand is revealed, in this case in the second stage. The constraint is a second stage constraint. Model Summary The news vendor problem with single resource is summarised below: subject to: max(v C)x (V C)u V o (12.2) 95

102 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 96 x M Expected Value Problem in AMPL x + u o = D The Demand D is unknown and therefore a stochastic parameter. Let us then consider S different possible future scenarios for the demand, which is redefined as D s, s S. Each scenario s is assigned a probability p s, where s S p s = 1. To formulate the expected value problem, we calculate the expected value of the demand as D = s S p sd s /S. The decision variables of the problem are x, o, u. We define the above decision variables in AMPL as: var x >= 0 ; var u >= 0 ; var o >= 0 ; The expected demand (given an arbitrary value of 250), together with the selling price V, the printing cost C and the maximum amount of printable copies M are defined in AMPL as follows: param V: = 0. 9 ; param C: = ; param D:=250; param M:=1000; The objective function and constraints are expressed as: maximize p r o f i t : ( x (V C) ) ( u (V C) + V o ) ; subject to L i m i t : x <= M; Balance : x + u o = D; To summarise, the problem above can be implemented in AMPL as follows. Listing 12.1: Informer model: expected value formulation in AMPL param V := 0. 9 ; param C := ; param D := 250; param M := 1000; var x >= 0 ; var u >= 0 ; var o >= 0 ; maximize p r o f i t : ( x (V C) ) ( u (V C) + V o ) ;

103 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 97 subject to L i m i t : x <= M; Balance : x + u o = D; A Stochastic Formulation For the stochastic formulation, we include the set of scenarios S as a new dimension to the demand and the second stage variables o and u. The stochastic algebraic formulation is given by: max(v C)x ( ) p s (V C)u s V o s (12.3) s S subject to: x M x + u s o s = D s s S A Stochastic Formulation in AMPL The stochastic formulation in AMPL is given by: Listing 12.2: Informer model: stochastic formulation in AMPL param NS := 1 0 ; set s c e n a r i o := 1.. NS ; param V := 0. 9 ; param C := ; param M := 1000; param prob { s c e n a r i o } := 1/NS ; param D{ s c e n a r i o } ; var x >= 0 ; var u{ s i n s c e n a r i o } >= 0 ; var o{ s i n s c e n a r i o } >= 0 ; maximize p r o f i t : ( x (V C) ) sum{ s i n s c e n a r i o } prob [ s ] ( u [ s ] ( V C) + V o [ s ] ) ; subject to L i m i t : x <= M; Balance { s i n s c e n a r i o } : x + u [ s ] o [ s ] = D[ s ] ; Notice that in the example above as we have not defined D we would have to include a separate data file with the actual values. A Stochastic Formulation in SAMPL The SAMPL stochastic programming version of the Informer model requires the definition of the scenario dimension. This can be done with the following declaration:

104 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 98 scenarioset scen ; We have seen that the demand parameter D is now a (random) vector indexed over the scenario index. The declaration of D is removed from the DATA, section. In SAMPL, the random parameters of the problems are declared in the RANDOM section. Vector D can be declared as follows: random param D{ scen }; We declare an equally weighted probability vector prob[s] as follows: probability param P{ scen } = 1 / card ( scen ); The Informers model is a two stage recourse model. The scenario tree for a two stage can be defined very easily using the extended AMPL constructs provided by SAMPL: tree thetree := twostage {2}; For more information about the declaration of the tree structure associated with a stochastic programming model, see section To summarise, the stochastic programming version of the Informer model is formulated as follows: Listing 12.3: Informer model: stochastic formulation in SAMPL s c e n a r i o s e t scen := #to account f o r 10 s c e n a r i o s ; tree thetree := twostage {2}; param V := 0. 9 ; param C := ; param M := 1000; random param D{ scen } ; p r o b a b i l i t y param P{ scen } := 1/ card ( scen ) ; var x >= 0, s u f f i x s t a g e 1 ; var u{ s i n scen } >= 0, s u f f i x s t a g e 2 ; var o{ s i n scen } >= 0, s u f f i x s t a g e 2 ; maximize p r o f i t : sum{ s i n scen } P [ s ] ( x (V C) ( u [ s ] ( V C) + V o [ s ] ) ) ; subject to L i m i t : x <= M; Balance { s i n scen } : x+u [ s ] o [ s ] = D[ s ] ; 12.2 News Vendor Problem: Multiple Resources In this section we define an enhanced News Vendor Problem, which extends the classical problem; we (a) introduce two resources (magazines) and (b) put the problem in a multitime period, that is, a temporal setting. A chain of Retail Stationers (RS) classically known as News Vendors buys magazines from a publisher who produces two monthly magazines (Espresso and Panorama) with a wide circulation.

105 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 99 The weekly demand for these magazines can be viewed as a random process that is partly predictable by factors such as the importance of the headline story. The RS needs to make purchase and related inventory decisions in the face of uncertain demand. The objective is to maximize the expected profit. Profit is strictly related to purchase cost, resale price, excess and shortfall. In the end of every month the RS can sell excess unsold magazines for a small price to some hotel outlets for coffee tables and recover some residual value. In addition, further data for the problem are given in summary form as available budget per month, sell price, residual price and purchase cost. The sell price and purchase cost may vary with time. The process may result in, as a consequence, an excess inventory or a shortfall of magazine quantities. The purchase decisions are made at the beginning of each week while sales take place throughout the week. RS has the additional restriction that it cannot purchase weekly more or less quantity than 10% in respect of the request done in the previous weeks. As opposed to the news vendor problem with a single resource, in this problem we have multiple time periods. Thus, we have to explicitly include time in the model. We define: Sets Indices Description T = [1,..., 4] t T number of weeks in our time horizon N = [1, 2] i N set of magazines S s S set of future scenarios We also define the following parameters: Parameters B t budget per week, B t = t = 1,..., T 1 C it purchase cost V it sell price R it residual price U it penalty for not meeting the demand demand D its Demand is uncertain and hence defined as a ramdon parameter D its i N, t T, s S. We also assign each scenario s a probability p s, where s S p s = 1. For our algebraic formulation we consider a deterministic demand by computing expected demands D it = s S p sd its /S. The purchase cost C it, sell price V it and residual price R it are defined in Table Week Panorama Espresso C 1t V 1t R 1t C 2t V 2t R 2t X X X X X X 4 X X Table 12.1: Purchase cost, sell price and residual prices The penalty for not meeting the demand is defined as U it = 1.5 C it. following decision variables: We then define the x it u it o i y it Purchase quantity over time t and magazine i Shortfall over time t and magazine i Excess (unsold copies) over magazine i Inventory over time t and magazine i The objective function of the problem is defineds as:

106 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 100 max i N ( t T ( ) (V it C it )(x it y it ) + t=2..t ((V it C it 1 )y it 1 ) (C it 1 R it )o i t T U it u it ) (12.4) In this approach we consider the objective to be made of all three parts: 1. the addition of the potential profit for totally meeting the demand, composed by the first two inner summations. They represent the amount purchased plus the inventory from the previous time period, less the amount left as inventory for the current time period, 2. the subtraction of the cost of any excess magazines less the revenue from selling them back to the provider, and 3. the subtraction of the penalty for failing to meet the demand. The model is subject to the following constraints: C it x it B t t T 1 (12.5) i N The constraints above ensure that we stay within budget during the periods when we make purchases of the magazines. 0.9x it 1 x it 1.1x it 1 i N, t = 2,..., T 1 (12.6) These constraints obey the publisher s rule of smooth production orders. x i1 = D i1 + y i1 u i1 i N (12.7) x it + y it 1 = D it + y it u it i N, t = 2,..., T 1 (12.8) The constraints above are inventory demand for the first three weeks of the month. These allow for shortfalls in meeting demand (u) while also enabling the RS to store some of the magazines at the end of the month. This is particularly important in month three since this storage quantity will be used to meet week four s demand. In addition, the storage is also necessary to help in meeting the publisher s smooth production run. y it 1 = D it + o i u it i N (12.9) This group of constraints detail how the final months demand is met allowing for the capture of excess magazines to be sold to the hotels. x it, u it, y it 0 i N, t T (12.10) o i 0 i N (12.11)

107 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) Model Summary The news vendor problem with multiple resources and multiple time periods is summarised below: max i N ( t T ( ) (V it C it )(x it y it ) + t=2..t ((V it C it 1 )y it 1 ) (C it 1 R it )o i t T U it u it ) (12.12) subject to: C it x it B t t T 1 i N 0.9x it 1 x it 1.1x it 1 i N, t = 2,..., T x i1 = D i1 + y i1 u i1 i N x it + y it 1 = D it + y it u it i N, t = 2,..., T 1 y it 1 = D it + o i u it i N x it, u it, y it 0 i N, t T o i 0 i N 12.4 Data Representation In our example, we have N = 2 magazines and T = 4 time periods, and thus the expected demand is given by a two-dimensional matrix composed of a total of 8 values. The actual demand forecast is composed of S scenarios, which transforms the demand into a three-dimensional matrix. For example, if S = 64, the demand is represented as: Demand Magazine Time = 1 Time = 2 Time = 3 Time = 4 Scenario 1 D 1,1,1 D 1,2,1 D 1,3,1 D 1,4,1 1 2 D 2,1,1 D 2,2,1 D 2,3,1 D 2,4,1 1 D 1,1,2 D 1,2,2 D 1,3,2 D 1,4,2 2 2 D 2,1,2 D 2,2,2 D 2,3,2 D 2,4, D 1,1,64 D 1,2,64 D 1,3,64 D 1,4, D 2,1,64 D 2,2,64 D 2,3,64 D 2,4,64 Table 12.2: Demand for the news vendor problem, 64 scenarios

108 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) Expected Value Problem in AMPL The AMPL deterministic formulation is given by: Listing 12.4: RS model: expected value problem in AMPL param T; param N; set maga := 1.. N; set time := 1.. T; set time1 w i t h i n { time } := 1.. T 1; set time2 w i t h i n { time } := 2.. T 1; set time3 w i t h i n { time } := 2.. T; param C{maga, time } ; #p u r c h a s e p r i c e param V{maga, time } ; #s e l l i n g p r i c e param R{maga } ; #r e s a l e v a l u e param p e n a l t y R a t e ; param p e n a l t y {maga, time } ; # p e n a l t y f o r not meeting the demand param B{ time1 }, d e f a u l t 10000; # budget param D{maga, time } ; #demand var x{maga, time1 } >=0; #p u r c h a s e q u a n t i t y var u{maga, time } >=0; #s h o r t a g e var o{maga} >=0; #e x c e s s var y{maga, time1 } >=0; #i n v e n t o r y maximize p r o f i t : ( sum{ i i n maga, t i n time1 } (V[ i, t ] C [ i, t ] ) ( x [ i, t ] y [ i, t ] ) + sum{ i i n maga, t i n time3 } (V[ i, t ] C [ i, t 1]) ( y [ i, t 1]) sum{ i i n maga, t i n time } p e n a l t y [ i, t ] u [ i, t ] sum{ i i n maga} (C [ i, T 1] R [ i ] ) o [ i ] ) ; subject to budget { t i n time1 } : sum{ i i n maga} ( x [ i, t ] C [ i, t ] ) <= B[ t ] ; demand_upper{ i i n maga, t i n time2 } : x [ i, t ] <= 1. 1 x [ i, t 1]; demand_lower{ i i n maga, t i n time2 } : x [ i, t ] >= 0. 9 x [ i, t 1]; i n i t _ p u r c h a s e { i i n maga } : x [ i, 1 ] = D[ i, 1 ] + y [ i, 1 ] u [ i, 1 ] ; demand_balance { i i n maga, t i n time2 } : x [ i, t ] + y [ i, t 1]= D[ i, t ] + y [ i, t ] u [ i, t ] ; inventory_end { i i n maga } : y [ i, T 1]= D[ i,t] + o [ i ] u [ i,t ] ; Observe that we defined three time subsets (time1, time2 and time3) to facilitate the definition of contraints and objective function. Listing 12.5 provides an example of the corresponding data file which includes made-up demand values: param T := 4 ; param N := 2 ; Listing 12.5: RS model: example data file param p e n a l t y R a t e := 1. 5 ; param C : := ;

109 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 103 param V : := ; param R := ; param D: := ; f o r { t i n 1.. T} { f o r { i i n 1.. N} { l e t p e n a l t y [ i, t ] := p e n a l t y R a t e C [ i, t ] ; } } 12.6 A Two-Stage Stochastic Formulation In this section we present a two-stage stochastic formulation of the news vendor problem with multiple resources. We consider the first stage to be time t = 1, while the second stage is composed of time intervals t > 1. We can think of this as making a decision for time t = 1 based on uncertainty and making subsequent corrective actions in the following time periods. Since this problem has multiple time periods, we could benefit from adopting a multi-stage formulation (instead of twostage), however, for the sake of this tutorial we will discuss it as two-stage. For the stochastic formulation we include the scenario set S, and redefine the decision variables accordingly: x its u its o is y its Purchase quantity over time t, magazine i and scenario s Shortfall over time t, magazine i and scenario s Excess (unsold copies) over magazine i and scenario s Inventory over time t, magazine i and scenario s Before the demand for the first time period is revealed, we have to decide how many magazines we are going to buy in time 1. Therefore, the x variables when time t = 1 are considered first stage variables. The number of magazines in inventory and the shortfall when t = 1 depend on the demand, so they are second stage variables. All the remaining variables for t > 1 are also second stage. The stochastic formulation is given by: max ( p s i N i S t T ( ) (V it C it )(x its y its ) + t=2..t ((V it C it 1 )y it 1s ) (C it 1 R it )o is t T U it u its ) subject to: C it x its B t i N t T 1, s S 0.9x it 1s x its 1.1x it 1s i N, t = 2,..., T, s S

110 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 104 x i1s = D i1s + y i1s u i1s i N, s S x its + y it 1s = D its + y its u its i N, t = 2,..., T 1, s S y it 1s = D it s + o is u it s i N, s S We impose the two stage structure in the model by forcing all decisions to be the same in the first time (stage) period. Notice that the only first stage variables are x i1s. x i11 = x i1s i N, s = 2,..., S The constraint above is known as a explicity non-anticipativity constraint. Finally we specify the nonnegativity of variables: x its, u its, y its 0 i N, t T, s S o is 0 i N, s S 12.7 A Two-stage Stochastic Formulation in AMPL The two stage stochastic formulation in AMPL is defined as: param T; param N; param S ; Listing 12.6: RS model: two-stage stochastic formulation in AMPL set maga := 1.. N; set time := 1.. T; set time1 w i t h i n { time } := 1.. T 1; set time2 w i t h i n { time } := 2.. T 1; set time3 w i t h i n { time } := 2.. T; set scen := 1.. S ; set scen_1 := 1.. S 1; param p i { scen } >= 0, d e f a u l t 1/S ; # e q u a l p r o b a b i l i t y param C{maga, time } ; #p u r c h a s e p r i c e param V{maga, time } ; #s e l l i n g p r i c e param R{maga } ; #r e s a l e v a l u e param p e n a l t y R a t e ; param p e n a l t y {maga, time } ; # p e n a l t y f o r not meeting the demand param B{ time1 }, d e f a u l t 10000; # budget param D{maga, time, scen } ; #demand var x{maga, time1, scen } >=0; #p u r c h a s e q u a n t i t y var u{maga, time, scen } >=0; #s h o r t a g e var o{maga, scen } >=0; #e x c e s s var y{maga, time1, scen } >=0; #i n v e n t o r y maximize p r o f i t : sum{ s i n scen } p i [ s ] ( sum{ i i n maga, t i n time1 } (V[ i, t ] C [ i, t ] ) ( x [ i, t, s ] y [ i, t, s ] ) + sum{ i i n maga, t i n time3 } (V[ i, t ] C [ i, t 1]) ( y [ i, t 1, s ] )

111 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 105 sum{ i i n maga, t i n time } p e n a l t y [ i, t ] u [ i, t, s ] sum{ i i n maga} (C [ i, T 1] R [ i ] ) o [ i, s ] ) ; subject to budget { t i n time1, s i n scen } : sum{ i i n maga} ( x [ i, t, s ] C [ i, t ] ) <= B[ t ] ; demand_upper{ i i n maga, t i n time2, s i n scen } : x [ i, t, s ] <= 1. 1 x [ i, t 1, s ] ; demand_lower{ i i n maga, t i n time2, s i n scen } : x [ i, t, s ] >= 0. 9 x [ i, t 1, s ] ; i n i t _ p u r c h a s e { i i n maga, s i n scen } : x [ i, 1, s ] = D[ i, 1, s ] + y [ i, 1, s ] u [ i, 1, s ] ; demand_balance { i i n maga, t i n time2, s i n scen } : x [ i, t, s ] + y [ i, t 1, s ]= D[ i, t, s ] + y [ i, t, s ] u [ i, t, s ] ; inventory_end { i i n maga, s i n scen } : y [ i, T 1, s ]= D[ i, T, s ] + o [ i, s ] u [ i, T, s ] ; two_stage { i i n maga, s i n scen_1 } : x [ i, 1, s ] = x [ i, 1, s +1]; 12.8 A Two-Stage Stochastic Formulation in SAMPL The equivalent formulation in SAMPL is given in listing We make use of the same structure as the SAMPL formulation for the news vendor problem with a single resource, given in listing 12.3 param T = 4 ; param N = 2 ; Listing 12.7: RS model: two-stage stochastic formulation in SAMPL param S ; s c e n a r i o s e t scen := 1.. S ; tree thetree := twostage {2}; set maga := 1.. N; set time := 1.. T; set time1 w i t h i n { time } := 1.. T 1; set time2 w i t h i n { time } := 2.. T 1; set time3 w i t h i n { time } := 2.. T; p r o b a b i l i t y param p i { scen } >= 0, d e f a u l t 1/S ; param C{maga, time } ; #p u r c h a s e p r i c e param V{maga, time } ; #s e l l i n g p r i c e param R{maga } ; #r e s a l e v a l u e param p e n a l t y R a t e ; param p e n a l t y {maga, time } ; # p e n a l t y f o r not meeting the demand param B{ time1 }, d e f a u l t 10000; # budget random param D{maga, time, scen } ; var x{maga, t i n time1, scen } >= 0, s u f f i x s t a g e i f t=1 then 1 e l s e 2 ;

112 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 106 var u{maga, t i n time, scen } >= 0, s u f f i x s t a g e 2 ; var o{maga, scen } >= 0, s u f f i x s t a g e 2 ; var y{maga, t i n time1, scen } >= 0, s u f f i x s t a g e 2 ; maximize p r o f i t : sum{ s i n scen } p i [ s ] ( sum{ i i n maga, t i n time1 } (V[ i, t ] C [ i, t ] ) ( x [ i, t, s ] y [ i, t, s ] ) + sum{ i i n maga, t i n time3 } (V[ i, t ] C [ i, t 1]) ( y [ i, t 1, s ] ) sum{ i i n maga, t i n time } p e n a l t y [ i, t ] u [ i, t, s ] sum{ i i n maga} (C [ i, T 1] R [ i ] ) o [ i, s ] ) ; subject to budget { t i n time1, s i n scen } : sum{ i i n maga} ( x [ i, t, s ] C [ i, t ] ) <= B[ t ] ; demand_upper{ i i n maga, t i n time2, s i n scen } : x [ i, t, s ] <= 1. 1 x [ i, t 1, s ] ; demand_lower{ i i n maga, t i n time2, s i n scen } : x [ i, t, s ] >= 0. 9 x [ i, t 1, s ] ; i n i t _ p u r c h a s e { i i n maga, s i n scen } : x [ i, 1, s ] = D[ i, 1, s ] + y [ i, 1, s ] u [ i, 1, s ] ; demand_balance { i i n maga, t i n time2, s i n scen } : x [ i, t, s ] + y [ i, t 1, s ]= D[ i, t, s ] + y [ i, t, s ] u [ i, t, s ] ; invent_end { i i n maga, s i n scen } : y [ i, T 1, s ]= D[ i, T, s ] + o [ i, s ] u [ i, T, s ] ; Observe that we define the variables stages at their declaration, and there is no need to add explicit non-anticipativity constraints. The data for the above model has to be defined in a separate file A Chance Constraint Formulation in AMPL Assume that the news vendor wishes to meet the demand with a reliability level of 90% in the future. In this section, we reformulate the News Vendor model to include chance constraints to represent this new requirement. Let 0 R 1 be the desired reliability. In the CC deterministic formulation, we have to add binary variables to correct represent the chance constraint. We define: s its to be 1 if a shortage of magazine i happens at time t and scenario s, 0 otherwise. We add the following constraints to the two-stage stochastic model to ensure that we meet the required reliability level: p s s its 1 R, i N, t T (12.13) s S The problem with the constraints above is that there is nothing guaranteeing that the s variables will be one in case a shortage happens. We have to add other constraints to guarantee that: D i1s x i1s Ms i1s, i N, s S (12.14) D its x its y it 1s Ms its, i N, s S, t = 2..T 1 (12.15)

113 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 107 D it s y it 1s Ms it s, i N, s S (12.16) Here M is a constant that is large enough to be greater than all demands D ps. The AMPL code (where we use shortage to represent variables h) for the deterministic CC formulation is given as: Listing 12.8: RS model: CC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param M := 10000; param R e l i a b i l i t y := 0. 9 ; var x{maga, time1, scen } >=0; #p u r c h a s e q u a n t i t y var u{maga, time, scen } >=0; #s h o r t a g e var o{maga, scen } >=0; #e x c e s s var y{maga, time1, scen } >=0; #i n v e n t o r y var s h o r t a g e {maga, time, scen } b i n a r y ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL subject to satdemandt1 { i i n maga, s i n scen } : D[ i, 1, s ] x [ i, 1, s ] <= M s h o r t a g e [ i, 1, s ] ; satdemand { i i n maga, t i n time2, s i n scen } : D[ i, t, s ] x [ i, t, s ] y [ i, t 1, s ] <= M s h o r t a g e [ i, t, s ] ; satdemandt { i i n maga, s i n scen } : D[ i, T, s ] y [ i, T 1, s ] <= M s h o r t a g e [ i, T, s ] ; cc { t i n time, i i n maga } : sum{ s i n scen } p i [ s ] s h o r t a g e [ i, t, s ] <= 1 R e l i a b i l i t y ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL A Chance Constraint Formulation in SAMPL When implementing CCs in SAMPL, there is no need to define additional binary variables. There is also no need to define constraints similar to (12.14)-(12.16), as this is handled by SAMPL itself. In the SAMPL formulation, we remove the shortage variables and update the objective function. We also define the chance constraints as given in the listing below. Listing 12.9: RS model: CC formulation in SAMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n SAMPL param R e l i a b i l i t y = 0. 9 ; # The d e c l a r a t i o n o f the o t h e r v a r i a b l e s i s the same

114 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 108 # as f o r the s t o c h a s t i c model i n SAMPL var sumprod{maga, time2, scen } >= 0 s u f f i x s t a g e 2 ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n SAMPL subject to cc_time1 { i i n maga } : p r o b a b i l i t y { s i n scen : x [ i, 1, s ] >= D[ i, 1, s ] } >= R e l i a b i l i t y ; cc_t{ i i n maga } : p r o b a b i l i t y { s i n scen : y [ i, T 1, s ] >= D[ i, T, s ] } >= R e l i a b i l i t y ; cc_time { i i n maga, t i n time2 } : p r o b a b i l i t y { s i n scen : sumprod [ i, t, s ] >= D[ i, t, s ] } >= R e l i a b i l i t y ; socccanwork { i i n maga, t i n time2, s i n scen } : sumprod [ i, t, s ] = x [ i, t, s ] + y [ i, t 1, s ] ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n SAMPL Notice that we defined temporary variables sumprod which are equal to the sum of x and y. This was done due to a known SAMPL weakness, please refer to Appendix A for more information An Integrated Chance Constraint Formulation in AMPL The ICCP approach considers the news vendor problem to be feasible if the expected violation of the shortfall is less than a predefined value. To define the deterministic ICCP formulation of the Dakota problem we can make use of the u variables that in our model already define the shortfall. Let 0 U 1 be the maximum predefined shortfall. We add the following balance integrated chance constraints in their deterministic form: p s u its U i N, t T, s S (12.17) The AMPL code (where we use Shortfall to represent parameter U) for the deterministic ICC formulation is given as: Listing 12.10: RS model: ICC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param S h o r t f a l l = 3000; # The d e c l a r a t i o n o f v a r i a b l e s i s the same as f o r the # s t o c h a s t i c model i n AMPL # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL subject to i c c { i i n maga, s i n scen, t i n time } :

115 CHAPTER 12. NEWS VENDOR PROBLEM (TUTORIAL 3) 109 p i [ s ] u [ i, t, s ] <= S h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL An Integrated Chance Constraint Formulation in SAMPL The ICC formulation in SAMPL for the news vendor problem is given below. Listing 12.11: RS model: ICC formulation in SAMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n SAMPL param S h o r t f a l l = 3000; # The d e c l a r a t i o n o f v a r i a b l e s i s the same as f o r the # s t o c h a s t i c model i n SAMPL # Temporary v a r i a b l e param temp{maga, time, scen } ; f o r { i i n maga} { f o r { t i n time } { f o r { s i n scen } { l e t temp [ i, t, s ] := 0 ; } } } # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n SAMPL subject to i c c { i i n maga, t i n time } : expectation { s i n scen }( u [ i, t, s ] l e s s temp [ i, t, s ] ) <= S h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n SAMPL Observe that we defined a temporary variable and added it to the integrated chance constraints. This is due to a known SAMPL weakness. More details can be found in Appendix A.

116 Chapter 13 Supply Chain (Tutorial 5) In this chapter we introduce the fifth tutorial, a Supply Chain Management problem, also known as a Production/Transportation Model Introduction A clothing manufacturer produces goods in two factories: Southall and Leeds. The manufacturer supplies the retailers (London, Paris and Wien), called dealers, with three products, Shirts, Skirts and Jeans. The products are shipped to three main dealers in quantities of tens of thousands. The manufacturer can carry over inventory from one period to the next in the factories. A link or inventory variable is introduced which represents the amount of products transferred from one period to the next. The manufacturer knows the production costs, transportation costs, inventory costs for the next month with certainty. For simplicity it is assumed that all costs, production and inventory capacities are known and constant over the time horizon whereas the dealer requirements can vary for each time period (e.g. for seasonal reasons). Production rates and the initial inventory of each product at each factory are known as well. A shortage penalty is introduced if the demand is not met. Figure 13.1 shows the flow of products in the logistic network. Figure 13.1: Product flow in logistics network For simplicity we assume that uncertainty only occurs in the demand for the products in each of the future time periods T = [1, 2, 3, 4]. If we assume that we know the expected value of the demands then we can formulate a dynamic linear programming problem which is a deterministic expected value model. 110

117 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) Algebraic Formulation In this section we present the algebraic formulation of the supply chain problem. We begin by introducing some notation. Let F, P and D be respectively the set of factories, products and dealers. We have several costs associated with these three sets. Let PC pf be the cost of production one unit of product p at factory f, TC pfd be the cost of transporting one unit of product p at factory f to dealer d, IC pf be the cost of holding one unit product p at a certain factory f, and PE pd be cost (penalty) of not meeting demand for product p at dealer d. The production and inventory capacities are limited, hence, we have to introduce two new parameters. Let PQ pf be the production capacity of product p at factory f, and IQ pf be the storing capacity cost of product p at factory f. The initial inventory may not be empty so we define I pf as the amount of product p stored at factory f before the first time period. We also introduce Q pdt as the requirement (demand) of dealer d for product p at time t. The demand is not known a priori and its uncertainty is represented as a discrete set of future scenarios S, where each scenario s is associated with a probability prob s, and s S prob s = 1. We convert the uncertain demand into a deterministic demand by calculating the expected value of the demand from dealer d for product p at time t as Q pdt = s S Q pdts. We have then the following decision variables: p pft be the amount of product p manufactured at factory f at time t, s pfdt be the amount of product p sent from factory f to dealer d at time t, h pft be the amount of product p held at inventory at factory f at time t, and x pdt be the amount of shortage of product p at dealer d at time t. The whole set of parameters and variables is summarised on Table Indices (sets) Type Notation Description Range / Dimensions Parameters (data) Variables F P D T Factories Products Dealers Time intervals PC Cost of producing a Product at a certain Factory P, F PQ Production capacity of Product at Factory P, F TC Cost of transporting one unit of product from Factory to Dealer P, F, D PE Cost of not meeting demand for Product at Dealer P, D I Initial inventory of Product at Factory P, F IC Cost of holding one unit of Product at Factory P, F IQ Storing capacity of Product at Factory P, F Q Expected Requirement of Dealer for Product at Time P, D, T p Amount of Product manufactured at Factory at Time P, F, T s Amount of Product sent from Factory to Dealer at Time P, F, D, T h Amount of Product held at inventory at Factory at Time P, F, T x Amount of shortage of Product at Dealer at Time P, D, T Table 13.1: Supply Chain Model Entities Then our objective function is defined as the minimisation of all costs involved in the logistics network:

First, we have to guarantee that the sum of all products sent to a certain dealer plus its shortage is equal to its demand: d D s pfdt + x pdt = Q pdt, p P, d D, t T (13.

118 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 112 min PC pf p pft + TC fd s pfdt + (13.1) p P f F t T p P f F d D t T IC pf h pft + PE pd x pdt p P t T p P t T f F The supply chain model is subject to several constraints. First, we have to guarantee that the sum of all products sent to a certain dealer plus its shortage is equal to its demand: d D s pfdt + x pdt = Q pdt, p P, d D, t T (13.2) f F The next constraints are the Inventory Balance constraints. These are necessary for consistency, as they guarantee that, at time t, the sum of products that are sent to dealers plus products that are chosen to be stored is equal to what was stored before time t plus what is produced at time t. These are summarised in Figure Figure 13.2: Inventory balance constraint The inventory balance constraints are given as: p pf1 + I pf = h pf1 + d D s pfd1, p P, f F (13.3) p pft + h pft 1 = h pft + d D s pfdt, p P, f F, t = 2,..., T (13.4) We cannot hold more than the inventory capacity, hence we include the following constraints: h pft IQ pf, p P, f F, t T (13.5) Similarly, we cannot produce more items than a factory capacity, we also add the following restrictions: p pft PQ pf, p P, f F, t T (13.6) Finally, we add the nonnegativity constraints on all decision variables: p pft, h pft 0, p P, f F, t T (13.7) s pfdt 0, p P, f F, d D, t T x pdt 0, p P, d D, t T

119 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) Model Summary The Supply Chain (or Production/Transportation) model is summarised below: subject to: min PC pf p pft + TC fd s pfdt + (13.8) p P f F t T p P f F d D t T IC pf h pft + PE pd x pdt p P f F t T p P d D t T s pfdt + x pdt = Q pdt, p P, d D, t T f F p pf1 + I pf = h pf1 + d D s pfd1, p P, f F p pft + h pft 1 = h pft + d D s pfdt, p P, f F, t = 2,..., T h pft IQ pf, p pft PQ pf, p P, f F, t T p P, f F, t T p pft, h pft 0, s pfdt 0, x pdt 0, 13.4 Data Representation p P, f F, t T p P, f F, d D, t T p P, d D, t T The only stochastic parameter in the supply chain model is the demand, which depends on the product (P ), the dealer (D) and time period (T ). The uncertainty of the demand is represented by a set of discrete scenarios S. For the expected value problem, which is deterministic in nature, we calculate the expected value of the demand for each triple p, d, t. In this case the expected demand is represented as a three-dimensional matrix. The demand itself is represented by a four-dimensional matrix where we add the scenario index. Table 13.2 gives the demand representation in such case Expected Value Formulation in AMPL Before we proceed to present the expected value formulation in AMPL, we present in Table 13.3 a mapping of every algebraic entity to its AMPL equivalent. The expected value formulation assumes a deterministic demand. Then the expected value formulation of the Supply Chain problem is given by: Listing 13.1: Supply chain model: expected value problem in AMPL ### SETS ### set FACTORY; set PRODUCT; set DEALER ; ### PARAMETERS ### param NT := 4 ; param prodcost {PRODUCT, FACTORY} ; param p r o d C a p a c i t y {PRODUCT, FACTORY} ; param t r a n s C o s t {FACTORY, DEALER} ;

120 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 114 Product Dealer 1 Time = 1 Time = 2... Time = T Demand... Dealer D Time = 1 Time = 2... Time = T Scenario 1 Q 1,1,1,1 Q 1,1,2,1... Q 1,1,T,1... Q 1,D,1,1 Q 1,D,2,1... Q 1,D,T,1 1 2 Q 2,1,1,1 Q 2,1,2,1... Q 2,1,T,1... Q 2,D,1,1 Q 2,D,2,1... Q 2,D,T, P Q P,1,1,1 Q P,1,2,1... Q P,1,T,1... Q P,D,1,1 Q P,D,2,1... Q P,D,T,1 1 Q 1,1,1,2 Q 1,1,2,2... Q 1,1,T,2... Q 1,D,1,2 Q 1,D,2,2... Q 1,D,T,2 2 2 Q 2,1,1,2 Q 2,1,2,2... Q 2,1,T,2... Q 2,D,1,2 Q 2,D,2,2... Q 2,D,T, P Q P,1,1,2 Q P,1,2,2... Q P,1,T,2... Q P,D,1,2 Q P,D,2,2... Q P,D,T, Q 1,1,1,S Q 1,1,2,S... Q 1,1,T,S... Q 1,D,1,S Q 1,D,2,S... Q 1,D,T,S S 2 Q 2,1,1,S Q 2,1,2,S... Q 2,1,T,S... Q 2,D,1,S Q 2,D,2,S... Q 2,D,T,S P Q P,1,1,S Q P,1,2,S... Q P,1,T,S... Q P,D,1,S Q P,D,2,S... Q P,D,T,S Table 13.2: Demands for the Supply Chain problem, 64 scenarios Type Notation AMPL Name Range / Dimensions F FACTORY P PRODUCT set (indices) D DEALER T NT PC prodcost PRODUCT, FACTORY PQ prodcapacity PRODUCT, FACTORY TC transcost FACTORY, DEALER param (parameters) IC invcost PRODUCT, FACTORY IQ invcapacity PRODUCT, FACTORY Q demand PRODUCT, DEALER, NT p produce PRODUCT, FACTORY, NT var (variables) x shortage PRODUCT, FACTORY, NT PE s penalty send PRODUCT, DEALER PRODUCT, FACTORY, DEALER, NT I h initialinv hold PRODUCT, FACTORY PRODUCT, FACTORY, NT Table 13.3: Mapping algebraic entities to AMPL param demand{product, DEALER, 1.. NT} ; param p e n a l t y {PRODUCT, DEALER} ; param i n i t i a l I n v {PRODUCT, FACTORY} ; param i n v C o s t {PRODUCT, FACTORY} ; param i n v C a p a c i t y {PRODUCT, FACTORY} ; ### VARIABLES ###

121 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 115 var produce {PRODUCT, FACTORY, 1..NT} >= 0 ; var send {PRODUCT, FACTORY, DEALER, 1..NT} >= 0 ; var h o l d {PRODUCT, FACTORY, 1..NT} >= 0 ; var s h o r t a g e {PRODUCT, DEALER, 1..NT} >= 0 ; ### OBJECTIVE ### # min ProdCost + T r a n s p o r t a t i o n C o s t + I n v e n t o r y C o s t + ShortageCost minimize t o t a l C o s t s : (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( prodcost [ p, f ] produce [ p, f, t ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, d i n DEALER, t i n 1..NT} ( t r a n s C o s t [ f, d ] send [ p, f, d, t ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( i n v C o s t [ p, f ] h o l d [ p, f, t ] ) ) + (sum{p i n PRODUCT, d i n DEALER, t i n 1..NT} ( p e n a l t y [ p, d ] s h o r t a g e [ p, d, t ] ) ) ; ### CONSTRAINTS ### subject to d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1..NT} : (sum{ f i n FACTORY} send [ p, f, d, t ] ) + s h o r t a g e [ p, d, t ] = demand [ p, d, t ] ; invtime0 {p i n PRODUCT, f i n FACTORY, t i n } : produce [ p, f, t ] + i n i t i a l I n v [ p, f ] = hold [ p, f, t ] + sum{d i n DEALER} send [ p, f, d, t ] ; i n v B a l a n c e {p i n PRODUCT, f i n FACTORY, t i n 2..NT} : produce [ p, f, t ] + h o l d [ p, f, ( t 1) ] = h old [ p, f, t ] + sum{d i n DEALER} send [ p, f, d, t ] ; i n v C a p a c i t y C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1..NT} : h old [ p, f, t ] <= i n v C a p a c i t y [ p, f ] ; p r o d C a p a c i t y C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1..NT} : produce [ p, f, t ] <= p r o d C a p a c i t y [ p, f ] ; The data is represented in a separate data file as: Listing 13.2: Supply chain model: data file for the expected value problem in AMPL set DEALER := London P a r i s Wien ; set FACTORY := S o u t h a l l Leeds ; set PRODUCT := S k i r t s S h i r t s Jeans ; param prodcost : S o u t h a l l Leeds := S k i r t s S h i r t s 0 7 Jeans ; param p r o d C a p a c i t y : S o u t h a l l Leeds := S k i r t s S h i r t s 0 60 Jeans ; param i n i t i a l I n v : S o u t h a l l Leeds :=

122 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 116 S k i r t s 0 5 S h i r t s 5 7 Jeans 3 5 ; param i n v C o s t : S o u t h a l l Leeds := S k i r t s S h i r t s Jeans ; param i n v C a p a c i t y : S o u t h a l l Leeds := S k i r t s S h i r t s 0 25 Jeans ; param t r a n s C o s t : London P a r i s Wien := S o u t h a l l Leeds ; param p e n a l t y : London P a r i s Wien := S k i r t s S h i r t s Jeans ; param demand [, London, ] : := S h i r t s S k i r t s Jeans [, P a r i s, ] : := S h i r t s S k i r t s Jeans [, Wien, ] : := S h i r t s S k i r t s Jeans ; The data presented was arbitrarily chosen A Stochastic Formulation in AMPL In this section we address the changes to the Supply Chain model when we consider an uncertain demand. In this example, we define S = 8 different scenarios, and for each scenario s we define δ ts, t = 2,..., T as an additional parameter which will be used to calculate the value of the actual demand in each possible scenario in the following way: Q pd1s = Q pd1, p P, d D, s S (13.9) Q pdts = δ ts Q pdt, p P, d D, s S, t = 2,..., T (13.10) Time t = 1 represents the first stage, where the demand is known with certainty, and time t > 1 represents the second stage. The stochastic formulation of the Supply Chain problem is given by:

123 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 117 min ( prob s PC pf p pfts + TC fd s pfdts + (13.11) s S p P f F t T p P f F d D t T IC pf h pfts + ) PE pd x pdts p P t T p P t T f F d D subject to: s pfdts + x pdts = Q pdts, p P, d D, t T, s S (13.12) f F p pf1s + I pf = h pf1s + d D s pfd1s, p P, f F, s S (13.13) p pfts + h pft 1s = h pfts + d D s pfdts, p P, f F, t = 2,..., T, s S (13.14) h pfts IQ pf, p P, f F, t T, s S (13.15) p pfts PQ pf, p P, f F, t T, s S (13.16) p pft1 = p pfts, p P, f F, t T, s S (13.17) h pft1 = h pfts, p P, f F, t T, s S x pft1 = x pfts, p P, f F, t T, s S s pftd1 = s pftds, p P, f F, t T, d D, s S p pfts, h pft 0, p P, f F, t T, s S (13.18) s pfdts 0, p P, f F, d D, t T, s S x pdts 0, p P, d D, t T, s S (13.19) Equations are non-anticipativity constraints that guarantee that the decisions made for time t = 1 are the same over all scenarios A Stochastic Formulation in AMPL In this section we present the deterministic equivalent of the stochastic Supply Chain problem. The AMPL entities are defined similarly to the ones in Table 13.3, with the exceptions present in Table The AMPL stochastic formulation is given by: Listing 13.3: Supply chain model: stochastic formulation in AMPL ### SETS ### set FACTORY; set PRODUCT; set DEALER ; ### PARAMETERS ### param NT := 4 ;

124 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 118 Type Notation AMPL Name Range / Dimensions set (indices) S NS δ delta NT, NS param (parameters) Q randomdemand PRODUCT, DEALER, NT, NS p produce PRODUCT, FACTORY, NT, NS var (variables) s send PRODUCT, FACTORY, DEALER, NT, NS h hold PRODUCT, FACTORY, NT, NS x shortage PRODUCT, FACTORY, NT, NS Table 13.4: Mapping stochastic algebraic entities to AMPL param NS := 8 ; param prodcost {PRODUCT, FACTORY} ; param p r o d C a p a c i t y {PRODUCT, FACTORY} ; param t r a n s C o s t {FACTORY, DEALER} ; param demand{product, DEALER, 1.. NT} ; param p e n a l t y {PRODUCT, DEALER} ; param i n i t i a l I n v {PRODUCT, FACTORY} ; param i n v C o s t {PRODUCT, FACTORY} ; param i n v C a p a c i t y {PRODUCT, FACTORY} ; param d e l t a { 1..NT, 1.. NS} ; param randomdemand{product, DEALER, 1.. NT, Scen } ; #SCENARIO SET set Scen : = 1.. NS ; #PROBABILITIES param prob { Scen } := 1/NS ; ### VARIABLES ### var produce {PRODUCT, FACTORY, 1.. NT, Scen}>= 0 ; var send {PRODUCT, FACTORY, DEALER, 1.. NT, Scen}>= 0 ; var h o l d {PRODUCT, FACTORY, 1.. NT, Scen } >= 0 ; var s h o r t a g e {PRODUCT, DEALER, 1.. NT, Scen } >= 0 ; ### OBJECTIVE ### minimize t o t a l C o s t s : sum{ s i n Scen } prob [ s ] ( (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( prodcost [ p, f ] produce [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, d i n DEALER, t i n 1..NT} ( t r a n s C o s t [ f, d ] send [ p, f, d, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( i n v C o s t [ p, f ] h o l d [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, d i n DEALER, t i n 1..NT} ( p e n a l t y [ p, d ] s h o r t a g e [ p, d, t, s ] ) ) ) ; ### CONSTRAINTS ### subject to d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1.. NT, s i n Scen } : (sum{ f i n FACTORY} send [ p, f, d, t, s ] ) + s h o r t a g e [ p, d, t, s ] = randomdemand [ p, d, t, s ] ;

125 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 119 invtime0 {p i n PRODUCT, f i n FACTORY, t i n 1.. 1, s i n Scen } : produce [ p, f, t, s ] + i n i t i a l I n v [ p, f ] = hold [ p, f, t, s ] + sum{d i n DEALER} send [ p, f, d, t, s ] ; i n v B a l a n c e {p i n PRODUCT, f i n FACTORY, t i n 2.. NT, s i n Scen } : produce [ p, f, t, s ] + h o l d [ p, f, ( t 1), s ] = hold [ p, f, t, s ] + sum{d i n DEALER} send [ p, f, d, t, s ] ; i n v C a p C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1.. NT, s i n Scen } : h old [ p, f, t, s ] <= i n v C a p a c i t y [ p, f ] ; p r o d C a p C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1.. NT, s i n Scen } : produce [ p, f, t, s ] <= p r o d C a p a c i t y [ p, f ] ; nant1 {p i n PRODUCT, f i n FACTORY, s i n Scen, t i n } : produce [ p, f, t,1]= produce [ p, f, t, s ] ; nant2 {p i n PRODUCT, f i n FACTORY, d i n DEALER, s i n Scen, t i n } : send [ p, f, d, t,1]= send [ p, f, d, t, s ] ; nant3 {p i n PRODUCT, f i n FACTORY, s i n Scen, t i n } : h old [ p, f, t,1]= h o l d [ p, f, t, s ] ; nant4 {p i n PRODUCT, d i n DEALER, s i n Scen, t i n } : s h o r t a g e [ p, d, t,1]= s h o r t a g e [ p, d, t, s ] ; The data is mostly the same as for the expected value problem, except for the calculation of randomdemand. This is given below, where we arbitrarily defined delta. Listing 13.4: Supply chain model: data for stochastic formulation in AMPL param d e l t a : := ; f o r { s i n 1.. NS} { l e t {p i n PRODUCT, d i n DEALER} randomdemand [ p, d, 1, s ] := demand [ p, d, 1 ] ; f o r { t i n 2..NT} { l e t {p i n PRODUCT, d i n DEALER} randomdemand [ p, d, t, s ] := randomdemand [ p, d, t 1, s ] d e l t a [ t, s ] ; } } 13.8 A Stochastic Formulation in SAMPL The equivalent SAMPL formulation of the Supply Chain problem is given in listing Listing 13.5: Supply Chain model: stochastic formulation in SAMPL ### SETS ### set FACTORY; set PRODUCT; set DEALER ; ### PARAMETERS ### param NT := 4 ; param NS := 8 ;

126 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 120 param prodcost {PRODUCT, FACTORY} ; param p r o d C a p a c i t y {PRODUCT, FACTORY} ; param t r a n s C o s t {FACTORY, DEALER} ; param demand{product, DEALER, 1.. NT} ; param p e n a l t y {PRODUCT, DEALER} ; param i n i t i a l I n v {PRODUCT, FACTORY} ; param i n v C o s t {PRODUCT, FACTORY} ; param i n v C a p a c i t y {PRODUCT, FACTORY} ; param d e l t a { 1..NT, 1.. NS} ; #SCENARIO SET s c e n a r i o s e t Scen := 1.. NS ; tree Tree := twostage ; #PROBABILITIES p r o b a b i l i t y param prob { Scen } := 1/NS ; random param randomdemand{product, DEALER, 1.. NT, Scen } ; ### VARIABLES ### var produce {PRODUCT, FACTORY, t i n 1.. NT, Scen}>= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; var send {PRODUCT, FACTORY, DEALER, t i n 1.. NT, Scen}>= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; var h o l d {PRODUCT, FACTORY, t i n 1.. NT, Scen } >= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; var s h o r t a g e {PRODUCT, DEALER, t i n 1.. NT, Scen } >= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; ### OBJECTIVE ### minimize t o t a l C o s t s : sum{ s i n Scen } prob [ s ] ( (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( prodcost [ p, f ] produce [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, d i n DEALER, t i n 1..NT} ( t r a n s C o s t [ f, d ] send [ p, f, d, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( i n v C o s t [ p, f ] h o l d [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, d i n DEALER, t i n 1..NT} ( p e n a l t y [ p, d ] s h o r t a g e [ p, d, t, s ] ) ) ) ; ### CONSTRAINTS ### subject to d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1.. NT, s i n Scen } : (sum{ f i n FACTORY} send [ p, f, d, t, s ] ) + s h o r t a g e [ p, d, t, s ] = randomdemand [ p, d, t, s ] ; invtime0 {p i n PRODUCT, f i n FACTORY, t i n 1.. 1, s i n Scen } : produce [ p, f, t, s ] + i n i t i a l I n v [ p, f ] = hold [ p, f, t, s ] + sum{d i n DEALER} send [ p, f, d, t, s ] ; i n v B a l a n c e {p i n PRODUCT, f i n FACTORY, t i n 2.. NT, s i n Scen } : produce [ p, f, t, s ] + h o l d [ p, f, ( t 1), s ] = hold [ p, f, t, s ] + sum{d i n DEALER} send [ p, f, d, t, s ] ; i n v C a p C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1.. NT, s i n Scen } :

127 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 121 h old [ p, f, t, s ] <= i n v C a p a c i t y [ p, f ] ; p r o d C a p C o n s t r a i n t {p i n PRODUCT, f i n FACTORY, t i n 1.. NT, s i n Scen } : produce [ p, f, t, s ] <= p r o d C a p a c i t y [ p, f ] ; The variables stages are defined with the suffix keyword, and the non-anticipativity constraint is no longer necessary. The data file for both SAMPL and stochastic AMPL is the same A Chance Constraint Formulation in AMPL In the stochastic formulation we addressed the problem of meeting the demand by introducing a penalty for any shortfall. In this section we will present a different approach: we would like to satisfy the demand of each product from each dealer with a probability greater than a certain reliability. This is represented as a Chance Constraint (CC). Let 0 R 1 be the desired reliability. We no longer need to have the shortage variables x, but we need to know whether the demand will be met or not. In the CC deterministic formulation, we have to add binary variables to correct represent the chance constraint. We define: y pdts to be 1 if a shortage of product p happens at dealer d, time t and scenario s, 0 otherwise. We no longer need to minimise the shortage penalty, so the new objective is given by: min ( prob s PC pf p pfts + TC fd s pfdts + (13.20) s S p P f F t T p P f F d D t T ) IC pf h pfts p P f F t T We have however to add the following constraints to ensure that we meet the required reliability level: prob s y pdts 1 R, p P, d D, t T (13.21) s S The model is also subject to constraints The problem with the constraints above is that there is nothing guaranteeing that the y variables will be one in case a shortage happens. We have to add other constraints to guarantee that: Q pdts f F s pfdts My pdts, p P, d D, t T, s S (13.22) Here M is a constant that is large enough to be greater than all demands Q pdts. The AMPL code for the deterministic CC formulation is given as: Listing 13.6: Supply chain model: CC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param R e l i a b i l i t y := 0. 5 ; ### VARIABLES ### var produce {PRODUCT, FACTORY, 1.. NT, Scen}>= 0 ;

128 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 122 var send {PRODUCT, FACTORY, DEALER, 1.. NT, Scen}>= 0 ; var h o l d {PRODUCT, FACTORY, 1.. NT, Scen } >= 0 ; var s h o r t a g e {PRODUCT, DEALER, 1.. NT, Scen } >= 0 b i n a r y ; ### OBJECTIVE ### minimize t o t a l C o s t s : sum{ s i n Scen } prob [ s ] ( (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( prodcost [ p, f ] produce [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, d i n DEALER, t i n 1..NT} ( t r a n s C o s t [ f, d ] send [ p, f, d, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( i n v C o s t [ p, f ] h o l d [ p, f, t, s ] ) ) ; ### CONSTRAINTS ### subject to # Force s h o r t a g e to 1 i f demand i s not s a t i s f i e d d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1.. NT, s i n Scen } : randomdemand [ p, d, t, s ] (sum{ f i n FACTORY} send [ p, f, d, t, s ] ) <= s h o r t a g e [ p, d, t, s ] ; # Force p r o b a b i l i t y o f s h o r t a g e to be l e s s than d e s i r e d r e l i a b i l i t y carda {p i n PRODUCT, d i n DEALER, t i n 1..NT} : sum{ s i n Scen } prob [ s ] s h o r t a g e [ p, d, t, s ] <= 1 R e l i a b i l i t y ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL Observe that we changed the variable shortage to be binary, and we added the newly defined constraints (we arbitrarily used M = 5000). We do not repeat the declaration of sets, parameters and the remaining constraints as they are the same as in listing A Chance Constraint Formulation in SAMPL When implementing CCs in SAMPL, there is no need to define additional binary variables. There is also no need to define constraints similar to 13.22, as this is handled by SAMPL itself. In the SAMPL formulation, we remove the shortage variables and update the objective function. We also define the chance constraints as given in the listing below. Listing 13.7: Supply chain model: CC formulation in SAMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n SAMPL param R e l i a b i l i t y := 0. 5 ; ### VARIABLES ### var produce {PRODUCT, FACTORY, t i n 1.. NT, Scen}>= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; var send {PRODUCT, FACTORY, DEALER, t i n 1.. NT, Scen}>= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; var h o l d {PRODUCT, FACTORY, t i n 1.. NT, Scen } >= 0, s u f f i x s t a g e i f ( t =1) then 1 e l s e 2 ; ### OBJECTIVES ### minimize t o t a l C o s t s : sum{ s i n Scen } prob [ s ] (

129 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 123 (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( prodcost [ p, f ] produce [ p, f, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, d i n DEALER, t i n 1..NT} ( t r a n s C o s t [ f, d ] send [ p, f, d, t, s ] ) ) + (sum{p i n PRODUCT, f i n FACTORY, t i n 1..NT} ( i n v C o s t [ p, f ] h o l d [ p, f, t, s ] ) ) ) ; ### CONSTRAINTS ### subject to d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1..NT} : p r o b a b i l i t y { s i n Scen : sum{ f i n FACTORY} ( send [ p, f, d, t, s ] ) >= randomdemand [ p, d, t, s ] } >= R e l i a b i l i t y ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n SAMPL An Integrated Chance Constraint Formulation in AMPL The ICCP approach considers the Supply Chain problem to be feasible if the expected violation of the shortfall is less than a predefined value. In the deterministic ICCP formulation of the Supply Chain problem, let 0 U 1 be the maximum predefined shortfall. We then add variables to represent the excess of the items produced in relation to the demand. Notice that this model includes shortage variables, hence it is not necessary to define additional ones. We define: e pdts to represent the excess of product p at dealer d, time t and scenario s. We add the following balance constraint to the original stochastic model: s pfdts D pdts = e pdts x pdts, p P, d D, t T, s S (13.23) f F The model is also subject to constraints Then we can add the ICC constraint in its deterministic form: prob s h pdts U p P, s S (13.24) s S The AMPL code (where we use excess to represent e and shortfall for U) for the deterministic ICC formulation is given as: Listing 13.8: Supply chain model: ICC formulation in AMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param s h o r t f a l l := 5 0 ; ### VARIABLES ### var produce {PRODUCT, FACTORY, 1.. NT, Scen}>= 0 ; var send {PRODUCT, FACTORY, DEALER, 1.. NT, Scen}>= 0 ; var h o l d {PRODUCT, FACTORY, 1.. NT, Scen } >= 0 ; var s h o r t a g e {PRODUCT, DEALER, 1.. NT, Scen } >= 0 ; var e x c e s s {PRODUCT, DEALER, 1.. NT, Scen } >= 0 ; ### CONSTRAINTS ###

130 CHAPTER 13. SUPPLY CHAIN (TUTORIAL 5) 124 subject to d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1.. NT, s i n Scen } : (sum{ f i n FACTORY} send [ p, f, d, t, s ] ) randomdemand [ p, d, t, s ] = e x c e s s [ p, d, t, s ] s h o r t a g e [ p, d, t, s ] ; E x p e c t a t i o n {p i n PRODUCT, d i n DEALER, t i n 1..NT} : sum { s i n Scen } prob [ s ] s h o r t a g e [ p, d, t, s ] <= s h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL An Integrated Chance Constraint Formulation in SAMPL The ICC formulation in SAMPL does not require the definition of extra variables and the balance constraint. The formulation is given below: Listing 13.9: Supply chain model: ICC formulation in SAMPL # The d e c l a r a t i o n o f the r e m a i n i n g p a r a m e t e r s and s e t s i s # the same as f o r the s t o c h a s t i c model i n AMPL param s h o r t f a l l = 5 0 ; ### VARIABLES ### var produce {PRODUCT, FACTORY, 1.. NT, Scen}>= 0 ; var send {PRODUCT, FACTORY, DEALER, 1.. NT, Scen}>= 0 ; var h o l d {PRODUCT, FACTORY, 1.. NT, Scen } >= 0 ; var s h o r t a g e {PRODUCT, DEALER, 1.. NT, Scen } >= 0 ; # The o b j e c t i v e f u n c t i o n i s the same as f o r the # s t o c h a s t i c model i n AMPL ### CONSTRAINTS ### subject to # I n t e g r a t e d Chance c o n s t r a i n t, SAMPL f o r m u l a t i o n d e a l e r R e q {p i n PRODUCT, d i n DEALER, t i n 1..NT} : expectation { s i n Scen }( randomdemand [ p, d, t, s ] l e s s (sum{ f i n FACTORY} send [ p, f, d, t, s ] ) ) <= S h o r t f a l l ; # The r e m a i n i n g c o n s t r a i n t s a r e the same as # f o r the s t o c h a s t i c model i n AMPL

131 Chapter 14 Scenario Generation 14.1 Issues and desirable properties in scenario generation The creation of the scenario tree used in scenario based stochastic programming is usually performed by specialized applications called scenario generators. The process of capturing the computational SP models involves therefore two distinct modelling steps, which are: Expressing the logic of the application s domain into a (SP) decision model Representing the randomness properties of the application s domain with stochastic processes and create scenario trees Depending on the domain of application, the domain experts typically apply well established models to specify the random model parameters, finely tuned and matching the problem at hand. For instance in financial applications CAPM, GARCH, Geometric Brownian Motion, Regime Switching Markov models have been extensively used. For a more comprehensive description of scenario generators for financial models, the readers are referred to some recent research reports and publications of CARISMA ([49], [46]). Other domains have other typical models; a short list of such domains and common techniques is given in Table The next necessary step is to connect the chosen scenario generator to the decision model; this is usually performed by importing the generated scenario tree in the form of data structured typically as a multidimensional table into the modelling system. The quality of the decision resulting from the stochastic programming problem is dependent on both the decision model and the scenario generation process. Some desirable properties for scenario generators are described in [42] and [57]; these can be summarized as: 1. Correctness: The generated scenario sets should be correct representations of our random parameters distributions; not knowing the distribution leads us to different descriptive models which give alternative representations of our parameters dynamics. It is important to choose the model that best captures the aspects of the dynamics of the random parameters that are important in the context of the decision problem. 2. Consistency: In case of multiple related random parameters, the values of these, under any particular scenario, should be consistent with each other. This issue arises when there are domain specific rules which apply to two or more of the generated random parameters: the generated scenarios, which include values for both the parameters, should be consistent with the domain rules (i.e. in finance, generated prices for different type of assets might have to satisfy the arbitrage free condition, or other logical inconsistencies between parameters values). 3. Stability: Stability for a scenario generation method is considered in respect of a particular decision model. A scenario generator is stable in respect to a decision problem if the deci- 125

132 CHAPTER 14. SCENARIO GENERATION 126 Modelling Paradigm Econometric Models and Time Series Geometric Brownian Motion Generalised Wiener Processes Artificial Intelligence Statistical Approaches Sampling SG Method Origin Application Field Reference AR(p) MA(q) ARMA(p, q) GARCH VAR BVAR Reduced Rank Regression Pro- Wiener cesses Neural Gas Property Matching Moment Matching Non Parametric Methods SG Forecasting Methods Sam- Random pling Stratified Sampling Autoregressive Models and Generation of Data Trajectories Moving Average Models and Generation of Data Trajectories Autoregressive Moving Average Models and Generation of Data Trajectories Generalised Autoregressive Conditional Heteroscedasticity and Generation of Data Trajectories Vector Auto Regressive Models and Generation of Data Trajectories Bayesian Vector Auto Regressive Models and Generation of Data Trajectories Generation of Data Trajectories Brownian Motion and Diffusion Processes Brownian Motion with drift and Diffusion Processes Neural Networks Statistical Approximation Moment Fitting Discretisation Quantile Regression and Forecasting Methods Discrete Sampling Interval Sampling Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance, Supply chains, Environment models Finance and Environment models Finance and Environment models Supply Chains, Energy, Environment models Supply Chains, Energy models Supply Chains, Energy, Environment models Finance and Environment models Finance, Supply chains, Environment models Finance and Environment models Finance and Environment models [8] [8] [8] [7, 22] [25] [43] [23] [27] [7] [45, 28] [38] [37] [38] [18] [41, 40] [41, 40] Bootstrap Discrete Sampling Finance models [19, 20] Monte Carlo Sampling Finance models [39] Markov Chains VECM Probability Interval Sampling Random path and Vector Error correction Finance, Energy, Supply Chains, Environment [39] Finance models [55] Table 14.1: List of SG methods and applications sions which are outcome of the decision problem do not vary significantly across multiple runs. Defining the decision model as a simple single stage model as: min x X F (x, ξ t ) (14.1) where ξ is a stochastic variable, and using a notation where { ξ} is a stochastic process, ξ is

133 CHAPTER 14. SCENARIO GENERATION 127 a discrete stochastic variable and { ξ} a discrete stochastic process (thus a scenario tree), [42] define, over K generated scenario trees ˇξ tk and the same number of obtained solutions of the problem x k, k = 1,..., K in-sample stability: and out-of-sample stability: F (x k, ˇξ tk ) = F (x l, ˇξ tl ) k, l {1,..., K} (14.2) F (x k, ξ tk ) = F (x l, ξ tl ) k, l {1,..., K} (14.3) The in-sample stability tells us therefore how stable the scenario generator is when used with the considered decision model or, in other words, how much the objective function value changes when solving the decision model using different trees generated with the same scenario generator and the same parameters. The out-of-sample stability evaluates the solutions obtained through the scenario generator against the real distribution; in the (very realistic) case that the real distribution is not known, the solutions can be evaluated against another scenario generator that has proven to be reliable with the current decision model or back-tested against historical data Overview of SG methods Scenario generation is possible following various general approaches; on top of these techniques there are methods to then reduce the number of scenarios to a tractable case. The common aim of all methods and techniques is to be able to approximate a distribution with a treatable scenario tree; the following categorization is largely taken from [42]. 1. Generating Scenarios Conditional Sampling At every node of the scenario tree, realizations of the stochastic process { ξ t } are sampled, either by sampling directly from the distribution of { ξ t } or by evolving the process discretely, according to a formula of the type ξ t+1 = z( ξ t, ε) where ε is the current random vector. Traditional sampling methods have a limitation: when sampling from more than a random variable, there is the need to sample every marginal separately and combine them afterwards, generating a tree whose size grows exponentially with the dimensions of the random vector. Sampling from specified marginals and correlations To overcome the difficulties in generating multivariate vectors, especially if correlated, these methods require the specification of the marginal distributions and the correlation matrix. Copulas are often used in these methods to bind together the various marginals. Moment matching If the distributions are not known, they can be described using their moments (mean, variance, skewness etc.), the correlation matrix in case of multivariate vectors and possibly other statistical properties. A discrete distribution can then be constructed that matches the given statistical properties ([36, 51]. Bootstrapping can be seen as a moment matching technique, in which the desired distribution is created by mean of values sampled directly from the original distribution s values. Path-based methods In these methods, whole paths (or fans) are generated evolving the stochastic process over time, one for each scenario. These scenarios have then to be clustered into a scenario tree of the desired shape.

134 CHAPTER 14. SCENARIO GENERATION Related techniques Clustering Clustering is the technique used to convert a set of scenarios in form of fans to a scenario tree. See [16] or [32] for a combined clustering/reduction approach. Internal sampling methods These methods differ from the others, as the sampling of scenarios is performed during the solution procedure. Most important methods of this kind are stochastic decomposition [34], importance sampling in Benders decomposition and stochastic quasigradient methods. Scenario reduction Scenario reduction is a method to decrease the size of an already generated scenario tree, trying to find a scenario subset of prescribed size that is closest to the initial distribution in terms of defined probability metrics ([17, 33, 32]). In general, all these methods can be divided into Statistical models or other models, depending whether the underlying random process is describing the real world based on scientific/mathematical theories of the given field, or just based on statistical properties (i.e. moments) of observed historical data. Figure 14.1: Scenario Generation methods There are some common models of randomness, whose use spreads across various application fields and are therefore listed below. They are not direct scenario generation techniques, as they generate time series, but continuous time models from which a scenario tree can be derived using sampling or clustering methods. Diffusion processes Diffusion processes are widely used in finance to model the future evolution of stock prices, interest rates and mortality ratio. They are continuous time models, and include:

135 CHAPTER 14. SCENARIO GENERATION 129 Wiener Processes (Brownian motion) Brownian motion is one of the simplest stochastic processes, and it is described in a mathematically convenient form. It is traditionally regarded as discovered by Robert Brown in 1828 [9] applied to the movement of pollen particles, mathematically formalized in 1880 by Thorvald N. Theile [53], but its first well known application is due to Albert Einstein in 1905 as a description of the movement of small particles in a stationary fluid [21]. A Wiener process is a stochastic process which is defined as: dz = ε dt, where ε N(0, 1) (14.4) and dz is the drift of the value of the process in dt. The process has the following important properties: 1. It is a Markov process, so future probability distributions depend only on the current value of the process and not on past values or other information 2. The increments over two defined time intervals are independent 3. Changes in the process over any finite time interval are normally distributed with a variance which increases linearly over time Generalized Wiener Processes (Brownian motion with drift) A generalized Wiener process is defined as follows: dx = αdt + σdz (14.5) where dz is the increment of a standard Wiener process, α is the drift parameter and σ is the standard deviation. Over any time interval t, the corresponding x is normally distributed with mean αdt and variance σ 2 dz. Ito Processes The Ito process is a generalization of Brownian motion with drift and can be expressed as: dx = a(x, t)dt + b(x, t)dz (14.6) where dz is the increment of a standard Wiener process, a(x, t) and b(x, t) are the drift and the standard deviation expressed as functions of the current state and time. A particular case of the Ito process is the Geometric Brownian Motion (GBM), where a(x, t) = α and b(x, t) = σ. The use or one or the other models depends on the assumptions made on the described reality: in case of modelling share prices, if we assume that the expected percentage return and the variance of the return are independent from the current price, the price can be modelled by a GBM, otherwise not. Time series Time series are commonly used to estimate parameters which explain the behaviour of a random variable based on past observations. Three broad category of models of practical importance are the autoregressive (AR) models, the integrated models (I) and the moving average (MA) models, which can be combined and which all assume linear dependence between the current data point and the previous one(s). Non-linear dependency is possible, like in the conditional heteroskedasticity models, in which the variance varies over time. Autoregressive models: AR(p)

136 CHAPTER 14. SCENARIO GENERATION 130 AR model of order p assume that the current value of a random variable depends solely on the past p observations of the same variable. The general form of an AR(p) process is: X t = c + p ϕ i X t i + ε t, where ε t IIDN(0, σ 2 ) (14.7) i=1 where X t is the value of the process at time t, ϕ i are the parameters of the model, c is a constant and ε t is white noise. The AR(1) model is known as random walk. Moving Average models: MA(q) The moving average model is conceptually a linear regression of the current value of the random variable against previous (unobserved) white noise terms. A moving average model of order q is expressed as: X t = µ + ε t + q θ i ε t i, where ε t IIDN(0, σ 2 ) (14.8) i=1 where X t is the value of the series at time t, θ i are the parameters of the model, µ is the mean of the series and ε t are white noise error terms. Autoregressive Moving Average models: ARMA(p,q) Autoregressive moving average models are defined as a combination of AR and MA models. An ARMA(p, q) model is therefore expressed as: X t = c + ε t + p ϕ i X t i + i=1 q θ i ε t i, where ε t IIDN(0, σ 2 ) (14.9) where all the symbols have the meaning defined for the AR and the MA formulations. Autoregressive Conditional Heteroskedasticity models: ARCH(q) i=1 This class of AR models assume that the random variable is characterized by non-constant variance over time, that is: the variance of the current error is considered to be a function of the values of the previous time periods errors. The process is then modelled with an AR(q) model, as: X t = c + where ε t = σ t z t, with z t IIDN(0, 1), and with: p ϕ i X t i + ε t (14.10) i=1 where α 0 > 0 and α i 0, i > 0. q σt 2 = α 0 + α i ε 2 t 1 (14.11) i+1 Generalized Autoregressive Conditional Heteroskedasticity models: GARCH(p,q) A GARCH(p, q) model assumes an ARMA(p, q) model for the error variance, therefore the error term for the model process is given by: σ 2 t = α 0 + q α i ε 2 t i + i=1 p β i σt i, 2 where ε t IIDN(0, σ 2 ) (14.12) i=1

137 Chapter 15 Decision Evaluation Introduction, following bits from IBM presentation Descriptive Models as defined by a set of mathematical relations, which simply predicts how a physical, industrial or a social system may behave. Normative Models constitute the basis for (quantitative) decision making by a superhuman following an entirely rational that is, logically scrupulous set of arguments. Hence quantitative decision problems and idealised decision makers are postulated in order to define these models. Prescriptive Models involve systematic analysis of problems as carried out by normally intelligent persons who apply intuition and judgement. Two distinctive features of this approach are uncertainty analysis and preference (or value or utility) analysis. Decision Models are in some sense a derived category as they combine the concept underlying the normative models and prescriptive models. Then: Data Model Decision Model: Constrained optimisation Descriptive Model: Simulation and Evaluation Then steps for decision evaluation The modelling paradigms revisited Then Stage 0: Analyse historical data [data model] Stage 1: Create data paths for random parameter values [descriptive models] Stage 2: Make decision using SP or DP [decision models] (data processes and decision processes are inter-twined) Stage 3: Test the decisions using simulation Back testing, stress testing, out of sample [descriptive models] 131

138 CHAPTER 15. DECISION EVALUATION 132 Figure 15.1: Decision evaluation methodology

Tutorial. Getting started. Sample to Insight. March 31, 2016

Tutorial. Getting started. Sample to Insight. March 31, 2016 Getting started March 31, 2016 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com Getting started