Incorporating Demand Response with Load Shifting into Stochastic Unit Commitment

Incorporating Demand Response with Load Shifting into Stochastic Unit Commitment Frank Schneider Supply Chain Management & Management Science, University of Cologne, Cologne, Germany, frank.schneider@uni-koeln.de Diego Klabjan Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL, USA, d-klabjan@northwestern.edu Ulrich W. Thonemann Supply Chain Management & Management Science, University of Cologne, Cologne, Germany, ulrich.thonemann@uni-koeln.de High costs for fossil fuels and increasing installations of intermittent energy sources are imposing major challenges on power grid management. Uncertainty in generation and demand for electric energy require flexible generation capacity and stochastic optimization of generation schedules. Emerging smart grid technology is one mean to increase efficiency in power generation and to mitigate effects of increasing uncertainty. In this paper, we analyze the potential of demand side resources (DSRs) that can be dispatched to reduce load at peak times. We present a stochastic dynamic programming model for the unit commitment problem in a day ahead market and include dispatch decisions for DSRs. Unlike previous research, we model the stochastic load shifting effect to previous and subsequent periods that must be taken into account when making dispatch decisions. We also present a solution algorithm that combines an approximate dynamic programming algorithm with stochastic progressive hedging to solve the problem. We prove convergence results for our algorithms and derive a lower bound on the optimal solution. Using data from the California Independent System Operator region, we show that our approach can solve real world instances in reasonable time. We conduct an extensive numerical study using real world data to show that substantial savings can be achieved by utilizing DSRs. Key words : Demand Response, Stochastic Unit Commitment, Approximate Dynamic Programming, Stochastic Progressive Hedging History : 1. Introduction The share of intermittent power sources in electric power generation has considerably increased during the last few years. Power generation from wind and solar sources in California, for instance, has increased by 27% from 2011 to 2012 and amounted to 6% of total generation in 2012 (U.S. Energy Information Administration (2013)). By 2020, one third of generated electricity is expected to be delivered from renewable sources and a large share will be generated by intermittent sources (California State Senate (2011)). The actual generation of intermittent sources depends to a large extend on external factors that cannot be controlled, such as wind speed and solar irradiation. 1

2 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response Because the generation of intermittent sources is fluctuating, a network with a large capacity of intermittent sources requires more flexibility of conventional sources than a distribution network with a low capacity of intermittent sources. Systems with large intermittent capacity require large reserve capacity to maintain system stability and meet demand. Especially during peak load hours, when many units are operating at maximum capacity, the marginal cost of generation is high. Emerging smart grid technology provides a mean to mitigate the cost effect of increased generation uncertainty. By integrating demand side resources (DSRs) into the distribution network, the demand profile can be controlled to match demand and generation. DSRs are grid-connected loads that can - to a certain extend - be remote controlled. Load aggregators contract a number of loads and bid negative generation capacity into the market for electric energy. When a bid is cleared, i.e., a DSR is dispatched, the aggregator provides a reduction in consumption of electric energy. Prices for electric energy tend to be high during peak demand hours and aggregators offer demand reductions during these hours. However, dispatch of DSRs typically leads to load shifting, i.e., a reduction of demand in one period results in an increase of demand in previous or subsequent periods and this effect must be taken into account when dispatching DSRs. A central task of independent system operators (ISOs) of power distribution networks is the economic operation of power generating units to meet the (uncertain) demand for electric energy, i.e., they must determine which power generating units to commit in each period and at what level to operate the units. If DSRs are available, their dispatch must also be taken into account, including the resulting load shifts. In this paper, we analyze the day ahead stochastic unit commitment problem with DSRs, i.e., the problem of finding a cost-optimal commitment and dispatch schedule of generators and DSRs for the next-day operation in a power distribution network with uncertain demand. Including DSRs into unit commitment results in a complex stochastic problem, because load can be shifted to previous as well as subsequent periods. We apply stochastic progressive hedging to solve the problem, which allows to model load shifts to previous periods in a dynamic program. Given accepted DSR loads, the underlying unit commitment problem is modeled as a dynamic program and solved by approximate dynamic programming (ADP) techniques that combine Monte-Carlo simulation and mathematical programming. Stochastic progressive hedging is used to optimize DSR bids where each function evaluation corresponds to solving a unit commitment problem. We also derive an algorithm that assesses a lower bound on the optimal value. In addition, convergence results of the stochastic progressive hedging algorithm deployed are provided. We also perform an extensive numerical study to assess the value of including DSRs into unit commitment, especially when intermittent sources of energy are used.

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 3 Our contribution is threefold. First, we present a dynamic programming formulation of the stochastic unit commitment problem with DSRs and stochastic load shifting. Unlike previous research, we allow for stochastic load shifts to previous as well as subsequent time periods, which is important, because the load shift effects are uncertain when the DSRs are dispatched a day ahead. Second, we develop an algorithm that solves the problem. We develop a new solution approach that combines stochastic progressive hedging and approximate dynamic programming and we derive convergence results for the algorithm that are of interest in their own right. The analyses do not rely on our problem and thus are general. Third, we present numerical results based on real-world data that indicate that substantial savings in energy cost can be achieved if DSRs are utilized in unit commitment models. This paper is organized as follows. In Section 2, we review the related literature on the unit commitment problem, on DSR models, and on the solution approaches that we rely on. In Section 3, we state the problem formally. In Section 4, we present our model. In Section 5, we show how the problem can be solved. Because the problem is non-convex, we cannot guarantee optimality of our solution, but we can derive a lower bound on the optimal solution. In Section 6, we report convergence results of the algorithms. In Section 7, we present numerical results, which are based on actual data from the California ISO region. Concluding remarks are presented in Section 8. Notation is summarized in the appendix. 2. Literature Review In this section we review literature related to our work. In Subsection 2.1, we provide an overview and a classification of modeling approaches related to our research. In Subsection 2.2, we review the solution approaches. 2.1. Unit Commitment Models An important element of our model is the unit commitment problem for day-ahead markets and Table 1 provides a classification that we use in our literature review. The early versions of this model consider deterministic settings without demand side resources and differ with respects to the components and constraints that they incorporate. Muckstadt and Koenig (1977) present a basic model and consider only reserve capacity and demand constraints. Later work extends the basic model and includes transmission constraints (Ma and Shahidehpour (1998)), voltage constraints (Ma and Shahidehpour (1999)), and storage (Al-Agtash (2001)). Among the most general models is the one by Baldick (1995), who includes minimum up- and down-time constraints, power-flow constraints, line-flow limits, voltage limits, reserve constraints,

4 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response Without With Demand Side Response Demand Side Response Without Load Shifting With Load Shifting Muckstadt and Koenig (1977) Su (2007) Ma and Shahidehpour (1998) Su and Kirschen (2009) Deterministic Ma and Shahidehpour (1999) Dietrich et al. (2012) Al-Agtash (2001) De Jonghe (2011) Baldick (1995) Takriti et al. (2000) Zhao and Zeng (2010) Valenzula and Mazumdar (2003) Parvania and Fotuhi- Stochastic Papavasiliou and Oren (2011) Firuzabad (2010) this paper Sioshansi and Short (2009) Morales et al. (2009) Constantinescu et al. (2011) Table 1 Literature on Unit Commitment Models and DSR ramp limits, and total fuel and energy limits on hydro and thermal power generating units. Several current models on the unit commitment problem include stochastic elements and allow for stochastic prices and cost (Takriti et al. (2000)), stochastic fuel cost (Valenzula and Mazumdar (2003)), and stochastic load (Papavasiliou and Oren (2011), Sioshansi and Short (2009), Morales et al. (2009), and Constantinescu et al. (2011)). Wood and Wollenberg (2006) provide an extensive overview of unit commitment models. Some authors have also included demand side resources in their models. Zhao and Zeng (2010) analyze a unit commitment formulation with demand side response that minimizes the worst-case operating cost. Parvania and Fotuhi-Firuzabad (2010) consider a model where DSRs are employed if load exceeds capacity. Both models consider stochastic loads but neglect load shifting effects. Another set of authors takes load shifting into account, but assumes that loads are deterministic. Su (2007) and Su and Kirschen (2009) analyze a manufacturing process where the manufacturer submits price-responsive bidding curves for electric energy. The bids are incorporated in a unit commitment model, such that the production rate can be optimized based on energy-bid prices. Dietrich et al. (2012) and De Jonghe (2011) develop unit commitment models, where a centralized decision maker has the authority to shift loads between periods. Dietrich et al. (2012) consider a model with local decision making, where load shifts are based on bidding curves of electricity prices. Unit commitment problems with DSRs have been modeled under stochastic load without load shifting (Zhao and Zeng (2010) and Parvania and Fotuhi-Firuzabad (2010)) and under deterministic demand with load shifting (Su (2007), Su and Kirschen (2009), Dietrich et al. (2012), and De Jonghe (2011)). DSRs under stochastic load with load shifting have not been analyzed so far and we address this topic in this paper. Unlike previous research, we allow for stochastic load shifts, which is important, because there is considerable uncertainty about the load-shift effect over the course of a day when the DSR is dispatch decision is made at the beginning of the day.

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 5 2.2. Solution Approaches The solution approaches that are typically used to solve stochastic instances of the unit commitment problem rely on scenario-based representations of uncertainty (Takriti and Birge (2000), Shiina and Birge (2004), Cerisola et al. (2009), Takriti et al. (2000), Ruiz et al. (2009), and Nowak and Römisch (2000)). Uncertainty about the next stage is incorporated in the model by considering a number of scenarios that are weighted by probabilities. The scenario approach is not appropriate for our model, because we consider a two-stage decision problem with stochastic components at each stage. In the first stage we decide about the dispatch of the DSRs, which results in stochastic load shifts. In the second stage, we solve a stochastic unit commitment problem. Because we consider stochastic load and stochastic load shifting, we would have to use a nested scenario-tree with many scenarios for load shifting of each DSR, which would result in models that become computationally intractable for reasonably sized settings. Another problem is the generation and selection of an appropriate set of scenarios and a scenario-based approach does not yield valid bounds on the optimal solution, because uncertainty is modeled approximately. Because the scenario-based approach has drawbacks for solving our model, we use a different approach. We model the unit commitment problem that we solve in the second stage as a dynamic program and embed it into a stochastic proximal point algorithm that solves the first stage problem. Because the resulting dynamic program is very large, we use ADP to solve the problem. We review the related literature on the proximal point algorithm and ADP next. The proximal point algorithm is introduced by Martinet (1970) and Rockafellar (1975). It is extended to the progressive hedging algorithm by Rockafellar and Wets (1991), which allows for decomposition of problems with a special structure. Recent work on the proximal point algorithm focus on convergence rates (Yao and Shahzad (2012)) and extensions to inexact proximal steps (He and Yuan (2012)). The progressive hedging algorithm has been applied as a decomposition approach to various problems in scenario-based stochastic programming (see, e.g., Takriti and Birge (2000) and Haugen et al. (2001)). We extend both the proximal point and the progressive hedging algorithm to stochastic problems, such that the proximal step is taken based on Monte- Carlo sampling and prove convergence of the algorithms for convex problems. Our line of proof follows Bertsekas (2011), who proves convergence of incremental proximal bundle methods. We use the stochastic progressive hedging algorithm to decompose the first stage decisions in our problem by time period and we use the stochastic proximal point algorithm to develop lower bounds on the optimal solution. We approximate the second stage of our problem using ADP. ADP has been widely applied to resource allocation problems, which have many similarities with our problem. Especially the

6 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response application to transportation problems by Powell and Topaloglu (2003) and the framework provided by Powell et al. (2001) share notation and methodology with our ADP algorithm. For an extensive review of ADP the reader is referred to Powell (2007), Bertsekas and Tsitsiklis (1996), Sutton and Barto (1998), and Powell and Van Roy (2004). 3. Problem Description We consider an electric power generation system with three types of electric energy resources: conventional generators (thermal, hydro-electric, nuclear, etc.), intermittent renewable generation capacity (wind, solar, etc.), and DSRs. The capacity of the intermittent resources is determined by external factors and cannot be controlled, but we can control the capacity of the conventional generators and we can decide which DSRs to dispatch and when. Our objective is to jointly determine the operating schedule of the conventional generators and the dispatch schedule of the DSRs, such that the total expected cost is minimized. The optimization is subject to a constraint that demand is met with a certain probability and to physical constraints of the conventional generators (minimum up and down times, ramping limits, etc.). We use a finite planning horizon that is divided into T periods. Because reliable forecasts for intermittent capacity are typically available only one day ahead, we use a planning horizon of one day and a time period of one hour in our numerical experiments. We denote the forecast of the gross base load demand in period t by L G t and the forecast of the capacity of the intermittent resources in period t by W t. Quantities L G t and W t are stochastic. We can control demand L G t by dispatching DSRs. If we dispatch a DSR in period t, then the demand in that period is reduced, but the demand in previous and subsequent periods is increased. We denote the fraction of demand that is shifted from period t to period t by β t,t. For a given dispatching schedule of the DSRs, we denote the change in demand in period t that is due to the dispatch of DSRs by L D t. The residual demand that must be filled by conventional generators is L t = L G t L D t W t. Figure 1 shows an example, where DSRs are dispatched. The peak load around noon is reduced and some load is shifted to morning and the evening hours. We consider the cost of conventional generators, Ct C, and the cost of dispatching DSRs, Ct D. The cost of the conventional generators consists of the startup cost of the generators and the actual generation cost. The cost of dispatching DSRs is determined based on bid-curves. Demand side resources are typically managed by load aggregators who contract a number of remote controllable resources. Each resource can offer load reducing capacity at an arbitrary price. Sorting and aggregating the offers of all resources ascending by price results in an aggregated bid-curve that indicates by how much demand for electric energy is reduced at a given price. Figure 2 shows an example, where the aggregated bid-curve consists of seven segments, and each segment can be accepted at any fraction, i.e., any quantity of load reduction between 0 MW and 4 MW can be chosen.

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 7 Load [GW] 34 32 Preloading Effect Rebound Effect 30 28 26 24 Load reduction by DSR dispatch 22 20 18 Load before DSR dispatch Load after DSR dispatch 0 3 6 9 12 15 18 21 24 Period Figure 1 Effect of DSR dispatch on load 4. Model The stochastic unit commitment model without DSRs, introduced next, is the foundation of our model. Then, we present our demand-side-response model and finally show how we integrate both models. We denote random variables by upper case letters, realizations of random variables by lower case letters, and sets by calligraphic upper case letters. The objective of the stochastic unit commitment problem (without DSRs, i.e., L D t = 0) is to determine an optimal policy for committing generators and their output levels, such that the total expected cost over the planning horizon is minimized. The sequence of events is as follows. At the beginning of period t (t = 0,..., T 1), we observe the realization of the gross demand for energy L G t and the realization of the intermittent energy generation W t of period t. Then, commitment and dispatch decision are made. Commitment decisions determine the availability of generators in future periods, but not in the current period. Dispatch decisions affect the output level of the generators in the current period of those generators that are on line in the current period. To model the problem as a dynamic program, we recursively define the optimal value functions. Let St x be the state of the system at the end of time period t after all actions have been taken and let X t (S t ) be the set of feasible actions given state S t. We denote the optimal value function of the system at post-decision state St x in period t by V t (St x ), which represents the expected cost-to-go from periods t to T 1, if optimal actions are taken in future periods. The dynamic program is [ V t 1 (St 1) x = E Lt Yt (S t ) St 1] x, (1)

8 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response Capacity [MW] Capacity [MW] Capacity [MW] 2.0 1.5 1.0 DSR 4 Cap. DSR 4 DSR 3 Cap. DSR 3 2.0 1.5 1.0 DSR 5 DSR 4 DSR 3 Cap. DSR 5 Cap. DSR 4 Cap. DSR 3 4.0 3.0 2.0 n = 7 n = 6 n = 5 n = 4 n = 3 0.5 DSR 2 Cap. DSR 2 0.5 DSR 2 Cap. DSR 2 1.0 n = 2 0.0 DSR 1 Cap. DSR 1 Price [$ / MW] 0.0 DSR 1 Cap. DSR 1 Price [$ / MW] 0.0 n = 1 Price [$ / MW] Bid DSR 5 Bid DSR 4 Bid DSR 3 Bid DSR 1&2 Bid DSR 4 Bid DSR 3 Bid DSR 2 Bid DSR 1 (a) Bid-Curve Aggregator 1 (b) Bid-Curve Aggregator 2 (c) Aggregate Bid-Curve Figure 2 Bid-Curves for two Load Aggregators and resulting Aggregated Bid-Curve with Y t (S t ) = min x t X t (S t ) 1 { C C t (S t, x t, l t ) + V t (S x t ) }. (2) Cost C C t is the one time period cost and l t is a realization of L t. We use a terminal value of V T 1 = 0. Given initial state S 0 and demand realizations l τ for 0 τ t, the optimal policy can be computed from { X π t = arg min C C t (S t, x t, l t ) + V t (S M (S t, x t )) }. x t X t (S t ) Function S M (S t, x t ) translates the pre-decision state S t into the post-decision state S x t, given action x t. We next explain the components of the dynamic program in detail. States. The system consists of a set of conventional generators that are subject to various physical constraints. The relevant characteristics to fully describe a generator s current operating state are captured by attributes. We denote the space of all possible attribute combinations by A and the attribute vector capturing the operating state of a generator by a = [generator index, uptime, downtime, output level] A. A complete representation of the state of all conventional generators in the system is given by binary resource vector R t = (R t,a ) a A, where R t,a is the number of resources with attribute vector a. The index of the generator is part of the attribute vector to identify individual generators. Because the output level is a continuous quantity, the dimension of R t is not finite. The full representation of the state of the system at the beginning of period t is S t = (R t, l [0,t] ), with l [0,t] = [l 0,..., l t ]. We denote the space of all possible resource vectors by R t, the sample space of net demand for energy from periods 0 to t by L t, and the resulting state space by S t = R t L t. Actions. Our objective is to minimize the expected cost of generating electric energy by finding an optimal operating schedule for the generators. Possible actions are binary commitment actions, i.e., switching the generator on or off and continuous dispatching actions, i.e., changing the output level of a generator. Let D = {switch on/switch off/do nothing; set output level} be the joint set of commitment and dispatching actions that can be performed on a generator. Action vector x t X t (S t ) with x t = (x t,a,d ) a A,d D denotes the number of resources with attribute vector a A 2

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 9 for which action d D is taken. x t is binary and not of finite dimension. Because each generator is subject to certain physical constraints, e.g., minimum up- and downtimes and ramping limits, and total generation must equal total demand, the set of feasible actions X t (S t ) depends on the pre-decision state. System Dynamics. We define attribute transition function a = h(a, d) for all a and d, where h( ) is the value of the attribute vector of a generator after action d. Resource transition function R M (x t ) translates action vector x t into the post-decision resource vector and is given by R x t,a = RM a (x t) = x t,a,d δ(a, a, d), a A, (3) a A d D where δ(a, a, d) = 1 for a = h(a, d) and δ(a, a, d) = 0 otherwise. Note that Rt x = (Rt,a) x a A does not explicitly depend on R t, but actions x t are restricted by x t X (S t ), where R t is a component of S t. Random process L t is correlated over time and modeled as a stochastic function L t = u(l [0,t 1], ζ t ), where ζ t is an exogenous stochastic process following an arbitrary distribution that depends on previous realizations l [0,t 1]. The transition functions for pre- and post-decision state variables are given by S t = (R t, l [0,t] ) = S M (St 1, x l t ) = (Rt 1, x [l [0,t 1], l t ]) and St x = (Rt x, l [0,t] ) = S M,x (S t, x t ) = (Rt M (x t ), l [0,t] ). Objective Function. The objective function consists of generation cost c t (a) and startup cost s t (a). However, additional cost components (e.g., shutdown cost) can easily be incorporated into the model. Our modeling and solution approach does not rely on specific forms of the cost functions, but we require the generation cost c t to be a convex function of the energy output level for a level > 0 and c t (a) = 0 for a level = 0. The energy generation cost for action vector x t X t (S t ) is given by x a A d D t,a,d (c t (h(a, d)) + s t (x t,a,d )). We require that generation equals demand and define the system balance equation as x a A d D t,a,dg(h(a, d)) = l t, where g(ā) is the current power output of a generator given its attribute vector ā. For certain S t the case X t (S t ) = can occur. We therefore add variable g t that captures the imbalance between generation and net demand. For g t > 0, unit cost c + t for positive non-spinning reserve capacity is incurred (Wood and Wollenberg (2006)). Similarly, for g t < 0 unit cost c t is incurred. We add the imbalance terms to the generation cost and the system balance equation and obtain the generation cost function Ct C. Note that our cost function Ct C does not depend on the state and l t and we thus write C C t (x t ) = x t,a,d (c t (h(a, d)) + s t (x t,a,d )) + c + t [g t ] + + c t [ g t ] +, x t X (S t ) (4) a A d D

10 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response and the system balance equation with non-spinning reserve x t,a,d g(h(a, d)) + g t = l t. (5) a A d D Spinning-Reserve Requirements. Spinning-reserve requirements are used in unit commitment problems to hedge against uncertainty in demand (Wood and Wollenberg (2006)). Spinningreserve capacity is capacity provided by on line generators, as opposed to non-spinning reserve capacity. Most previous work on stochastic unit commitment problems models uncertainty by defining scenarios (see, e.g., Takriti et al. (1996), Takriti and Birge (2000), Nowak and Römisch (2000)). Unlike other models, our model does not depend on a fixed number of scenarios and the distribution of demand depends on a continuous decision variable. We therefore cannot model reserve requirements as a deterministic constraint. We formulate spinning-reserve requirements as a chance constraint that requires that the probability of not having sufficient synchronized generation capacity in period t + 1 is less than ɛ: { } P x t,a,d g max (h(a, d)) L t+1 1 ɛ. (6) a A d D Function g max (a) denotes the maximum achievable output of a generator with attribute vector a. We later use the fact that we can easily obtain a deterministic equivalent of the constraint for any distribution (see, e.g., Birge and Louveaux (1997)). Dynamic Programming Model. The dynamic programming optimality equation for the stochastic unit commitment problem is { ( V t 1 (St 1) x = E Lt [min C C t (x t ) + V t S M,x (S t, x t ) )} ] S x t 1, (7) x t subject to x t,a,d {0, 1}, Constraints (5) and (6), and x t,a,d = R t,a, a A. (8) d D Next, we present our modeling approach for incorporating DSRs into the problem formulation. We model DSRs as bid-curves. A bid-curve consists of N segments, each characterized by a bid price and a bid quantity. We allow segments to be cleared partially, i.e., an offer is accepted at any fraction. We consider a single aggregate bid-curve in each period and denote bid price and bid quantity of segment n in period t by c D t,n and g D t,n, respectively. We use a continuous decision variable z t,n [0, 1], representing dispatch decisions for each segment n in period t. Then, z = (z t,n ) 0 t T,0 n<n dispatch cost of schedule z is given by represents a full dispatch schedule for all DSRs. Total T 1 N 1 C D (z) = z t,n c D t,n. t=0 n=0

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 11 Dispatching a DSR results in lower demand in the period in which it is dispatched and in an increase in demand in previous or subsequent periods. We model the fraction of load of segment n that is shifted from period t to period t by an arbitrarily distributed random variable β n,t,t. Reducing load in period t by z t,n g D t,n leads to an increase of load in period t of z t,n g D t,nβ n,t,t. The sum of the shifted load does not necessarily equal the dispatched capacity. We allow net energy conservation ( t t β n,t,t < 1) and net increase in demand for electric energy ( t t β n,t,t > 1) induced by dispatch of DSRs in t. The total load adjustment in period t is given by Let L t (z) = L G t N 1 T 1 L D t (z) = z t,n g D t,n n=0 t =0 N 1 β n,t,t z t,ng D t,n. W t L D t (z) denote the residual demand for electric energy from conventional generators after DSR dispatch. Value L t (z) is a random variable and its probability distribution depends on dispatch schedule z. To integrate the DSR model into the stochastic unit commitment model, we replace L t by L t (z) in Constraint (6), Problem (7), and in the definition of l [0,t]. Because accepting a bid in a time period has implications on previous and subsequent time periods, it is not possible to derive a simple dynamic programming model. Instead, given fixed DSR decision z, the resulting problem is a dynamic program with a value function dependent on z. We denote the resulting value function for a given DSR dispatch schedule z by V t (z, S x t ). Note that now the actions space depends on z, i.e., X = X (S t, z). In the combined model, we seek to minimize the sum of DSR dispatch cost and energy generation cost incurred by conventional generators. Combining both objective functions provides the integrated minimization problem n=0 min z { C D (z) + V 1 (z, S x 1) } (9) subject to Constraints (5), (6), and (8). State S x 1 refers to the state of the system, right before the beginning of the planning horizon and is assumed to be known. 5. Solution Approach Dynamic programming models and solution methods like backwards recursion implicitly assume that actions taken in period t affect periods t t only. In our model, dispatch of DSRs in t can evoke load shifting to periods t < t, which violates this assumption. It is not straightforward to solve Problem (9), which requires concurrent minimization over z and solving the dynamic program. We approach this problem by incorporating DSR dispatch decisions into the myopic problem in Equation (7). This is possible if we relax the requirement that the dispatch schedule must be

12 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response constant over all periods, i.e., z is replaced by z t. Solving the myopic problems provides T locally optimal dispatch schedules that need to be synchronized to a common implementable z. We synchronize the solutions using stochastic progressive hedging, which is a special case of the stochastic proximal point algorithm. We cannot apply the known deterministic versions of the algorithms, because every load realization results in different optimal values z t and we can obtain z t only for a single realization in each iteration. In Subsection 5.1, we modify the deterministic proximal point algorithm to the stochastic case and extend it to stochastic progressive hedging in Subsection 5.2. Because the dynamic program in Problem (9) is very large, we cannot solve it optimally in each iteration. Instead, we approximately solve it as described in detail in Subsection 5.3. In Subsection 5.4, we reformulate Problem (9) to fit the conditions of the stochastic progressive hedging algorithm, i.e., we show how to disaggregate z. We also show how to integrate the approximation algorithm for the dynamic program and the stochastic proximal point algorithm. Our solution approach uses two approximations. First, we apply the stochastic progressive hedging algorithm to a non-convex function (Problem (9) is non-convex) and the algorithm might not converge to an optimum. Second, we use ADP that does not guarantee optimality, even for fixed DSR decisions. Therefore, the solution is not necessarily optimal, but provides an upper bound on the optimal value. To assess the solution quality, we compare it to a lower bound. In Subsection 5.5, we derive this lower bound which is based on a continuous relaxation, and we show how it can be computed using the stochastic proximal point algorithm. Figure 3 illustrates the relation of the different modules of our solution approach. Stochastic Proximal Point Algorithm Subsection 5.1 Continuous relaxation Special case Stochastic Unit Commitment Problem Section 4 Stochastic Progressive Hedging Algorithm Subsection 5.2 ADP for Stochastic Unit Commitment Problem Subsection 5.3 Lower Bound on the Optimal Solution Subsection 5.5 Solution of the Integrated Problem Subsection 5.4 Figure 3 Solution Approaches

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 13 5.1. Stochastic Proximal Point Algorithm In this section, we introduce the stochastic proximal point algorithm. It builds the basis for the procedure that synchronizes local DSR dispatch schedules across time (Subsection 5.2) and we also rely on the algorithm to compute a lower bound on the optimal solution (Subsection 5.5). The algorithm is an extension of the proximal point algorithm (Martinet (1970) and Rockafellar (1976)) to stochastic programming problems. We state the procedure in terms of a generic stochastic function l(v, w, ω) : N M Ω R, N R n, M R m, where (Ω, F, P) is a probability space. Function l is a saddle-function, i.e., l is convex in v for all w and concave in w for all v. The objective is to find (v, w ) argminimax v,w E[l(v, w, ω)]. Algorithm 1 : Stochastic Proximal Point Algorithm 1. Initialization a) Set k = 0. b) Choose v 0 N, w 0 M. c) Choose α 0 > 0. 2. Loop a) Sample ω k Ω. b) Solve { (v k+1, w k+1 ) argminimax l(ˆv, ŵ, ω k ) + 1 ˆv v k 2 1 } ŵ w k 2. ˆv N,ŵ M 2α k 2α k c) If stopping criterion is met, return (w k+1, v k+1 ); else k = k + 1, set α k, and go to 2.a). Algorithm 1 applies the proximal point algorithm for saddle-functions of Rockafellar (1976) to stochastic functions. It differs from the deterministic version in the sampling step, which allows to solve the problem argminimax v,w E[l(v, w, ω)] without explicitly evaluating the expected value. In Step 2.a), samples ω k are generated from distribution Ω, and a sequence of modified subproblems is solved in Step 2.b). In Step 2.c), the step size α k is updated. In Section 6, we identify conditions for α k under which Algorithm 1 converges. We next extend the algorithm to stochastic progressive hedging. 5.2. Stochastic Progressive Hedging Algorithm We derive a procedure based on Algorithm 1 that solves a decomposable problem optimally by solving a sequence of smaller subproblems individually. We apply this variant in our solution approach to synchronize locally optimal DSR decisions in the dynamic program across time. We use the notation y [t,t ] with 0 t t T 1 to denote the set of values (y t,, y t ) Z t t+1, Z R n. We define operator H : Z T Z as Hy [0,T 1] = 1 T 1 y T t=0 t and operator K : Z Z T as

14 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response Ky t = y t e, where e is a vector of all ones of dimension T. Note that H is the orthogonal projection from Z T on Z and that any w [0,T 1] defined by w [0,T 1] = y [0,T 1] KHy [0,T 1] is orthogonal to KHy [0,T 1]. Let h : Z Ω R be a stochastic convex function of the form h(y, ω) = h(h 0 (y, ω),..., h T 1 (y, ω)), where h t are convex and linked by y only for all t. Our objective is finding y arg min y E[h(y, ω)]. Let h : Z T Ω R be given by h(y [0,T 1], ω) = h(h 0 (y 0, ω),..., h T 1 (y t 1, ω)). Because E[ h(khy [0,T 1], ω)] = E[h(Hy [0,T 1], ω)] = E[h(ȳ, ω)] for ȳ = Hy [0,T 1], any y t obtained from solving the problem min y[0,t 1] E[ h(y [0,T 1], ω)) subject to y [0,T 1] = KHy [0,T 1] solves minȳ E[h(ȳ, ω)]. We now focus on min y[0,t 1] E[ h(y [0,T 1], ω)] subject to y [0,T 1] = KHy [0,T 1]. The augmented Lagrangian relaxation (see, e.g., Bertsekas (1982)) obtained from dualizing y [0,T 1] = KHy [0,T 1] reads l(y[0,t 1], w, ω) = h(y [0,T 1], ω) + y [0,T 1] KHy [0,T 1], w + 1 y [0,T 1] KHy [0,T 1] 2, 2α where w is a multiplier, α is a penalty factor, and, denotes the standard inner product. Function l is a saddle-function and (y [0,T 1], w ) argminimax y[0,t 1],w E[ l(y [0,T 1], w, ω)] can be solved using Algorithm 1. The inner product and quadratic term in l, however, prevent from decomposing l in the optimization in Step 2.b) of Algorithm 1. Following the ideas of the progressive hedging algorithm (Rockafellar and Wets (1991)), we modify the augmented Lagrangian representation by introducing an iterative update of w and the quadratic penalty term to obtain a decomposable instance of Algorithm 1. Let k be the current iteration of the procedure. First, we replace y [0,T 1] KHy [0,T 1] 2 in Step 2.b) of Algorithm 1 by y [0,T 1] KHy k 1 [0,T 1] 2, i.e., we use the solution of iteration k 1, y k 1 [0,T 1], in the penalty term. Second, we set w 0 = y 0 [0,T 1] KHy0 [0,T 1] and update w k by setting w k+1 = w k + 1 (y k+1 α k [0,T 1] KHyk+1 ) instead of minimizing over w. Note that now y [0,T 1] [0,T 1] KHy [0,T 1], w = y [0,T 1], w, because KHy [0,T 1] and w are orthogonal. As a result, the minimization over y [0,T 1] in Step 2.b) of Algorithm 2 is now decomposable. The stochastic progressive hedging algorithm is given in Algorithm 2. Algorithm 2 : Stochastic Progressive Hedging Algorithm 1. Initialization a) Set k = 0. b) Choose y 0 [0,T 1] ZT.

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 15 c) Set w 0 = y 0 [0,T 1] KHy0 [0,T 1]. d) Choose α 0 > 0. 2. Loop a) Sample ω k Ω. b) Solve y k+1 [0,T 1] = arg min ŷ [0,T 1] Z T { h(ŷ [0,T 1], ω k ) + ŷ [0,T 1], w k + 1 c) Set w k+1 = w k + 1 α k (y k+1 [0,T 1] KHyk+1 [0,T 1] ). } ŷ [0,T 1] KHy k 2 [0,T 1]. 2α k d) If stopping criterion is met, return Hy k+1 [0,T 1] ; else k = k + 1, set α k, and go to 2.a). In Section 6, we identify conditions for α k under which Algorithm 2 converges. In our approach, we apply Algorithm 2 to Problem (9). Components h t correspond to the myopic problems V t (z, S x t ) and solutions y k t correspond to locally optimal DSR dispatch decisions obtained by solving min zt V t (z t, S x t ). In Subsection 5.4 we modify Problem (9) to match the structure of h. In each iteration of Algorithm 2, we solve the stochastic unit commitment problem. Solving it optimally is computationally intractable and we thus replace the exact solution of the subproblems in Step 2.b) by an approximation that we introduce next. 5.3. ADP for Stochastic Unit Commitment We approximately solve the stochastic unit commitment problem, given a DSR dispatch schedule z using ADP. Our approach is to employ a value function approximation of finite, sufficiently low dimensionality and to approximate its parameters using samples from the stochastic information process. We next introduce an approximate value function and then provide an algorithm to iteratively update the approximation. Value function approximations replace the table lookup of the value function by an analytical form that is determined by a set of parameters (Powell (2007)). Often, problems exhibit special structure suggesting a certain form of the approximation. In general, a good match between the approximating function and the structure of the underlying problem crucially impacts the solution quality (Bertsekas (2005)). We use a piecewise linear approximation, because this form has proven to perform well in many resource allocation problems (see Powell (2007)). Discretizing the State Space. Because attribute vector a contains the continuous coordinate a level, the state vector R t is of infinite dimension. We cannot handle infinite dimensional state spaces computationally and thus we discretize coordinate a level into a number of equally sized intervals. We aggregate all exact attribute vectors having a level within the same interval into an aggregate attribute vector ã and denote the discretized attribute space by Ã. Similarly, we discretize all

16 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response components l t of the load history l [0,t] into a number of intervals and denote the resulting discretized load component of the state by ˆl [0,t] ˆLt. Aggregating the Attribute Vector. Even after discretization, the state space is still too large to be dealt with explicitly. The standard approach to reduce the size of the state space is aggregation (Powell (2007)). We define G levels of aggregation, in which, with decreasing resolution, we define sets of super attributes Â g, g = 0,..., G, where Â 0 = Ã and Â g+1 < Â g. Each super attribute â g Â g is a set of attributes ã with similar values. We define R t,â g = ã â g R t,ã. Quantity R t,â g is no longer a binary vector for g > 0, if we decide to aggregate multiple similar generators into â g. Note that we keep level g = 0 in the formulation, which corresponds to the discretized but disaggregate state. Separable piece-wise linear approximation. After discretization and aggregation, the state space is still too large to enumerate, because even small problem instances with 10 generators have state spaces with more than 10 10 states. To reduce the number of parameters of the value function, we use a separable piece-wise linear approximation. For each ˆl [0,t] ˆLt and â g Â g, g = 0,..., G we define a piecewise linear function f t,â g(ˆl [0,t], ˆR t,â g) that represents an estimate of the value of having ˆR t,â g resources with attribute vector â g available, if we have observed load history ˆl [0,t]. Each aggregation level g is assigned a weight w g > 0 with g w g = 1, defining its contribution to the overall value function approximation. We replace the exact value function V t (z, St x ) by a separable piece-wise linear approximation of the form ˆV t (St x ) = G w g=0 g â f g Âg t,â g(ˆl [0,t], ˆR t,â g). We denote the slope vector of f t,â g by v t,â g(ˆl [0,t], ). Note that for different load histories ˆl [0,t] ˆLt, we have different sets of piece-wise linear functions for all â g Â g. Updating the Approximation. Approximating the value function reduces the number of parameters that must be estimated. It does, however, not solve the problem of explicitly calculating the expected value of the optimal cost from periods t to T 1 in Equation (7). We use ADP to circumvent this problem by iteratively executing the following procedure. First, generate a sample path from the distribution of random variables. Then, step forward through time and solve the myopic problems based on the current estimate of the value function and the sample path. Finally, use solution information (e.g., gradients) to update the current value function approximation. We start by replacing the optimal value function V t (z, S x t ) by the approximate value function ˆV t (S x t ). The optimality equation for period t 1 based on the approximation of the value function in t is given by Ṽ t 1 (St 1) x = E Lt [Ŷt(S t ) St 1], x with (10) { Ŷ t (S t ) = min C C t (x t ) + ˆV ( t S M,x (S t, x t ) )}. x t X t (S t )

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 17 Instead of computing the expected value in Equation (10) explicitly, we approximate it by sampling the residual demand L t and defining ν t 1 (S x t 1, ˆV t, l t ) = Ŷt(S M (S x t 1, l t )). (11) We define an initial estimate of ˆV t, i.e., the slopes of the piecewise linear approximation, and iteratively update the approximation. Let k be the current iteration and let ˆV k 1 t and l k t denote the current estimate of the value function and the current sample of residual demand, respectively. Given S x 1, ˆV k 1 t and sample path l k [0,T 1] = [lk 0,..., l k T 1], we start with t = 0 and solve { ˆx k t = arg min C C x k t X t(st k) t (x t ) + ˆV ( t,k 1 S M,x (S k t, x t ) )} (12) to obtain actions ˆx k t and post- and pre-decision states S x,k t respectively. = S M,x (St k, x k t ) and St+1 k = S M (S x,k t, lt+1), k We complete an iteration by updating the current value function approximation. Starting in t = T 1, we use the approximate problem νt k (S x,k t, ˆV t+1, k lt+1) k and the current estimate of state S x,k t to update the estimate by ˆV k t = B(ν k t, ˆV k 1 t that depends on the analytical form of ˆV t and the underlying problem. ˆV k 1 t, S x,k ), where B( ) is some updating operator For stability, we maintain functions f as convex and update the slopes of f t,â g(ˆl [0,t], ) at state S x,k t using Problem (11) and a step size γ k t that may depend on the state. To estimate the gradient at S x,k t, we relax the integrality constraints of the action space and denote the resulting action space by Xt (S t ). We solve the continuous relaxation of Problem (11) and use the dual vector corresponding to Constraint (8) as an estimate of the gradient. For each â g Â g, g = 0,..., G, we take the mean dual value over all generators with attribute vector ã â g and calculate a temporary slope vector v t,â k g. We execute the SPAR algorithm (Powell (2007)) on the temporary slopes to maintain convexity and obtain the new slope vector v k t,â g(ˆl k [0,t] ). Our ADP algorithm for fixed z can be summarized as follows. t at Algorithm 3 : ADP Algorithm for the Stochastic Unit Commitment Problem 1. Initialization a) Pick initial values for v t,â g,0(ˆl [0,t] ) for all ˆl [0,t] ˆLt, â g Â g, g, and t, such that f t,â g(ˆl [0,t], ) are convex functions. b) Set S x,k 1 = S x 1, the initial state of the system, for all k 0. c) Set k = 1, choose γ 1 > 0. 2. Sample Sample the residual demand sequence l k [0,T ] = [lk 0,..., l k T ].

18 Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response S k t 3. Forward Pass For t = 0,..., T 1: a) St k = S M (St 1, x,k lt k ). { } b) ˆx k t = arg min xt X t (St k) Ct C k 1 (x t ) + ˆV t (S M,x (St k, x t )) subject to Constraints (6)-(8). c) S x,k t = S M,x (S k t, ˆx k t ). 4. Backward Pass For all t = T 1,..., 1: a) Solve min xt X t (S {C t k) t C ( x t ) + ˆV } t k (S M,x (St k, x t )). b) Record dual vector λ k t corresponding to Constraint (8). c) Do for all g = 0..., G, â g Â g i. Count number of generators with super attribute â g : n = ã â g R k t,ã. ii. Calculate mean dual value λ = ã â g Rk t,ã λ t,ã n. iii. For m = 0,..., M set { γ k v t 1,â g(m) = t 1v k 1 t 1,â g(ˆl k, n) + (1 [0,t 1] γk t 1) λ v k 1 t 1,â g(ˆl k, m) [0,t 1] for m = n otherwise. iv. Execute SPAR projection algorithm on v t 1,â g(ˆl k [0,t 1]) to maintain convexity. v. Obtain ˆV k t 1 by setting v k t 1,â g(ˆl k [0,t 1] ) = v t 1,â g(ˆl k [0,t 1] ). d) If stopping criterion not met, set k=k+1, set γ k, and go to Step 2. We use a double-pass algorithm to estimate the value function, i.e., we first record trajectory and then update the value function approximation in a backward pass (Powell (2007)). In Step 4 of the algorithm, we update ˆV k 1 t 1 using ˆV k t, which is the already updated estimate of the value function in t. In our numerical study, this procedure has proven to converge quicker than single-pass algorithms. 5.4. Solution of the Integrated Problem To solve Problem (9), we relax the requirement that the dispatch schedule must be constant over all periods and include a local copy of z in each one-period problem. In each iteration of the stochastic progressive hedging algorithm, we solve the dynamic program with local DSR dispatch and the algorithm forces convergence of the locally optimal schedules to a common constant z. We first reformulate Problem (9) to match the structure required by Algorithm 2. Subsequently, we show how to solve Step 2.b) of the algorithm using standard dynamic programming. Proposition 1. Problem (9) is equivalent to problem min z [0,T 1] C D (z 0 ) + Ṽ 1(z [0,T 1], S x 1)

Schneider, Klabjan, and Thonemann: Stochastic Unit Commitment with Demand Response 19 subject to Ṽ t 1 (z [t,t 1], S x t 1) = E Lt (z t ) z 0 = = z T 1, with (13) [ { ( ) }] min xt X (S t,z t ) Ct C (x t ) + Ṽt z[t+1,t 1], St x and ṼT 1 = 0. All proofs can be found in the electronic companion to the paper. [For the review process, the electronic companion has been attached to the main manuscript.] Proposition 1 is a technical result that allows reformulating the dynamic program with local DSR dispatch decisions. The value functions Ṽt is obtained from the value functions V t, by assuming that in each time period a different DSR decision can be made, i.e., z becomes z t. Note that Constraint (13) is the same as z [0,T 1] = KHz [0,T 1] and that Ṽt correspond to h t and z t correspond to y t of Subsection 5.2, respectively. The augmented Lagrangian representation obtained by relaxing z [0,T 1] = KHz [0,T 1] is T 1 ( L α (z [0,T 1], w [0,T 1] ) = C D (z 0 )+Ṽ 1(z [0,T 1], S 1)+ x z t Hz [0,T 1], w t + 1 ) 2α z t Hz [0,T 1] 2, where α is a positive constant. It is known that if Problem (9) is convex, then z [0,T 1] obtained by solving max w [0,T 1] is a global minimizer of Problem (9) (see, e.g., Bertsekas (1982)). The structure of L α t=0 (14) min L α (z [0,T 1], w [0,T 1] ) (15) z [0,T 1] is similar to l of Section 5.2 and Algorithm 1 is in principle applicable. However, executing Step 2.b) of the algorithm requires to concurrently minimize over z and compute Ṽ 1. Because Hz [0,T 1] is a function of all z t, we cannot simply add the term z t Hz [0,T 1] 2 to Ṽ t and treat z t as a standard decision variable in the dynamic program. As a result, it is unclear how to carry out Step 2.b) of the algorithm. We solve this problem by following the ideas of Subsection 5.2 and by applying Algorithm 2 to the problem. We replace Hz [0,T 1] by an exogenous parameter z Z in Equation (14) and require w t to be orthogonal to z. Then, because z, w t = 0, we can write z t, w t instead of z t z, w t and obtain T L α (z [0,T 1], w [0,T 1], z) = C D (z 0 ) + Ṽ 1(z 1 ( [0,T 1], S 1) x + z t, w t + 1 ) 2α z t z 2, (16) which is a modified version of L α. Now we can augment the actions space of the dynamic program by z t and incorporate optimization over z t into the dynamic programming formulation. Theorem 1 provides a transformation of L α that allows us to apply backwards recursion to the problem. t=0