Benefits of Storage Control for Wind Power Producers in Power Markets

Size: px

Start display at page:

Download "Benefits of Storage Control for Wind Power Producers in Power Markets"

Sheena Randall
5 years ago
Views:

1 Benefits of Storage Control for Wind Power Producers in Power Markets Mainak Chowdhury, Student Member, IEEE, Milind Rao, Student Member, IEEE, Yue Zhao, Member, IEEE, Tara Javidi, Senior Member, IEEE Andrea Goldsmith, Fellow, IEEE, Abstract We consider a wind power producer (WPP) participating in a dynamically evolving two settlement power market. We study the utility of energy storage for a WPP in maximizing its expected profit. With random wind and price processes, the optimal forward contract and storage charging/discharging decisions are formulated as solutions of an infinite horizon stochastic optimal control problem. For the asymptotically small storage capacity regime, we precisely characterize the maximum profit increase brought by utilizing energy storage. We prove that, in this regime, an optimal policy uses storage to compensate for power delivery shortfall/surplus in real time, without changing the forward contracts from the optimal ones in the absence of energy storage. This policy also serves as an approximately optimal policy for the case of relatively small storage capacity. We also design a policy based on model predictive control (MPC) that is approximately optimal for general storage capacities. We numerically evaluate the developed policies for wind and price processes with representative statistics from real world data. It is observed that, as expected, the simple small storage approximation policy performs closely to the optimum when storage is relatively small, while the more complex stochastic MPC policy performs better for larger storage capacities. I. INTRODUCTION Wind energy contributed to 7% of load serving by renewable energies in the U.S. in 204 [. Globally, the installed wind power capacity has increased eight-fold in the past decade, and continues to grow thanks to the decreasing capital costs of harnessing wind power. A typical approach for integrating wind energy into the electricity grid is to let wind power producers (WPPs) participate in conventional multisettlement power markets, where power is sold in multiple forward (e.g., day-ahead, hour-ahead, and 5-minutes ahead) markets. A summary describing participation of WPPs for multi-settlement markets in several Independent System Operators (ISOs) in the U.S. can be found in [2. Such markets raise significant challenges for WPPs, particularly in the dayahead market, where the vast majority of contracts of power delivery are made. In particular, as wind power generation is non-dispatchable, committing to forward power contracts This research was supported in part by a 3Com Corporation Stanford Graduate Fellowship, and in part by the NSF under CPS Synergy grant M. Chowdhury, M. Rao, and A. Goldsmith are with the Dept. of Electrical Engineering, Stanford University, Stanford, CA, 94305, USA, ( mainakch,milind,andreag@stanford.edu). Y. Zhao is with the Dept. of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY, 794 USA, ( yue.zhao.2@stonybrook.edu). T. Javidi is with the Dept. of Electrical and Computer Engineering, UC San Diego, La Jolla, CA, 92093, USA, ( tara@ece.ucsd.edu). raises fundamental challenges to WPPs, as future wind power generation is inherently uncertain due to difficulties of wind forecast [3. Hence, a WPP has the risk of running a shortfall in delivering a forward power contract when its actual generation is insufficient. Making up for such a shortfall by fast-responding generation resources (e.g., buying power in the real time market) is usually costly. As wind power generation itself has very low variable cost, the cost due to compensating for the risk of shortfall constitutes a WPP s most significant operating cost. There are a variety of approaches for the WPPs to reduce the risk of delivery shortfall. Methods that exploit statistical characteristics of wind generation such as improving wind power forecast [3 and aggregating diverse wind sources [4 directly reduce the uncertainty in the WPPs knowledge of future wind power generation. Fast-ramping fuel-based backup generators can also compensate for the uncertainty of future wind power, albeit with a high operating cost. Recently, energy storage has emerged as an increasingly viable commercial technology due to its decreasing cost [5; the ability of energy storage to shift energy across time provides great flexibility for WPPs to charge/discharge energy to compensate for any generation shortfall/surplus to meet forward contracts. Nonetheless, as energy storage is currently still expensive for large scale deployments, assessing its value in reducing operating cost and increasing profit is of primary interest to WPPs. Without storage, the problem of a single WPP participating in a two-settlement (day-ahead and real time) market has been studied in [6, [7, [8 where the optimal (i.e. expected profit maximizing) forward contracts based on the statistics of wind generation were developed. With storage, different variants of a stochastic control formulation have been considered in [9, [0, [, [2, [3, [4 for setting either the day ahead contracts or the charging policy or both. Optimal storage charging policy is studied in the absence of a power market in [9 where the total variation of the power output of a WPP is minimized. The value of providing co-located energy storage to a WPP has been studied in [0 and [ for real time market participation only, where dynamic programming (DP) solutions and online algorithms were developed, respectively. Focusing on a single day of wind generation and storage operation, the day-ahead market is considered in assessing the role of storage colocated with wind power in [2, where the optimal day-ahead contracts and storage operation were solved. A dynamically evolving two-settlement market has been considered in [3 and [4 in which optimal operation of co-located storage and wind power are studied. There, the model is limited to

2 2 one in which the forward market trades power one time slot ahead. As will be clarified next, this corresponds to the case of D = in this paper. We note that, when hourly day-ahead and hourly real-time markets are considered, D can be as large as 24, which significantly complicates the study of the problem. Another work that considers storage and WPPs competing (as opposed to cooperating) with each other in a power market is [5, where the Nash equilibria in this competitive setting are derived. We further note that the use of energy storage has also been extensively studied from a system operator s perspective on unit commitment and optimal power flow problems (see, e.g., [6, [7 among others). In this work we consider a WPP participating in a dynamically evolving conventional two-settlement (day-ahead and real time) market, while exploiting co-located energy storage (i.e., a battery). We consider that forward power contracts are made some time slots ahead of delivery, depending on the forward market structure. Wind power and prices are modeled as random processes with general statistics. We study the maximum utility of energy storage in the WPP s operation involving delivery of day ahead forward power contracts while limiting costs due to real time market interactions. Since the computation of a policy minimizing an infinite horizon expected discounted cost is in general intractable due to a policy living in a multi-dimensional continuous space, we first focus on small storage asymptotes. We show that, for a small storage, an optimal policy involves employing storage to reduce the real time market interaction via appropriate charging (when there is excess energy) and discharging (when there is a deficit), without changing the forward contracts from the optimal ones in the absence of energy storage. We also show that the discounted infinite horizon profit is concave and increasing. Thus, calculating the precise incremental benefit of a small storage provides an upper bound that linearly increases with the battery capacity on the utility of a battery. Next, for general battery capacity, we propose a method to obtain a convex quadratic approximation of the value function, namely the expected infinite horizon discounted cost. Based on this approximation, we develop a stochastic model predictive control (MPC) policy. We numerically evaluate the value function associated with the small battery approximation policy and the MPC policy. We observe that the small battery approximation policy performs very well when battery capacity is small, while the stochastic MPC performs better for larger storage. The remainder of the paper is organized as follows. In Section II, we describe our system model, our objective in the optimal policy computation and the underlying assumptions made in the analysis. The optimal policy in the small battery asymptotic regime is characterized in Section III. We then describe a stochastic MPC policy based on quadratic approximations of the value function in Section IV. Numerical results are presented in Section V. Conclusions are drawn in Section VI. Wind Battery w t η + (b t+ b t ) + η (b t b t+ ) + Forward market p f t p b t, p s t s t Wind Farm e t Real time market s t D Grid Fig.. System model of a wind farm participating in a two-settlement market. Red nodes represent the wind farm and the blue nodes represent external systems. Arrows with the labels represent flows between nodes. Solid arrows represent directions of energy flows (bidirectional arrows indicate energy may flow both ways). Dotted lines represent price and contract information flows. A. System model II. PROBLEM DESCRIPTION We consider the infinite horizon problem of a WPP with colocated energy storage participating in a two settlement power market. The system diagram is depicted in Fig.. At each time instant t, the WPP performs the following two actions: Forward market interaction: The WPP promises to deliver a certain amount of electricity some number of hours, denoted by D t, into the future. The WPP s revenue from this forward market interaction depends on the contract price and the amount of electricity promised. More generally, multiple forward contracts to be delivered at different future times can also be formed. Figures 2(a) and 2(b) depict two models of the forward market s timelines. In Figure 2(a), at every hour t, a forward contract in the amount of s t is formed by a WPP, to be delivered D t = 24 hours later. In Figure 2(b), at the 0 th hour of each day, a total of 24 forward contracts are formed to be delivered in each of the 24 hours in the next day, respectively. We employ the st model, namely, D t = D for some fixed D, in deriving our main results. Generalizations of our results to other models is discussed in Section III-C. At each hour t, the contracted amount D hours earlier for the current hour t, denoted by s t D, needs to be delivered by the WPP. This necessarily entails the WPP s interactions with the real time market and its storage operation, as described next. Real time market interaction and storage operation: The WPP fulfils its earlier commitment made to be delivered at the current time. Due to the uncertainties in wind power generation, the WPP may either fall short of

3 S 0 is S S 46 2 S 23 is committed, to be delivered at t=47. committed, S S 26 S 22 S to be 25 S 0 is S S 46 2 S 24 delivered committed, S S 26 at t=24. S to be 25 S 24 delivered t= at t=24. 4 t= All commitments to be delivered from t = 24 to 47 are contracted at t = 0. All commitments to be delivered from t = 48 to 63 are contracted at t = 34. S 23 is committed, to be delivered at t=47. (a) All commitments to be delivered from t = 24 to 47 are contracted at t = 0. All commitments to be delivered from t = 48 to 63 are contracted at t = 34. S 22 t= (b) t= Fig. 2. Examples of the timelines of forward market interactions: a) At every hour t, a forward contract in the amount of s t is formed to be delivered D = 24 hours later, and b) In each day, at the 0 th hour, 24 forward contracts are 4 formed to be delivered in each of the 24 hours of the next day. its commitment, or it may have an excess left over after meeting all the forward contracts. In the former case, the WPP has to either buy power from the real time market or discharge power from its energy storage. In the latter case, the WPP sells power to the real time market or charges power to its energy storage. The objective of the WPP is to minimize the infinite horizon expected discounted cost by choosing its actions appropriately. The notation associated with our model is summarized in Table I. Next, we present a more detailed description of the technical assumptions in our model in Section II-B. B. Technical assumptions and model details Our model makes the following assumptions: The price process (p f t, p b t, p s t ) (where p f t is the forward contract price, p b t is the real time buying price, and p s t is the real time selling price) and wind process w t are bounded and Lebesgue measurable. The wind farm is a price-taker, and actions of the wind farm do not impact the joint exogenous price and wind processes. The price and wind processes can be correlated. The stored energy in the battery is b t. At each time instant, the WPP decides the level b t+ to charge the battery to at the next time instant. Let the charging efficiency be denoted by η + and discharging efficiency by η. Thus, the energy consumed (and an negative amount means energy is extracted) for changing the battery level is given by η + (b t+ b t ) + η (b t b t+ ) +, where ( ) + denotes max(0, ). Due to losses, η η +. Note that in batteries where the efficiency is 00% (lithium ion batteries are very close to ideal [8), η + = η =, and the energy consumed/extracted for changing the battery level is b t+ b t. At each time instant, we have the constraint that b t+ B t where B t is a set dependent on current battery level b t. This allows us to incorporate the following constraints in our model: S 47 S 47 TABLE I SUMMARY OF NOTATIONS USED External processes p f t Forward market price p b t Real time buying price p s t Real time selling price w t Wind power z t (w t, p f t, pb t, ps t ) F t All external realizations till time t: {z s} s t Battery parameters B Battery capacity b t Battery level at the beginning of time t B t Domain for battery level at time t + B If B t is the same for all t we use this shorthand notation R Ramping constraint η + Charging efficiency η Discharging efficiency Other wind farm parameters D The time after which the contract has to be delivered s t The contract formed at time t, to be delivered at time t + D s t t D Vector which contains all the power contracts to be offered from time t to t + D e t The excess energy left over after meeting contract and charging/discharging battery x t All prices (current and past time) and past actions; we call this the state of the system. π B,t ( ) Mapping from state to action at time t when battery capacity is B. Action is in R 2 and specifies the contract decision at time t and the battery level at t +. g Stage cost (time dependence implicit in g) g f Cost of interaction with the forward market g r Cost of interaction with the real time market f(,, ) Dynamics of the state evolution Capacity constraints The capacity of the battery is B. If we have an ideal battery, B t = B = [0, B. If we do not wish to completely charge or discharge the battery to prevent excessive wear, we can limit B = [ɛ, B ɛ. This reduces the effective capacity to B 2ɛ. Ramping constraints If the battery has power limits while charging or discharging, there is a ramping constraint R B which is the maximum change in battery levels in a time instant. Ramping can be modeled as B t = B [b t RB, b t + RB. In addition to setting the battery level at time t, the WPP further makes a decision regarding sales and purchase of energy in the forward market. In particular, the WPP contracts s t in the forward market to be delivered D time units later. The WPP receives contract price p f t for each unit contracted. The maximum contract is bounded by a sufficiently large constant S, i.e., s t [0, S. As the earlier contract s t D needs to be delivered from the realized wind and stored energy, the available energy e t at time t, after delivering the contract and charging/discharging the battery to the appropriate level, is e t = w t η + (b t+ b t ) + +η (b t b t+ ) + s t D. () The WPP sells its excess (e t ) +, if any, at the real-time market selling price p s t. The WPP meets its deficit, if any, by purchasing ( e t ) + units of energy at the real-time buying price p b t. 3

4 4 The state of the WPP x t in general consists of the history of contracts to be delivered, prices and wind energy realized, and the energy storage level. When the price and wind statistics are Markovian (to which our results are not restricted), the state only needs to include the current prices and wind as opposed to the history of prices and wind, i.e., x t (p f t, p s t, p b t, w t, s t t D, b t, t). The state space is X. For brevity, we denote the price and wind processes by z t = (w t, p f t, p b t, p s t) and the natural filtration [9 associated with the external stochastic process by {F t } t 0. Roughly speaking, the natural filtration satisfies the following properties: F t F t+s for s 0, t 0. Conditioning on F t means that all realizations of the external stochastic process until time t, i.e., {z s } s t, are known. The action given a state x t at time t consists of specifying the contract level s t and the battery level b t+ for time t +. Thus, action π B,t (x t ) = (s t, b t+ ) and the space of all actions A is assumed to satisfy A = {(s, b) s [0, S, b [0, B}. We define the stage cost g(x t, π B (x t )) as g(x t, π B (x t )) = g f (p f t, s t ) + g r (p b t, p s t, e t ) + I Bt (b t+ ), (2) where g f is the forward market cost component, g r is the real-time market cost component and the third term limits the feasible range of b t+. I Bt (b t+ ) denotes the standard indicator function which evaluates to zero if b t+ B t and is infinity otherwise. We consider g f (p f t, s t ) = p f t s t, g r (p b t, p s t, e t ) = p b t( e t ) + p s t (e t ) +. Note that g r ( ) captures the risk (the greater the e t is, the more the deficit and purchase from the real time market), whereas g f captures the profits from the forward contract. Note that g, g f and g r are convex functions in their individual arguments. It can also be seen that the dynamics of the state is linear in the action and external processes, i.e. with a linear function f, (3) x t+ = f(x t, π B,t (x t ), z t+ ). (4) We list some additional technical assumptions below. Assumption : The prices are such that β D p s t < min(p f t ) < max(p f t ) < β D p b t with probability for all t 0. This is a sufficient condition to ensure that an agent with only a battery does not make a profit and also eliminates infinite arbitrage opportunities. Assumption 2: We consider contract policies in a (sufficiently large) neighborhood of the optimal batteryless contract policy denoted by P L. Specifically, we consider policies which are Lipschitz in B for all B in the following sense: there exists a finite (could be large) L so that for all considered policies π B,t = (s B,t, b B,t ) with battery of capacity B at time t, s B,t (x t ) s 0,t( x t ) LB, where s 0,t is the optimal batteryless contract policy, and x t equals x t at all coordinates except for the battery capacity where it is zero. Note that s 0,t can be computed based on just the external statistics by solving a news vendor problem [6. From the constraint on the energy capacity, we also have b B,t (x t ) B. Throughout the text, we use o(b) to refer to any function z(b) such that z(b) lim B 0 B = 0, and Θ(B) to refer to any function z(b) such that for some c 0. C. The objective We seek to obtain a policy z(b) lim B 0 B = c π B,t : Domain(x t ) R 2 that maps the state x t at each time instant to an action π B,t (x t ) that minimizes the infinite horizon stage cost. Note that if the wind and prices are Markov processes with memory M, the domain for the state variable is (x t ) = R D+5M+. We consider a discount factor of β <. We formulate the following optimal stochastic control (or dynamic programming) problem: [ V t (x t ) min E β s t g(x s, π B,s (x s )) F t π B,t s=t s.t. x s+ = f(x s, π B,s (x s ), z s+ ) s t.(5) Before we describe our main results we introduce additional notation related to the objective function. Let [ V πb,t,t(x t ) E β s t g(x s, π B,s (x s )) F t s=t be the value function obtained by following some fixed policy π B,t conditioned on the realization at time t, i.e., on F t. Let V + π B,t,t be defined as V + π B,t,t(s t t D, b) E [ V πb,t,t(x t ) F t, (6) where the expectation is taken before the randomness at time t is realized, and b is the initial battery level at time t. III. ANALYSIS OF THE OPTIMAL POLICY In this section we derive some properties of the optimal policies.

5 5 A. Structural properties The Bellman operator of the dynamic program formulated in (5) is T t V t (x t ) inf u t A g(x t, u t )+βe zt+ [V t+ (f(x t, u t, z t+ )) F t. In the simple case where the external statistics are stationary or cyclo-stationary, the value function and optimal policy can be obtained by solving the fixed point equation, V = T V. Since the stage cost is bounded and the Bellman operator is a contraction, the fixed point equation has a unique solution. We call a stationary optimal policy (not necessarily unique) corresponding to this optimal solution πb. In case the statistics are not stationary, the optimal value function V t (x t ) and a corresponding policy πb,t are even harder to compute; however our asymptotic characterizations in the following sections hold even for such general statistics. The evaluation of πb,t for a finite B is in general intractable both analytically and computationally. We can, however, consider asymptotic analysis to give us further insights. In the following, we list some observations about an infinite battery asymptote first, followed by the optimal small battery characterization in Section III-B. We now describe an upper bound on the value function when the battery is large and the discount factor is close to, i.e., we would like to minimize the average cost instead of the discounted total cost. As the energy storage capacity B, it is apparent that the optimal policy of the wind farm is to store all realized wind and contract what is stored when the price at the forward market is highest. i.e. p f t = max p f. Lemma. The average stage profit realized with infinite battery is upper bounded by, V avg = E [w max p f. where E [w = lim T T T t= E [w t is the average wind energy realized over all time. We can realize this only if there are no ramping constraints or other inefficiencies. This is the optimal policy as any interaction with the real time market (shortfall or excess) is suboptimal due to the absence of (infinite) arbitrage opportunities. B. Small battery asymptotic analysis As it is difficult to compute the optimal policy πb,t in general, in this section we focus on the incremental value of storage when the battery capacity B is small. As we consider small battery capacities, ramping constraints of the battery can be negleced in the analysis. In essence, we perform a sensitivity analysis of V + πb,t,0 (6) at B = 0. Assuming that both our initial contract history and the battery level is zero to start with, we set s 0 D = 0 RD, the initial battery b = 0 and seek to compute V + π B,t,0(0,0) B. We now state our main theorem: Theorem. At time t, consider the following policy π B,t : s B,t is set to be the optimal batteryless contract. The optimal storage operation policy is to charge fully or discharge fully depending only on the external wind and price processes. The exact specification is in Program. For a small enough B, the above policy achieves, 0) to within o(b). V + πb,t,t(st t D Program Optimize the next stage battery level If η + p b ti(s t D > w t ) + η + p s ti(s t D < w t ) < βα t (s t t+ D, 0), set b t+ = B. If η p b ti(s t D > w t ) + η p s ti(s t D < w t ) > βα t (s t t+ D, 0), set b t+ = 0. Else: set b t+ = b t. α t (.) V + π B,t,t(,b) b in Program is the expected cost of storing an additional unit of charge in the battery at time t. It is non-positive, since one can always make an expected profit (and the cost is the negative of profit) out of storing an additional unit of charge in the battery. The indicator function I( ) is if the condition is satisfied and 0 otherwise. As a typical case, if p b t < α t ( ) < p s t, and β, η +, η, (7) Program reduces to the specification that we should charge if we have a wind excess event (s t D < w t ), and discharge if we have a wind deficit event (s t D > w t ). This is because, as we show later, for statistics where the distribution of the external processes does not change significantly over time scales of length D units (in a sense made precise in Appendix D), α t is close to the negative of the contract price D time units earlier, p f t D, and the above conditions (7) hold under Assumption. For general statistics and correlation structure α( ) can be computed using a dynamic program, whose complexity depends on the mixing time of the external processes. It may be intractable in general. Note that α t ( ) needs to be computed only within Θ(B), as discussed more after the description of an elaborate version of this program in Appendix B. We observe, in particular, that the optimal policy in the low battery regime does not involve changing the optimal forward contract from that with no storage, but instead involves deciding when to charge or fully discharge the battery to reduce the risk of energy shortfall (measured by g r ( )). For specific cases where the external price and wind processes are cyclostationary, this corresponds to charging when there is excess energy, and discharging when there is a deficit. We prove Theorem in the appendices. We can thereafter compute V + π B,t,0(0,0) B by policy evaluation. For a system with a finite number of states, policy evaluation or evaluating the value functions corresponding to particular states can be done by solving a linear system of equations. It may also be evaluated using Monte Carlo simulations of state trajectories. The latter may be the only tractable option for larger state spaces due to complexity. As shown later in Section V, for some special wind and price processes, the incremental value of a small amount of storage may be evaluated in closed form.

6 6 C. Extension to other forward market models In this section we describe extensions of the small battery analysis to another day-ahead market model commonly used in practice. The main difference of this variant from the model analyzed earlier is as follows: a) Forward contract decisions are made at one specified time every D time units, b) At each time of decision, D forward contracts are determined for the consecutive D time units starting from D time units later. An example with D = 24 and D = 4 is the following: at 0am of each day, 24 forward contracts are made for the 24 hours starting from the beginning of the next day. In comparison, the model analyzed in earlier sections evenly spread the forward contract decisions over all hours. We observe that the same dynamic programming formulation in Section II applies in this market model, and the change to this market model affects the state of the DP. However, in spite of the differences from the model considered earlier, in the low battery limit, there are many similarities about the nature of the policies achieving the optimal profits in both market models. In particular, as we show in Appendix F, Theorem continues to hold under this market model. The reason is similar to that in the earlier market model: it can be shown that the batteryless optimal policy solves the news-vendor problem; by the first order necessary conditions for optimality, any deviation in the contract policy would not affect the profit up to first order terms. The charging policy is also exactly the same as in the earlier case, with a possibly different α t ( ). IV. APPROXIMATE DYNAMIC PROGRAMMING APPROACHES The optimal policy described in Section III-B becomes suboptimal for batteries of large capacities. In this section, we propose an efficiently computable heuristic policy when relatively large batteries are used. We first compute a convex quadratic approximation of the value function by converting costs and constraints to quadratic functions. We then describe how model predictive control approaches can yield better results. A. Quadratic approximation of value function We seek to represent the value function by a quadratic approximation and use this to determine a policy to follow. The non-linearities in the control problem arise from the piecewise linear nature of the real time cost function and constraints (imposed by limited energy storage) on the charging levels and non-negative contracts in the forward market. As introduced in [20, we replace these with quadratic cost functions as follows: g f (p f t, s t ) = h f, (s t + h f,2 ) 2 (8) g r (p b t, p s t, w t + b t s t D b + t ) (9) ( = h r, wt + b t s t D b + ) 2 t h r,2 (0) ( I [0,B (b t +) = γ b + t B ) 2, () 2 where coefficients h,, γ are chosen as functions of ( [ expected external wind and price processes E, E [p s t, E [ ) pt b, E [wt, B to approximate the p f t piecewise linear costs and the energy storage constraint. This converts the problem to a linear-quadratic (LQ) control problem. The state can be written as x t = [w t, b t, s t t D and the action u t = [b t+, s t. The state vector does not include the price, introducing another source of suboptimality. This is because policy recommendations from such an approach are the same irrespective of instantaneous prices. The stage cost and dynamics can now be written as x t [ x Qt q t u t t q u t r t t x t+ = A x t + A 2 u t + A 3 w t, g quad t (x t, u t ) = where Q, q, A are constants that can be straightforwardly computed. This problem has a convex quadratic stage cost and linear dynamics. The convex quadratic value function V quad (x t ) can be found by solving the Algebraic Ricatti Equations (ARE) resulting from the Bellman optimality equation. V quad is a quadratic approximation of our original problem. This approach also delivers an affine policy as u t (x t ) = K t x t + k t for K t, k t determined through the ARE. The quality of the approximation V quad is much higher when the price process does not vary much. Another approach to obtaining a quadratic approximator of the value function which is not problem specific is presented in [2. The authors use the S procedure to find a convex quadratic under-approximation to the value function using iterated Bellman equations. B. Model predictive control With any approximation to the value function, denoted by V app, we can employ a model predictive control (MPC) approach to determine a policy by applying V app as the terminal cost. In the LQ approximation to the problem described in Section IV-A, the variation in price is not taken into account while determining the policy. The proposed MPC approach refines the usage of the approximation by using the current price information. First, the expected value of the wind and price processes can be used for unseen future realizations in a certainty-equivalent MPC approach. The M step certainty-equivalent MPC algorithm utilizes an V app (x t ) to obtain a policy as u t = argmin s t,b t+ min s t+m t+,bt+m+ t+2 M j=0 β j g(s t+j+ t+j D+, bt+j+, z t+j) +β M V app (s t+m t+m D+, b t+m, E [z t+m ). (2) As defined earlier, z t = (p f t, p s t, p b t, w t ) represents the external stochastic processes. At each stage t, we see z t = z t and use z t+j = E [z t+j j [M. Furthermore, as opposed to the certainty-equivalent MPC approach, we also use a stochastic version of the MPC algorithm which is more robust as it samples the external wind and price processes from its probability distribution rather than simply using its expectation. We generate multiple

7 7 such samples of length M for future realizations in a Monte Carlo approach. We choose the day ahead contract and the charging policy for the current time instant that minimizes the average costs across all such M-length samples and the appropriately discounted terminal function to approximate expected future cost. A policy minimizing this cost is selected. Further details explicitly characterizing the objective function to be minimized are presented in Appendix F. The MPC policy can be used in cases where the probability distribution for future realizations depends on past realizations of wind and prices as well. When V app is a quadratic (e.g. the V quad developed in Section IV-A), solving both certainty-equivalent and stochastic MPC is minimizing a quadratic function subject to linear constraints which can be done efficiently. As M gets larger, the terminal cost function V app has a reduced impact because of the discount factor. Hence, for sufficiently large M, finding the MPC solution can be approximated by a linear program. V. NUMERICAL RESULTS In this section, we a) illustrate the optimal policy in the asymptotic regimes of energy storage, and b) extensively evaluate the asymptotically optimal and the proposed heuristic policies for all regimes of energy storage capacity. A. Simulation setup We first perform simulations employing a synthetic model for the external wind and price processes which are fit from PJM interconnection data from the year [22. We consider the case of a WPP in hourly day-ahead and hourly real-time markets for the case where horizon D = 24. For comparison, we also evaluate approximate D = 4 models with prices and wind averaged over 6 hour blocks. We fit Gaussian models to the prices and uniform distributions to the wind. The realizations of the wind and prices are independent across time and each other in this simulations but are cyclo-stationary. Furthermore, real world wind and price traces will be used for simulations toward the end of the section. The code and data files for the simulation are available at [23. We evaluate the averaged discounted profit over 6 realizations for two policies, and also evaluate upper bounds: ) Small battery approximate policy: This policy sets contracts to the optimal batteryless contract s t, and charges or discharges its battery if it sees an excess or deficit based on the realized wind and the contract it has to meet. 2) Stochastic MPC: We use the heuristic policy described in Section IV with 40 Monte Carlo samples for price and wind values with lookahead of M = 40 for D = 4, and 40 samples with lookahead M = 48 for D = 24. The terminal value function approximation used is computed from the LQ method described in Section IV-A. 3) Clairvoyant bound: Assuming complete knowledge of the future wind and price processes, an upper bound on the discounted profit can be computed using a linear program. 4) Linear upper bound: The precise incremental value of storage was computed in the small battery regime as shown in Section III-B. Now as the value function is convex in B, the profit is concave, and the computed derivative at 0 provides an upper bound as follows V Linear B (s D+, b = 0) = V π 0,t,t(s D+, 0)+ V πb,t,t (s D+, 0) B=0 B B V π B,t,t(s D+, 0). 5) Upper bound: The upper bound we present in the figures is the minimum of the above two upper bounds. B. Special case - Constant prices and stationary wind distribution We start with the price process being constant and the wind distribution drawn from a stationary uniform distribution. The battery is assumed to be ideal. While this scenario is not realistic in practice, it allows us to compute the utility of the battery solely due to the variation in wind. In this scenario, we can evaluate the incremental value of energy storage in closed-form. For any finite horizon D, the optimal batteryless contract can be found by solving the following news-vendor problem, s = min p f + β D E [ p b (s w t+d ) + p s (w t+d s) + s ( p s = Fw f β D p s ) β D (p b p s, ) (3) where F w is the cdf of the wind distribution. For a small battery and an optimal batteryless contract, the following holds: V + B (B) = Pr(w < s ) ( p b B + β V + B (0)) (4) +Pr(w > s ) ( β V + B (B)) V + B (0) = Pr(w < s ) ( β V + B (0)) (5) +Pr(w > s ) ( p s B + β V + B (B)). The notations are explained as follows. We let V + B (st t D+, b) be the incremental cost of storage (nonpositive), i.e., the difference of a) the expected batteryless value function, from b) the expected value function at (s t t D+, b) from following the optimal policy under the small battery approximation. Notationally, if the first argument is omitted, the prior D contracted amounts are the optimal batteryless amounts s. In (4), the incremental value function with full battery is the sum of the cost when there is a deficit ( p b B + β V + B (0)) and when there is an excess (β V + B (B)), weighted by the probability of deficit or excess while following the small battery optimal policy. Equation (5) can be similarly interpreted. The above simplification of stagecost is valid to o(b) as described earlier. We can evaluate Pr(w > s ) from the solution to the news-vendor problem described in Equation (3). If we follow the batteryless optimal contract and do not charge or discharge the battery

8 Average discounted profit ($) Upper Bound Small Battery MPC Average discounted profit ($) Upper Bound Small Battery MPC Average discounted profit ($) Upper Bound Small Battery MPC (a) (b) (c) Fig. 3. Average discounted profit as a function of battery capacity for (a) constant price and stationary wind processes and D = 4, and (b) periodic and random price and wind processes with D = 4 modeled over year and (c) periodic and random price and wind processes with D = 24 and 2 months modeled. until the first time we have a non-zero contract to deliver (s t D > 0), the incremental value of battery evaluates to V + B (s D = 0, b = 0) = Pr(w > s ) V + (B) +Pr(w < s ) V + (0) = B (pf β D p s )(β D p b p f ) β D (p b p s )( β) (6) This is confirmed in Fig. 3(a) where the incremental value of the battery using the small battery policy closely matches the asymptote obtained above (6). The approximation degrades when B = 25MW h. Beyond a capacity of 0MW h, the stochastic MPC heuristic outperforms the small battery policy (albeit both are close to each other). To get a sense of the size of storage, the average wind energy generated at every time unit (over 6 hours) in this scenario is 200MW h. In scenarios where the price and wind statistics are periodic, the asymptotes are found numerically as is done in Fig. 3(b) and Fig. 3(c). C. Performance with general battery capacities Fig. 3(a), 3(b), and 3(c) show the performance of the two policies for a wide range of energy storage capacities, in the stationary wind and constant price scenario and periodic external process scenarios with D = 4 and D = 24. The MPC heuristic is seen to perform quite well as it a) matches the asymptote (and the optimal small battery policy) at low battery levels, and b) approaches the clairvoyant bound at high energy levels. The small battery policy, which only employs the battery to minimize real-time interactions without changing the forward contracts from the optimal batteryless ones, is clearly suboptimal for values of energy storage beyond 00MW h. In other words, when the battery capacity is higher than this, it is optimal to contract more than the optimal batteryless contract depending on the stored energy levels. In the extreme case with very high battery capacity, it is optimal to store energy and contract only when price is high. In Fig. 3(a), (finite) arbitrage across time cannot be done as prices are constant. Increasing discounted profits are seen in Fig. 3(b) and 3(c) as increased battery levels enable storing energy across periods until the forward market contract price is higher. Average discounted real-time exchange (MWh) Average discounted contracts (MWh) (a) Clairvoyant Small Battery MPC (b) Clairvoyant Small Battery MPC Fig. 4. Periodically varying wind and fixed price with D = 4 and simulated for year: (a) Average discounted magnitude of the real-time component of the stage cost (b) Average discounted contracts offered on the forward market. In Fig. 4(a), the magnitude of real-time market interactions are plotted for the two policies and in the clairvoyant bound. It can be seen that the interactions, which are a proxy for risk since real-time market interactions are costly, are reduced more significantly for the MPC. Fig. 4(b) shows the average discounted contracts (sum of contracts with a discounting factor) made. The small battery policy does not contract more as capacity increases by design, and this is a source of its suboptimality. The MPC policy has its discounted contracts

9 9 Wind Price (p f ) Contracts (s t) Charging(b t+ b t) MPC Small battery t Average discounted profit ($) Upper Bound η = Small Battery η = 0.9 MPC η = Fig. 5. Time trace for 2 time units with D = 4 with periodic wind and price statistics and capacity 30MW h. All plots have been normalized. converging to that of the optimal clairvoyant policy as B increases; its increased contracts and reduced real-time market interactions contribute to good performance. In this scenario, energy storage allows for increased contracts, and at the same time reduced real time market interaction as it helps mitigate the randomness of wind energy generation. In the general case with varying prices, arbitrage brings additional benefit. It is interesting to note that MPC and the low complexity small battery heuristic have similar performance when B < 20M W h, although the policies have quite different computational requirements. In Fig. 5, we can see the behavior of the MPC and small battery policies for a specific case with D = 4 and periodic wind and price statistics. The small battery policy contracts an amount based on the current forward contract price, statistics of the wind and price for each time instant; this can result in a higher contract for periods in which the contract price realization is low but the expected wind generation is high. The MPC policy actively takes into account arbitrage opportunities afforded by periodic price and wind variations and contracts more when the price is higher by charging and discharging the battery appropriately. This shows that the small battery policy brings gain through minimizing wind variations whereas the MPC policy can also arbitrage with prices. D. Impact of battery inefficiencies Fig. 6(a) highlights the impact of battery charging and discharging inefficiencies in the periodically varying wind and price scenario for D = 4. Performance of both policies and the clairvoyant upper bound flatten out at degraded profit levels. Efficiency of the co-located energy storage is crucial in making a profit and justifying the expense of the battery. Fig. 6(b) illustrates the impact of ramping factor R. A ramping factor R > 0.7 is not seen to degrade performance at higher battery levels, implying that large shifts in battery values are not common in the optimal policy for typical scenarios. The greatest impact of the ramping factor for the small battery policy is at lower battery levels while it affects MPC policy for all battery levels. Average discounted profit ($) (a) Upper Bound R = Small Battery R = 0.7 MPC R = (b) Fig. 6. Periodically varying wind and price statistics with D = 4 and simulated for 2 moths (a) Effect of battery inefficiency (b) Effect of the ramping constraint. E. Performance on Real Data Performance on real data with representative forecasting errors is shown in Fig. 7. In this simulation, PJM interconnection data for the years 2004 and 2005 were used to generate the wind and prices. The algorithms were run over 2 months which allowed for 2 realizations. Representative forecasting error distributions were assumed with the margin of error (% 0%) increasing as the prediction horizon increased. In solving the policy decisions A uniform model was assumed for wind and Gaussian models for the prices. The policies are then evaluated using the real world wind and price traces. As can be seen from the plot, the small battery policy performs quite well compared to the MPC policy for small storage capacities. The clairvoyant bound is also higher as real data can offer arbitrage conditions; this makes it profitable for a battery owner who does not generate wind power to operate on the grid. The performance of the small battery policy is not monotonically increasing as Assumption on prices does not hold for real data. There are wide fluctuations in the prices and the forward contract price now can be larger than the buying price or lower than the real time selling price at a later time instant. This implies that storing energy in the battery may be sometimes not as profitable as selling it on the real time market.

10 0 Average discounted profit ($) Upper Bound Small Battery MPC Fig. 7. Performance with D = 4 for 2 realizations of real data with representative errors over 2 months. factor β <, the series is absolutely summable. This implies that the expected stage cost is finite and the dominated convergence theorem can be used to interchange the expectation and the summation. We note that the incremental change in the value function due to the battery cannot be infinite due to the fact that the prices are bounded. This means that the value function is Lipschitz and absolutely continuous in B at B = 0. This implies differentiability of the value function almost everywhere with respect to the Lebesgue measure, in any interval that B may lie in, in particular, a neighborhood around 0. From the boundedness of prices it follows that V + πb,t,t is Lipschitz in b. Thus V + πb,t,t is differentiable in b (similar argument as in the last paragraph). VI. CONCLUSIONS We have studied the problem of a WPP participating in a dynamically evolving two settlement power market, with the help of a co-located energy storage. To maximize the expected profit of a WPP, the optimal forward contract and the storage operation (i.e. charging/discharging) decisions are formulated as an infinite horizon stochastic optimal control problem. We characterize analytically the optimal operations for the asymptotic regimes of small storage and show that the optimal operation for small storage involves only a reduction in the real time market interaction, without changing the forward contracts from the optimal ones in the absence of storage. For an intermediate storage capacity, we propose a stochastic model predictive control (MPC) policy based on a quadratic approximation of the value function. We numerically evaluate the simple asymptotically optimal policy and the MPC policy. We observe that, as expected, while the simple policy associated with the small battery approximation works well for small batteries, the more complex MPC works better for larger ones. The precise threshold on battery capacity where the different policies are best depends on the exact parameters of the system. ACKNOWLEDGMENT We thank Dr. Mahnoosh Alizadeh and the anonymous reviewers for helpful comments and discussions. APPENDIX A In this appendix, we state and prove Lemma 2 about some properties of V + π B,t. Lemma 2. The following hold: V + π B,t,t is differentiable in B., b) is differentiable in b. V + πb,t,t(st t D Proof. The finite first moment assumption on the wind together with the bounded price processes implies that the expected stage cost at each time is bounded. The batteryless optimal contract is bounded above by an absolute constant if the expected absolute first moment of the wind process is finite [6. Also, if the discount APPENDIX B In this appendix, we prove Theorem. We first show that an optimal storage operation policy is dependent only on the external processes (in particular, it is independent of the contract policy). For any fixed storage operation policy dependent only on external wind and price statistics we show that an optimal contract policy is the batteryless optimal policy. ) Optimal storage operation policy: From the differentiability of V + πb,t,t in b established in Appendix A, we have the following expression: V + πb,t,t(st t D, b) = V + πb,t,t(st t D, 0) + α t(s t t D )b + o(b). α t ( )b in the above expression is the incremental cost (or the negative of the incremental value) of having an initial battery level of b. Note that α t < 0 can be computed based on the history of all realizations till (and including) time t and the future statistics of all external wind and price processes. Note that, as shown in Appendix C, the o(b) term follows from the Lipschitz continuity of α t in its contract arguments. Hence even if each entry of s t t D is specified within Θ(B) of the batteryless optimal contract, the above relation still holds. We now characterize an optimal battery charging policy within P L (defined in Assumption 2). Note that by an optimal policy we mean any policy π B,t such that the corresponding V π + B,t,t is different from the V + πb,t,t by no more than o(b). We have the following inequality. V π B,t,t(x t ) h t (s t, b t+, x t ) = p f t (s t ) + p b t(s t D w t + η + (b t+ b t ) + η (b t b t+ ) + ) + p s t ( s t D + w t η + (b t+ b t ) + + η (b t b t+ ) + ) + + βv + π B,t,t(st t+ D, b t+ ). Note that h t (,, ) differs from V πb,t,t(x t ) in that the former corresponds to the value function for the optimal policy from all time instants t + onwards (with the only potentially suboptimal action being at time t), whereas in the latter, we follow the suboptimal policy π B,t for all time. Note also that x t+ is specified by the choices of the contract and battery pair

11 s t, b t+ and by the external random processes. The (s t, b t+ ) can be chosen either according to the policy πb,t (to get an equality) or can be chosen suboptimally, thereby increasing the one-stage cost. We now look into the optimality conditions that the best one-step b t+ needs to satisfy. We first note that b t + b t+ is bounded in absolute value by B. For a given s t D, b t, w t we then have the following: h(s t, b t+, x t ) b t+ bt+ b t+ = η + p b ti(s t D > w t ) and +η + p s t I(s t D < w t )+βα t (s t t+ D), if s t D w t > 2B, h(s t, b t+, x t ) bt+ b b t t+ = η p b ti(s t D > w t ) +η p s ti(s t D < w t )+βα t (s t t+ D), if s t D w t > 2B. The notation f x x a+ refers to the right derivative of f with respect to x at x = a and f x x a refers to the corresponding left derivative. Note that we are interested in the value of E [h, and specifying the storage operation policy when w t lies in [s t D 2B, s t D + 2B is sufficient for us to know E [h within o(b). The following lemma makes this precise: Lemma 3. Policies differing only in the specification for b t+ [0, B for the wind realization w t lying in a set of measure Θ(B) yield E [h(s t, b t+, x t ) differing by at most o(b). This follows from the fact that the difference in E [h(s t, b t+, x t+ ) for two policies in P L differing only in the behavior when s t D w t 2B is of the form Pr( w t s t D B) Θ(B) = o(b). An implication of the above lemma is that the choice of storage operation policy in s t D w t 2B does not matter and an optimal next stage storage operation policy is dependent only on external wind and price processes. An optimal next stage storage operation policy can be written as in Program (, reproduced ) in Program 2. α t (.) V + π B,t,t(,b) b in Program 2 is the expected cost of storing an additional unit of charge in the battery and is non-positive, since one can always make an expected profit out of storing an additional unit of charge in the battery. Note that α t (s t t+ D ) is Lipschitz in the contract policy, i.e., it is known to be within Θ(B) (discussed in Appendix C) among all policies in P L. Strictly speaking one should have Θ(B) instead of 0 on the right hand side in Program 2 or Program, as done in Program 3. This is because the set of w t for which h b is Θ(B) is also of measure Θ(B). By Lemma 3, all policies satisfying Program 3 yield E [h that are the same up to first order terms in B. Note that, if we start from a battery level b 0 = 0, the battery level is always either 0 or B. Note also that terms in s t t+ D are known to within Θ(B). This is true for time t < D because initial contracts are known, and for time t D by Assumption by which we have that the contract policy does not deviate from the batteryless optimal contract policy by more than Θ(B). Thus α t ( ) can be computed by knowing just the external statistics. In some special cases, this can even be obtained in closed form (Appendix E). We now characterize the optimal contract policy by fixing a (potentially time dependent) storage operation policy dependent only on the external statistics of wind and prices. 2) Optimal contract policy: We start with decomposing the expression V π B,t (x t ) as the summation of two terms: one being the expected real time cost due to the initial contract and the second being the expected revenue from contracts from the time t onwards. We observe that V π B,t can be written as V π B,t (x t ) = E + [ t+d v=t β v t g r (p b v, p s v, e v ) F t [ β v t E p f vs v v=t + β D E [g r (p b v+d, p s v+d, e v+d ) F v (7) We focus on finding the optimal contract policy s t. We observe that the inner expectation conditioned on F v can be computed to o(b) accuracy by using the storage operation policy which is just a function of the random variables defined in F v (this is proved in Appendix D). In addition, for any L, L 2 such that L, L 2 < L, we can consider the batteryless optimal policy at time t, s t and have that for s t = s t + L B, s t = s t + L 2 B, ( ) p f t s t +β D E [g r (p b t+d, p s t+d, ẽ t+d ) F t ( [ ) p f t s t +β D E g r (p b t+d, p s t+d, ē t+d ) F t = o(b). Thus, given a fixed storage operation policy dependent only on the external statistics of the wind and the price processes, we observe that the contract choice does not affect the expected cost by more than o(b) as B approaches 0. Intuitively, the increased profits from the contracts are canceled by the losses due to the increased expected real time market interaction if the contract is already at the one specified by the batteryless optimal policy. Hence an optimal contract policy in the asymptotically small battery regime is just the batteryless optimal contract policy and an optimal storage operation policy is to follow Program. Thus Theorem is proved. APPENDIX C In this appendix, we prove that α t is also locally Lipschitz around the contracts. We note that the expected increase in the real time interaction due to a change in the contract by Θ(B) is also Θ(B) due to the fact that E wt+ [p b t(w t+ s t+ D ) p s t(w t+ s t+ D ) + is Lipschitz in s t+ D. Since b < B, we have bθ(b) = o(b), and α t (s t t+ D + Θ(B))b = α t(s t t+ D )b + o(b)

12 2 Program 2 Optimize the next stage battery level If h(st,bt+,xt) b t+ bt+ b t+ = η + p b ti(s t D > w t ) + η + p s t I(s t D < w t ) + βα t (s t t+ D ) < 0, set b t+ = B. If h(st,bt+,xt) b t+ bt+ b t = η p b ti(s t D > w t ) + η p s t I(s t D < w t ) + βα t (s t t+ D ) > 0, set b t+ = 0. Else: set b t+ = b t. Program 3 Optimize the next stage battery level (similar to Program 2) If h(st,bt+,xt) b t+ bt+ b t+ = η + p b ti(s t D > w t ) + η + p s t I(s t D < w t ) + βα t (s t t+ D ) Θ(B), set b t+ = B. If h(st,bt+,xt) b t+ bt+ b t = η p b ti(s t D > w t ) + η p s t I(s t D < w t ) + βα t (s t t+ D ) Θ(B), set b t+ = 0. Else: set b t+ = b t. where s t t D + Θ(B) is taken to refer to any contract history which is within a constant times B of each contract in s t t D, i.e., s t t D + Θ(B) {zt t D : z i s i < Θ(B) for i {t D,..., t }}. APPENDIX D In this appendix we show that, given a storage operation policy, the expectation at a future time can be computed to within o(b) and hence can be used to compute the optimal contract policy. The batteryless optimal s is such that p f t s [ + β D E g r (p b t+d, p s t+d, s + w t+d η + (b t+d+ b t+d ) + + η ( b t+d+ + b t+d ) + F t = 0 is minimized at s. A first order necessary condition for this to hold is ( ) p f t s + β D E [g r (p b t+d, ps t+d, e t+d) F t = 0 s p f t + β D E [ p b t+d(s w t+d )I(w t+d < s ) F t + β D E [ p s t+d( s + w t+d )I(w t+d > s ) F t = 0. ( = λb p f t +β D E [ p b t+d(s w t+d )I(w t+d < s ) F t +β D E [ p s t+d(w t+d s )I(w t+d > s ) ) F t ( [ +B β D η+ ( b t+d +b t+d+ ) + η (b t+d b t+d+ ) E B F t )+o(b) ( [ = B β D η+ ( b t+d +b t+d+ ) + η (b t+d b t+d+ ) E B F t )+o(b). The first order (in B) terms in the last expression is independent of λ and depends only on the storage operation policy and the external statistics. APPENDIX E We compute α t in this appendix. We note that, by the results of the previous appendix, we can focus only on the real time market interaction at the next time instant, as every subsequent time will only have a o(b) effect on the value function. We also restrict ourselves to a contract s t D which is within Θ(B) of the batteryless optimal contract s. Taking partial derivatives with respect to b t, and using Assumption, we get that α t = β b (E[pb t(s +η + ( b t +b) + η (b t b) w t ) + p s t+(s +η + ( b t +b) + η (b t b) w t+ ) F t )+Θ(B) With a fixed storage operation policy dependent only on the external wind and price statistics, and here we choose s +λb WLOG (as will be shown by the end of this section), the change in the cost is p f t (λb) [ +β D E g r (p b t+d, p s t+d, w t+d s λb η + (b t+d+ b t+d ) + η (b t+d b t+d+ ) + ) F t If the wind and price statistics are such that conditioning on F t is the same as conditioning on F t D (one special case where this condition holds is if the external processes are independent across time) then this expression evaluates (within Θ(B)) to βp f t D, which by assumption (and assuming η = η + = ), is between p s t and pb t for all t, w.p.. This means that optimal battery charging policy in this case (under the assumptions above) is simply to charge if there is an excess and discharge if there is a deficit.

3 APPENDIX F In this appendix, the formulations of the M-step certainty equivalent Model Predictive Control (MPC) and the stochastic MPC model are presented.

13 3 APPENDIX F In this appendix, the formulations of the M-step certainty equivalent Model Predictive Control (MPC) and the stochastic MPC model are presented. The M step certainty-equivalent MPC algorithm utilizes an approximate value function V app (x t ) to obtain a policy as u t = argmin s t,b t+ min s t+m t+,bt+m+ t+2 M j=0 β j g(s t+j+ t+j D+, bt+j+, z t+j) +β M V app (s t+m t+m D+, b t+m, z t+m ). (8) As defined earlier, z t = (p f t, p s t, p b t, w t ) represents the external stochastic processes. At each stage t, we see z t = z t and use z t+j = E [z t+j j [M. In stochastic MPC, we use the distribution of the future realizations of wind and prices to generate multiple samples z (i), i [N, where N is the number of samples. We use z (i) t = z t and sample z (i) t+m iid t+ Pr(zt+ t+m ) to generate N samples indexed by i [N. u t = argmin s t,b t+ N N i= s t+m,i t,i M j=0 min,b t+m+,i t+,i β j g(s t+j+,i t+j D+,i, bt+j+,i t+j,i, z (i) t+j ) + β M V app (s t+m,i t+m D+,i, b t+m,i, E [z(i) t+m ) s.t. s t,i = s t, b t+,i = b t i (9) APPENDIX G In this section we argue that a different market model as discussed in Section III-C would yield the same contract and storage operation policies as the market model assumed in the earlier appendices does. We start with the contract policy described in Appendix B-2, and make the following observation from the discussions after Equation (7): under a different market model as in Section III-C, any deviations in the contract policy which are Θ(B) from the batteryless optimal policy would affect the cost (when the contract is realized) only by a term whose magnitude is less than first order in the size of the battery (i.e., the deviations in the cost are o(b)). The derivation of the storage operation policy follows a very similar derivation as in Appendix B-. REFERENCES [ U.S. Energy Information Administration, Annual energy outlook, Tech. Rep., 204. [2 J. Rogers and K. Porter, Utility Wind Integration Group, Wind power and electricity markets, Tech. Rep., 20. [3 P. Pinson, Wind energy: Forecasting challenges for its operational management, Statistical Science, vol. 28, no. 4, pp , 203. [4 Y. Zhao, J. Qin, R. Rajagopal, A. Goldsmith, and H. V. Poor, Wind aggregation via risky power markets, IEEE Trans. Power Syst., vol. 30, no. 3, pp , 205. [5 H. L. Ferreira, R. Garde, G. Fulli, W. Kling, and J. P. Lopes, Characterisation of electrical energy storage technologies, Energy, vol. 53, pp , 203. [6 E. Y. Bitar, R. Rajagopal, P. P. Khargonekar, K. Poolla, and P. Varaiya, Bringing wind energy to market, IEEE Trans. Power Syst., vol. 27, no. 3, pp , 202. [7 M. Zugno, T. Jónsson, and P. Pinson, Trading wind energy on the basis of probabilistic forecasts both of wind generation and of market quantities, Wind Energy, vol. 6, no. 6, pp , 203. [8 J. M. Morales, A. J. Conejo, and J. Pérez-Ruiz, Short-term trading for a wind power producer, IEEE Trans. Power Syst., vol. 25, no., pp , 200. [9 H.-I. Su and A. El Gamal, Modeling and analysis of the role of energy storage for renewable integration: Power balancing, IEEE Trans. Power Syst., vol. 28, no. 4, pp , 203. [0 P. Harsha and M. Dahleh, Optimal management and sizing of energy storage under dynamic pricing for the efficient integration of renewable energy, IEEE Transactions on Power Systems, vol. 30, no. 3, pp. 64 8, 205. [ J. Qin, Y. Chow, J. Yang, and R. Rajagopal, Online modified greedy algorithm for storage control under uncertainty, IEEE Transactions on Power Systems, vol. 3, no. 3, pp , May 206. [2 E. Bitar, R. Rajagopal, P. Khargonekar, and K. Poolla, The role of co-located storage for wind power producers in conventional electricity markets, in Proc. American Control Conference (ACC), 20, pp [3 J. H. Kim and W. B. Powell, Optimal energy commitments with storage and intermittent supply, Operations research, vol. 59, no. 6, pp , 20. [4 S. Kurandwad, C. Subramanian, A. Vasan, V. Sarangan, V. Chellaboina, A. Sivasubramaniam et al., Windy with a chance of profit: bid strategy and analysis for wind integration, in Proc. 5th International Conf. on Future Energy Syst. ACM, 204, pp [5 J. Taylor, D. S. Callaway, K. Poolla et al., Competitive energy storage in the presence of renewables, IEEE Trans. Power Syst., vol. 28, no. 2, pp , 203. [6 R. Jiang, J. Wang, and Y. Guan, Robust unit commitment with wind power and pumped storage hydro, IEEE Trans. Power Syst., vol. 27, no. 2, pp , 202. [7 D. Gayme and U. Topcu, Optimal power flow with large-scale storage integration, IEEE Trans. Power Syst., vol. 28, no. 2, pp , 203. [8 K. Divya and J. Østergaard, Battery energy storage technology for power systems an overview, Electric Power Syst. Research, vol. 79, no. 4, pp , [9 B. Øksendal, Stochastic differential equations. Springer, [20 M. Rao, M. Chowdhury, Y. Zhao, T. Javidi, and A. Goldsmith, Value of storage for wind power producers in forward power markets, in Proc. American Control Conference (ACC), 205, pp [2 Y. Wang, B. O Donoghue, and S. Boyd, Approximate dynamic programming via iterated bellman inequalities, International Journal of Robust and Nonlinear Control, 204. [22 National Renewable Energy Laboratory, Eastern wind integration and transmission study, Tech. Rep., 200. [23 Github code repository, MAT, accessed: MAINAK CHOWDHURY is pursuing a Ph.D. degree in the Wireless Systems Laboratory at Stanford University. He received a Bachelor of Technology in Electrical Engineering from the Indian Institute of Technology, Kanpur, India (20), and a Master of Science in Electrical Engineering from Stanford University (203). His research interests include communication systems with a large number of transmitters or receivers and applications involving stochastic control.

4 MILIND RAO (S 4) received the B. Tech degree in Electrical Engineering from IIT Madras, India in 203 where he graduated with the Siemens award for academic proficiency. He received the M.S. degree in Electrical Engineering from Stanford University in 205 and is currently pursuing the Ph.

His research interests are in stochastic control, optimization, signal processing, and wireless communications.

S. and Ph.D. degrees in electrical engineering from the University of California, Los Angeles (UCLA), Los Angeles in 2007 and 20, respectively.

Tara Javidi (S 96-M 02) studied electrical engineering at the Sharif University of Technology from 992 to 996.

14 4 MILIND RAO (S 4) received the B. Tech degree in Electrical Engineering from IIT Madras, India in 203 where he graduated with the Siemens award for academic proficiency. He received the M.S. degree in Electrical Engineering from Stanford University in 205 and is currently pursuing the Ph.D. degree in the Wireless Systems Laboratory at Stanford University. He is a recipient of the Stanford Graduate, DAAD and KVPY fellowships. His research interests are in stochastic control, optimization, signal processing, and wireless communications. YUE ZHAO (S 06, M ) is an assistant professor of electrical and computer engineering at Stony Brook University. He received the B.E. degree in electronic engineering from Tsinghua University, Beijing, China in 2006, and the M.S. and Ph.D. degrees in electrical engineering from the University of California, Los Angeles (UCLA), Los Angeles in 2007 and 20, respectively. His current research interests include renewable energy integration, smart grid, optimization and game theory, and statistical signal processing. Tara Javidi (S 96-M 02) studied electrical engineering at the Sharif University of Technology from 992 to 996. She received her MS degrees in Electrical Engineering: Systems, and Applied Mathematics: Stochastics, from the University of Michigan, Ann Arbor, MI. She received her PhD in EECS from University of Michigan, Ann Arbor in From 2002 to 2004, Tara was an assistant professor at the electrical engineering department, University of Washington, Seattle. She joined University of California, San Diego, in 2005, where she is currently an associate professor of electrical and computer engineering. She was a Barbour Scholar during academic year and received an NSF CAREER Award in Her research interests are in communication networks, stochastic resource allocation, and wireless communications ANDREA GOLDSMITH (S 90-M 93-SM 99- F 05) is the Stephen Harris professor in the School of Engineering and a professor of Electrical Engineering at Stanford University. She was previously on the faculty of Electrical Engineering at Caltech. Her research interests are in information theory and communication theory, and their application to wireless communications and related fields. She co-founded and served as Chief Scientist of Plume WiFi, and previously co-founded and served as CTO of Quantenna Communications, Inc. She has also held industry positions at Maxim Technologies, Memorylink Corporation, and AT&T Bell Laboratories. Dr. Goldsmith is a Fellow of the IEEE and of Stanford, and has received several awards for her work, including the IEEE ComSoc Edwin H. Armstrong Achievement Award as well as Technical Achievement Awards in Communications Theory and in Wireless Communications, the National Academy of Engineering Gilbreth Lecture Award, the IEEE ComSoc and Information Theory Society Joint Paper Award, the IEEE ComSoc Best Tutorial Paper Award, the Alfred P. Sloan Fellowship, the WICE Technical Achievement Award, and the Silicon Valley/San Jose Business Journal s Women of Influence Award. She is author of the book Wireless Communications and co-author of the books MIMO Wireless Communications and Principles of Cognitive Radio, all published by Cambridge University Press, as well as an inventor on 28 patents. She received the B.S., M.S. and Ph.D. degrees in Electrical Engineering from U.C. Berkeley. Dr. Goldsmith has served on the Steering Committee for the IEEE Transactions on Wireless Communications and as editor for the IEEE Transactions on Information Theory, the Journal on Foundations and Trends in Communications and Information Theory and in Networks, the IEEE Transactions on Communications, and the IEEE Wireless Communications Magazine. She participates actively in committees and conference organization for the IEEE Information Theory and Communications Societies and has served on the Board of Governors for both societies. She has also been a Distinguished Lecturer for both societies, served as President of the IEEE Information Theory Society in 2009, founded and chaired the student committee of the IEEE Information Theory society, and chaired the Emerging Technology Committee of the IEEE Communications Society. At Stanford she received the inaugural University Postdoc Mentoring Award, served as Chair of Stanford s Faculty Senate and for multiple terms as a Senator, and currently serves on its Budget Group, Committee on Research, and Task Force on Women and Leadership.

Optimal Control of Plug-In Hybrid Electric Vehicles with Market Impact and Risk Attitude

Optimal Control of Plug-In Hybrid Electric Vehicles with Market Impact and Risk Attitude Lai Wei and Yongpei Guan Department of Industrial and Systems Engineering University of Florida, Gainesville, FL