Optimal Control of Plug-In Hybrid Electric Vehicles with Market Impact and Risk Attitude

Similar documents
Real-Time Demand Response with Uncertain Renewable Energy in Smart Grid

CAISO Participating Intermittent Resource Program for Wind Generation

Optimal Demand Response

Battery Energy Storage

DATA-DRIVEN RISK-AVERSE STOCHASTIC PROGRAM AND RENEWABLE ENERGY INTEGRATION

Multi-Area Stochastic Unit Commitment for High Wind Penetration

Multi-Area Stochastic Unit Commitment for High Wind Penetration in a Transmission Constrained Network

Computing risk averse equilibrium in incomplete market. Henri Gerard Andy Philpott, Vincent Leclère

Spatially-Explicit Prediction of Wholesale Electricity Prices

Perfect and Imperfect Competition in Electricity Markets

Deregulated Electricity Market for Smart Grid: A Network Economic Approach

EVALUATION OF WIND ENERGY SOURCES INFLUENCE ON COMPOSITE GENERATION AND TRANSMISSION SYSTEMS RELIABILITY

DIMACS, Rutgers U January 21, 2013 Michael Caramanis

California Independent System Operator (CAISO) Challenges and Solutions

Optimal Demand Response

Modelling wind power in unit commitment models

Planning a 100 percent renewable electricity system

Value of Forecasts in Unit Commitment Problems

Growing competition in electricity industry and the power source structure

Stochastic Unit Commitment with Topology Control Recourse for Renewables Integration

A new stochastic program to facilitate intermittent renewable generation

Bringing Renewables to the Grid. John Dumas Director Wholesale Market Operations ERCOT

Storing energy or Storing Consumption?

Javier Contreras Sanz- Universidad de Castilla-La Mancha Jesús María López Lezama- Universidad de Antioquia Antonio Padilha-Feltrin- Universidade

R O B U S T E N E R G Y M AN AG E M E N T S Y S T E M F O R I S O L AT E D M I C R O G R I D S

DRIVING ROI. The Business Case for Advanced Weather Solutions for the Energy Market

Benefits of Storage Control for Wind Power Producers in Power Markets

Coordinated Aggregation of Distributed Resources

SHORT TERM LOAD FORECASTING

Information Document Calculation of Pool Price and Transmission Constraint Rebalancing Costs During a Constraint Event ID # R

Stochastic Equilibrium Problems arising in the energy industry

FORECASTING OF WIND GENERATION The wind power of tomorrow on your screen today!

1 Bewley Economies with Aggregate Uncertainty

EUROPEAN EXPERIENCE: Large-scale cross-country forecasting with the help of Ensemble Forecasts

ENERGY STORAGE MANAGEMENT AND LOAD SCHEDULING WITH RENEWABLE INTEGRATION. Tianyi Li. Doctor of Philosophy

A stochastic integer programming approach to the optimal thermal and wind generator scheduling problem

Bilevel Programming-Based Unit Commitment for Locational Marginal Price Computation

Recent US Wind Integration Experience

The MIDAS touch: EPOC Winter Workshop 9th September solving hydro bidding problems using mixed integer programming

Forecasting demand in the National Electricity Market. October 2017

Global Catalyst Market

arxiv: v1 [math.oc] 28 Jun 2016

Moral Hazard: Part 1. April 9, 2018

Integrated Electricity Demand and Price Forecasting

Wind Generation Curtailment Reduction based on Uncertain Forecasts

About Nnergix +2, More than 2,5 GW forecasted. Forecasting in 5 countries. 4 predictive technologies. More than power facilities

A Unified Framework for Defining and Measuring Flexibility in Power System

Increasingly, economists are asked not just to study or explain or interpret markets, but to design them.

A Stochastic-Oriented NLP Relaxation for Integer Programming

Software Tools: Congestion Management

Systems Operations. PRAMOD JAIN, Ph.D. Consultant, USAID Power the Future. Astana, September, /6/2018

Online Appendix for Dynamic Ex Post Equilibrium, Welfare, and Optimal Trading Frequency in Double Auctions

1 Basic Analysis of Forward-Looking Decision Making

SMART GRID FORECASTING

Non-Linear Optimization

The Weather Information Value Chain

Modeling, equilibria, power and risk

On-line supplement to: SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology

Stochastic Dual Dynamic Programming with CVaR Risk Constraints Applied to Hydrothermal Scheduling. ICSP 2013 Bergamo, July 8-12, 2012

Coupled Optimization Models for Planning and Operation of Power Systems on Multiple Scales

Energy Forecasting Customers: Analysing end users requirements Dec 3rd, 2013 Carlos Alberto Castaño, PhD Head of R&D

Contents Economic dispatch of thermal units

Dynamic stochastic game and macroeconomic equilibrium

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem

TRANSMISSION BUSINESS LOAD FORECAST AND METHODOLOGY

Firming Renewable Power with Demand Response: An End-to-end Aggregator Business Model

A three-level MILP model for generation and transmission expansion planning

1 Descriptions of Function

PROJECT ECONOMIC ANALYSIS

The North American Electric Reliability Corporation hereby submits Informational Filing of the North American Electric Reliability Corporation.

Energy Storage and Intermittent Renewable Energy

Integrating Wind Resources Into the Transmission Grid

A Simplified Lagrangian Method for the Solution of Non-linear Programming Problem

Temporal Wind Variability and Uncertainty

The role of predictive uncertainty in the operational management of reservoirs

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Report on System-Level Estimation of Demand Response Program Impact

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

SHORT-TERM hydropower planning under uncertainty is

Deceptive Advertising with Rational Buyers

SIMATIC Ident Industrial Identification Systems

Decision Models Lecture 5 1. Lecture 5. Foreign-Currency Trading Integer Programming Plant-location example Summary and Preparation for next class

MOPEC: Multiple Optimization Problems with Equilibrium Constraints

POWER systems are one of the most critical infrastructures

Proper Welfare Weights for Social Optimization Problems

Optimization Methods in Management Science

Optimal Control of an Inventory System with Joint Production and Pricing Decisions

Introduction to Operations Research. Linear Programming

Handout 1: Introduction to Dynamic Programming. 1 Dynamic Programming: Introduction and Examples

Microeconomic Theory (501b) Problem Set 10. Auctions and Moral Hazard Suggested Solution: Tibor Heumann

7.1 INTRODUCTION. In this era of extreme competition, each subsystem in different

Economic Operation of Power Systems

Electronic Market Making and Latency

Lecture Note 1: Introduction to optimization. Xiaoqun Zhang Shanghai Jiao Tong University

Residential Demand Side Management Under High Penetration of Rooftop Photovoltaic Units

Cost of Inflow Forecast Uncertainty for Day Ahead Hydropower Production Scheduling

Introduction to Operations Research

Uncertainty Per Krusell & D. Krueger Lecture Notes Chapter 6

Power Engineering II. Fundamental terms and definitions

Players as Serial or Parallel Random Access Machines. Timothy Van Zandt. INSEAD (France)

Transcription:

Optimal Control of Plug-In Hybrid Electric Vehicles with Market Impact and Risk Attitude Lai Wei and Yongpei Guan Department of Industrial and Systems Engineering University of Florida, Gainesville, FL 32611 Emails: laiwei@ufl.edu; guan@ise.ufl.edu Abstract In this paper, we develop optimal electricity storage control policies to manage charging and discharging activities for plug-in hybrid electric vehicles for the benefit of an energy market participant. We first develop models for both risk-neutral and risk-averse aggregators to participate only in a real-time market. The proposed models capture the impact of the charging and discharging activities on real-time electricity prices. Next, we extend our study to the case in which aggregators participate in both the real-time and day-ahead markets. For each developed model, we analyze the properties of the optimal objective value function, prove the existence and uniqueness of the optimal policy, and explore the corresponding optimal policy structure. Moreover, through numerical studies, we explore insights on how electricity prices are influenced by charging and discharging activities. In particular, we observe that aggregated charging/discharging activities with market-impact consideration could reduce the variance of the real-time electricity prices more efficiently, as compared to individual activities. In addition, with the consideration of market impact, an aggregator tends to use less electricity storage. Finally, it is beneficial to let an aggregator control the electricity storage and participate in both the real-time and day-ahead markets, instead of participating only in the real-time market. Keywords: risk attitude, electricity storage control, market impact, real-time market, day-ahead market 1

1 Introduction Electricity, unlike other commodities, is traditionally considered non-storable (cf. [17]), unless it is transformed into other forms of energy, such as original potential energy of the water through pumped-storage hydroelectricity. Accordingly, electricity prices behave differently from those of other commodity markets because power balance is required at any time for a power grid system. Meanwhile, the penetration of renewable energy into the power grid system has recently been increasing dramatically, and this trend will continue following the Obama Administration s goal of 20% wind power for U.S. electricity generation [24] and the newly announced International Renewable Energy Agency s goal of 30% renewable energy generation worldwide by 2030 [5]. However, due to its intermittent nature, integration of renewable energies (such as wind and solar) enlarges the volatility of electricity prices for deregulated electricity markets. A large amount of electricity storage is one of a few limited approaches to help flatten the price curve and maintain system stability. Although pumped-storage hydroelectricity can serve as electricity storage devices, because of the need for proximity to a river, the portion of hydroelectric power in many areas is very limited for wholesale markets, as compared to nuclear and thermal generators. However, recent promotion of the deployment of plug-in hybrid electric vehicles (PHEVs) makes large-scale electricity storage possible in the near future. PHEVs are vehicles that include both a gasoline engine and a battery storage system of 4kWh or more, in which the battery storage can be recharged from an external source and has the ability to support at least 10 miles of driving in all-electric mode, as described in [10] and [12]. If the capacity of the battery is large enough, the vehicle operates as an all-electric vehicle (e.g., Nissan Leaf); otherwise, the vehicle first operates in all-electric mode and switches to a chargesustaining model once the gasoline engine is triggered (e.g., Toyota Prius) [4]. As compared to gas-fueled vehicles, PHEVs used to be less competitive due to their high costs. However, PHEVs have become more competitive than gas-fueled vehicles for the following reasons: 1) recent drop in the cost of batteries [8], 2) significant fossil fuel price increment during the last decade, 3) flexibility and extended driving range offered to the customers (cf. [13] and [22]), and 4) promotion by the government for its function of reducing traffic emissions and petroleum dependence. (The Obama Administration in the United States, for example, has proposed to put a total of one million 1

PHEVs on the road by 2015 [26].) Accordingly, there is potential in the near future for major vehicle manufacturers such as Chevrolet to mass-produce PHEVs, which will further reduce the production cost for each PHEV. As compared to gas-fueled vehicles, PHEVs have an additional advantage of serving as electricity storage devices apart from daily transportation activities [14, 15]. Since most vehicles are parked for 23 hours a day, on average, their batteries could be used to store and release electricity, and the profit is estimated to be up to $4,000 a year [9]. This electricity storage function of PHEVs benefits the electricity system in the following aspects: (1) It reduces the thermal power generation cost by charging power during off-peak hours and discharging it during peak hours. In this way, the net-load during the peak hours is reduced (if the amount of electricity discharged is considered to be a negative load), which leads to lower costs from fast-start generators used to serve peak-hour loads. Meanwhile, the net load during off-peak hours is increased, which helps to prevent shutting down slow-start thermal generators and saves large shut-down costs. (2) Because of its fast response time in terms of minutes, or even seconds, PHEVs are capable of serving as frequency regulation, spinning reserve, voltage support, and so on, in auxiliary service for most independent system operators (ISOs) (e.g., MISO and PJM). This reduces the need for high-cost, fast-start generators. It also reduces the need for keeping thermal generator output lower than that under optimal efficiency so as to maintain a certain reserve amount. (3) It helps the system to take advantage of intermittent renewable energy. Due to its intermittent nature, renewable energy (such as wind and solar) is hard to forecast accurately and increases the need for reserve markets. PHEVs, serving as electricity storage units because of their short reaction time, offer the perfect supplement of such intermittent energy. (4) Aside from reducing traffic emissions and serving as electricity storage, PHEVs can help reduce emissions generated from thermal plants. Both power system operators and energy market participants thus consider PHEVs important potential elements in their future daily operations: power system operators will consider optimal 2

central control on PHEVs to minimize total system operation costs while maintaining system reliability; energy market participants will aim to maximize their total profits by charging and discharging PHEVs. In this paper, we seek to optimally control the PHEV charging/discharging activities for market participants with the objective of maximizing total discounted expected profit. We first consider participating in the real-time market only by taking advantage of electricity price differences at different time periods. We study the optimal control problem for an aggregator (a controller owning a group of PHEV units) whose action (charging or discharging) can affect the electricity price. We also take into account the risk attitude of the aggregator and analyze its influence on the optimal decision. We then extend our study to the case where the aggregator participates in both the real-time and day-ahead markets. For each model, we prove the concavity of the aggregator s profit function, explore the properties of optimal policies, and analyze the optimal policy structure. Together with numerical studies, we show how an aggregator s charging/discharging activities, with the consideration of market impact in decision making, influence electricity prices as compared to those of individual controllers who don t consider market impact. We also explore how an aggregator s risk-averse attitude affects the optimal trading quantity and the overall usage of PHEV electricity storage, and how much extra profit can be generated by participating in both the realtime and day-ahead markets. The remaining part of this paper is organized as follows. In Section 2, we describe recent related research developments on PHEVs. In Section 3, we study optimal storage control policies for PHEVs participating in the real-time market only. We also explore the insights of risk-averse players versus risk-neutral players and compare their performance. In Section 4, we extend our study to consider both the day-ahead and real-time markets for an aggregator. In Section 5, we report experimental results, and more importantly, explore further insights. Finally, in Section 6, we conclude our research and discuss future research directions. 2 Literature Review Significant research progress has been made recently on potentially integrating PHEVs with power electricity systems and markets. For each electricity market, the benefits of PHEVs to power system 3

operators and market participants are different, and, accordingly, these lead to different research focuses. System operators aim to 1) perform optimal central control on PHEVs for the benefit of the whole power grid system, 2) perform sensitivity analysis of PHEVs activities, and 3) decide and select charging stations. For the central control approach, a large-scale optimization model is usually developed for the system operator with the objective of maximizing the total social welfare, in which physical constraints of thermal generator and transmission line capacities and power balance constraints are considered, and PHEVs charging and discharging activities are modeled as load and generation, respectively [33]. For this approach, system operators control the charging/discharging activities of PHEVs. For instance, in [33], a new unit commitment model is developed for system operators to simulate the interactions among PHEVs, wind power, and demand response. Four PHEV charging scenarios are simulated for the Illinois power system. Within this central control framework, system operators also consider the impacts of PHEV charging activities on the power grid system, in particular, in terms of electricity price (i.e., the local marginal price (LMP)). For instance, in [34], a case study based on PJM Interconnection (see, e.g., [1]) shows that a small magnitude of load increase caused by PHEV charging activities can increase LMP significantly. Furthermore, it is shown empirically in [35] that charging PHEVs during off-peak hours will flatten the daily load curve and reduce LMP volatility, and charging PHEVs during peak hours will increase LMP volatility. Most recently, in [30], the impacts of different electricity tariffs on PHEV drivers charging decisions have been empirically studied. The cost and emission impacts of these charging patterns are compared to the ideal case of charging that is controlled by the system operator. To facilitate PHEV charging/discharging activities, government agencies and companies are starting to deploy public charging stations (cf. [25]). For instance, NRG energy is committed to investing $100 million in building a PHEV charging network in California [11], and similarly a plan of building 570 charging stations has been proposed in British Columbia, Canada [23]. Accordingly, an agent-based decision support system for a PHEV charging infrastructure deployment is proposed in [32], optimal deployment of public charging stations for PHEVs has been studied in [10], and infrastructureplanning for PHEVs with battery swapping is proposed in [21]. One significant advantage of constructing charging stations, particularly, battery swapping stations, is to make aggregations of 4

PHEV charging/discharging activities possible. That is, a large number of individual PHEVs can be clustered for charging/discharging simultaneously. Market participants and electricity users aim to maximize total profit (or minimize total cost) by managing PHEV charging/discharging activities for daily usage and attending the auxiliary service, real-time, or/and day-ahead markets. In [28], owners of PHEVs are assumed to be price takers. In the proposed approach, forecasted future electricity prices, which are deterministic, are utilized in the calculation for the profit. Two dynamic programming algorithms are proposed to find the economically optimal solution for the vehicle owner for the cases with and without participating in ancillary service markets. In [2], vehicle-to-grid service is coordinated with thermal and wind plants for market participants to achieve optimal energy trading for the day-ahead market. In the proposed approach, the wind power outputs, the spot market energy prices, and the imbalance prices are assumed uncertain. A two-stage stochastic programming model is derived and solved by commercial optimization solver CPLEX. In [19], the authors consider daily energy cost minimization problems of vehicle owners under time-of-use electricity pricing frameworks. The price for each time period is assumed deterministic and the energy demand at each time period is random. The PHEV mobility is modeled as a Markov chain with the state transition probabilities describing the state transitions for the PHEVs (home, work, and commute). A stochastic dynamic programming algorithm is developed to solve the problem. Note here that, from the modeling perspective, the optimal schedule of charging/discharging activities concept for PHEVs are similar to the ones studied for general electricity storage devices (e.g., pumped-storage hydro). For instance, an optimal bidding strategy is studied for electricity storage in a competitive electricity market in [20]. For the proposed approach, the electricity prices are assumed deterministic and a nonlinear optimization model is proposed to formulate the problem. The model is solved for the unconstrained case first by relaxing the capacity constraints and then the optimization time intervals are adjusted when the upper reservoir capacity limits are enforced. A study for maximizing the profit of the owner of wind power and pumped-storage hydro is performed in [7]. In the proposed approach, the wind power is assumed random and sampling approaches are utilized to generate scenarios. For each scenario, a linear programming model, with the time horizon to be discretized, is developed to solve the problem. In [3], a combined bidding strategy is developed for joint wind and pumped storage 5

plants in a pool-based electricity market. In the proposed approach, the wind power is assumed uncertain and a stochastic mixed-integer linear programming model is proposed to formulate the problem. In [6], a model for electricity storage is developed that treats the storage problem as an optimal switching problem under continuous setting, where the prices are assumed to follow a Markov continuous-time stochastic process and a diffusion model is proposed to maximize the expected profit. The decision variables include operating regimes of the storage facility (injection, storage, and withdrawal) and switching times between refilling the reservoir. energy commitments with storage and intermittent supply have been studied. In [16], optimal In this research, an optimal commitment policy is derived under the assumption that wind energy is uniformly distributed; furthermore, the stationary distribution of the electricity storage level corresponding to the optimal policy is obtained. In this research, we solve the problem from a market participant s perspective. Instead of considering optimal controls for individual owners to participate in a single electricity market without theoretically analyzing the market impact, we consider decisions for an aggregator with the consideration of market impact. Recent works in [29] and [18] are related to our research, in which charging/discharging activities for a group of PHEVs are studied. In [29], optimal scheduling of PHEVs, instead of an individual vehicle, is considered, with the purpose of reducing dependencies on small expensive units in the existing power systems. The proposed approach is for a regulated market and the objective is to minimize the total generation cost, while ensuring the loads to be satisfied. In [18], the authors assume the prices are deterministic for each time period and the aggregator for a group of PHEVs is a price-taker. A linear programming model, or a quadratic programming model when market impacts are considered, is studied with the objective of minimizing total operation costs. Our research contributes to the literature by solving the optimal PHEV control problem, under the assumption that electricity prices are uncertain, for both risk-neutral and risk-averse aggregators who consider market impact and participate in both the real-time and day-ahead markets. Moreover, we analyze optimal policy structures and explore insights into the problem. 6

3 Real-Time Market In this section, PHEVs are considered only participating in the real-time market. Under the current deregulated market framework, the real-time LMP is updated every 15 minutes for most markets by running the security-constrained economic dispatch problem. Accordingly, we discretize the operational time interval into Ñ time periods (e.g., time periods 1 to Ñ), with the length of each time period to be 15 minutes. In addition, to participate in the real-time market, market participants have to submit their offers m periods (6 to 8, depending on different markets) ahead. In other words, at time period n m, we know the historical exogenous real-time electricity prices p n m = (p 1,..., p n m ), where p i represents the exogenous price for time period i, and we make the charging/discharging amount offer for time period n. We submit this offer (i.e., how much to charge/discharge) as a price-taker (also named self-scheduling at MISO and CAISO). That is, we only offer electricity amounts, in terms of buying and selling, and the offer is guaranteed to be taken. This means there is no bidding price in the offer (or the bidding price is very low for selling electricity and very high for buying electricity). Following this market rule and our offering strategy, at time period n m, the starting electricity storage level at time period n, denoted as x n can be computed based on the charging/discharging schedules in time periods n m to n 1. In addition, there are no charging/discharging activities for the first m time periods (i.e., time periods 1 to m). For notation brevity, in the remaining part of this paper, it is sufficient to use [1, N] as the operational time interval, corresponding to [m + 1, Ñ] where N = Ñ m, truncated from [1, Ñ]. In the charging/discharging process, two aspects must be considered. First, there is conversion loss from storage. The conversion of power into stored electricity and back to the power grid is imperfect. The electricity amount converted from power into storage during the discharging process is only a portion of the original electricity amount spent on charging the battery. Let α u represent the efficiency of converting the power electricity into potential energy in the storage during the charging process, and α d represent the efficiency of converting the potential energy in the storage into the power electivity. The overall energy conversion coefficient, e.g., α u α d, is around 0.6-0.8 based on the description in [31]. Second, the charging/discharging rate is bounded. We let C u be the upper limit for the charging rate and C d be the upper limit for the discharging rate in a single time period. Finally, we denote the total battery capacity owned by the aggregator to be 7

C +C l with C l representing the minimum storage level required to protect the battery and financial discount rate to be β. For notation brevity, we use the interval [0, C], instead of [C l, C l + C], to represent the allowable charging/discharing range of the battery. In our approach, the exogenous electricity price could be Markovian or of a more general type. Our decision is how much to charge/discharge in each period n, which is equivalent to the decision on the ending storage level in period n, denoted as y n, n [1, N], so as to maximize the total discounted expected profit. 3.1 Risk-Neutral Aggregator We first study a risk-neutral aggregator case. For this case, we explore a finite-horizon model to maximize the aggregator s total discounted expected profit and solve the problem using stochastic dynamic programming. Let V n (x n, p n m ) be the maximum expected total profit from period n to the end of the horizon (e.g., period N). We first initialize the value function V N+1 (x N+1, p N+1 m ) to be the salvage value of the electricity at the end of the operational time horizon, which is assumed to be a concave and increasing function in x N+1 (e.g., V N+1 (x N+1, p N+1 m ) = α ln x N+1 for a sufficiently small α > 0). For each particular time period n, the charging amount is equal to the ending storage level minus the initial one for the period (i.e., y n x n if y n x n 0). Similarly, the discharging amount is equal to x n y n, if y n x n < 0. Therefore, in general, the charging amount is equal to [y n x n ] + and the discharging amount is equal to [y n x n ], where z = max{ z, 0} and z + = max{z, 0}. In addition, to consider the market impact, we assume the exogenous real-time electricity price is non-negative and the influence of charging/discharging activities on the real-time electricity price is b n dollars per unit at period n. For instance, the real-time electricity price is assumed to increase (decrease) b n dollars if one additional electricity unit is charged (discharged). This value can be considered as the slope of the supply curve at the clearing price for the real-time market. In addition, we assume that the controller s action does not affect the evolution of the exogenous price. Based on this assumption, the myopic profit function for the current period can be described 8

as follows: L n (y n x n, p n m ) = { E pn p n m [p n ] b n α d [y n x n ] + b n [y n x n ] + /α u } (αd [y n x n ] [y n x n ] + /α u ). (1) Accordingly, for a given beginning storage level x n, we let yn myop (x n, p n m ) represent the maximizer of the above function. Then the backward Bellman equation for the dynamics of the system can be described as follows: V n (x n, p n m ) = where the action space max {L n(y n x n, p n m ) + βe pn m+1 p n m [V n+1 (y n, p n m+1 )]}, (2) y n A(x n ) A(x n ) = {y n : (x n C d ) + y n min{x n + C u, C}} (3) describes the feasible set of the target storage level when the initial storage level is x n. Note that the decision on y n is made at time period n m. Finally, the objective is to maximize the total expected discounted profit for the given operational time interval (i.e., the value V 1 (x 1, p 1 m )). Before describing the optimal control policy for the problem, we first explore the properties of the value functions as follows: Proposition 1 For a risk-neutral aggregator with market-impact consideration, we can observe the following properties: (a) The myopic optimal solution is y myop n (x n, p n m ) = max{x n C d, x n E pn p n m [p n ]/(2b n α d ), 0}, and the myopic profit function L n (z, p n m ) is strictly concave in z. (b) The joint value function J n (x n, y n, p n m ) := L n (y n x n, p n m ) + βe pn m+1 p n m [V n+1 (y n, p n m+1 )] is jointly concave in (x n, y n ). For any given x n, J n (x n, y n, p n m ) is strictly concave in y n and accordingly, there exists a unique optimal ending electricity storage level y n(x n, p n m ) for any given x n. (c) The value function V n (x n, p n m ) is increasing and concave in x n. Proof: The proof is given in Appendix A.1. Based on the above proposition, we can derive the optimal policy as follows: 9

Theorem 1 For any price evolution model, there exists a unique optimal target storage level x n = y n 1 (y n 2 (... y 1 (x 1, p 1 m ),..., p n 2 m ), p n 1 m ), corresponding to any given initial storage level x 1 and any available price information p n 1 m for n = 2,..., N + 1. Proof: The proof is given in Appendix A.2. Moreover, based on the concavity and monotonicity analysis described in Proposition 1, we can further obtain the following structural results. Proposition 2 For a risk-neutral aggregator with market-impact consideration, we can observe that (a) The optimal ending electricity storage level y n(x n, p n m ) is a non-decreasing function of x n, i.e., n/ x n 0. Besides, y n(x n, p n m ) y myop n (x n, p n m ). (b) The optimal charging amount y n(x n, p n m ) x n is a non-increasing function of x n. (c) If electricity prices are independent among periods, then y n(x n, p n m ) is a non-increasing function of E pn p n m [p n ], which is equal to E[p n ]. Proof: The proof is given in Appendix A.3. Proposition 2 provides a nice optimal policy structure for the problem. Proposition 2 (a) tells us that the higher the electricity storage level at the beginning of one period, the higher the electricity storage level we should keep at the end of the period. In addition, for each time period n, we always have the optimal ending electricity storage level no smaller than the one obtained by the myopic approach. Proposition 2 (b) indicates that the higher the initial electricity storage level we have, the more we should discharge or the less we should charge. It also shows that if it is optimal to discharge at a certain initial electricity storage level, then it is optimal to discharge at a higher initial electricity storage level. Similarly, if it is optimal to charge at a certain initial electricity storage level, it is optimal to charge at a lower initial electricity storage level. So for each expected price level, there exist two thresholds s( p n m ) and S( p n m ). If the initial electricity storage level is lower than s( p n m ), then it is optimal to charge; if the initial electricity storage level is higher than S( p n m ), then it is optimal to discharge; if the initial electricity storage level is between s( p n m ) 10

Expected price Charge Do nothing Discharge Expected price Charge Do nothing Discharge 0 Initial inventory 0 Initial inventory Figure 1: Optimal Policy Structure (General Price vs. Independent Price) and S( p n m ), then it is optimal to do nothing. The optimal policy structure for this general price evolution is shown on the left in Figure 1. Proposition 2 (c) indicates that if prices are independent among periods, then the higher the expected current price, the more electricity we should discharge or the less we should charge. Accordingly, we have the monotone property and optimal policy structure for this price-independent case, which is shown on the right in Figure 1. 3.2 Risk-Averse Aggregator In the previous subsection, we assume the electricity storage aggregator is risk-neutral. However, in a realistic situation, people are typically risk-averse (i.e., they care more about loss than benefit). In this subsection, we consider the case where the aggregator has risk-averse attitude and use an exponential utility function to capture the risk-averse attitude. In addition to the notation described in the previous subsection, we add an additional notation λ to represent the coefficient of risk aversion. Under this setting, we let V λ n (x n, p n m ) be the maximum utility from period n to the end of the operational time interval, and V λ N+1 (x N+1, p N+1 m ) be the increasingly concave salvage value of the electricity at the last period, as described in the previous subsection. For each time period n, we let L 0 n(y n x n, p n ) denote the myopic profit function when the price for the current period is p n and yn myop (x n, p n m, λ) represent its maximizer corresponding to a given x n, where L 0 n(z, p n ) = [p n bα d z + bz + /α u ](α d z z + /α u ). (4) 11

Now we can define the myopic utility function L λ n(z, p n m ) = E pn p n m [1 e λl0 n (z,p n) ]. Accordingly, the dynamics of the system can be described as follows: where V λ n (x n, p n m ) = max y {Lλ n(y n x n, p n m ) + βe pn m+1 p n m [Vn+1(y λ n, p n m+1 )]}, (5) n A(x n) A(x n ) = {y n : (x n C d ) + y n min{x n + C u, C}} (6) is the feasible set when the initial electricity storage is x n. The final objective of the problem is to maximize the total discounted expected utility of operating the system V λ 1 (x 1, p 1 m ). Similar to the study in the previous subsection, before describing the optimal control policy for the problem, we first explore the properties of the value functions as follows: Proposition 3 For a risk-averse aggregator with market-impact consideration, we can observe that (a) The myopic utility function L λ n(z, p n m ) is strictly concave in z. (b) The joint value function Jn(x λ n, y n, p n m ) = L λ n(y n x n, p n m )+βe pn m+1 p n m [Vn+1 λ (y n, p n m+1 )] is jointly concave in (x n, y n ). For any given x n, Jn(x λ n, y n, p n m ) is strictly concave in y n, and, accordingly, there exists a unique optimal ending electricity storage level yn(x n, p n m ) for any given x n. (c) The value function Vn λ (x n, p n m ) is increasing and concave in x n. Proof: The proof is given in Appendix A.4. Based on the above proposition, we can show the uniqueness of the optimal control policy for the risk-averse aggregator case. Theorem 2 For a risk-averse aggregator with market-impact consideration, there exists a unique optimal target storage level x n = y n 1 (y n 2 (... y 1 (x 1, p 1 m ),..., p n 2 m ), p n 1 m ), corresponding to any given initial storage level x 1 and any available price information p n m for n = 2,..., N + 1. 12

Proof: The proof is given in Appendix A.5. Based on Proposition 3 and Theorem 2, we can further explore the risk-averse control policy as follows: Proposition 4 For a risk-averse aggregator with market-impact consideration, we can observe: (a) The discharging amount x n y myop n (x n, p n m, λ) is non-increasing in λ. (b) The optimal ending electricity storage level y n(x n, p n m ) is a non-decreasing function of x n, i.e., n/ x n 0. (c) The optimal charging amount y n(x n, p n m ) x n is a non-increasing function of x n. Proof: The proof is given in Appendix A.6. From the above Proposition 4, we can observe that, similar to the risk-neutral case, we also have a threshold policy corresponding to each expected price level. That is, based on Proposition 4 (b) and (c), there exists two thresholds s( p n m ) and S( p n m ) for which it is optimal to discharge when the initial electricity storage level is higher than S( p n m ), to charge when the initial electricity storage level is lower than s( p n m ), and to do nothing when the initial electricity storage level is in between. Moreover, based on Proposition 4 (a), we observe that, for the myopic policy, we will discharge less if the aggregator gets more conservative (e.g., λ increases). Accordingly, for the multi-period case, the aggregator tends to charge/discharge less as λ increases. Later on, we explore the insights and discuss the impact of risk-averse attitude on the optimal policy through numerical studies in Section 5.1.3. 4 Both Day-Ahead and Real-Time Markets In the previous section, we only consider the charging/discharging activities for the real-time market. With the number of PHEVs increasing in the coming years, it is rational for aggregators to participate in both the real-time and day-ahead markets, which can help bring extra profit. However, it is much more challenging to make optimal decisions for aggregators under the two-market settings. Day-ahead and real-time market frameworks for energy trading work in different ways. As 13

Submit day-ahead offer for Day and real-time offer for Period Submit real-time offer for Period noon (e.g., Day Day Figure 2: Process to Submit Day-Ahead and Real-Time Offers described in the previous section, to participate in the real-time market, the offer including charging/discharging amounts needs to be submitted m time periods ahead. However, for the day-ahead market, corresponding to each operating day (e.g., day n) for most U.S. energy markets, ISOs require market participants to submit their offers at noon the day before (e.g., day n 1). Sometime in the afternoon (depending on different markets), the ISOs finish their day-ahead unit commitment runs and announce day-ahead LMPs for each operating hour on day n. Without loss of generality, we assume the day-ahead LMPs become available one time period after the offers are submitted. Therefore, for the current practice, in order to participate in both markets, an aggregator has to submit the charging/discharging amount day-ahead offers for each particular operating hour on an operating day (e.g., day n) at noon the day before (e.g., day n 1) and continue submitting charging/discharging amount real-time offers m periods before each operating time period. The processes to submit day-ahead and real-time offers are shown in Figure 2. 4.1 Notation and Assumption Instead of using only one index n to represent the time periods, we use (n, k) as the state space index in this section for the convenience of later development of dynamic programming equations, where n represents the nth date and k indicates the kth period on the nth date. Accordingly, we use p n,k and f n,k to denote the exogenous electricity prices in the kth period on the nth date in the real-time and day-ahead markets, respectively. We also let q n,k and q n,k denote the discharging amounts of electricity (or the charging amounts when negative) in the kth period on the nth date in the day-ahead and real-time markets, respectively. Finally, we assume there are K time periods 14

in total in one day. We mainly study the risk-neutral case in this section. Based on the market operation described in the previous subsection, in the decision process, we need to decide the day-ahead charging/discharging amounts for all operating hours the next day before the day-ahead LMPs are available. Similar to the real-time market only case, the charging/discharging activities affect the day-ahead LMPs as well. As compared to other thermal generation sources, the portion of the PHEVs electricity storage energy is relatively small. We assume the influence of charging/discharging amounts on the day-ahead electricity price is a dollars per unit. Corresponding to each particular day n, we assume the operational time interval is from 12am the current day to 12am the next day. To consider the action space for each state (n, k), we discuss two cases: (i) The states in which aggregators submit both the day-ahead and real-time offers. Suppose aggregators submit offers for the day-ahead market l time periods before the operational time interval starts (12am) for the next day. In addition, noting here that, as described in the previous section, aggregators will submit offers for the real-time market m (m < l) time periods ahead. Then, at time period K l for each particular day, the aggregator will submit the charging/discharging amounts for the day-ahead market on the next day and the charging/discharging amount for time period K l + m for the real-time market for the current day. (ii) The states in which aggregators can only submit offers for the real-time market. All states on each particular day are in this category except the one described in (i). For notation brevity, we first introduce the following vectors before we describe the Bellman equation: p n,k = [p 1,1,..., p n,k ], f n = [f 1,1,..., f n,k ], q n = [q 1,1,..., q n,k ], and q n,k:k = [q n,k,..., q n,k ]. Now we describe value function expressions. Because the decision on charging/discharging amounts for the real-time market is made m time periods ahead, corresponding to the value function for each state (n, k) with k K l + m, all necessary available information for the decision made in this time period includes x n,k, p n,k m, f n, and q n,k:k. Accordingly, we can describe the value 15

function, i.e., representing the optimal value (e.g., maximum profit) from the kth period on day n to the end of the operational time interval, in the form V n,k (x n,k, p n,k m, f n, q n,k:k ). Note here that this value function holds the same form for the state (n, k) with k = K l + m, for which we describe the information available for the decisions made in time period k l, for both the day-ahead market for day n + 1 and the real-time market for time period k l + m for day n. On the other hand, corresponding to the value function for each state (n, k) with k K l + m + 1, f n+1 and q n+1 become available when the decision is to be made. Accordingly, we can describe the value function in the form V n,k (x n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ). Similar to the real-time market only case, we initialize the electricity salvage value for the last period V N,K+1 ( ) to be an increasingly concave function. For notation brevity, we let V n, l ( ) = V n 1,K l ( ) for 1 l K 1. (7) In our approach, we allow both exogenous day-ahead and real-time prices to follow any price evolution process and assume that the controller s action does not affect the evolution of the exogenous prices. In addition, we make the following mild assumptions. For instance, we let [p n,k, p n,k ] denote the support of the random variable p n,k with p n,k > 0 and p n,k < +. We assume the day-ahead discharging amount q n,k q n,k q n,k, in which q n,k is assumed to be p n,k /(2b n,k ) and q n,k is an arbitrary large number. Under this setting, we assume the aggregator will not buy enough electricity from the day-ahead market to reduce the real-time price significantly (e.g., by half of the lowest real-time price), because it is more reasonable to buy the electricity from the real-time market when the real-time price is decreased significantly. 4.2 Revenue Function for Each Period (n, k) Corresponding to each state (n, k), we let x n,k and y n,k represent the beginning and ending electricity storage levels. In addition, we denote z n,k = y n,k x n,k. Accordingly, we have z + n,k = max{z n,k, 0} representing the total net charging amount and z n,k = max{ z n,k, 0} representing the total net discharging amount, combining the day-ahead and real-time charging/discharging quantities. The electricity storage balance equation can be described as follows: q n,k + q n,k = α dz n,k z+ n,k /α u. (8) 16

For each time period (n, k), because the charging/discharging amount for the day-ahead market q n,k is known when we make the offer for the real-time market, the charging/discharging amount for the real-time market q n,k can be replaced by α dz n,k z+ n,k /α u q n,k in our formulations. Then, the day-ahead profit value for the current period is q n,k (f n,k a n,k q n,k ) with the consideration of market impact. The real-time profit value for the current period is equal to (α d z n,k z+ n,k /α u q n,k )(p n,k b n,k (α d z n,k z+ n,k /α u q n,k )). (9) Accordingly, the myopic profit function for state (n, k) includes both the day-ahead and real-time profits and can be described as follows: L n,k (z n,k, p n,k m, f n, q n,k ) = q n,k (f n,k a n,k q n,k ) ) )] +E pn,k ( p n,k m, f n [(α ) d z n,k z+ n,k /α u q n,k (p n,k + b n,k z + n,k /α u b n,k α d z n,k + b n,kq n,k (10). 4.3 Bellman Equations and Main Results For each time period (n, k), the action space A(x n,k ) represents the feasible set for decision making when the initial inventory is x n,k. Under the two-market setting, the feasible action set is A(x n,k ) = {y n,k : (x n,k C d ) + y n,k min{x n,k + C u, C}}. (11) The dynamics of the system can be represented under the following three categories: (1) The value functions corresponding to states (n, k) with 1 k K l + m 1: V n,k (x n,k, p n,k m, f n, q n,k:k ) = max y n,k A(x n,k ) J n,k(x n,k, y n,k, p n,k m, f n, q n,k:k ), where the joint value function J n,k (x n,k, y n,k, p n,k m, f n, q n,k:k ) = L n,k (y n,k x n,k, p n,k m, f n, q n,k ) +βe pn,k m+1 ( p n,k m, f n ) [V n,k+1(y n,k, p n,k m+1, f n, q n,k+1:k )]. For these states, the corresponding time periods for deciding real-time offers are before period K l, when the day-ahead offers are submitted. (2) The value function corresponding to state (n, k) with k = K l + m: V n,k (x n,k, p n,k m, f n, q n,k:k ) = max J n,k (x n,k, y n,k, p n,k m, f n, q n,k:k, q n+1,1:k ), y n,k A(x n,k ), q n+1,1:k 17

where the joint value function J n,k (x n,k, y n,k, p n,k m, f n, q n,k:k, q n+1,1:k ) = L n,k (y n,k x n,k, p n,k m, f n, q n,k ) +βe (pn,k m+1, f n+1 ) ( p n,k m, f n ) [V n,k+1(y n,k, p n,k m+1, f n+1, q n,k+1:k, q n+1,1:k )]. For this case, the decisions for both the day-ahead and real-time offers are made at time period K l, with the available information (x n,k, p n,k m, f n, q n,k:k ). One time period later, the day-ahead LMPs ( f n+1 ) and offer amounts ( q n+1,1:k ) for day n + 1 are available, besides y n,k and p n,k m+1, which lead to the expression V n,k+1 (y n,k, p n,k m+1, f n+1, q n,k+1:k, q n+1,1:k ). (3) The value functions corresponding to states (n, k) with K l + m + 1 k K: V n,k (x n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ) = max J n,k(x n,k, y n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ), y n,k A(x n,k ) where the joint value function J n,k (x n,k, y n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ) = L n,k (y n,k x n,k, p n,k m, f n+1, q n,k ) +βe pn,k m+1 ( p n,k m, f n+1 ) [V n,k+1(y n,k, p n,k m+1, f n+1, q n,k+1:k, q n+1,1:k )]. For these states, the day-ahead LMPs and offer amounts for day n + 1 are available and the only decision is the real-time offer. Finally, for the last day (i.e., day N), there are no day-ahead offers to be determined. The objective is to maximize the total discounted expected profit of operating the system V 1,1 (x 1,1, p 1,1 m, f 1, q 1 ), where q 1 is assumed to be 0. From the above Bellman equations, we can observe that as compared to the real-time market only case, for the both markets case, we have additional one-time decision in terms of day-ahead offer amounts on each particular day. Although this decision happens only at one time period per day, it can be observed that the dimension of the problem increases significantly. In the following, we explore structure properties to help reduce computational times. Proposition 5 For the risk-neutral aggregators participating in both the day-ahead and real-time markets corresponding to each operating day n, 1 n N, we can observe the following properties for any feasible policy: 18

(a) L n,k (z n,k, p n,k m, f n, q n,k ) is strictly concave in (z n,k, q n,k ) for 1 k K l + m, and L n,k (z n,k, p n,k m, f n+1, q n,k ) is strictly concave in (z n,k, q n,k ) for K l + m + 1 k K. (b) J n,k (x n,k, y n,k, p n,k m, f n, q n,k:k ) is strictly concave in (x n,k, y n,k, q n,k:k ) for 1 k K l + m, and J n,k (x n,k, y n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ) is strictly concave in (x n,k, y n,k, q n,k:k, q n+1,1:k ) for K l + m + 1 k K. (c) V n,k (x n,k, p n,k m, f n, q n,k:k ) is strictly concave in (x n,k, q n,k:k ) for 1 k K l + m, and V n,k (x n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ) is strictly concave in (x n,k, q n,k:k, q n+1,1:k ) for K l + m + 1 k K. (d) V n,k (x n,k, p n,k m, f n, q n,k:k ) is increasing in x n,k for 1 k K l+m, and V n,k (x n,k, p n,k m, f n+1, q n,k:k, q n+1,1:k ) is increasing in x n,k for K l + m + 1 k K. Proof: The proof is given in Appendix A.7. Theorem 3 Corresponding to each operating day n, 1 n N, there exist a unique optimal target electricity storage level yn,k and a unique optimal day-ahead market trading amount q n,k, corresponding to any given initial storage level and any available price information, for each time period k : 1 k K. Proof: The proof is given in Appendix A.7. Following the similar approach as described for the risk-averse real-time market only case and the risk-neutral both market case, we can obtain a similar conclusion for a risk-averse aggregator participating in both markets. For instance, we have the same state and action spaces as the above risk-neutral case. We can also prove the concavity of the myopic expected utility function and the joint value function under the same assumptions. Accordingly, corresponding to each operating day n, there exist a unique optimal target electricity storage level y n,k and a unique optimal dayahead market trading amount q n+1,k for each time period k : 1 k K for given available price information. We omit the detailed description for brevity. 4.4 Algorithm Development In this subsection, we describe algorithms to solve the optimal control problem for participating in both the real-time and day-ahead markets. We discretize the state space to approximate the 19

continuous problem described in the previous subsections. To discretize the state space, we evenly divide the storage charging/discharging range by N I units, the real-time price range by N P units, and the day-ahead offer amount range by Q units. We assume the real-time prices Markovian and the day-ahead prices independent among different operating days and independent of the real-time prices. From the analysis described in the previous subsection, we can observe that once the dayahead offers are decided, the remaining problem is equivalent to solving the real-time only case. Meanwhile, we notice that the day-ahead offers are decided at one time unit for each operating day. Therefore, the whole dynamic programming framework consists of two main parts: (1) deciding the real-time offer amounts corresponding to the given day-ahead offer amounts (see Algorithm 3, which solves Cases (1) and (3) in Subsection 4.3) and (2) deciding the day-ahead offer amounts (see Algorithm 2, which solves Case (2) in Subsection 4.3). To solve (1), the concavity derived in Proposition 1 (b) and monotonicity derived in Proposition 2 (a) can help speed up the algorithm. More specifically, the concavity can help derive a bisection search approach and monotonicity can help reduce the searching space (see, e.g., line 5 in Algorithm 3). To solve (2), we notice that for a continuous jointly strictly concave function f(x 1,..., x N ), if there exists a point (x 1,..., x N ) where all first-order derivatives are zeros, i.e., f(x 1,..., x N )/ x i = 0 for i = 1,..., N, then this point is the global optimal point. Based on this optimality condition and strict concavity of the value function in the day-ahead offer amounts shown in Proposition 5, we can develop an efficient algorithm, i.e., Algorithm 2. Now we are ready to explain the whole framework, as described in Algorithm 1. (i) Day N and time periods K l + m to K: For this case, the day-ahead offer amounts for Day N are given and there is no need to make any further day-ahead offer amounts for day N + 1. We only need to use Algorithm 3 to decide the real-time offer amounts for these time periods. (ii) Time periods K l+m on day n to time periods K l+m 1 on day n+1, for n = N 1 to 1: Following the definition described in (7), these periods correspond to time periods l + m to K l + m 1 on day n, for n = N to 2. For this case, in time period l + m, we need 20

to derive the optimal day-ahead offer amounts for day n. After this, we only need to derive the optimal real-time offer amount for each particular time period. Algorithm 2 is designed to accomplish this, noting that Algorithm 3 is called in Algorithm 2. (iii) Time periods 1 to K l + m 1 on day 1. For this case, the day-ahead offer amounts for day 1 are initialized to be zeros. We can also use Algorithm 3 to accomplish this, with different initial values as compared to (i). Algorithm 1: The Whole Framework Output: Optimal day-ahead offers and ending storage levels 1 Initialize L k (q, x, y, p) as described in (10) for all q, x, y, and p, with f k replaced by E[f k ] and E[p k p k m = p] = N P j=1 jp pj m, where P pj m is the m step transition probability from price p to price j 2 Initialize V N,K+1 (x, p)=αln(x) for all x and p 3 for q N,K l+m:k = [1 : Q,..., 1 : Q] do 4 Obtain V N,K l+m:k ( q N,K l+m:k, :, :) and y N,K l+m:k ( q N,K l+m:k, :, :) from Algorithm 3 with inputs q N,K l+m:k and V N,K+1 (:, :) 5 for n = N to 2 do 6 for q n, l+m:0 = [1 : Q,..., 1 : Q] do 7 Obtain q n,1:k ( q n, l+m:0, :, :), V n, l+m ( q n, l+m:0, :, :), Y n, l+m ( q n, l+m:0, :, :), V n, l+m+1:k l+m 1 ( q n, l+m:0, :, :, :, :), and Y n, l+m+1:k l+m 1 ( q n, l+m:0, :, :, :, :) from Algorithm 2 with inputs q n, l+m:0 and V n,k l+m (:, :, :) 8 Obtain V 1,1:K l+m 1 (:, :) and Y 1,1:K l+m 1 (:, :) from Algorithm 3 with zero day-ahead offer amounts and V 2, l+m ( q 2, l+m:0 0, :, :) as inputs 9 return q 2:N,1:K (:, :, :) and Y 1:N,1:K (:,..., :) From Algorithms 1 to 3, we can observe that for the general case, Algorithm 1 takes O(NQ l m+1 ) iterations with Algorithm 2 embedded in each iteration. Similarly, Algorithm 2 takes O(N I N P K ln Q) iterations with Algorithm 3 embedded. Noting here that in our experiment, the while loop takes no more than five iterations. In addition, Algorithm 3 takes O(KN I N P ln N I ) iterations. In sum, the complexity for the whole algorithm for the general case is O(Q l m+1 K 2 NN 2 I N 2 P ln Q ln N I ). In our specific problem setting, for a parking lot of a company, the reasonable operating time interval is usually from 8am to 4pm. With the consideration of day-ahead offers submitted at noon and real-time offers submitted two time periods ahead. The corresponding complexity is O(Q 3 K 2 NN I 2 N P 2 ln Q ln N I ). Meanwhile, we can notice that the computational complexity can be very large as the state space increases. Accordingly, we will develop efficient approximate dynamic 21