An Online Learning Algorithm for Demand Response in Smart Grid

Size: px
Start display at page:

Download "An Online Learning Algorithm for Demand Response in Smart Grid"

Transcription

1 An Onlne Learnng Algorthm for Demand Response n Smart Grd Shahab Bahram, Student Member, IEEE, Vncent W.S. Wong, Fellow, IEEE, and Janwe Huang, Fellow, IEEE Abstract Demand response program wth real-tme prcng can encourage electrcty users towards schedulng ther energy usage to off-peak hours. A user needs to schedule the energy usage of hs applances n an onlne manner snce he may not know the energy prces and the demand of hs applances ahead of tme. In ths paper, we study the users long-term load schedulng problem and model the changes of the prce nformaton and load demand as a Markov decson process, whch enables us to capture the nteractons among users as a partally observable stochastc game. To make the problem tractable, we approxmate the users optmal schedulng polcy by the Markov perfect equlbrum (MPE) of a fully observable stochastc game wth ncomplete nformaton. We develop an onlne load schedulng learnng (LSL) algorthm based on the actor-crtc method to determne the users MPE polcy. When compared wth the benchmark of not performng demand response, smulaton results show that the LSL algorthm can reduce the expected cost of users and the peak-to-average rato (PAR) n the aggregate load by 28% and 13%, respectvely. When compared wth the short-term schedulng polces, the users wth the long-term polces can reduce ther expected cost by 17%. Keywords: Demand response, real-tme prcng, partally observable stochastc game, onlne learnng, actor-crtc method. I. INTRODUCTION The future smart grd ams to empower utlty companes and users to make more nformed energy management decsons. Ths motvates the utlty companes to provde users wth ncentves to adjust the tmng of ther electrcty usage [1]. The ncentves may be through a demand response program wth tme-varyng prcng schemes such as real-tme prcng (RTP) and nclnng block rate (IBR) prcng [2]. Wth a properly desgned demand response program, the utlty company can decrease ts generaton cost due to the reducton of peak-to-average rato (PAR) n the aggregate load. Meanwhle, users can reduce ther payment by takng advantage of low prces at off-peak hours. There are several challenges for users to optmally determne ther energy schedule n a demand response program. Frst, f the utlty company uses RTP or IBR, the users schedulng decsons are coupled snce the applances energy Manuscrpt receved on Oct. 8, 2016, revsed on Jan. 11, 2017, and accepted on Feb. 2, Ths work s supported by the Natural Scences and Engneerng Research Councl of Canada (NSERC) under Strategc Project Grant (STPGP ), and the Theme-based Research Scheme (Project No. T23-407/13-N) from the Research Grants Councl of the Hong Kong Specal Admnstratve Regon, Chna. S. Bahram and V.W.S. Wong are wth the Department of Electrcal and Computer Engneerng, The Unversty of Brtsh Columba, Vancouver, BC, Canada, V6T 1Z4. J. Huang s wth the Department of Informaton Engneerng, The Chnese Unversty of Hong Kong, Hong Kong, emal: {bahrams, vncentw}@ece.ubc.ca, jwhuang@e.cuhk.edu.hk schedule of a user affects the prce that s charged to all users, hence affects other users cost. Second, each user s uncertan about the total demand of other users, as well as the tme of use and operaton constrants of hs own applances. In partcular, each applance s operaton depends on ts task specfcatons (e.g., task duraton, start tme/deadlne of the task), whch are not known a pror untl the user decdes to turn on that applance. Thrd, the users may not know the prce nformaton ahead of tme. There have been some efforts n tacklng the above challenges. We dvde the related lterature nto two man threads. The frst thread s concerned wth technques for schedulng the energy usage of the applances n a household wth a myopc user, who ams to mnmze hs cost n a short perod of tme (e.g., one day). Samad et al. n [3] proposed prcng algorthms based on stochastc approxmaton to mnmze the PAR of the aggregate load n one day for a sngle household. Chen et al. n [4] proposed a robust optmzaton approach to mnmze the worst-case daly bll payment of a myopc user n a market wth the RTP scheme. Eksn et al. n [5] captured the nteractons among myopc users wth heterogeneous but correlated consumpton preferences wth the RTP scheme as a Bayesan game. Forouzandehmehr et al. n [6] proposed a dfferental stochastc game framework to capture the nteractons among myopc users wth controllable applances. In these works, however, t was not mentoned how the proposed schedulng algorthms can be used for foresghted users, who am to mnmze ther long-term costs. The second thread s concerned wth technques for schedulng the applances n a household wth a foresghted user. Wen et al. n [7] proposed a renforcement learnng algorthm to address the applances schedulng problem n a household. Km et al. n [8] proposed a load schedulng algorthm based on Q-learnng for a mcrogrd wth tme-of-use prcng scheme. Lang et al. n [9] proposed a Q-learnng approach to mnmze the bll payment and dscomfort cost of a foresghted user n a household. Ruelens et al. n [10] proposed a batch renforcement learnng algorthm to schedule controllable loads such as water heater and heat-pump thermostat. These works, however, dd not menton how the proposed learnng algorthms can capture the decson makng of multple foresghted users. Xao et al. n [11] appled dynamc programmng to model the nteractons among multple foresghted supplers. Yao et al. n [12] studed the electrcty sharng problem among multple foresghted users wth the RTP scheme. The schedulng problem of each ndvdual user s formulated as a Markov decson process. A specfc structure for the suboptmal polcy of each user s determned. Ja et al. n [13] proposed a

2 learnng algorthm based on stochastc approxmaton for the utlty company to determne the day ahead prce values n a market wth multple foresghted users. These works, however, dd not study the operaton constrants of dfferent electrcal applances n resdental sectors. In ths paper, we focus on desgnng a load schedulng learnng (LSL) algorthm for multple resdental users, who schedule ther applances n response to RTP nformaton. Each user s aware that the total energy consumpton (not just hs own) wll affect the prce announced by the utlty company. Furthermore, each user s selfsh and ams to mnmze hs own bll payment. We study the long-term nteractons among foresghted users nstead of the short-term nteractons among myopc users. It enables us to model the users decson makng wth uncertanty about the prce nformaton and load demand of ther applances as a Markov decson process wth dfferent states for dfferent possble scenaros. We capture the nteractons among users as a stochastc game [14]. Markov perfect equlbrum (MPE) s a standard soluton concept for analyzng stochastc games. Several algorthms have been proposed to determne an MPE n fully observable stochastc games [15] [22]. Some algorthms are model-based and requre knowledge of the dynamcs of the system,.e., the state transton probabltes. The model-based learnng algorthms nclude ratonal learnng methods [15] [17], lnear programmng based algorthms [18], [19], and homotopy method [20]. Some other learnng algorthms are model-free and am to determne an MPE when the system dynamcs are unknown. Examples of model-free approaches nclude Lyapunov optmzaton [21] method and renforcement learnng algorthms [22]. In the demand response program, the underlyng game s partally observable [23] [25], snce each user only observes hs own state and s uncertan about other users states. The key challenge n our model s to characterze the MPE under the partal observablty of each user and the nterdependency among the users polces. Ths paper s an extenson of our prevous work [26] that takes nto account the uncertanty n the energy prce and users load demand. The contrbutons of ths paper are as follows: Novel Soluton Approach: The partally observable stochastc game s a realstc framework to model the nteractons among users, but t s dffcult to solve. To make the problem tractable, we propose an algorthm executed by each user to approxmate the state of all users usng some addtonal nformaton from the utlty company. It enables us to approxmate the users optmal polcy by the MPE polcy n a fully observable stochastc game wth ncomplete nformaton, whch s more tractable. Learnng Algorthm Desgn: We formulate an ndvdual optmzaton problem for each household, ts global optmal soluton corresponds to the MPE polcy of the proposed fully observable stochastc game wth ncomplete nformaton. We develop an actor-crtc method [27] [30]-based dstrbuted LSL algorthm that converges to the MPE polcy. The algorthm s onlne and model-free, whch enables users to learn from the consequences of ther past decsons and schedule ther applances n an onlne fashon wthout knowng the system dynamcs. Performance Evaluaton: We evaluate the performance of the LSL algorthm n reducng the PAR n the aggregate load and the expected cost of users. Compared wth the benchmark of not performng demand response, our results show that the LSL algorthm can reduce the PAR n the aggregate load and the expected cost of foresghted users by 13% and 28%, receptvely. We compare the polcy of the foresghted and myopc users, and show that foresghted users can reduce ther daly cost by 17%. When compared wth the Q-learnng method (e.g., n [7] and [8]), the LSL algorthm based on the actor-crtc method converges faster to the MPE polcy. The rest of ths paper s organzed as follows. Secton II ntroduces the system model. In Secton III, we model the nteractons among users as a partally observable stochastc game and approxmate t by a fully observable stochastc game wth ncomplete nformaton. In Secton IV, we develop a dstrbuted learnng algorthm to compute the MPE. In Secton V, we evaluate the performance of the proposed algorthm through smulatons. Secton VI concludes the paper. II. SYSTEM MODEL We consder a system wth one utlty company and a set N ={1,..., N} of N households. Each household s equpped wth an energy consumpton controller (ECC) responsble for schedulng the applances n that household. The ECC s connected to the utlty company va a two-way communcaton network, whch enables the exchange of the prce nformaton and the household s load demand. Users partcpate n demand response program for a long perod of tme (e.g., several weeks). We dvde the tme nto a set T = {1,..., T } of T equal tme slots, e.g., 15 mnutes per tme slot. In ths paper, we use ECC, household, and user nterchangeably. A. Applances Model Let A = {1,..., A } denote the set of applances n household N, where A s the total number of applances. In each tme slot, an applance s ether awake or asleep, ndcatng whether t s ready to operate or not. We defne the applance s operaton state as follows: Defnton 1 (Applance Operaton State): For household N, the operaton state of applance a A n tme slot t T s a tuple s a,,t = (r a,,t, q a,,t, δ a,,t ), where r a,,t s the number of remanng tme slots to complete the current task, q a,,t s the number of tme slots for whch the current task can be delayed, and δ a,,t s the number of tme slots snce the most recent tme slot that applance a becomes awake wth the most recent new task. Fg. 1 shows the values of r a,,t, q a,,t, and δ a,,t for applance a A, whch has a task that should be operated for three tme slots wth a maxmum delay of three tme slots. When applance a becomes awake n tme slot t, r a,,t and q a,,t are ntalzed based on the current task (e.g., here we have r a,,t = q a,,t = 3), and δ a,,t s set to 1. The value of r a,,t decreases when applance a executes ts task and becomes 0 when the applance has completed ts task and s

3 Fg. 1. The values of r a,,t, q a,,t, and δ a,,t for applance a, whch should be operated for three tme slots wth a maxmum delay of three tme slots. asleep n tme slot t. The value of q a,,t remans unchanged when the task s executed, and decreases when the task s delayed. When q a,,t s 0, the ECC cannot delay the applance s task. The value of δ a,,t ncreases n each tme slot and s reset to 1 when applance a becomes awake wth a new task. The applance may start a new task rght after completng the current task. Thus, wthout becomng asleep, r a,,t and q a,,t are ntalzed based on the new task, and δ a,,t s set to 1. ECC does not know when an applance becomes awake ahead of tme. Instead, t has a belef regardng P a, (δ a,,t ), the probablty that the dfference between two sequental wake-up tmes for applance a s δ a,,t, for δ a,,t 1. Such a probablty dstrbuton can be estmated, for example, based on the awake hstory for applance a. ECC can approxmate P a, (δ a,,t ) by the rato of the events that the dfference between two consecutve wake-up tmes s δ a,,t n a gven hstorcal data record. Applance a may become awake n the next tme slot (for a new task) f ether applance a s asleep or t wll complete the current task n the current tme slot. In Appendx A, we show that gven current tme t, the probablty P a,,t+1 that applance a A becomes awake wth a new task n the next tme slot t + 1 T s P a, (δ a,,t ) P a,,t+1 = 1 δ a,,t 1 =1 P a, ( ). (1) We partton the set of applances nto must-run and controllable. Let A M denote the set of must-run applances n household. Examples of must-run applances nclude lghtng and TV. The ECC has no control over the operaton of must-run applances. On the other hand, the ECC can control the tme of use for the controllable applances. The set of controllable applances n household can further be parttoned nto two sets: the set A N of non-nterruptble applances, and the set A I of nterruptble applances. Examples of non-nterruptble applances nclude washng machne and dsh washer, and examples of nterruptble applances nclude ar condtoner and electrc vehcle (EV). The ECC may schedule a nonnterruptble applance durng several consecutve tme slots, but cannot nterrupt ts task. The ECC may delay or nterrupt the operaton of an nterruptble applance. Each tme an applance a A becomes awake, t sends nformaton about ts new task s specfcatons to the ECC. Defnton 2 (Task s Specfcatons): For an applance a A, the specfcatons of ts task nclude the average power consumpton p avg a, to execute the task, the schedulng wndow T a, = [t s a,, td a, ] correspondng to a tme nterval whch ncludes the earlest start tme t s a, T and the deadlne t d a, T for the task, the operaton duraton d a, for a must-run or non-nterruptble applance correspondng to the total number of tme slots requred to complete the task, and the nterval [d mn a,, dmax a, ] for an nterruptble applance correspondng to the range of the operaton duraton. The value of the average power consumpton p avg a, s assumed to be fxed and known a pror for each applance a. The operaton duraton d a, for a non-nterruptble applance a A N s fxed. On the other hand, the operaton duraton d a, for a task of an nterruptble applance a A I can be any value n the range of [d mn a,, dmax a, ], and we have dmn a, 0 and d max a, t d a, ts a,. We use the bnary decson varable x a,,t {0, 1} to ndcate whether an applance a A s scheduled to operate n tme slot t (x a,,t = 1) or not (x a,,t = 0). Notce that x a,,t s equal to 0 when applance a s asleep (.e., r a,,t = 0). Let x,t = (x a,,t, a A ) denote the schedulng decson vector for all applances n household n tme slot t. ECC can nfer the state s a,,t+1 of applance a n the next tme slot t + 1 from the current state s a,,t, the probablty P a,,t+1, applance s type, the task s specfcatons, and the schedulng decson x a,,t as follows: 1) Must-run applances: The feasble acton for applance a A M n tme slot t T s { 1, f ra,,t 1 x a,,t = (2) 0, f r a,,t = 0. When applance a A M becomes awake wth a new task, r a,,t s set to d a,, and ECC operates the applance wthout delay,.e., q a,,t s equal to 0. Gven current tme t, the operaton state n tme slot t + 1 can be obtaned as follows: If ether applance a A M s asleep (.e., r a,,t = 0) or t wll complete ts task n the current tme slot (.e., r a,,t = 1), then applance a becomes awake n tme slot t + 1 wth probablty P a,,t+1, wth the correspondng next state as s a,,t+1 = (d a,, 0, 1), (3) and the applance s asleep n tme slot t + 1 wth probablty 1 P a,,t+1, wth the correspondng next state as s a,,t+1 = (0, 0, δ a,,t + 1). (4) If r a,,t 2, then applance a A M has not completed ts task yet. Wth probablty 1, the correspondng next state as s a,,t+1 = (r a,,t 1, 0, δ a,,t + 1). (5) 2) Non-nterruptble controllable applances: The feasble acton for applance a A N n tme slot t T s 0 or 1, f t T a,, r a,,t 1, q a,,t 1, x a,,t = 1, f t T a,, r a,,t 1, q a,,t = 0, (6) 0, f r a,,t = 0. Equaton (6) mples that ECC can decde to operate a nonnterruptble applance a or not when the applance s awake

4 (r a,,t 1) and ts current task can be delayed (q a,,t 1). ECC has to operate an awake applance f the task cannot be delayed (q a,,t =0). ECC wll not schedule applance a f t s asleep (r a,,t = 0). When applance a A N becomes awake, r a,,t and q a,,t are set to d a, and t d a, ts a, d a, + 1, respectvely. Gven current tme t, the operaton state n the next tme slot s as follows: If ether applance a A N s asleep (.e., r a,,t = 0) or t wll complete the current task n the current tme slot (.e., r a,,t = 1 and x a,,t = 1), then the applance becomes awake n tme slot t + 1 wth probablty P a,,t+1, wth the correspondng next state as s a,,t+1 = (d a,, t d a, t s a, d a, + 1, 1), (7) and the applance s asleep n tme slot t + 1 wth probablty 1 P a,,t+1, wth the correspondng next state as s a,,t+1 = (0, 0, δ a,,t + 1). (8) If r a,,t 2 and x a,,t = 1, then applance a A N has not completed ts task yet and s scheduled n the current tme slot t. The applance cannot be delayed n the next tme slot,.e., q a,,t+1 = 0. Wth probablty 1, the correspondng next state as s a,,t+1 = (r a,,t 1, 0, δ a,,t + 1). (9) If r a,,t 1 and x a,,t = 0, then applance a A N has not completed ts task yet and s not scheduled n the current tme slot t. Wth probablty 1, we have s a,,t+1 = (r a,,t, q a,,t 1, δ a,,t + 1). The acton set n (6) mples that x a,,t cannot be equal to 0 f q a,,t s 0 n tme slot t. 3) Interruptble controllable applances: Equaton (6) s the feasble acton for applance a A I n tme slot t T. When an nterruptble applance a A I becomes awake wth a new task, r a,,t s set to the maxmum operaton duraton d max a,. To operate the applance for at least d mn a, tme slots, ECC can delay the task n at most t d a, ts a, dmn a, + 1 tme slots. The maxmum operaton duraton may not be completed before the deadlne wthn the schedulng horzon T a,. In ths case, f t + 1 T a,, the nterruptble applance wll become ether asleep or awake wth a new task n the next tme slot t + 1. The operaton state n the next tme slot t + 1 s as follows: If the next tme slot s not n the schedulng wndow (.e., t + 1 T a, ), applance a A I s asleep (.e., r a,,t = 0), or the applance wll complete ts task n the current tme slot (.e., r a,,t = 1 and x a,,t = 1), then the applance becomes awake n tme slot t+1 wth probablty P a,,t+1, wth the next state as s a,,t+1 = (d max a,, t d a, t s a, d mn a, + 1, 1), (10) and the applance s asleep n tme slot t + 1 wth probablty 1 P a,,t+1, wth the correspondng next state as s a,,t+1 = (0, 0, δ a,,t + 1). (11) If the next tme slot s n the schedulng wndow (.e., t+1 T a, ), r a,,t 2, and x a,,t = 1, then applance a A I s scheduled n the current tme slot t. The applance s awake n the next tme slot t + 1 wth probablty 1, and the next state s s a,,t+1 = (r a,,t 1, q a,,t, δ a,,t + 1). (12) If t + 1 T a,, r a,,t 1, and x a,,t = 0, then the task of applance a A I s not scheduled n the current tme slot t. The applance s awake n the next tme slot t + 1 wth probablty 1, wth the correspondng next state as s a,,t+1 = (r a,,t, q a,,t 1, δ a,,t + 1). (13) B. Prcng Scheme and Household s Cost In a dynamc prcng scheme, the payment by each household depends on the tme and total amount of energy consumpton. Let l,t = a A p avg a, x a,,t denote the aggregate load of household n tme slot t. Let lt others denote the aggregate background load demand of other users n tme slot t that do not partcpate n the demand response program. The utlty company knows lt others at the end of tme slot t. Let l t = lt others + N l,t denote the aggregate load demands of all users n tme slot t. We assume that the utlty company uses a combnaton of RTP and IBR [3], [31]. In tme slot t T, the unt prce λ t s { ( ) λ 1,t, λ t lt = λ 2,t, f 0 l t l th f l t > lt th, t, (14) where λ 1,t λ 2,t, t T. Here, λ 1,t and λ 2,t are the unt prce values n tme slot t when the aggregate load s lower and hgher than the threshold lt th, respectvely. We defne the vector of prce parameters n tme slot t as λ t = (λ 1,t, λ 2,t, lt th ). The prce parameters are set by the utlty company accordng to dfferent factors such as the tme of the day, day of the week, wholesale market condtons, and the operaton condtons of the power network. We can capture the prce changes by makng the followng assumpton: Assumpton 1 The prce parameters are generated accordng to a hdden Markov model. In each hdden state, the prce parameters are generated from a probablty dstrbuton whch s unknown to the users [32], [33]. Assumpton 1 s consstent wth many realstc stuatons of prce determnaton. For example, the prce parameters λ t may change perodcally. In ths case, the hdden states correspond to the tme of the day, and the prce parameters vector for each hdden state s fxed. In a more general model, a hdden state corresponds to the tme of the day and the prce parameters are chosen from a known probablty dstrbuton (e.g., a truncated normal dstrbuton) n each hdden state. If ths s the case, the probablty dstrbuton for each tme slot can be estmated by examnng the hstorcal prces of the same tme slot from many days [33]. In Secton V, we compare the users schedulng decsons when the utlty company apples the perodc and random prce parameters, respectvely. The payment of household n tme slot t s l,t λ t (l t ). When the ECC nterrupts the operaton of the nterruptble applances, the correspondng user wll experence a dscomfort

5 cost. When an nterruptble applance a A I becomes awake, t sends the user s desrable operaton schedule x des a,,t for all tme slots t T a, and the coeffcents ω a,,t, a A I, t T a, (measured n terms of $) to the ECC to reflect the user s dscomfort caused by any potental change of the operaton schedule of nterruptble applance a. For each household, we capture the dscomfort cost from schedulng the nterruptble applances by the weghted Eucldean dstance between the operaton schedule wth demand response and the desrable operaton schedule as a A ω xa,,t I a,,t x des a,,t, whch s also used n [34]. The total cost for each household n tme slot t nvolves the payment and dscomfort cost. That s, c,t (l t ) = l,t λ t (l t ) + ω a,,t x a,,t x des. (15) a A I a,,t In the long-term schedulng problem, the schedulng horzon T s a large number (e.g., f the schedulng horzon s sx months and each tme slot s 15 mnutes, then we have T 17000). Thus, t s reasonable to approxmate the problem wth an nfnte schedulng horzon, and consder the expected dscounted cost of each household wth the dscount factor β [35, pp. 150] as (1 β) β t 1 c,t (l,t, l,t ). (16) t=1 The parameter β n (16) can be used to characterze a wde range of users behavour. When β s close to zero, the users are myopc,.e., they am to mnmze ther short-term cost (e.g., daly cost) wthout consderng the consequences of ther short-term polcy on ther future cost. When β s close to one, the users are foresghted,.e., they am to mnmze ther long-term cost. One may assume dfferent values of β for dfferent partcpatng users. In ths paper, we assume that all users have the same value of β. In a more general future study, one may consder the case where dfferent users have dfferent values of β. In the cost model (16) wth an nfnte schedulng horzon, we can consder the statonary schedulng decson makng that s ndependent of tme. Specfcally, the decson makng only depends on the prce parameters and the applance operaton state n a tme slot, but s ndependent of tme slot ndex t. Therefore, we can remove tme ndex t from the applances states, prce parameters, and the household s cost. III. PROBLEM FORMULATION Due to prvacy concerns, each household does not reveal the nformaton about ts applances to other households. We have Assumpton 2 The ECC can only observe the operaton state of the applances n ts own household. We capture the nteractons among households n demand response program as a partally observable stochastc game. Game 1 Households Partally Observable Stochastc Game: Players: The set of households N. States: The state of household s s = (s a,, a A ). Observatons: The observaton of household s o = (s, λ) O, where O s the set of possble observatons for household. Let o = (o, N ) O denote the observaton profle of all households, where O = N O. We use notatons z(o ) and z(o) to denote the value of an arbtrary parameter z n observaton o of household and observaton profle of all households o, respectvely. Actons: We defne the acton vector of household n observaton profle o as x (o) = (x a, (o), a A ). Let x(o) = (x (o), N ) denote the acton profle of all households. Let X (o ) denote the feasble acton space obtaned from (2), (6) for household wth observaton o. Transton Probabltes: Gven the current prce parameters, Assumpton 1 mples that the prce parameters vector s Markovan. From Secton II-A, the next state of an applance depends only on ts current state and acton. Thus, the transton between the observatons of a household s Markovan. Let P (o o, x (o)) denote the transton probablty from observaton o O to o O wth acton x (o). It depends on the applances wake-up probablty n (1). Furthermore, the users have ndependent preferred plans of usng ther applances. Hence, the states of dfferent households are ndependent. The transton probablty from observaton o O to o O wth acton profle x(o) s P (o o, x(o)) = N P (o o, x (o)). Statonary Polces: Let π (o, x (o)) denote the probablty of choosng a feasble acton x (o) n observaton o. Let π (o)=(π (o, x (o)), x (o) X (o )) denote the probablty dstrbuton over the feasble actons. We defne the statonary polcy for household as the vector π = (π (o), o O). Let π = (π, N ) denote the jont polcy of all households, and π denote the polcy for all households except household. Value functons: Under a gven jont polcy π, the value functon V π : O R returns the expected dscounted cost for household startng wth observaton profle o. It can be expressed as the followng Bellman equaton [14]: V π { π (o) = E π(o) Q ( o, x (o) )}, o O, (17) where E π(o){ } denotes the expectaton ( over the probablty dstrbuton π (o). Functon Q π o, x (o) ) s the Q-functon for household wth acton x (o) n observaton profle o when other households polcy s π [14]. We have ( Q π o, x (o) ) = E π (o){ (1 β) c (o, x(o)) +β } P (o o, x(o)) V π (o ). (18) o O It s computatonally dffcult to determne the optmal polces for the households n such a partally observable stochastc game. In a partally observable stochastc game among users, each user needs to know what other users are observng n each tme slot. Inspred by the works n [23] [25], we propose an algorthm executed by each ECC to estmate the observaton profle of all households. It enables us to study the users optmal polcy n a fully observable stochastc game wth ncomplete nformaton, n whch the households play a sequence of Bayesan games.

6 Algorthm 1 Executed by ECC N. 1: Communcate the average load demand l avg (o ) for all feasble actons x (o ) X (o ) to the utlty company. 2: Receve the average aggregate load l avg (o) from utlty company. 3: Approxmate the observaton profle by ô:=(l avg (o), λ). A. Observaton Profle Approxmaton Algorthm To make the analyss of Game 1 tractable, we propose an algorthm executed by each ECC to approxmate the observaton of all households usng some addtonal nformaton. Let ô denote the approxmate observaton profle of all households. Algorthm 1 descrbes how ECC obtans ô. ECC sends the average load demand l avg (o ) of all feasble actons x (o ) X (o ) to the utlty company. ECC knows λ and receves the average aggregate load l avg (o)= 1 N j N lavg j (o j ). It approxmates the observaton profle o by vector ô = (l avg (o), λ). In Algorthm 1, each household receves nformaton on the average aggregate load demands. Thus, the prvacy of each ndvdual household s protected. All ECCs obtan the same approxmaton for an observaton profle. Thus, we can consder a fully observable stochastc game wth ncomplete nformaton. Under a gven approxmate observaton profle ô, the households play a Bayesan game, as each household may have dfferent observatons o, and thus dfferent sets of feasble actons. Game 2 Households Fully Observable Stochastc Game wth Incomplete Informaton: Ths game s constructed from Game 1 f the households defne ther actons and polcy as follows: Actons: Let O (ô) O denote the set of possble observatons for household n the approxmate observaton profle ô. We defne the set of actons for household n the approxmate observaton profle ô as ˆX (ô) = {x (o ) : x (o ) X (o ), o O (ô)}. The feasblty of an acton x (ô) ˆX (ô) depends on the observaton o of household. Polces: We defne the statonary polcy π (ô, x (o )) as the probablty of choosng a feasble acton x (o ) X (o ) n an approxmate observaton profle ô when the observaton of household s o O (ô). Let P (o ô) be the probablty that household has observaton o O (ô) when the approxmate observaton profle s ô. Hence, the probablty of choosng any acton x (ô) ˆX (ô) s π (ô, x (ô)) = P (o ô)π (ô, x (o )). Let π (ô) = (π (ô, x (ô)), x (ô) ˆX (ô)) denote the probablty dstrbuton over the actons for household n an approxmate observaton profle ô. We defne the polcy for household n Game 2 as the vector π = (π (ô), ô O). B. Markov Perfect Equlbrum (MPE) Polcy In ths subsecton, we dscuss how each household determnes a polcy π (ô) n Game 2 for any approxmate observaton profles ô to mnmze ts value functon V π(ô). The MPE s a standard soluton concept for the partally observable stochastc games. The MPE corresponds to the users polces wth Markov propertes and s compatble wth the assumpton for the applance model n Secton II-A. The MPE n Game 2 s defned as follows: Defnton 3 A polcy π MPE = (π MPE, N ) s an MPE f for every household N wth a polcy π, we have V (π,πmpe ),π MPE ) (ô) V (πmpe (ô), N, ô O. (19) The MPE polcy s the fxed pont soluton of every household s best response polcy. Household solves the followng Bellman equatons when other households polces are fxed: V πmpe (ô) = mnmze E π(ô) π (ô) {Q πmpe } (ô, x (ô)), ô O. (20) As the followng Theorem states, the exstence of the MPE s guaranteed for Game 2. Theorem 1 Game 2 has at least one MPE n stochastc statonary polces. The proof of Theorem 1 can be found n Appendx B. The MPE s the fxed pont of N recursve problems n (20) for all households. Problem (20) mples that for household wth acton x (ô) under observaton profle ô n the MPE, we have V πmpe (ô) Q πmpe (ô, x (ô)). We ntroduce an equvalent non-recursve optmzaton problem for each household, whch s more tractable. For household N, we defne the Bellman error [14] for an acton x (ô) n an approxmate observaton profle ô as B (V π, ô, x (ô)) = Q π (ô, x (ô)) V π (ô). (21) We defne functon (V π Bellman errors for all observatons ô O. That s (V π, π ) = E π(ô) ô O, π ) as the sum of the expected { B (V π, ô, x (ô)) }. (22) Each household ams to determne the polcy π and the value functon V π to mnmze (V π, π ) by solvng the followng optmzaton problem. mnmze V π,π (V π, π ) (23) subject to B (V π, ô, x (ô)) 0, ô O, x (ô) ˆX (ô). Problem (23) s generally a non-convex problem, and may have several local mnma. We show that the MPE polcy of household s the global mnmum of problem (23). Theorem 2 The polcy π MPE s an MPE of Game 2 f and only f for all households N wth acton x (ô) ˆX (ô), we have π MPE ( (ô, x (ô)) B V π MPE, ô, x (ô) ) = 0, ô O. (24) The proof can be found n Appendx C. Theorem 2 mples that the Bellman error s zero for an acton wth postve probablty at the MPE. Thus, (V πmpe, π MPE ) = 0 and the MPE s the global optmal soluton of problem (23) for all households. Solvng problem (23) s stll challengng, as each ECC requres the values of the unavalable transton probabltes between the observatons. Ths motvates us to develop a model-free learnng algorthm that enables each ECC to sched-

7 ule the applances n an onlne manner wthout knowng the system dynamcs. Bascally, each ECC updates the polcy and value functon based on the consequences of ts past decsons. As part of the learnng algorthm, we need to record the observaton and acton spaces for a household. In order to reduce the complexty, we use the lnear functon approxmaton to estmate the value functon [36, Ch. 3]. For household, let φ (ô) = (φ v, (ô), v V) denote the row vector of bass functons, where V s the set of bass functons. Let θ = (θ v,, u V) denote the row vector of weght coeffcents. The approxmate value functon for household s V π (ô, θ ) = θ φ T (ô), (25) where T s the transpose operator. It enables ECC to compute vector θ wth V elements nstead of the value functon V π (ô) for all approxmate observaton profles ô. We parameterze the polcy π for household va softmax approxmaton [36, Ch. 3]. Let µ (ô, x (ô)) = ( µ p, (ô, x (ô)), p P ) denote the row vector of bass functons, where P s the set of bass functons. Let ϑ = (ϑ p,, p P) denote the row vector of weght coeffcents. The approxmate probablty of choosng acton x (ô) ˆX (ô) s π (ô, x (ô), ϑ )= e (ϑµt (ô,x(ô))) x (ô) ˆX (ô) e(ϑµt (ô,x (ô))). (26) To smplfy the computaton of ths approxmaton, we use the vector of compatble bass functons ψ (ô, x (ô)) = ( ψp, (ô, x (ô)), p P ), where ψ p, (ô, x (ô)) = ln(π (ô, x (ô), ϑ )) ϑ p,. (27) We can show that for the softmax parameterzed polcy, the vector of bass functons µ (ô, x (ô)) can be replaced wth vector ψ (ô, x (ô)) [30]. IV. ONLINE LEARNING ALGORITHM DESIGN In ths secton, we propose a load schedulng learnng (LSL) algorthm executed by the ECC of each household to determne the MPE polcy. We use an actor-crtc learnng method, whch s more robust than the actor-only methods (such as the polcy evaluaton [22, Ch. 2]) and faster than the crtc-only methods (such as the Q-learnng and temporal dfference (TD) learnng [22, Ch. 6]). The concept of the actorcrtc was orgnally ntroduced by Wtten n [27] and then elaborated by Barto et al. n [28]. A detaled study of the actorcrtc algorthm can be found n [29], [30]. Our LSL algorthm s based on the frst proposed algorthm n [30]. The ECC s responsble for the actor and crtc updates. In the crtc update, the ECC evaluates the polcy to update the value functon. In the actor update, t updates the polcy to decrease the objectve value of problem (23) based on the updated value functon. In the polcy update, we use the gradent method wth a smaller step sze compared wth the step sze n the value functon s update, thereby usng a two-tmescale update process [30]. Algorthm 2 descrbes the LSL algorthm executed by ECC. The ndex k refers to both teraton and tme slot. Our algorthm nvolves the ntaton and schedulng phases. Lne 1 descrbes the ntalzaton n tme slot k = 1. The loop nvolvng Lnes 2 to 14 descrbes the schedulng phase, whch ncludes the observaton profle approxmaton, the crtc update, the actor update, and the bass functon constructon. In Lnes 3, ECC executes Algorthm 1 to obtan the approxmate observaton profle ô. In tme slot k = 1, ECC does not have any experence from ts past decsons and chooses an acton n Lne 11. For k > 1, the crtc and actor updates are executed. ECC determnes the updated vector θ k usng the TD approach [22, Ch. 6]. The TD error e k 1 s e k 1 TD = (1 β)c (ôk 1, x k 1 (ô k 1 ) ) +βv π,k 1 (ôk, θ k 1 ) V π,k 1 (ôk 1, θ k 1 ). (28) The crtc update for ECC s θ k = θ k 1 TD + γc k 1 e k 1 TD φ (ôk 1 ), (29) where γc k s the crtc step sze n teraton k. In the actor update module, ECC determnes the updated vector ϑ k usng the gradent method wth descent drecton. In partcular, ECC uses the descent drecton π k 1(ôk 1, x k 1 (ô k 1 ), ϑ k 1 ) ϑ k 1 ( V π,k 1, π k 1 ) to ensure convergence to the MPE. Snce the gradent s not avalable, ECC uses vector e k 1 TD ψ (ô k 1, x k 1 (ô k 1 )) as an estmate of the gradent [30, Algorthm 1]. Therefore, the convergence to the MPE s guaranteed, snce the TD error e k 1 TD s an estmate for the Bellman error for acton xk 1 n teraton k 1. Thus, the descent drecton s zero f condton (24) s satsfed. The actor update for ECC s ϑ k = ϑ k 1 γa k π k 1(ôk 1, x k 1 (ô k 1 ), ϑ k 1 ) e k 1 TD ψ (ô k 1, x k 1 (ô k 1 )), (30) where γa k s the actor step sze n teraton k. We use the approach n [37] to autonomously construct the new bass functons ψ P +1, (ô, x (ô)) and φ V +1, (ô). The canddate for the bass functon ψ P +1, (ô, x (ô)) s the TD error e k 1 TD n (28), whch estmates the Bellman error. The expectaton over the Bellman errors of the feasble actons x (ô k 1 ) X (o k 1 ) s the canddate for φ V +1, (ô). We have ψ P +1, (ô, x (ô)) = e k 1 TD, (31) φ V +1, (ô) = E { B k 1 ( V π,k 1, ô k 1, x (ô k 1 ) )}. (32) The expectaton n (32) s over the probablty of choosng each feasble actons x (ô k 1 ) X (o k 1 ). In Appendx D, we explan how to approxmate the Bellman error for each feasble acton. In Lne 7, ECC checks the convergence of and decdes whether to add the new bass functons or not. θ k In Lne 11, ECC schedules the applances n the current tme slot k. In Lne 12, ECC receves the cost (ôk c, x k (ô k )). Next tme slot s started n Lne 13. In Lne 14, the stoppng crteron s gven. From Theorem 2, LSL algorthm converges to the MPE f the objectve value, π k 1 ) s zero. ECC computes the approxmate objectve value by summng over the expected Bellman errors up to teraton k 1 as (V π,k 1 (V π,k 1, π k 1 ) =

8 Algorthm 2 LSL Algorthm Executed by ECC N. 1: Set k := 1, ɛ := 10 3, and ξ = Set φ 1,( ) := 1 and ψ 1,( ):=1, and randomly ntalze θ1, 1 and ϑ 1 1,. 2: Repeat 3: Observe o k := (s k, λ k ). Approxmate ô k usng Algorthm 1. 4: If k 1, 5: Determne the updated vector θ k accordng to (29). 6: Determne the updated vector ϑ k accordng to (30). 7: If θ k θ k 1 < ɛ, 8: Construct new bass functons ψ P +1, (ô, x (ô)) and φ V +1, (ô) usng (31) and (32). 9: End f 10: End f 11: Choose acton x k (ô k ) usng polcy π k (ô k, ϑ k (ôk ). 12: Receve the cost c, x k (ô k ) ) from the utlty company. 13: k := k : Untl ˆf obj k 1 j=1 (V π,k 1 E π k 1 (ô j ), π k 1 ) < ξ. { B j( V π,k 1, ô j, x (ô j ) )}. (33) The suffcent condtons for the actor and crtc step szes to ensure the convergence of the LSL algorthm are gven n [29]. In the proposed model-free LSL algorthm, ECC does not know the next states of the applances untl the next tme slot begns n Lne 13. The ECC updates ts value functon usng the TD error n (28), whch depends on the next tme slot observaton. Therefore, the ECC only goes through one teraton per tme slot. V. PERFORMANCE EVALUATION In ths secton, we evaluate the performance of the LSL algorthm n a system, where one utlty company serves 200 households that partcpate n the demand response program. The schedulng horzon s sx months. Each tme slot s 15 mnutes. We consder sx controllable applances for each household, e.g., dsh washer, washng machne, and stove are non-nterruptble applances, and EV, ar condtoner, and water heater are nterruptble applances. We model other applances such as refrgerator and TV as must-run applances. Table I summarzes the task specfcatons of the controllable applances [38]. For the EV n each household, we have d mn a, = (Bd B0 )/pavg a, and dmax a, = (Bmax B 0)/pavg a,, where B 0 s the ntal chargng level when the EV awakes, B d s the chargng demand for the next trp, and B max s the battery s maxmum capacty. The chargng demand of the EV n household s unformly chosen at random from the set {18 kwh, kwh,..., 24 kwh}. The battery capacty s set to 30 kwh. Typcally, the user s ndfference between the chargng patterns for the EV as long as the chargng s fnshed before the deadlne. Thus, we set coeffcents ω a,,t, t T to zero for the EV. Coeffcents ω a,,t, t T are chosen unformly at random from the nterval [$0, $0.5] for the ar condtoner and water heater. We set the desred load pattern (x des a,,t, t T ) of the ar condtoner to a 16-hour perod, durng whch the applance turns on for an hour and turns off n the next hour n a perodc fashon. We set the desred load pattern of the water heater to a 5-hour perod wthout nterrupton. To smulate the non-nterruptble applances, we consder TABLE I OPERATING SPECIFICATIONS OF CONTROLLABLE APPLIANCES. Applance ( avg p a,, d a,, d mn a,, ) dmax a, Dsh washer (1.5 kw, 2 hr,, ) Washng machne (2.5 kw, 3 hr,, ) Stove (3 kw, 3 hr,, ) EV Ar condtoner Water heater am am (3 kw,, (B d B0 )/pavg a,, (Bmax B 0)/pavg a, ) (1.5 kw,, 2 hr, 8 hr) (2.5 kw,, 0 hr, 5 hr) 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am Fg. 2. Prce parameters over one day: (a) l th t ; (b) λ 1,t and λ 2,t. several schedulng wndows selected unformly between 10 am and 10 pm, wth a length that s unformly chosen at random from set {4 hr, 5 hr, 6 hr, 7 hr}. For the washng machne, we model (P a, ( ), 1) as a truncated normal dstrbuton whch s lower bounded by zero, and has a mean value of 288 tme slots and a standard devaton of 60 tme slots. For other applances, we use a truncated normal dstrbuton wth a mean value of 96 tme slots and a standard devaton of 20 tme slots. In practcal mplementatons, the probablty dstrbuton (P a, ( ), 1) for each applance a can be approxmated by usng the hstorcal record on the usage behavour of each user. Unless stated otherwse, the prce parameters vary perodcally wth a perod of one day. As dscussed n Secton II-B, the perodc prce parameter vector s a specal case for the hdden Markov model n Assumpton 1. Fgs. 2 (a) and (b) show lt th, t T, and λ 1,t and λ 2,t, t T over one day, respectvely. The actor and crtc step szes n teraton k of the LSL algorthm are set to γa k = m a /k 2 3 and γc k = m c /k, respectvely. Snce each ECC may use dfferent values for m a and m c n practce, we choose m a and m c unformly from [0.5, 2] for each household. Unless stated otherwse, the dscounted factor β s set to 0.995,.e., the users are foresghted. For the benchmark scenaro wthout demand response, the non-nterruptble applances are operated as soon as they become awake. The ar condtoner and water heater are operated

9 Load demand (kw) Wthout load schedulng Wth load schedulng Aggregate load (MW) 0 6 am 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am Tme (hour) am 2 0 Wthout load schedulng Wth load schedulng Must-run load 12 pm 6 pm 12 am 6 am Tme (hour) Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Fg. 3. (a) Load demand for household 1 over two days; (b) aggregate load demand of users over one day; (c) aggregate load demands of all users over seven days wth and wthout load schedulng. accordng to ther desred load patterns. The EV starts to charge when t s plugged n. We smulate both the benchmark case and LSL algorthm for several scenaros usng Matlab n a PC wth processor Intel Core U CPU 1.80 GHz. Frst, we compare the load profles for household 1 over two days n the benchmark scenaro (wthout load schedulng) and the LSL algorthm (wth load schedulng) n Fg. 3 (a). The EV chargng demands of household 1 n the frst and second days are 6 and 8 hours, respectvely. Wth the LSL algorthm, the ECC of household 1 schedules the operatng applances to reduce the payment. In partcular, snce the peak load wth schedulng n the frst day s much lower than that n the second day, the ECC of the foresghted household 1 charges the EV for 8.5 hours n the frst day (larger than the demand of 6 hours n the frst day), n order to reduce the chargng hour to 5.5 hours n the second day. Such a chargng schedule reduces the peak load n the second day. Fg. 3 (b) shows the aggregate load demand of all users durng one sample day. The peak load s about 1.9 MW around 8 pm wthout load schedulng. When the households deploy LSL algorthm, the ECCs schedule the controllable applances to off-peak hours Fg. 4. Daly average cost for myopc and foresghted household 1. at the MPE. The peak load decreases by 27% to 1.4 MW. Fg. 3 (c) shows the aggregate load profle of all users over one week. The peak load reducton can be observed n all days wth the LSL algorthm. The LSL algorthm benefts the users by reducng ther daly average cost. We perform smulatons for β = 0.995, 0.8, 0.5, 0.2, 0.05, whch ncludes the extreme cases of foresghted users (β = 0.995) and myopc users (β = 0.05). We present the daly average cost of household 1 for dfferent values of β n Fg. 4. The ntal value of $4.8 per day s the daly average cost wthout load schedulng. When household 1 s foresghted, ts daly average cost decreases by 28% (from $4.8 per day to $3.5 per day). When β decreases, the daly average cost ncreases gradually. For a myopc user, the daly average cost decreases by 11% (from $4.8 per day to $4.3 per day). The reason s that the ECC for foresghted users schedules the applances consderng the prce n the current and future tme slots. Fg. 5 (a) shows the chargng profle of the EV for household 1 wth a myopc user. Fg. 5 (b) shows the dynamcs of electrcty prce over two days when the users are myopc. The ECC of the myopc user (wth β = 0.05) consders the daly prce fluctuatons and charges the EV just to fulfll the chargng demand (for 6 hours). Fg. 5 (c) shows the chargng profle of the EV for household 1 wth a foresghted user. Fg. 5 (d) shows the dynamcs of electrcty prce when the users are foresghted. The ECC of the foresghted user (wth β = 0.995), on the other hand, takes advantage of the prce fluctuatons over multple days (n partcular the low current prce) and charges the EV more than the current chargng demand (for 8.5 hours) n order to reduce cost n the followng day when the prce n the chargng perod s hgh. The LSL algorthm helps the utlty company reduce the PAR n the aggregate load demand. We compute the expected PAR over a perod of 2 months n Fg. 6. We consder two specal cases of the hdden Markov model n Assumpton 1,.e., the perodc and random prce parameters, respectvely, to evaluate the performance of LSL algorthm. Wth perodc prce parameters, the LSL algorthm performs well and reduces the PAR from 2.3 to 2.02 (13% reducton) n 3000 tme slots (about a month). For random prce parameters, we assume that the utlty company chooses lt th, t T from a truncated normal dstrbuton wth a mean value shown n Fg. 2 (a) and a standard devaton of 0.2 MW. The parameters λ 1,t and λ 2,t, t T are also chosen from a truncated normal dstrbuton wth a mean value shown n Fg. 2 (b) and a

10 EV chargng rate (kw) Myopc user Expected PAR Wth random prce parameters Wth perodc prce parameters Prce ($/MW) EV chargng rate (kw) Prce ($/MW) 0 6 am 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am am 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am Foresghted user Tme (hour) 0 6 am 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am am 12 pm 6 pm 12 am 6 am 12 pm 6 pm 12 am 6 am c Tme (hour) Fg. 5. (a) The EV s chargng schedule when household 1 s myopc (β = 0.05); (b) the electrcty prce when users are myopc; (c) the EV s chargng schedule when household 1 s foresghted (β = 0.995); (d) The electrcty prce when users are foresghted. standard devaton of 5 $/MW. The random prce parameters can model abnormal fluctuatons (such as spkes n the prce values). In practce, the probablty dstrbutons for the prce parameters can be estmated from the hstorcal prce data. Nevertheless, our LSL algorthm s model-free, hence the ECCs do not need to know the probablty dstrbutons of the prce parameters. Results shows that the ECCs can stll effectvely determne ther MPE polces through learnng, but t takes 6500 tme slots (about two months) for the Iteraton number Fg. 6. Expected PAR of the LSL algorthm wth perodc and random prce parameters. PAR to converge to Thus, LSL algorthm has a robust performance even n a market wth random fluctuatons n the prce parameters. We show that Algorthm 2 converges to the MPE by usng the MPE characterzaton n Theorem 2. Fg. 7 depcts the absolute values of the approxmate objectve functon f obj, π k ) for households 1, 2, and 3. It shows that the objectve values converge to zero (we have the same result for other households), whch s the global optmal soluton of problem (23). Thus from Theorem 2, the LSL algorthm converges to the MPE of Game 2. Though the acton and state spaces of each household are large, the speed of convergence s acceptable as a result of usng the value functon and polcy approxmatons. The jumps n the curves n Fg. 7 correspond to the teratons where the bass functons n (31) and (32) are added to the bass functon sets. In our smulaton, the runnng tme of the LSL algorthm per teraton per household s only a few seconds. As the households only need to go through one teraton of computaton per tme slot (e.g., 15 mns), the proposed algorthm s sutable for real-tme executons. We compare the LSL algorthm wth a schedulng algorthm based on Q-learnng to demonstrate the beneft of the actorcrtc method. Q-learnng has been used n some exstng learnng algorthms for demand response (e.g., [7] and [8]). We consder an algorthm based on Q-learnng wth the same structure as LSL algorthm, wth the only dfference that the ECC updates the Q-functons [22, Ch. 6]. Fg. 8 shows the daly average cost of household 1 usng the LSL algorthm and the Q-learnng benchmark. In each teraton of the Q-learnng benchmark, the polces are obtaned from the updated values of the Q-functons (whch s computed based on the Boltzmann exploraton as n [7]). The polcy update suffers from hgh fluctuatons and slow learnng. Our proposed algorthm converges much smoother, wth a total convergence tme around 25% of that of the Q-learnng benchmark. To study how the observaton profle approxmaton n Algorthm 1 affects the users polcy, we compare the households polces n two scenaros. In the frst scenaro, the states are partally observable to the ECCs. They wll use Algorthm 1 to approxmate the observaton profle of all households. In the second scenaro, the utlty company shares the state of all households wth each ECC. Thus, the states become fully observable to the ECCs. The LSL algorthm can be used n both scenaros to determne the MPE polcy of the households. (V π,k

11 (V π, k, π k ) Household 1 Household 2 Household Iteraton number Fg. 7. Objectve value (V π,k, π k ) for households 1, 2, and partally observable stochastc game, where each household ams at mnmzng ts dscounted average cost n a realtme prcng market. We proposed a dstrbuted and modelfree learnng algorthm based on the actor-crtc method to determne the MPE polcy. We used the value functon and polcy approxmaton technque to reduce the acton and state spaces of the households and mprove the learnng speed. Smulaton results show that the expected PAR n the aggregate load can be reduced by 13% when users deploy the proposed algorthm. Furthermore, the foresghted users can beneft from 28% reducton n ther expected dscounted cost n long-term, whch s 17% lower than the expected cost of the myopc users. For future work, we plan to extend our LSL algorthm to a deregulated market, where multple households partcpate n demand response program and can choose to purchase electrcty from multple utlty companes Fg. 8. Daly average cost for household 1 wth the algorthm based on Q-learnng and our proposed LSL algorthm. Aggregate load (MW) am Wthout load schedulng Partally observable load schedulng Fully observable load schedulng 12 pm 6 pm 12 am 6 am Tme (hour) Fg. 9. The aggregate load demand wth the partally observable load schedulng and fully observable load schedulng. Fg. 9 shows the aggregate load demand n both scenaros over one day, wth and wthout load schedulng. When the states are partally observable, the ECCs play a sequence of Bayesan games n Game 2. As each ECC has ncomplete nformaton about other households states, t determnes an optmal polcy that mnmzes the expected cost n all possble states of other households under a gven approxmate observaton profle. When the states are fully observable, the ECCs play a sequence of normal form games. As each ECC knows the actual state of other households, ts polcy becomes the best response for the actual state of the system. Fg. 9 shows that when the states become fully observable, the peak n the aggregate load demand further decreases when the aggregate load s around the threshold lt th. Ths reduces the expected cost of the households, e.g., the daly average cost of household 1 s reduced by 6.3% (from $3.5 per day to $3.28 per day). VI. CONCLUSION In ths paper, we formulated the schedulng problem of the controllable applances n the resdental households as a A. The Proof of Equaton (1) APPENDIX Consder applance a A n household. Accordng to Defnton 1, δ a,,t for t T s the number of tme slots snce the most recent tme slot that applance a becomes awake wth the most recent new task. In other words, applance a has not become awake wth a new task agan n tme slots t δ a,,t + 1,..., t snce t became awake n tme slot t δ a,,t + 1. The value of P a, (δ a,,t ) s the probablty that the dfference between two sequental wake-up tmes for applance a s δ a,,t. Gven the current tme slot t, the probablty P a,,t+1 that applance a A becomes awake wth a new task n the next tme slot t + 1 T can be obtaned from the Bayes rule as P a,,t+1 = Prob{E 1 E 2 } Prob{E 2 }, (34) Prob{E 1 } where E 1 s the event that applance a has not become awake wth a new task untl tme slot t, and E 2 s the event that applance a becomes awake n tme slot t + 1 after δ a,,t tme slots snce t became awake wth the most recent task. Wth probablty Prob{E 1 E 2 } = 1, applance a has not become awake wth a new task untl tme slot t condtoned on the event that t becomes awake wth a new task n tme slot t + 1. Wth probablty Prob{E 2 } = P a, (δ a,,t ), applance a becomes awake n tme slot t+1 after δ a,,t tme slots snce t became awake wth the most recent task. Applance a has not become awake n tme slots t δ a,,t +1,..., t wth probablty Prob{E 1 } = 1 δ a,,t 1 =1 P a, ( ). Therefore, P a,,t+1 can be obtaned as (1). Ths completes the proof. B. The Proof of Theorem 1 The MPE polcy n Game 2 s the fxed pont soluton of every household s best response polcy. Household solves the Bellman equatons (20) for all approxmate observaton profles ô O when other households polces are fxed. We construct a Bayesan game from the underlyng fully observable game wth ncomplete nformaton as follows: Game 3 Bayesan Game Among Vrtual Households:

12 Players: The set of vrtual households, where each vrtual household (, ô) corresponds to each real household N and observaton profle ô O. Types: The type of each vrtual household (, ô) s the observaton o O of household. P (o ô) s the probablty that vrtual household (, ô) has type o. Strateges: The strategy for vrtual household (, ô) s the probablty dstrbuton π (ô) over the actons x (ô) ˆX (ô). Costs: The cost of each vrtual household (, ô) wth { } (ô, x (ô)), where strategy π (ô) s equal to E π(ô) Q π Q π (ô, x (ô)) s defned n (18). We consder the Bayesan Nash equlbrum (BNE) soluton concept for the underlyng Bayesan game among vrtual households. We show that the BNE corresponds to the MPE of Game 2 among households N. In Game 3, each vrtual household (, ô) ams to determne ts BNE strategy } π BNE (ô) to mnmze E π BNE (ô) {Q πbne (ô, x (ô)) when other vrtual households strateges are fxed. Therefore, n the BNE all vrtual households solve the Bellman equatons n (20). Consequently, the BNE of the Game 3 among vrtual households corresponds to the MPE of Game 2 among real households. A BNE always exsts for the Bayesan games wth a fnte number of players and actons [17, Ch. 6]. Thus, an MPE exsts for the fully observable game wth ncomplete nformaton among households. Ths completes the proof. C. The Proof of Theorem 2 We use an approach smlar to [9, Theorem 3.8.2] to show that the jont polcy π s an MPE f and only f (V π, π ) = 0 for all households N. Then, we obtan the condton n (24) for the polcy n an MPE. Our proof nvolves two steps. Step (a) Consder the jont polcy π and value functons V π (ô), N, n the feasble set of problem (23), for whch we have (V π, π ) = 0 for N. We show that the polcy π s an MPE. Accordng to the constrant set of problem (23), the Bellman errors for the actons n an approxmate observaton profle ô are non-negatve. Snce (V π, π ) s the expectaton over the Bellman errors, ts value s nonnegatve for all feasble polces and value functons. If (V π, π ) = 0 for all N, then the polcy π and the value functons V π (ô), N are the global optmum of problem (23) for all households. Hence, no household has the ncentve to unlaterally change ts polcy, n order to further reduce ts objectve value (V π, π ). In other words, the polcy π s an MPE. Next, we show that for an MPE polcy π MPE, we can determne a value functon V πmpe (ô) such that (V πmpe, π MPE ) = 0. From (22), (V πmpe, π MPE ) = 0 s equvalent to ) } {B (V πmpe, ô, x (ô) = 0, N. (35) ô O E π MPE (ô) Accordng to the constrant set of problem (23), the Bellman errors for the actons n an observaton profle ô are nonnegatve n the MPE. Thus, each term of the summaton n (35) should be zero. That s )} {B (V πmpe, ô, x (ô) = 0, ô O, N, E π MPE (ô) whch s equvalent to E π MPE (ô) π MPE {Q πmpe (ô, x (ô)) V πmpe (36) } (ô) = 0, ô O, N. (37) (ô) { s a randomzed polcy. Hence, we have E π MPE (ô) V π MPE (ô) } = V πmpe (ô). Hence, for all approxmate observaton profle ô O, (37) can be rewrtten as } V πmpe (ô)=e π MPE (ô) {Q πmpe (ô, x (ô)), N. (38) For household, we defne the average cost n approxmate observaton profle ô as c (ô) = E π MPE (ô) {c (ô, x(ô))}. We defne the average transton probablty from observaton ô to ô as P (ô ô) = E π MPE (ô) {P (ô ô, x(ô))}. We defne vectors c =( c (ô), ô O) and V πmpe = ( V πmpe (ô), ô O ), and defne the transton matrx P = [ P (ô ô), ô, ô O ]. By substtutng (18) nto (38), we have V πmpe = (1 β) c + β P V πmpe. (39) By rearrangng the terms n (39), we obtan ( I β P ) V π MPE = (1 β) c, (40) where I s the dentty matrx. Matrx P s a stochastc matrx (.e., each of ts entres s a nonnegatve real number representng a probablty), and thus ts egenvalues are less than or equal to one. Besdes, the dscount factor β s less than one. Hence, the egenvalues of matrx I β P are postve, and thereby t s nvertble (or nonsngular). From (40), we can obtan V πmpe as V πmpe = (1 β) ( I β P ) 1 c. (41) Therefore, for the MPE polcy π MPE, we obtan the value functon V πmpe (ô) n (41) such that (V πmpe, π MPE ) = 0 for all households N. Step (b) We obtan the condton n (24) for the polcy n an MPE. For each household N, the objectve functon (V πmpe (V πmpe ô O, π MPE, π MPE ) = x (ô) ˆX (ô) ) n (22) can be expressed as π MPE (ô, x (ô)) B ( V πmpe ), ô, x (ô). (42) The Bellman error B (V πmpe, ô, x (ô)) s nonnegatve. Hence, from (42),( (V πmpe, π MPE )) = 0 s equvalent to π MPE (ô, x (ô)) B V πmpe, ô, x (ô) = 0 for all households N wth acton x (ô) ˆX (ô) n observaton profle ô. Ths completes the proof. D. Bellman Error Approxmaton The bass functon n (32) s equal to the expectaton over the Bellman errors for all feasble actons

13 x (ô k 1 ) X (o k 1 ). ECC knows the observaton o k 1, the approxmate observaton profle ô k 1, and the cost c (ôk 1, x k 1 (ô k 1 ), x k 1 (ô k 1 ) ) for the chosen acton x k 1 (ô k 1 ) n teraton k 1, as well as the current observaton o k and the approxmate observaton profle ôk. ECC needs to use these avalable nformaton to approxmate the Bellman error for an arbtrary feasble acton x (ô k 1 ) X (o k 1 ). We use the TD error as an estmaton for the Bellman error [14, Lemma 3]. We have B k 1 ( V π,k 1, ô k 1, x (ô k 1 ) ) (1 β) c (ôk 1, x (ô k 1 ), x k 1 (ô k 1 ) ) + β V π,k 1 (ôk ( x (ô k 1 ) ), θ k 1 ) V π,k 1(ôk 1, θ k 1 ), (43) ( where ô k x (ô k 1 ) ) s the approxmate observaton profle n the current tme slot k f household chooses acton x (ô ( k 1 ) n the prevous tme slot k 1. ECC determnes ô k x (ô k 1 ) ) n the followng two steps: Step (a) ECC knows observatons o k 1 and o k. Thus t can determne the set of applances that become awake wth a new task n the current tme slot k. ECC can also determne the state of other operatng applances for an arbtrary feasble acton x (ô k 1 ). Therefore, ECC can determne the state of ts own household for an arbtrary feasble acton x (ô k 1 ). Step (b) The states of other households are fxed. Furthermore, ECC knows the approxmate observaton profle ô k for the chosen acton x k 1 (ô k 1 ). Usng the result of Step (a), ECC can compute the average aggregate load demands for the feasble actons of all households for an arbtrary feasble acton x (ô k 1 ) X (o k 1 ), and thus t can determne the approxmate observaton profle ô k (x (ô k )) for all households for acton x (ô k 1 ). In addton ( to computng the approxmate observaton profle ô k x (ô k 1 ) ), ECC needs to compute the cost c (ôk 1, x (ô k 1 ), x k 1 (ô k 1 ) ) for feasble acton x (ô k 1 ) X (o k 1 ). ECC knows the payment to the utlty company for the chosen acton x k 1 (ô k 1 ). Snce the load demand of one household s much smaller than the aggregate load demand of all households, we can assume that the prce value s unchanged when household unlaterally changes ts load demand. Thus, ECC can estmate ts payment for an arbtrary feasble acton x (ô k 1 ) X (o k 1 ). ECC can also determne the dscomfort cost for acton x (ô k 1 ) X (o k 1 ). Therefore, t can compute the cost c (ôk 1, x (ô k 1 ), x k 1 (ô k 1 ) ) for an arbtrary feasble acton x (ô k 1 ). Fnally, ECC s able to compute the approxmate Bellman error n (43). REFERENCES [1] Offce of Electrcty Delvery & Energy Relablty, Customer partcpaton n the smart grd: Lessons learned, U.S. Department of Energy, Tech. Rep., Sept [2] The Brattle Group, Freeman, Sullvan & Co., and Global Energy Partners, LLC, A natonal assessment of demand response potental, Federal Energy Regulatory Commsson, Tech. Rep., Jun [3] P. Samad, A. Mohsenan-Rad, V.W.S. Wong, and R. Schober, Realtme prcng for demand response based on stochastc approxmaton, IEEE Trans. on Smart Grd, vol. 5, no. 2, pp , Mar [4] Z. Chen, L. Wu, and Y. Fu, Real-tme prce-based demand response management for resdental applances va stochastc optmzaton and robust optmzaton, IEEE Trans. on Smart Grd, vol. 3, no. 4, pp , Dec [5] C. Eksn, H. Delc, and A. Rbero, Demand response management n smart grds wth heterogeneous consumer preferences, IEEE Trans. on Smart Grd, vol. 6, no. 6, pp , Nov [6] N. Forouzandehmehr, M. Esmalfalak, A. Mohsenan-Rad, and Z. Han, Autonomous demand response usng stochastc dfferental games, IEEE Trans. on Smart Grd, vol. 6, no. 1, pp , Jan [7] Z. Wen, D. O Nell, and H. Mae, Optmal demand response usng devce-based renforcement learnng, IEEE Trans. on Smart Grd, vol. 6, no. 5, pp , Sept [8] B. Km, Y. Zhang, M. van der Schaar, and J. Lee, Dynamc prcng and energy consumpton schedulng wth renforcement learnng, IEEE Trans. on Smart Grd, vol. 7, no. 5, pp , Sept [9] Y. Lang, L. He, X. Cao, and Z. J. Shen, Stochastc control for smart grd users wth flexble demand, IEEE Trans. on Smart Grd, vol. 4, no. 4, pp , Dec [10] F. Ruelens, B. J. Claessens, S. Vandael, B. D. Schutter, R. Babuska, and R. Belmans, Resdental demand response of thermostatcally controlled loads usng batch renforcement learnng, accepted for publcaton n IEEE Trans. on Smart Grd, [11] Y. Xao and M. van der Schaar, Dstrbuted demand sde management among foresghted decson makers n power networks, n Proc. of IEEE Conf. on Sgnals, Systems and Computers, Pacfc Grove, CA, Nov [12] J. Yao and P. Venktasubramanam, Optmal end user energy storage sharng n demand response, n Proc. of IEEE SmartGrdComm, Mam, FL, Nov [13] L. Ja, Q. Zhao, and L. Tong, Retal prcng for stochastc demand wth unknown parameters: An onlne machne learnng approach, n Proc. of Allerton Conf. on Communcaton, Control, and Computng, Montcello, IL, Oct [14] J. Flar and K. Vreze, Compettve Markov Decson Processes. NY: Sprnger, [15] E. Kala and E. Lehrer, Ratonal learnng leads to Nash equlbrum, Econometrca, vol. 39, no. 10, pp , Jul [16] A. Sandron, Does ratonal learnng lead to Nash equlbrum n fntely repeated games? Journal of Economc Theory, vol. 78, no. 1, pp , [17] M. Bowlng and M. Veloso, Ratonal and convergent learnng n stochastc games, n Proc. of Int l Conf. on Artfcal Intellgence, Seattle, WA, Aug [18] L. M. Dermed and C. L. Isbell, Solvng stochastc games, n Advances n Neural Informaton Processng Systems 22, Y. Bengo, D. Schuurmans, J. Lafferty, C. Wllams, and A. Culotta, Eds. Curran Assocates, Inc., 2009, pp [Onlne]. Avalable: [19] L. L and J. Shamma, LP formulaton of asymmetrc zero-sum stochastc games, n Proc. of IEEE Annual Conf. on Decson and Control, Los Angeles, CA, Dec [20] R. N. Borkovsky, U. Doraszelsk, and Y. Kryukov, A user s gude to solvng dynamc stochastc games usng the homotopy method, Operatons Research, vol. 58, no. 4-part-2, pp , Jul [21] M. Neely, A Lyapunov optmzaton approach to repeated stochastc games, n Proc. of Allerton Conference on Communcaton, Control, and Computng (Allerton), Montcello, IL, Oct [22] R. Sutton and A. Barto, Renforcement Learnng: An Introducton. Cambrdge, MA: MIT Press, [23] R. Emery-Montemerlo, G. Gordon, J. Schneder, and S. Thrun, Approxmate solutons for partally observable stochastc games wth common payoffs, n Proc. of Int l Conf. on Autonomous Agents and Multagent Systems, New York, NY, Jul [24] F. Olehoek, S. Whteson, and M. Spaan, Approxmate solutons for factored Dec-POMDPs wth many agents, n Proc. of Int l Conf. on Autonomous Agents and Multagent Systems, Sant Paul, MN, May [25] L. MacDermed, C. Isbell, and L. Wess, Markov games of ncomplete nformaton for mult-agent renforcement learnng, n Proc. of Int l Conf. on Artfcal Intellgence, San Fransco, CA, Aug [26] S. Bahram and V.W.S. Wong, An autonomous demand response program n smart grd wth foresghted users, n Proc. of IEEE SmartGrdComm, Mam, FL, Nov [27] I. H. Wtten, An adaptve optmal controller for dscrete-tme Markov envronments, Informaton and Control, vol. 34, no. 4, pp , Aug

14 [28] A. G. Barto, R. S. Sutton, and C. W. Anderson, Neuronlke adaptve elements that can solve dffcult learnng control problems, IEEE Trans. on Systems, Man, and Cybernetcs, vol. 13, no. 5, pp , Sept [29] V. Konda and J. Tstskls, On actor-crtc algorthms, SIAM Journal on Control and Optmzaton, vol. 42, no. 4, pp , Aug [30] S. Bhatnagar, R. Sutton, M. Ghavamzadeh, and M. Lee, Natural actorcrtc algorthms, Automatca, vol. 45, no. 11, pp , Nov [31] A. Mohsenan-Rad and A. Leon-Garca, Optmal resdental load control wth prce predcton n real-tme electrcty prcng envronments, IEEE Trans. on Smart Grd, vol. 1, no. 2, pp , Sept [32] D. W. Bunn, Modellng Prces n Compettve Electrcty Markets. Tornoto, Canada: Wley Fnance, [33] R. S. Mamon and R. J. Ellott, Hdden Markov Models n Fnance. NY: Sprnger, [34] P. Yang, G. Tang, and A. Nehora, A game-theoretc approach for optmal tme-of-use electrcty prcng, IEEE Trans. on Power Systems, vol. 28, no. 2, pp , Aug [35] Y. Shoham and K. Leyton-Brown, Multagent Systems: Algorthmc, Game-Theoretc, and Logcal Foundatons. Cambrdge Unversty Press, [36] L. Busonu, R. Babuska, B. D. Schutter, and D. Ernst, Renforcement Learnng and Dynamc Programmng Usng Functon Approxmators. FL: CRC Press, [37] R. Parr, C. Panter-Wakefeld, L. L, and M. L. Lttman, Analyzng feature generaton for value-functon approxmaton, n Proc. of Int l Conf. on Machne Learnng, New York, NY, Jun [38] Toronto Hydro. [Onlne]. Avalable: /electrcsystem/resdental/yourbllovervew/pages/applancechart.aspx Janwe Huang (S 01-M 06-SM 11-F 16) s an IEEE Fellow, a Dstngushed Lecturer of IEEE Communcatons Socety, and a Thomson Reuters Hghly Cted Researcher n Computer Scence. He s an Assocate Professor and Drector of the Network Communcatons and Economcs Lab (ncel.e.cuhk.edu.hk), n the Department of Informaton Engneerng at the Chnese Unversty of Hong Kong. He receved the Ph.D. degree from Northwestern Unversty n 2005, and worked as a Postdoc Research Assocate at Prnceton Unversty durng He s the co-recpent of 8 Best Paper Awards, ncludng IEEE Marcon Prze Paper Award n Wreless Communcatons n He has coauthored sx books, ncludng the textbook on Wreless Network Prcng. He receved the CUHK Young Researcher Award n 2014 and IEEE ComSoc Asa-Pacfc Outstandng Young Researcher Award n He has served as an Assocate Edtor of IEEE/ACM Transactons on Networkng, IEEE Transactons on Wreless Communcatons, and IEEE Journal on Selected Areas n Communcatons - Cogntve Rado Seres, and IEEE Transactons on Cogntve Communcatons and Networkng. He has served as the Char of IEEE ComSoc Cogntve Network Techncal Commttee and Multmeda Communcatons Techncal Commttee. Shahab Bahram (S 12) receved the B.Sc. and M.A.Sc. degrees both from Sharf Unversty of Technology, Tehran, Iran, n 2010 and 2012, respectvely. He s currently a Ph.D. canddate n the Department of Electrcal and Computer Engneerng, The Unversty of Brtsh Columba (UBC), Vancouver, BC, Canada. Hs research nterests nclude optmal power flow analyss, game theory, and demand sde management, wth applcatons n smart grd. Vncent W.S. Wong (S 94, M 00, SM 07, F 16) receved the B.Sc. degree from the Unversty of Mantoba, Wnnpeg, MB, Canada, n 1994, the M.A.Sc. degree from the Unversty of Waterloo, Waterloo, ON, Canada, n 1996, and the Ph.D. degree from the Unversty of Brtsh Columba (UBC), Vancouver, BC, Canada, n From 2000 to 2001, he worked as a systems engneer at PMC-Serra Inc. (now Mcrosem). He joned the Department of Electrcal and Computer Engneerng at UBC n 2002 and s currently a Professor. Hs research areas nclude protocol desgn, optmzaton, and resource management of communcaton networks, wth applcatons to wreless networks, smart grd, and the Internet. Dr. Wong s an Edtor of the IEEE Transactons on Communcatons. He was a Guest Edtor of IEEE Journal on Selected Areas n Communcatons and IEEE Wreless Communcatons. He has served on the edtoral boards of IEEE Transactons on Vehcular Technology and Journal of Communcatons and Networks. He has served as a Techncal Program Co-char of IEEE Smart- GrdComm 14, as well as a Symposum Co-char of IEEE SmartGrdComm 13 and IEEE Globecom 13. Dr. Wong s the Char of the IEEE Communcatons Socety Emergng Techncal Sub-Commttee on Smart Grd Communcatons and the IEEE Vancouver Jont Communcatons Chapter. He receved the 2014 UBC Kllam Faculty Research Fellowshp.

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Foresighted Demand Side Management

Foresighted Demand Side Management Foresghted Demand Sde Management 1 Yuanzhang Xao and Mhaela van der Schaar, Fellow, IEEE Department of Electrcal Engneerng, UCLA. {yxao,mhaela}@ee.ucla.edu. Abstract arxv:1401.2185v1 [cs.ma] 9 Jan 2014

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

Dynamic Pricing for Smart Grid with Reinforcement Learning

Dynamic Pricing for Smart Grid with Reinforcement Learning Dynamc Prcng for Smart Grd wth Renforcement Learnng Byung-Gook Km, Yu Zhang, Mhaela van der Schaar, and Jang-Won Lee Samsung Electroncs, Suwon, Korea Department of Electrcal Engneerng, UCLA, Los Angeles,

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that Artcle forthcomng to ; manuscrpt no (Please, provde the manuscrpt number!) 1 Onlne Appendx Appendx E: Proofs Proof of Proposton 1 Frst we derve the equlbrum when the manufacturer does not vertcally ntegrate

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

6. Stochastic processes (2)

6. Stochastic processes (2) 6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

6. Stochastic processes (2)

6. Stochastic processes (2) Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Indian Buffet Game With Negative Network Externality and Non-Bayesian Social Learning

Indian Buffet Game With Negative Network Externality and Non-Bayesian Social Learning IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 45, NO. 4, APRIL 2015 609 Indan Buffet Game Wth Negatve Network Externalty and Non-Bayesan Socal Learnng Chunxao Jang, Member, IEEE, Yan

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

k t+1 + c t A t k t, t=0

k t+1 + c t A t k t, t=0 Macro II (UC3M, MA/PhD Econ) Professor: Matthas Kredler Fnal Exam 6 May 208 You have 50 mnutes to complete the exam There are 80 ponts n total The exam has 4 pages If somethng n the queston s unclear,

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition Sngle-Faclty Schedulng over Long Tme Horzons by Logc-based Benders Decomposton Elvn Coban and J. N. Hooker Tepper School of Busness, Carnege Mellon Unversty ecoban@andrew.cmu.edu, john@hooker.tepper.cmu.edu

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003

Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003 Tornado and Luby Transform Codes Ashsh Khst 6.454 Presentaton October 22, 2003 Background: Erasure Channel Elas[956] studed the Erasure Channel β x x β β x 2 m x 2 k? Capacty of Noseless Erasure Channel

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

8. Modelling Uncertainty

8. Modelling Uncertainty 8. Modellng Uncertanty. Introducton. Generatng Values From Known Probablty Dstrbutons. Monte Carlo Smulaton 4. Chance Constraned Models 5 5. Markov Processes and Transton Probabltes 6 6. Stochastc Optmzaton

More information

Second Order Analysis

Second Order Analysis Second Order Analyss In the prevous classes we looked at a method that determnes the load correspondng to a state of bfurcaton equlbrum of a perfect frame by egenvalye analyss The system was assumed to

More information

Uncertainty and auto-correlation in. Measurement

Uncertainty and auto-correlation in. Measurement Uncertanty and auto-correlaton n arxv:1707.03276v2 [physcs.data-an] 30 Dec 2017 Measurement Markus Schebl Federal Offce of Metrology and Surveyng (BEV), 1160 Venna, Austra E-mal: markus.schebl@bev.gv.at

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Portfolios with Trading Constraints and Payout Restrictions

Portfolios with Trading Constraints and Payout Restrictions Portfolos wth Tradng Constrants and Payout Restrctons John R. Brge Northwestern Unversty (ont wor wth Chrs Donohue Xaodong Xu and Gongyun Zhao) 1 General Problem (Very) long-term nvestor (eample: unversty

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

A Simple Inventory System

A Simple Inventory System A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

An Admission Control Algorithm in Cloud Computing Systems

An Admission Control Algorithm in Cloud Computing Systems An Admsson Control Algorthm n Cloud Computng Systems Authors: Frank Yeong-Sung Ln Department of Informaton Management Natonal Tawan Unversty Tape, Tawan, R.O.C. ysln@m.ntu.edu.tw Yngje Lan Management Scence

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

CHAPTER 17 Amortized Analysis

CHAPTER 17 Amortized Analysis CHAPTER 7 Amortzed Analyss In an amortzed analyss, the tme requred to perform a sequence of data structure operatons s averaged over all the operatons performed. It can be used to show that the average

More information