IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 1. Optimal Energy Management Policy of Mobile Energy Gateway

Size: px
Start display at page:

Download "IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 1. Optimal Energy Management Policy of Mobile Energy Gateway"

Transcription

1 TRANSACTIONS ON VEHICULAR TECHNOLOGY Optimal Energy Management Policy of Mobile Energy Gateway Yang Zhang, Dusit Niyato, Senior Member,, Ping Wang, Senior Member,, and Dong In Kim, Senior Member, 5 Abstract With the advancement of wireless energy harvesting 6 and transfer technologies, e.g., radio frequency (RF) energy, mo- 7 bile nodes are fully untethered as energy supply is more ubiqui- 8 tous. The mobile nodes can receive energy from wireless chargers, 9 which can be static or mobile. In this paper, we introduce the use 10 of a mobile energy gateway that can receive energy from a fixed 11 charging facility, as well as move and transfer energy to other 12 users. The mobile energy gateway aims to maximize the utility by 13 optimally taking energy charging/transferring actions. We for- 14 mulate the optimal energy charging/transferring problem as a 15 Markov decision process (MDP). The MDP model is then solved 16 to obtain the optimal energy management policy for the mobile 17 energy gateway. Furthermore, the optimal energy management 18 policy obtained from the MDP model is proven to have a thresh- 19 old structure. We conduct an extensive performance evaluation 20 of the MDP-based energy management scheme. The proposed 21 MDP-based scheme outperforms several conventional baseline 22 schemes in terms of expected overall utility. 23 Index Terms Markov decision process (MDP), mobile energy 24 gateway, wireless charging. 25 I. INTRODUCTION R 26 ADIO FREQUENCY (RF) energy is one of the wireless 27 energy harvesting and transfer techniques that support 28 far-field wireless charging services. The other techniques are 29 inductive coupling and magnetic resonance coupling, which are 30 near-field charging techniques. In RF-based wireless charging, 31 an RF signal is used as a carrier to transfer energy from a source 32 (e.g., a wireless charger) to a consumer (e.g., a user). The RF- 33 based wireless charging can support mobile networks, which 34 are composed of energy-constrained nodes and devices. This 35 will help improve not only the energy efficiency but also the 36 performance of the networks. The efficiency of the RF-based 37 wireless charging depends largely on the distance between the 38 charger and the charging device. Traditionally, wireless charg- Manuscript received December 26, 2014; revised April 5, 2015; accepted June 5, This work was supported in part by the Singapore Ministry of Education (MOE) under Tier-1 Grant RG18/13 and Grant RG33/12 and Tier-2 Grant MOE2014-T ARC 4/15 and in part by the National Research Foundation of Korea funded by the Korean government (MSIP) under Grant 2014R1A5A The review of this paper was coordinated by Dr. P. Lin. Y. Zhang, D. Niyato, and P. Wang are with the School of Computer Engineering, Nanyang Technological University, Singapore ( yzhang28@e.ntu.edu.sg; dniyato@ntu.edu.sg; wangping@ntu.edu.sg). D. I. Kim is with the School of Information and Communication Engineering, Sungkyunkwan University (SKKU), Suwon , Korea ( dikim@skku.ac.kr). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TVT ing is deployed for fixed chargers with constant power supply 39 (e.g., from a power outlet). However, in many situations (e.g., 40 sensor networks), wireless charging can be used for a mobile 41 energy gateway that moves and reaches other charging devices 42 [1]. Such a mobile energy gateway improves an energy reple- 43 nishment process for various mobile and wireless networks, 44 particularly when the charging devices cannot or rarely visit a 45 fixed wireless charging facility. However, there are a few issues 46 when the mobile energy gateway is deployed, e.g., optimal de- 47 ployment, path planning, and energy management. 48 We consider a mobile network with the mobile energy gate- 49 way. Unlike most of existing works, which assume that the mo- 50 bility of the energy gateway 1 can be adjusted and its path can be 51 optimized, we consider the energy gateway with noncontrolla- 52 ble mobility. For example, the energy gateway may be attached 53 to other vehicles (e.g., a bicycle or trolley) or carried by a hu- 54 man. The energy gateway works as a self-interested carrier (i.e., 55 an agent) of energy between energy sources and users. The 56 energy gateway is equipped with RF charging capability, as 57 well as an energy storage, e.g., a battery. The fixed chargers and 58 users are geographically distributed at different locations in the 59 network. The energy gateway moves among the locations ran- 60 domly, pays for the energy received from the chargers, and 61 receives payment from users when the energy is transferred. 62 Therefore, the energy gateway aims to maximize the profit 63 from strategically charging and transferring energy at different 64 locations. 65 To address the problem of energy charging/transferring ac- 66 tions of the energy gateway, we propose a Markov decision 67 process (MDP)-based scheme. By employing the MDP-based 68 scheme, the energy gateway decides whether to pay and receive 69 energy from the fixed charger or not. By contrast, when the 70 energy gateway meets with users (i.e., the users are in the en- 71 ergy transfer range of the energy gateway), the energy gateway 72 decides whether to transfer energy to the users or not. There 73 could be multiple independent energy gateways in the system. 74 Each energy gateway makes the energy charging/transferring 75 decisions rationally from its own perspective, and thus, the 76 system operates in a distributed manner. The energy charging/ 77 transferring decision, which is referred to as a policy, is made 78 based on the states of the energy gateway. The states are defined 79 as the location, the energy level of the battery, the users in the 80 neighborhood, and the current prices of energy at fixed chargers We use mobile energy gateway and energy gateway interchangeably in the rest of this paper Personal use is permitted, but republication/redistribution requires permission. See for more information.

2 2 TRANSACTIONS ON VEHICULAR TECHNOLOGY 82 The contributions of this paper are summarized as follows. 83 We propose the concept of a self-interested energy gate- 84 way, which is equipped with RF charging capability. The 85 energy gateway acts as an energy carrier to assist the 86 chargers in extending the energy transmission range to 87 remote users. 88 We design an MDP-based scheme for the energy gateway 89 to obtain the energy management policy. The optimal 90 performance is achieved in terms of maximized utility of 91 the energy gateway. 92 We study the structure of the optimal energy charging/ 93 transferring policy. In particular, we prove that the opti- 94 mal policy obtained from the MDP-based scheme has a 95 threshold structure with respect to the system states. 96 We present an extensive performance evaluation of the 97 MDP-based scheme. We demonstrate that the MDP-based 98 scheme outperforms several baseline schemes for energy 99 management actions. This is due to the fact that the MDP- 100 based scheme takes both the current and the future system 101 states into account. 102 The rest of this paper is organized as follows. We review 103 related work in Section II. In Section III, we describe the mobile 104 network with energy sources (i.e., fixed chargers): an energy 105 gateway to carry and transfer energy to users. The RF energy 106 propagation model is presented, and a payment scheme of users 107 is described. Section IV formulates an MDP to maximize the 108 energy gateway s expected utility. The solution method and 109 the existence of threshold policies of the MDP are presented 110 in Section V. Numerical results are provided in Section VI. 111 Finally, Section VII concludes this paper. 112 II. RELATED WORK 113 A. RF Energy Harvesting 114 Recently, an RF energy harvesting technique has been intro- 115 duced to sustain the operation of wireless devices if their wired 116 charging or battery replacement are too costly or practically 117 infeasible. For example, body area wireless devices for human 118 health monitoring [2], sensors inside civil infrastructures [3], 119 and monitoring devices in airframes [4] can benefit from the 120 RF energy harvesting techniques. Numerous applications and 121 research works related to RF energy harvesting were reviewed 122 in [5] and [6]. 123 Nishimoto et al. in [7] developed a prototype of a wireless 124 sensor network with RF energy harvesting capability. The 125 sensors harvest ambient RF energy from far-field TV broadcast 126 (6.6 km) signals. Popovic et al. in [8] also implemented and 127 studied the similar RF energy harvesting sensor networks. The 128 experiments in [7] and [8] showed that with advanced antenna 129 and circuit designs, the RF harvested power is able to support 130 sensor applications. Ostaffe in [9] showed that a mobile phone 131 emitting 0.5 W can charge the device at the distance of 1, 5, 132 and 10 m with the power of 40 mw/m 2, 1.6 mw/m 2, and μw/m 2, respectively. RF energy harvesting to support 134 cognitive radio systems was discussed in [10]. Simultaneous 135 wireless information and power transfer was proposed in [11], 136 which allows the existing wireless network architecture to 137 support RF energy transferring without much modification. RF energy may be harvested from two types of sources, 138 i.e., ambient source (e.g., TV tower [7]) and dedicated source. 139 For example, in [12] and [13], mobile chargers were deployed 140 to power wireless sensors. The Powercaster transmitters [14], 141 which operate with the transmit power of 1 W/3 W, and 142 the Powerharvester receivers [14], which harvest 6 dbm/ dbm, were also developed as commercialized devices to 144 utilize RF energy. 145 B. Performance Modeling and Optimization 146 for RF Energy Harvesting 147 The MDP was introduced as an online optimization approach 148 for energy harvesting in communication systems [15]. For ex- 149 ample, in a body sensor network where each sensor node carries 150 a rechargeable battery and an associated energy harvesting 151 facility [16], an MDP model was formulated for the sensors 152 to choose different transmission modes in different system 153 states to achieve the optimal energy efficiency in terms of the 154 maximized successful reported health events with constrained 155 energy. The actions of taking different transmission modes 156 are, respectively, associated with different energy consumption 157 levels and data rates. The system states, which include the 158 energy level, the health event to be transmitted, and the energy 159 harvesting state, are modeled as a correlated two-state process 160 of harvesting RF energy from an ambient source. Similarly, in 161 [17], Sultan formulated an MDP model to determine an action 162 of sensing/being idle for an RF-energy-powered secondary user 163 in cognitive radio networks. Energy is randomly harvested by 164 the secondary user. An MDP model was developed in [18] to 165 optimize the mean delay of data transmission of a sensor node. 166 To achieve the maximized throughput with quality-of-service 167 (QoS) requirements, Niyato et al. [19] studied an MDP-based 168 scheme for a mobile user to balance between energy charging 169 and data transmission. 170 Users may be far away from RF sources and cannot receive 171 RF energy from wireless chargers [20], [21]. To overcome the 172 transmission range limitation, in the literature, a dedicated en- 173 ergy transmitter, e.g., a charger and relay, was proposed to move 174 and disseminate RF energy at different areas in the system, 175 so that the RF energy can be transferred to areas without RF 176 energy sources. For example, to charge radio frequency identifi- 177 cation (RFID) tags with RF energy, He in [22] proposed the op- 178 timal placement of stationary RFID readers, which also supply 179 RF energy. The objective is to minimize the number of readers, 180 given QoS requirements. Erol-Kantarci and Mouftah in [12] 181 and Shi et al. in [23] introduced the idea of using mobile 182 chargers to travel to different locations and charge multiple 183 sensors. Erol-Kantarci and Mouftah [12] proposed an optimal 184 path of mobile chargers, which is obtained based on the shortest 185 Hamiltonian cycle of the locations to visit. Erol-Kantarci and 186 Mouftah in [13] extended the scheme in [12] by considering 187 priorities of different sensors. An integer linear programming 188 optimization model was applied to maximize the power re- 189 ceived by the prioritized sensors. 190 To the best of our knowledge, the works in the literature 191 related to RF energy harvesting in mobile networks did not 192 consider the perspective of a mobile energy transmitter (i.e., 193

3 ZHANG et al.: OPTIMAL ENERGY MANAGEMENT POLICY OF MOBILE ENERGY GATEWAY 3 period is set to allow the energy to be charged or transferred, 233 which relies on the implementation of the system. We consider 234 that only one decision is made in one decision period. 235 Note that the energy pricing (i.e., from chargers to the energy 236 gateway and from the energy gateway to users) is beyond 237 the scope of this paper. We assume that the set of prices is 238 predetermined. Finding optimal prices is a separate issue that 239 could be studied in the future work. 240 A. RF Energy Propagation Model and User Payment 241 Fig. 1. System description. 194 a mobile energy gateway), which is designed to carry energy 195 from sources to energy users, aiming at maximizing the profit 196 in terms of utility or monetary reward, and hence, this is the 197 main issue that is the focus of this paper. 198 III. SYSTEM MODEL 199 We consider the mobile network with fixed wireless chargers 200 (i.e., energy suppliers), an energy gateway (i.e., an energy mes- 201 senger), and users (i.e., energy consumers), as shown in Fig There are multiple fixed chargers at different locations in the 203 network. The energy gateway moves among locations, visiting 204 chargers and users. The energy gateway is equipped with a bat- 205 tery and energy transfer interfaces. When the energy gateway 206 is at a charger, a certain amount of energy can be transferred to 207 the battery of the energy gateway. Then, a certain price has to be 208 paid by the energy gateway to the charger. The energy price of 209 different chargers can be different and time varying. We assume 210 that the energy gateway visits only one charger at a time, and 211 thus, it receives energy only from the corresponding charger. 212 By contrast, when the energy gateway is at the users, a certain 213 amount of energy may be transferred to the nearby users. The 214 energy transfer to the users is performed in a broadcast manner 215 (i.e., multiple users can receive energy simultaneously), which 216 is a typical nature of RF energy transfer. The users are assumed 217 to be geographically distributed following a Poisson spatial 218 distribution [24]. The user receives RF energy from the energy 219 gateway and pays a retail energy price to the energy gateway. 220 Note that the amount of energy transferred from the charger to 221 the energy gateway can be different and is usually higher than 222 that transferred from the energy gateway to the users. 223 Based on the given system model, we aim to design an energy 224 management scheme for the energy gateway. The scheme as- 225 sists the energy gateway in deciding whether to receive energy 226 from the charger and whether to transfer energy to the users 227 or not. The decision is based on the system states, which are 228 defined based on location, energy prices at chargers, and the 229 number of users that are able to receive energy from the energy 230 gateway. To make the model tractable, we assume that the 231 decision making by the energy gateway is time slotted, and each 232 time slot is called a decision period. The length of a decision The energy gateway transfers energy to users by the RF en- 242 ergy transfer technique. With the energy successfully received, 243 the user makes payment to the energy gateway (e.g., in the 244 form of a real money transaction or a fictitious token). Due to 245 path loss, an amount of energy received by different users may 246 vary. We assume that if the received energy does not exceed the 247 demand of a user, the payment of the user will be based on the 248 actual energy received. 249 We assume that the energy transferring time is fixed and 250 less than a time slot, depending on the property of the energy 251 gateway. Signals will be sent to the end users by the energy 252 gateway to start and terminate the energy transfer. As a result, 253 the durations of all the end users receiving energy are identical. 254 The amount of energy (i.e., in joules) received by user n during 255 the energy transfer duration can be expressed using Friis 256 formula [6], as follows: 257 ( ) 2 λ e R n = ζ RF/DC G t G r,n E S (1) 4πR n where ζ RF/DC is the RF-to-DC energy conversion efficiency 258 [7]. G t and G r,n are the energy transmitting antenna gain of 259 the energy gateway and the receiving antenna gain of user n, 260 respectively. λ is the wavelength of energy transfer signal. R n 261 is the distance from the energy gateway to user n. For simplicity 262 of notation, we let ζ RF/DC G t G r,n (λ/4π) 2 = g, where g is a 263 constant. From (1), the amount of energy received by any user is 264 inversely proportional to the square of the distance to the energy 265 gateway, given a fixed amount of transmitted energy E S. 266 Friis transmission formula requires that the distance R n bex- 267 tween the user and the energy gateway satisfies R n >R f, where 268 R f is the Fraunhofer distance satisfying the following conditions: 269 R f = 2D2 a λ, R f λ and R f D a (2) where D a is the largest dimension of the user s antenna. For 270 the users in the near-field of the energy gateway, i.e., 0 R n 271 R f, we assume that the energy can be transferred without loss. 272 Let the maximum energy demand of user n be e D n, the distance 273 R D n = g E S e D n is a boundary distance, where the demand of the user will not 274 be satisfied for R n >R D n, and the demand will be fully met 275 otherwise. 276 We consider that the energy transfer is performed in a spher- 277 ical spatial area (or a circle area in 2-D space). The area is 278 (3)

4 4 TRANSACTIONS ON VEHICULAR TECHNOLOGY 279 centered at the energy gateway with radius R, where R is the 280 longest distance that the user can receive any energy. The area 281 is divided into the following subareas When user n has the distance 0 R n < max{r f,rn D }, the amount of received energy is larger than the demand of the energy user, i.e., e R n >e D n, and thus, the energy demand of the user will be fully met. For max{r f,rn D } R n R, the amount of energy e R n received by the user will not fully satisfy the energy demand e D n. ForR n >R, the user cannot receive any energy since the energy received becomes too low. The user is defined to be in the energy outage zone of the energy gateway. 292 The distance R is treated as a cutoff distance. All the users 293 located at the energy outage zone will not be considered as valid 294 users for the energy gateway. 295 After receiving the energy with amount e R n,usern will in- 296 form the mobile energy gateway this information with the pay- 297 ment of energy price. We assume that the user always reports 298 truthful information. As we assume, the users are geographi- 299 cally distributed following a Poisson spatial distribution. The 300 probability density function of distance l between user n (out 301 of maximum N users) and the energy gateway (i.e., an origin) 302 is expressed as follows [24]: f(n, l N)= 3 R B(n+ 2 3,N n+1) ( l 3 β B(N n+1,n) R 3 ; n+ 2 ) 3,N n+1 (4) 303 where B(a, b) =Γ(a)Γ(b)/Γ(a + b), and Γ( ) is the Gamma 304 function. β(x; y, z) =(x y 1 (1 x) z 1 )/(B(y, z)) is the Beta 305 density function [24]. 306 The expected payment R(n, E S ) made by user n to the 307 energy gateway is obtained as follows: R R ( R(n, E S )= f(n, l N)r(e D n )dl + f(n, l N)r g E ) S l 2 dl 0 R (5) 308 where R =min{r, max{r f,rn D }}. The first term in (5) 309 indicates the total payment of the users, whose demands are 310 all fully satisfied. The second term in (5) indicates the total 311 payment of users whose demends are partly satisfied, which 312 is calculated from (1), due to path loss and RF-to-DC energy 313 conversion efficiency. The functions r(e D n ) and r(g(e S /l 2 )) 314 indicate the energy price of the amount of received energy to 315 be e D n and e R n = g(e S /l 2 ), respectively. 316 The total average payment received by the energy gateway is 317 obtained as R(N,E S )= N R(n, E S ). (6) n=1 318 B. Uniform Payment for Energy Transfer 319 With large enough energy E S transferred by the energy 320 gateway, the demands of all the users are satisfied, i.e., e R n e D n n {1, 2,...,N}. This is the case when R = R, n 321 {1, 2,...,N}. The payment from user n to the energy gateway 322 becomes 323 R(n, E S )=r ( e D ) n (7) since R 0 f(n, l N)dl = 1. Thus, the total average payment 324 from all the users to the energy gateway is 325 R(N,E S )= N R(n, E S )= n=1 N n=1 r ( e D ) n which can be simplified as R(N,E S )=Nr(e D ) for the case 326 that all the users have the same energy demand denoted by e D. 327 IV. OPTIMIZATION PROBLEM FORMULATION 328 We formulate an MDP model to obtain the optimal energy 329 management policy for the energy gateway. The MDP model 330 consists of the system states, transition matrices among the states, 331 the actions, and corresponding reward of the energy gateway. 332 A. State Space and Action Space 333 The state space of the mobile energy gateway is defined as 334 follows: 335 S={S =(L, E, N, P) L L, E G, N N, P P} (9) where S is a composite state consisting of all the system state 336 variables L, E, N, and P. 337 There are, in total, L locations. The location state is de- 338 noted by L L = {1, 2,...,L}, where L is the set of all 339 locations that the energy gateway can visit. 340 Energy state E G = {0, 1,..., E} is the current 341 amount of energy in the energy gateway s battery. The 342 capacity of the battery is E units of energy. 343 User state N {0, 1,..., N} denotes the number of 344 users that the energy gateway can transfer energy to at the 345 current location. We assume that the maximum number of 346 users that the mobile energy gateway can transfer energy 347 to is finite and denoted by N. 348 Price state P is a composite state for the energy prices at 349 all the chargers. P is denoted by P =(P 1, P 2,...,P M ), 350 where P i, i = 1, 2,...,M is the energy price at location 351 i with a charger. M is the total number of locations with 352 chargers. We assume that the price state P i of each 353 charger takes a value from a finite discrete set of energy 354 price, i.e., P i P = {ρ 1,ρ 2,...,ρ K }, where P is the 355 set of all the K possible prices. This assumption is widely 356 adopted in the literature [25]. 357 The action of the energy gateway is denoted by A A = 358 {0, 1, 2}, where A is the action space. The action A = 1 indi- 359 cates that the energy gateway requests for charging from the 360 charger at the current location. The action A = 2 indicates that 361 the energy gateway transfers energy to the users. The action 362 A = 0 indicates that the energy gateway is idle (i.e., doing 363 nothing). 364 (8)

5 ZHANG et al.: OPTIMAL ENERGY MANAGEMENT POLICY OF MOBILE ENERGY GATEWAY B. Transition Matrices of System States 366 The current state S =(L, E, N, P) transits to the next state 367 S =(L, E, N, P ). In the following, we derive the probabil- 368 ity matrices for the state transition ) Price State Transitions: The price state transition matrix 370 for the charger at location i is expressed as follows: ψ p i 1,1... ψ p i 1,K P i =..... (10) ψ p i K,1... ψ p i K,K where ψ p i 371 k,k indicates the probability that the price state P i of 372 the charger at location i changes from P i = k to P i = k, where 373 k, k P indicate the current price state and the next price 374 state of the ith charger. Thus, the transition matrix for the com- 375 posite price state P of all the chargers is obtained as W P = P 1 P 2 P M (11) 376 where is the Kronecker product. We denote the element in 377 W P with row p and column p to be ψ p p,p, which is the transi- 378 tion probability indicating the composite price state P changes 379 from the current state p =(P 1, P 2,...,P M ) to the next state 380 p =(P 1, P 2,...,P M ) ) Location State Transitions: We divide the set of locations 382 L into three subsets, i.e., L B, L S, and L NC, based on the attri- 383 butes of the locations, where L = L B L S L NC. The subset 384 L B includes all the locations with chargers. L S includes all the 385 locations where there are users but no chargers. L NC is the 386 subset where the energy gateway has contact to neither chargers 387 nor users. We simply assume L B L S =, L B L NC =, 388 and L S L NC =. We denote the total number of locations 389 in the subset L B to be L B (i.e., L B = L B ). Likewise, L S = 390 L S, and L NC = L NC. Clearly, L = L B + L S + L NC. 391 The transition of the location state L of the energy gateway 392 can be expressed by the following transition matrix: L NC,NC L NC,B L NC,S W L = L B,NC L B,B L B,S (12) L S,NC L S,B L S,S 393 the elements of which denote the transition matrices among the 394 three subsets. L a,a, a, a {NC,B,S} contains the transition 395 probabilities when the current location is in the location sub- 396 set L a, and the next location is in subset L a. For example, 397 L S,B means the current location of the energy gateway is 398 in subset L S, and the next location is in subset L B. Each 399 element ψm,m l in L a,a denotes the transition probability from 400 a current location m in subset a to the next location m in 401 subset a ) Energy State Transitions: Next, we derive the energy state 403 transition matrix of the energy gateway. Energy state transitions 404 can be divided into three cases First, the energy state may increase. This occurs when the energy gateway receives E B units of energy from a charger. Recall that E is the capacity of the energy gateway s battery. The (E + 1) (E + 1)-dimensional transition matrix for this case is given in the following 409 equation: 410 E + (E, E L) 1 η L 0 (EB 1) 1 η L η L 0 (EB 1) 1 η L = η L 0 η L 1 η L η L 1 (13) where each row of the matrix denotes the current energy 411 state E, and each column denotes the energy state of the 412 next decision period E. η L is the efficiency of energy 413 charging at location L, i.e., the probability of successful 414 charging. 0 (EB 1) 1 is a row vector, which is composed 415 of E B 1 zeros. 416 Second, the energy state may decrease. This can happen 417 when the energy gateway transfers E S units of energy to 418 users. In this case, the energy state may decrease by E S, 419 except that when there is lees than E S units of energy 420 in the battery, we assume that the energy gateway still 421 transfers energy, so that the energy state decreases to 422 E = 0. The transition matrix is as shown in (14), which 423 has the dimenson of (E + 1) (E + 1), as follows: 424 E (E, E L) = (ES 1) (ES 1) 1 0 (14) where 0 (ES 1) 1 is a row vector, which is composed of 425 E S 1 zeros. 426 The energy state can remain the same, for example, 427 when the energy gateway does not receive or transfer any 428 energy. In this case, we have the transition matrix E 0 = 429 I E+1, where I E+1 is an (E+1) (E+1) identity matrix. 430 The changing of energy state depends on the current location 431 and on the action of the energy gateway. Let W L,E ((L, E), 432 (L, E ) A) denote the transition matrix from the current com- 433 posite state (L, E) to the next state (L, E ), which has the di- 434 mension of (E+1)L (E+1)L. When action A = 0 is taken, 435 the corresponding transition matrix is expressed as follows: 436 W L,E ((L, E), (L, E ) A = 0) =W L E 0 (15) where E 0 indicates that the energy state E remains the same (i.e., 437 E = E ), regardless of the location state of the energy gateway. 438

6 6 TRANSACTIONS ON VEHICULAR TECHNOLOGY may receive energy directly from the charger). Conse- 471 quently, the user state of the energy gateway is N When the energy gateway moves to the location with 473 neither chargers nor users (i.e., L L NC ), similar to the 474 previous case, the user state is N For ease of presentation, we define the following matrices: 476 W L,E L L B LNC AQ1 448 Fig. 2. Energy charging action. (a) With a charger and (b) without any charger at the current location. 439 When action A = 1 is taken, i.e., the energy gateway re- 440 quests for energy charging, the corresponding transition matrix 441 is expressed as follows: W L,E ((L, E), (L, E ) A =1) L NC,NC E 0 L NC,B E 0 L NC,S E 0 = L B,NC E + L B,B E + L B,S E +. (16) L S,NC E 0 L S,B E 0 L S,S E In this case, when the energy gateway is not at any charger (i.e., 443 the current location state L belongs to subset L S or L NC ), the 444 energy gateway cannot receive any energy. Consequently, E is applied to the row corresponding to the location states L 446 L S and L L NC. Otherwise, the energy gateway will receive 447 energy, and the matrix E + is applied for the location state L L B (see Fig. 2). 449 When the action A = 2 is taken, the energy gateway trans- 450 fers energy to users. The corresponding transition matrix is 451 expressed as follows: W L,E ((L, E), (L, E ) A = 2) =W L E. (17) 452 In this case, the energy state of the battery decreases. Note that 453 the energy transferring action can be taken at all the locations. 454 For the locations without energy users (i.e., L L NC LB ), 455 the energy gateway can still transfer energy, although no users 456 will pay for and receive the transferred energy so that the 457 transferred energy could be wasted ) User State Transitions: Here, the user state is the number 459 of users that the energy gateway can transfer energy to. The 460 transitions of the user state depend on the location of the energy 461 gateway. For the energy gateway moving from the current 462 location L to the next location L, the transition of the user state 463 from N to N has the following three cases The energy gateway moves to the location with users (i.e., L L S ). Thus, there will be some users that can receive energy from the energy gateway, and the user state is N N = {0, 1,...,N}. However, when the energy gateway moves to the location with a charger (i.e., L L B ), there will be no users receiving energy from the energy gateway (e.g., the users = W L,E ((L, E), (L, E ) A) ([ ] ) I(LNC+L B ) (L NC+L B ) 0 I 0 0 (E+1) (E+1) L L W L,E L L S (18) = W L,E ((L, E), (L, E ) A) ([ ] ) 0 0 I 0 I (E+1) (E+1) (19) LS L S L L where the matrix I Y Y is a Y Y identity matrix. W L,E L L 477 S is a matrix with [L(E + 1)] [L(E + 1)] dimensions, which 478 has the physical meaning that it represents the part of transitions 479 in matrix W L,E ((L, E), (L, E ) A), where the next location 480 L of the energy gateway is in subset L S (i.e., with users), 481 and masks the rest with zeros. Similarly, W L,E L L B LNC 482 represents the part of transitions in W L,E ((L, E), (L, E ) A), 483 where L belongs to subset L B (i.e., with a charger) and subset 484 L NC. 485 For L L B LNC,wehaveN 0. By contrast, for L 486 L S, the transition matrix of user state N considering states L 487 and E is derived as follows: 488 ψ0,0 u ψ u 0,N W L,E,N L L S =..... W L,E L L S ψn,0 u ψn,n u (20) where ψn,n u is the transition probability of an event that the 489 user state changes from n N (for the current state) to n N 490 (for the next decision period). Given the case that users are 491 distributed spatially in Poisson distribution with the spatial 492 density α and the maximum number of users is finite and known 493 (i.e., N<+ ), ψn,n u is in the form of a truncated Poisson 494 process defined as follows: 495 ψn,n u = e πr2 α (πr 2 α) n + k=n n!, n {1,...,N} n {1,...,N 1} e πr2 α (πr 2 α) k k!, n {1,...,N} n = N. (21) Note that other spatial distributions and user state transition 496 processes can be applied without affecting the optimization 497 model. W L,E,N L L S is the transition matrix of (L, E, N ) 498 when the next location L is in subset L S. Similarly, 499 W L,E,N L L B LNC is the transition matrix of (L, E, N ) 500

7 ZHANG et al.: OPTIMAL ENERGY MANAGEMENT POLICY OF MOBILE ENERGY GATEWAY when the next location L is in subset L NC or L B, which is 502 expressed as follows: W L,E,N L L B LNC = [ ] 1 (N+1) 1 0 (N+1) N WL,E L L B LNC (22) 503 where 1 (N+1) 1 is an (N +1) 1 matrix of 1 s. 504 Then, the transition matrix of the current composite state 505 (L, E, N ) to the next composite state (L, E, N ) is given as 506 follows: W L,E,N ((L, E, N ), (L, E, N ) A) = W L,E,N L L S + W L,E,N L L B LNC. (23) 507 5) Overall Transition Matrix: The transition matrix of the 508 entire state space is denoted by W(S, S A), where the current 509 composite state is S =(L, E, N, P), and the next composite 510 state is S =(L, E, N, P ), given action A, which is taken by 511 the energy gateway, as follows: W(S, S A) = W L,E,N ((L, E, N ), (L, E, N ) A) W P (P, P ). (24) 512 V. S OLVING THE MARKOV DECISION PROCESS 513 OPTIMIZATION MODEL 514 Here, we first define an immediate utility function of the 515 energy gateway. Then, we present the MDP model. Next, we 516 define the threshold structure of the optimal policy obtained 517 from the MDP model. 518 A. Immediate Utility Function 519 An immediate utility function u(s, A) is defined as the 520 reward of the energy gateway in the current decision period, 521 given the composite state S =(L, E, N, P). Without loss of 522 generality, we adopt the following function of u(s, A), which 523 has different forms given different locations L and actions A of 524 the energy gateway, i.e., u B (S), L L B and A = 1 u(s A)= u S (S), L L S and A = 2 (25) u 0 (S), otherwise 525 where u B (S) denotes the reward, which is obtained when the 526 energy gateway is at the location with a charger (i.e., L L B ), 527 and the charging action A = 1 is taken. u S (S) denotes the 528 reward when the energy gateway is at the location with users, 529 and the action energy transfer A = 2 is taken. u 0 (S) is the 530 reward of the energy gateway being idle. u B (S), u S (S), and 531 u 0 (S) are defined as follows: due to the compensation to the self-discharging effect [26], 536 [27]. Thus 537 u S (S) =R(N,E S ) F (E) (27) where E S denotes the amount of energy transmitted at the en- 538 ergy gateway to the users. R(N,E S ) is the function indicating 539 the payment from all N users at the current location. This 540 function is defined as in (6) and (8). Thus 541 u 0 (S) = F (E) (28) where only the holding cost of energy is applied. 542 Note that the immediate utility function u 0 (S) is used for the 543 following cases. First, the energy gateway takes the idle action 544 A = 0 regardless of the current location. Second, the charging 545 action A = 1 is taken when the current location has no charger 546 (i.e., L L B ). Third, the energy transferring action is taken 547 when the energy gateway is not at the location with users (i.e., 548 L L S ). 549 B. Solving the MDP Optimization Model 550 The objective of the MDP model is to obtain an optimal 551 energy management policy for the energy gateway. A policy 552 φ(a S) is defined as a mapping of state S to action A to 553 be taken by the energy gateway. The optimal policy, which is 554 denoted by φ (A S), aims to maximize the overall utility of the 555 energy gateway. 556 The following Bellman equation [28] is applied to obtain the 557 optimal policy 558 U(S) = max H(S A) (29) φ(a S) φ (A S) = arg max H(S A) (30) φ(a S) H(S A)=u(S A)+γ S S W (S, S A)U(S ) (31) where S =(L, E, N, P) is the current state. The Bellman equa- 559 tion can be numerically solved by the value iteration algorithm 560 [29]. H( ) denotes the overall utility of the energy gateway, in- 561 cluding the immediate utility of the current state as well as that 562 of all the possible future states. U(S) is the achieved optimal 563 overall utility. φ (A S) is the optimal policy. γ [0, 1) is a dis- 564 count factor of possible future states. u(s A) and S S W (S, 565 S A)U(S ) are the current immediate utility and the expected 566 future utility of the energy gateway, respectively. W (S, S A) 567 is the transition probability from the current state S to the next 568 state S, which can be obtained from the transition matrix, as 569 given in (24). The complexity of solving the Bellman equation 570 by the value iteration algorithm is O( A S 2 ) [30], where A 571 is the number of actions, and S is the total number of states. 572 C. Threshold Structure of MDP Solutions 573 u B (S) = E B P(L) F (E) (26) 532 where E B is the amount of energy transferred from the charger 533 to the energy gateway, P(L) {P 1, P 2,...,P M } is the cur- 534 rent price state at the location L, and F (E) is the holding cost 535 of the current energy state. This cost, for example, could be Next, we introduce the concept of a threshold policy and 574 prove the existence of the threshold policy in the optimal energy 575 management policy obtained from solving the proposed MDP 576 model ) Concept of Threshold Policy: The optimal policy 578 φ (A S) of the MDP model is defined to be a threshold policy, 579

8 8 TRANSACTIONS ON VEHICULAR TECHNOLOGY 580 if the following condition holds: A 1, min Θ Θ Θ thr,1 φ A i, Θ thr,i 1 Θ Θ thr,i (A Θ, S Θ )= i {2, 3,..., A 1} A A, Θ A 1 Θ max Θ (32) 581 where Θ is a state or a composite state variable. S Θ denotes 582 the composite state of other states except Θ. φ (A Θ, S Θ ) is 583 the optimal action solved by the Bellman equation in (29) (31), 584 given the current state S =(Θ, S Θ ). Θ thr,i is called the ith 585 threshold of state Θ. In other words, action A is monotonic as 586 state Θ increases. 587 The existence of the threshold policy can contribute to solv- 588 ing the MDP model efficiently. For example, in a system with 589 a very large number of states S, e.g., S = , the value 590 iteration algorithm is not viable due to the unmanageable com- 591 plexity, as analyzed in Section V-B. As shown in the definition 592 of threshold policy (32), with the existence of the threshold 593 policy, once all the thresholds Θ thr,i i {2, 3,..., A 1} 594 are known, the actions A 1,...,A A to take on all the system 595 states are already decided. The algorithms for immediately 596 deciding on the thresholds deserve future research. However, 597 a few approaches have been proposed to estimate thresholds 598 in the MDP solutions, such as reinforcement learning [31] and 599 approximation algorithms [32]. 600 To prove that the optimal policy φ (A S) in (30) is a thresh- 601 old policy, the concept of supermodularity/submodularity [33] 602 is applied. 603 Definition 1: For x X R, y Y R, a function f(x, y) 604 R is supermodular in (x, y) if f(x 1,y 1 ) f(x 1,y 2 ) f(x 2, 605 y 1 ) f(x 2,y 2 ) x 1,x 2 X y 1,y 2 Y,x 1 >x 2,y 1 >y Similarly, f(x, y) is submodular in (x, y) if f(x 1,y 1 ) f(x 1, 607 y 2 ) f(x 2,y 1 ) f(x 2,y 2 ) x 1,x 2 X y 1,y 2 Y,x 1 > 608 x 2,y 1 >y The supermodularity/submodularity property of f(x, y) is a 610 sufficient condition of the nondecreasing/nonincreasing mono- 611 tonicity of y =argmax y f(x, y) [28], [33]. Specifically, in the 612 proposed MDP model and Bellman equation given in (29) (31), 613 for a given state θ {E, L, W}, the fact that H(S A) is 614 supermodular/submodular in (θ, A) indicates that φ (A S) is 615 nondecreasing/nonincreasing in θ {E, L, W} ) Threshold Policy: First, when the energy gateway is at the 617 location with a charger, the threshold policy exists with respect 618 to the energy state E. 619 We first remove the action of energy transfer A = 2 (i.e., 620 the energy gateway never transfers energy when it is at the 621 location with a charger, i.e., for the current system state S = 622 (L, E, N, P), L L B ). The proof is direct that A = 2 is always 623 dominated by the idle action A = 0 in this case since the 624 following condition holds: H(S A = 0) H(S A = 2) S where L L B. (33) 625 Thus, we have the following theorem when the energy gate- 626 way is at the location with a charger. 627 Theorem 1: Given any user state N, price state P, and 628 location state L L B, the optimal action policy of the energy 629 gateway is a threshold policy in the energy state E, if the holding cost F (E) is a linear function in E. The action of the energy 630 gateway is A = 1ifE ETHRESHOLD 1 0, and A = 0 otherwise. 631 The threshold policy is binary that only the action A {0, 1} 632 will be taken. The intuition is that when the energy gateway has 633 less energy in its battery, it is more likely to receive energy from 634 the charger. The proof of Theorem 1 is in Appendix A. 635 Similarly, when the energy gateway is at the location with 636 users, the charging action A = 1 is eliminated. We have the 637 following theorem for the threshold policy with respect to the 638 energy state E. 639 Theorem 2: Given any user state N, price state P, and loca- 640 tion state L L S, the optimal action policy of the energy gateway 641 is a threshold policy in the energy state E, given that the holding 642 cost F (E) is a linear function in E. The action of the energy 643 gateway is A=0 when E ETHRESHOLD 0 2, and A=2 otherwise. 644 Again, the intuition is that when the energy gateway has more 645 energy in its battery, it is more likely to transfer energy to the 646 users. The proof of Theorem 2 is similar to that of Theorem 1, 647 and therefore, we omit it for brevity. 648 Finally, for the energy gateway at the location without any 649 charger or users (i.e., the location state is in subset L NC ), 650 the idle action A = 0 is always taken, and a threshold policy 651 with respect to the energy state E exists trivially. Therefore, the 652 existence of a threshold policy with respect to the energy state 653 E in the optimal policy is completely proven. 654 In similar spirit, when the energy gateway is at the location 655 with a charger, a threshold policy with respect to the energy 656 price of a particular charger P i i {1, 2,...,M} exists, as 657 stated in the following theorem. 658 Theorem 3: Given any user state N, price state P i =(P 1, ,P i 1, P i+1,...,p M ) (except the ith price component), 660 and location state L L B, the optimal action policy of the 661 energy gateway is a threshold policy in the ith price state 662 component P i. 663 The intuition is that if the energy price is higher, the energy 664 gateway is less likely to receive energy from the charger since 665 it incurs smaller reward. 666 When the energy gateway is at the location with users, we 667 have the following theorem for a threshold policy with respect 668 to the user state N. 669 Theorem 4: Given any price state P, energy state E, and 670 location state L L S, the optimal action policy of an energy 671 gateway is a threshold policy in the user state N. 672 The intuition is that when there are more number of users 673 that can receive energy from the energy gateway, the energy 674 gateway is likely to take the action to transfer energy due to 675 higher reward. The proofs of Theorems 3 and 4 are similar to 676 that of Theorem 1, and therefore, we omit it for brevity. 677 VI. NUMERICAL RESULTS 678 A. System Settings 679 1) System Parameters: Unless otherwise stated, we use the 680 following parameter settings to evaluate and compare the per- 681 formance of different schemes. 682 There are three locations in the network: Location L = has neither a charger nor a user, i.e., L = 1 is in subset 684

9 ZHANG et al.: OPTIMAL ENERGY MANAGEMENT POLICY OF MOBILE ENERGY GATEWAY L NC. Location L = 2 belongs to L B, where the charger exists. At location L = 3, the energy gateway can transfer energy to users, i.e., L = 3 L S. The transition matrix of location state L is W L = (34) which indicates that the energy gateway has the probability of 0.29 to be with the charger and the probability of 0.69 to be with users. The battery of the energy gateway has the capacity of five units of energy, i.e., E = 5. The charger provides the energy charging service at three different prices, which are denoted by P = {0.1, 1.0, 5.0}. The price state changes among the three prices uniformly, i.e., W P = P 1 =[1/3] 3 3, as in (11). The spatial density of users is α = per unit of area. The energy transferring range is set as R = 10 m. The energy gateway receives one unit of energy from the charger and transfers one unit of energy to users, i.e., E B = E S = 1. The probability of successfully receiving energy from the charger is For the immediate utility function given in (25), we assume that the cost of holding energy is negligible, i.e., H(E) 0. The utility function of charging is expressed as u B (S) = E B P. For transferring energy to users, we consider the case where the energy demands of all users are met, as in (8). We set the uniform payment as follows: r(e d ) 1.0. Therefore, in (25), u S (S) =R(N,E S )= 1.0 N, where 1.0 indicates the payment from a user to the energy gateway. The discount factor in the Bellman equation is γ = ) Baseline Schemes and Evaluation Criteria: We compare 715 the proposed MDP-based scheme with four baseline energy 716 management schemes. These schemes are as follows ) A greedy scheme (GRDY): The energy gateway always takes an action to maximize the immediate utility function u(s A) of the current decision period (i.e., a myopic strategy), regardless of all the previous and future system states. 2) A location-aware scheme (LOCA): The energy gateway always takes charging (A = 1), transferring (A = 2), and idle (A = 0) actions at the locations with a charger (i.e., subset L B ), with users (subset L S ) and with neither a charger nor users (subset L NC ), respectively. 3) A random scheme (RND): The action taken by the energy gateway is randomly selected from A = {0, 1, 2}, with the probability of 1/3 for each action. 4) A location-aware random scheme (LRND): The energy gateway takes actions A = 0 and A = 1 when it is at the location in subset L B. It takes actions A = 0 and A = 2 when it is at the location in subset L S. Finally, it takes action A 0atthe location in subset L NC. 735 We assume that the energy gateway is initialized at any state 736 S Swith the probability p ent = 1/ S. By adopting different energy management schemes, we evaluate the expected utility 737 of the energy gateway, energy charging (or transferring) rate, 738 average energy level, and successful energy transferring rate. 739 Here, the successful energy transferring rate is the probability 740 of the states at which the energy gateway receives and stores 741 enough energy to be transferred ) Threshold Policy: Fig. 3 shows that an optimal energy 743 management policy obtained from the proposed MDP-based 744 scheme is a threshold policy. In particular, the threshold policy 745 with respect to the price state P is shown in Fig. 3(a) (d). 746 Fig. 3(a) and (c) shows the policies for the location state L = 1, 747 i.e., the energy gateway is at the location with a charger. In this 748 case, the action taken by the energy gateway changes from A = (i.e., charging) to A = 0 (i.e., idle) as the price state P in- 750 creases. For example, in Fig. 3(a), at the energy state E = 2, 751 action A = 1 is taken when P = 1 as well as when P = 2, and 752 A changes to 0 when P increases to 3. However, when the en- 753 ergy gateway is at the location with users, i.e., L = 2, as shown 754 in Fig. 3(b) and (d), no threshold policy exists with respect to 755 P since the energy gateway cannot request and receive energy 756 from the charger. Consequently, the actions are not affected by 757 the price state P. 758 In Fig. 3(d) and (f), when the energy gateway is at the 759 location with users, i.e., L = 2, the threshold policy exists with 760 respect to N.AsN increases, the action of the energy gateway 761 changes from A = 0toA = 2 (i.e., an energy transferring 762 action). This is due to the fact that the energy gateway gains 763 higher utility by transferring energy when more users can 764 receive energy. By contrast, as shown in Fig. 3(c) and (e), where 765 the location state is fixed as L = 1 (i.e., at the location with a 766 charger), there is no threshold policy with respect to N since 767 the number of users does not affect the charging decision of the 768 energy gateway. 769 The energy state E affects the action of the energy gateway 770 when it is at the location with either the charger or the user, as 771 shown in Fig. 3(a), (b), (e), and (f). When the energy gateway is at 772 the location L=1 where the charger exists, as shown in Fig. 3(a) 773 and (e), the action changes from A = 1toA = 0asE increases. 774 This is because the energy gateway tends to request and receive 775 energy when its battery (energy) level is low [e.g., E 3in776 Fig. 3(e)]. The energy gateway stops charging when its energy 777 level is high enough [e.g., E > 3 in Fig. 3(e)] to avoid the cost 778 from charging. By contrast, when the energy gateway is at the lo- 779 cation with users (i.e., L = 2), the energy transferring action 780 A=2 is preferred as E becomes larger [e.g., E 2inFig. 3(b)]. 781 Specifically, the energy gateway is more likely to transfer 782 energy to the users when it has sufficient energy in its battery. 783 B. Maximum Energy Capacity of Mobile Energy Gateway: 784 Impacts to Optimality 785 We evaluate different performance measures and compare 786 the proposed MDP-based scheme with the other baseline 787 schemes. The results are shown in Figs. 4 and 5, when the 788 maximum capacity E of the energy gateway s battery changes 789 from 0 to Fig. 4(a) shows the expected utilities of the energy gate- 791 way by adopting different energy management schemes. The 792

10 10 TRANSACTIONS ON VEHICULAR TECHNOLOGY Fig. 3. Threshold in actions for different (a) price state P and energy state E (when L = 1andN = 2), (b) price state P and energy state E (when L = 2and N = 2), (c) price state P and user state N (when L = 1andE = 2), (d) price state P and user state N (when L = 2andE = 2), (e) user state N and energy state E (when L = 1andP = 1), and (f) user state N and energy state E (when L = 2andP = 1). Fig. 4. Impacts of maximum energy capacity E on (a) the expected utility, (b) the energy charging rate, and (c) the energy transferring rate. 793 proposed MDP-based scheme achieves optimal performance in 794 terms of utility, compared with all the baseline schemes. In 795 Fig. 4(a), although the computational complexity of solving 796 the MDP-based scheme is O( A S 2 ), which is larger than 797 O(1) of the baseline schemes, the expected utility obtained 798 significantly increases compared with other baseline schemes. 799 As the maximum capacity E of the energy gateway s battery 800 increases, the utilities obtained from the MDP-based and base- 801 lines schemes increase. This is because when E becomes large, 802 the energy gateway can store more energy to be transferred to 803 users and, thus, gain more utility. 804 Fig. 4(a) and (b) shows the energy charging (A = 1) rate 805 from the chargers and energy transferring (A = 2) rate to users, respectively. Fig. 4(a) and (b) highlights that the energy 806 charging and transferring rates first increase and then become 807 stable at a certain level as the maximum capacity E increases. 808 In this case, when E is relatively small, the increased capacity 809 E allows the energy gateway to receive and store more energy 810 (i.e., taking action A = 1). Thus, the energy gateway has more 811 opportunity to transfer energy (i.e., A = 2) to the users. Con- 812 sequently, both the energy charging/transferring rates increase. 813 However, as E continues to increase, the cost (i.e., negative 814 utility) of charging u B (S) prevents the energy gateway from 815 charging and, thus, curtails the energy transfer. Therefore, both 816 the curves of energy charging/transferring rates plateau as E is 817 large enough, i.e., when E 4 in Fig. 4(a) and (b). 818

Channel Selection in Cognitive Radio Networks with Opportunistic RF Energy Harvesting

Channel Selection in Cognitive Radio Networks with Opportunistic RF Energy Harvesting 1 Channel Selection in Cognitive Radio Networks with Opportunistic RF Energy Harvesting Dusit Niyato 1, Ping Wang 1, and Dong In Kim 2 1 School of Computer Engineering, Nanyang Technological University

More information

A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation

A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation Karim G. Seddik and Amr A. El-Sherif 2 Electronics and Communications Engineering Department, American University in Cairo, New

More information

Task Offloading in Heterogeneous Mobile Cloud Computing: Modeling, Analysis, and Cloudlet Deployment

Task Offloading in Heterogeneous Mobile Cloud Computing: Modeling, Analysis, and Cloudlet Deployment Received January 20, 2018, accepted February 14, 2018, date of publication March 5, 2018, date of current version April 4, 2018. Digital Object Identifier 10.1109/AESS.2018.2812144 Task Offloading in Heterogeneous

More information

TRANSMISSION STRATEGIES FOR SINGLE-DESTINATION WIRELESS NETWORKS

TRANSMISSION STRATEGIES FOR SINGLE-DESTINATION WIRELESS NETWORKS The 20 Military Communications Conference - Track - Waveforms and Signal Processing TRANSMISSION STRATEGIES FOR SINGLE-DESTINATION WIRELESS NETWORKS Gam D. Nguyen, Jeffrey E. Wieselthier 2, Sastry Kompella,

More information

Optimal Sensing and Transmission in Energy Harvesting Sensor Networks

Optimal Sensing and Transmission in Energy Harvesting Sensor Networks University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 2-206 Optimal Sensing and Transmission in Energy Harvesting Sensor Networks Xianwen Wu University of Arkansas, Fayetteville

More information

4888 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 15, NO. 7, JULY 2016

4888 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 15, NO. 7, JULY 2016 4888 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 15, NO. 7, JULY 2016 Online Power Control Optimization for Wireless Transmission With Energy Harvesting and Storage Fatemeh Amirnavaei, Student Member,

More information

, and rewards and transition matrices as shown below:

, and rewards and transition matrices as shown below: CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount

More information

Dynamic spectrum access with learning for cognitive radio

Dynamic spectrum access with learning for cognitive radio 1 Dynamic spectrum access with learning for cognitive radio Jayakrishnan Unnikrishnan and Venugopal V. Veeravalli Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University

More information

Performance of Round Robin Policies for Dynamic Multichannel Access

Performance of Round Robin Policies for Dynamic Multichannel Access Performance of Round Robin Policies for Dynamic Multichannel Access Changmian Wang, Bhaskar Krishnamachari, Qing Zhao and Geir E. Øien Norwegian University of Science and Technology, Norway, {changmia,

More information

WIRELESS systems often operate in dynamic environments

WIRELESS systems often operate in dynamic environments IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 61, NO. 9, NOVEMBER 2012 3931 Structure-Aware Stochastic Control for Transmission Scheduling Fangwen Fu and Mihaela van der Schaar, Fellow, IEEE Abstract

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016 Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the

More information

Online Learning Schemes for Power Allocation in Energy Harvesting Communications

Online Learning Schemes for Power Allocation in Energy Harvesting Communications Online Learning Schemes for Power Allocation in Energy Harvesting Communications Pranav Sakulkar and Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering Viterbi School of Engineering

More information

On the complexity of maximizing the minimum Shannon capacity in wireless networks by joint channel assignment and power allocation

On the complexity of maximizing the minimum Shannon capacity in wireless networks by joint channel assignment and power allocation On the complexity of maximizing the minimum Shannon capacity in wireless networks by joint channel assignment and power allocation Mikael Fallgren Royal Institute of Technology December, 2009 Abstract

More information

Power Allocation and Coverage for a Relay-Assisted Downlink with Voice Users

Power Allocation and Coverage for a Relay-Assisted Downlink with Voice Users Power Allocation and Coverage for a Relay-Assisted Downlink with Voice Users Junjik Bae, Randall Berry, and Michael L. Honig Department of Electrical Engineering and Computer Science Northwestern University,

More information

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Mohammad H. Yarmand and Douglas G. Down Department of Computing and Software, McMaster University, Hamilton, ON, L8S

More information

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences Data Gathering and Personalized Broadcasting in Radio Grids with Interferences Jean-Claude Bermond a,, Bi Li a,b, Nicolas Nisse a, Hervé Rivano c, Min-Li Yu d a Coati Project, INRIA I3S(CNRS/UNSA), Sophia

More information

Channel Allocation Using Pricing in Satellite Networks

Channel Allocation Using Pricing in Satellite Networks Channel Allocation Using Pricing in Satellite Networks Jun Sun and Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology {junsun, modiano}@mitedu Abstract

More information

ENERGY STORAGE MANAGEMENT AND LOAD SCHEDULING WITH RENEWABLE INTEGRATION. Tianyi Li. Doctor of Philosophy

ENERGY STORAGE MANAGEMENT AND LOAD SCHEDULING WITH RENEWABLE INTEGRATION. Tianyi Li. Doctor of Philosophy ENERGY STORAGE MANAGEMENT AND LOAD SCHEDULING WITH RENEWABLE INTEGRATION by Tianyi Li A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in The Faculty

More information

Downlink Traffic Scheduling in Green Vehicular Roadside Infrastructure

Downlink Traffic Scheduling in Green Vehicular Roadside Infrastructure Downlink Traffic Scheduling in Green Vehicular Roadside Infrastructure Abdulla A. Hammad, Terence D. Todd, George Karakostas and Dongmei Zhao Department of Electrical and Computer Engineering McMaster

More information

Markov Decision Processes Chapter 17. Mausam

Markov Decision Processes Chapter 17. Mausam Markov Decision Processes Chapter 17 Mausam Planning Agent Static vs. Dynamic Fully vs. Partially Observable Environment What action next? Deterministic vs. Stochastic Perfect vs. Noisy Instantaneous vs.

More information

On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

More information

Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence

Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence Tim Holliday, Nick Bambos, Peter Glynn, Andrea Goldsmith Stanford University Abstract This paper presents a new

More information

Power Allocation over Two Identical Gilbert-Elliott Channels

Power Allocation over Two Identical Gilbert-Elliott Channels Power Allocation over Two Identical Gilbert-Elliott Channels Junhua Tang School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University, China Email: junhuatang@sjtu.edu.cn Parisa

More information

Cooperation Stimulation in Cooperative Communications: An Indirect Reciprocity Game

Cooperation Stimulation in Cooperative Communications: An Indirect Reciprocity Game IEEE ICC 202 - Wireless Networks Symposium Cooperation Stimulation in Cooperative Communications: An Indirect Reciprocity Game Yang Gao, Yan Chen and K. J. Ray Liu Department of Electrical and Computer

More information

Contents. Ipswich City Council Ipswich Adopted Infrastructure Charges Resolution (No. 1) Page

Contents. Ipswich City Council Ipswich Adopted Infrastructure Charges Resolution (No. 1) Page Ipswich City Council Ipswich Adopted Infrastructure Charges Resolution (No. 1) 2014 Contents Page Part 1 Introduction 3 1. Short title 3 2. Commencement 3 3. Sustainable Planning Act 2009 3 4. Purpose

More information

Supplementary Technical Details and Results

Supplementary Technical Details and Results Supplementary Technical Details and Results April 6, 2016 1 Introduction This document provides additional details to augment the paper Efficient Calibration Techniques for Large-scale Traffic Simulators.

More information

requests/sec. The total channel load is requests/sec. Using slot as the time unit, the total channel load is 50 ( ) = 1

requests/sec. The total channel load is requests/sec. Using slot as the time unit, the total channel load is 50 ( ) = 1 Prof. X. Shen E&CE 70 : Examples #2 Problem Consider the following Aloha systems. (a) A group of N users share a 56 kbps pure Aloha channel. Each user generates at a Passion rate of one 000-bit packet

More information

Intuitionistic Fuzzy Estimation of the Ant Methodology

Intuitionistic Fuzzy Estimation of the Ant Methodology BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 2 Sofia 2009 Intuitionistic Fuzzy Estimation of the Ant Methodology S Fidanova, P Marinov Institute of Parallel Processing,

More information

Decision Theory: Q-Learning

Decision Theory: Q-Learning Decision Theory: Q-Learning CPSC 322 Decision Theory 5 Textbook 12.5 Decision Theory: Q-Learning CPSC 322 Decision Theory 5, Slide 1 Lecture Overview 1 Recap 2 Asynchronous Value Iteration 3 Q-Learning

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Dipendra Misra Cornell University dkm@cs.cornell.edu https://dipendramisra.wordpress.com/ Task Grasp the green cup. Output: Sequence of controller actions Setup from Lenz et. al.

More information

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks 1 Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo, Member, IEEE, Xiaojun Lin, Member, IEEE, and Ness B. Shroff, Fellow, IEEE Abstract

More information

Name: UW CSE 473 Final Exam, Fall 2014

Name: UW CSE 473 Final Exam, Fall 2014 P1 P6 Instructions Please answer clearly and succinctly. If an explanation is requested, think carefully before writing. Points may be removed for rambling answers. If a question is unclear or ambiguous,

More information

Resource and Task Scheduling for SWIPT IoT Systems with Renewable Energy Sources

Resource and Task Scheduling for SWIPT IoT Systems with Renewable Energy Sources This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 1.119/JIOT.218.2873658,

More information

Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios

Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios 1206 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 8, NO. 3, MARCH 2009 Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios Anh Tuan Hoang, Ying-Chang Liang, David Tung Chong Wong,

More information

Markovian Decision Process (MDP): theory and applications to wireless networks

Markovian Decision Process (MDP): theory and applications to wireless networks Markovian Decision Process (MDP): theory and applications to wireless networks Philippe Ciblat Joint work with I. Fawaz, N. Ksairi, C. Le Martret, M. Sarkiss Outline Examples MDP Applications A few examples

More information

Information in Aloha Networks

Information in Aloha Networks Achieving Proportional Fairness using Local Information in Aloha Networks Koushik Kar, Saswati Sarkar, Leandros Tassiulas Abstract We address the problem of attaining proportionally fair rates using Aloha

More information

Hidden Markov Models (HMM) and Support Vector Machine (SVM)

Hidden Markov Models (HMM) and Support Vector Machine (SVM) Hidden Markov Models (HMM) and Support Vector Machine (SVM) Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1 Hidden Markov Models (HMM)

More information

Reinforcement Learning based Multi-Access Control and Battery Prediction with Energy Harvesting in IoT Systems

Reinforcement Learning based Multi-Access Control and Battery Prediction with Energy Harvesting in IoT Systems 1 Reinforcement Learning based Multi-Access Control and Battery Prediction with Energy Harvesting in IoT Systems Man Chu, Hang Li, Member, IEEE, Xuewen Liao, Member, IEEE, and Shuguang Cui, Fellow, IEEE

More information

EE 550: Notes on Markov chains, Travel Times, and Opportunistic Routing

EE 550: Notes on Markov chains, Travel Times, and Opportunistic Routing EE 550: Notes on Markov chains, Travel Times, and Opportunistic Routing Michael J. Neely University of Southern California http://www-bcf.usc.edu/ mjneely 1 Abstract This collection of notes provides a

More information

Performance Analysis of a Threshold-Based Relay Selection Algorithm in Wireless Networks

Performance Analysis of a Threshold-Based Relay Selection Algorithm in Wireless Networks Communications and Networ, 2010, 2, 87-92 doi:10.4236/cn.2010.22014 Published Online May 2010 (http://www.scirp.org/journal/cn Performance Analysis of a Threshold-Based Relay Selection Algorithm in Wireless

More information

A Learning Theoretic Approach to Energy Harvesting Communication System Optimization

A Learning Theoretic Approach to Energy Harvesting Communication System Optimization 1 A Learning Theoretic Approach to Energy Harvesting Communication System Optimization Pol Blasco, Deniz Gündüz and Mischa Dohler CTTC, Barcelona, Spain Emails:{pol.blasco, mischa.dohler}@cttc.es Imperial

More information

Stability Analysis of Slotted Aloha with Opportunistic RF Energy Harvesting

Stability Analysis of Slotted Aloha with Opportunistic RF Energy Harvesting 1 Stability Analysis of Slotted Aloha with Opportunistic RF Energy Harvesting Abdelrahman M.Ibrahim, Ozgur Ercetin, and Tamer ElBatt arxiv:151.6954v2 [cs.ni] 27 Jul 215 Abstract Energy harvesting (EH)

More information

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Husheng Li 1 and Huaiyu Dai 2 1 Department of Electrical Engineering and Computer

More information

250 (headphones list price) (speaker set s list price) 14 5 apply ( = 14 5-off-60 store coupons) 60 (shopping cart coupon) = 720.

250 (headphones list price) (speaker set s list price) 14 5 apply ( = 14 5-off-60 store coupons) 60 (shopping cart coupon) = 720. The Alibaba Global Mathematics Competition (Hangzhou 08) consists of 3 problems. Each consists of 3 questions: a, b, and c. This document includes answers for your reference. It is important to note that

More information

Reinforcement Learning. Introduction

Reinforcement Learning. Introduction Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control

More information

Call Completion Probability in Heterogeneous Networks with Energy Harvesting Base Stations

Call Completion Probability in Heterogeneous Networks with Energy Harvesting Base Stations Call Completion Probability in Heterogeneous Networks with Energy Harvesting Base Stations Craig Wang, Salman Durrani, Jing Guo and Xiangyun (Sean) Zhou Research School of Engineering, The Australian National

More information

An Adaptive Clustering Method for Model-free Reinforcement Learning

An Adaptive Clustering Method for Model-free Reinforcement Learning An Adaptive Clustering Method for Model-free Reinforcement Learning Andreas Matt and Georg Regensburger Institute of Mathematics University of Innsbruck, Austria {andreas.matt, georg.regensburger}@uibk.ac.at

More information

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

Simplex Algorithm for Countable-state Discounted Markov Decision Processes Simplex Algorithm for Countable-state Discounted Markov Decision Processes Ilbin Lee Marina A. Epelman H. Edwin Romeijn Robert L. Smith November 16, 2014 Abstract We consider discounted Markov Decision

More information

Preference Elicitation for Sequential Decision Problems

Preference Elicitation for Sequential Decision Problems Preference Elicitation for Sequential Decision Problems Kevin Regan University of Toronto Introduction 2 Motivation Focus: Computational approaches to sequential decision making under uncertainty These

More information

Lecture 7: Wireless Power Transfer

Lecture 7: Wireless Power Transfer Advanced Topics on Wireless Ad Hoc Networks Lecture 7: Wireless Power Transfer Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Wireless Power Transfer 1 / 61 Wireless

More information

Energy Harvesting Multiple Access Channel with Peak Temperature Constraints

Energy Harvesting Multiple Access Channel with Peak Temperature Constraints Energy Harvesting Multiple Access Channel with Peak Temperature Constraints Abdulrahman Baknina, Omur Ozel 2, and Sennur Ulukus Department of Electrical and Computer Engineering, University of Maryland,

More information

Artificial Intelligence & Sequential Decision Problems

Artificial Intelligence & Sequential Decision Problems Artificial Intelligence & Sequential Decision Problems (CIV6540 - Machine Learning for Civil Engineers) Professor: James-A. Goulet Département des génies civil, géologique et des mines Chapter 15 Goulet

More information

Optimum Repartition of Transport Capacities in the Logistic System using Dynamic Programming

Optimum Repartition of Transport Capacities in the Logistic System using Dynamic Programming Theoretical and Applied Economics Volume XVIII (011), No. 8(561), pp. 17-0 Optimum Repartition of Transport Capacities in the Logistic System using Dynamic Programming Gheorghe BĂŞANU Bucharest Academy

More information

Energy Efficient Transmission Strategies for Body Sensor Networks with Energy Harvesting

Energy Efficient Transmission Strategies for Body Sensor Networks with Energy Harvesting Energy Efficient Transmission Strategies for Body Sensor Networks with Energy Harvesting Alireza Seyedi Department of ECE, University of Rochester, Rochester, NY USA e-mail: alireza@ece.rochester.edu Biplab

More information

Exploiting Mobility in Cache-Assisted D2D Networks: Performance Analysis and Optimization

Exploiting Mobility in Cache-Assisted D2D Networks: Performance Analysis and Optimization 1 Exploiting Mobility in Cache-Assisted D2D Networks: Performance Analysis and Optimization Rui Wang, Jun Zhang, Senior Member, IEEE, S.H. Song, Member, IEEE, and arxiv:1806.04069v1 [cs.it] 11 Jun 2018

More information

Revenue Maximization in a Cloud Federation

Revenue Maximization in a Cloud Federation Revenue Maximization in a Cloud Federation Makhlouf Hadji and Djamal Zeghlache September 14th, 2015 IRT SystemX/ Telecom SudParis Makhlouf Hadji Outline of the presentation 01 Introduction 02 03 04 05

More information

Final Exam December 12, 2017

Final Exam December 12, 2017 Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes

More information

HDR - A Hysteresis-Driven Routing Algorithm for Energy Harvesting Tag Networks

HDR - A Hysteresis-Driven Routing Algorithm for Energy Harvesting Tag Networks HDR - A Hysteresis-Driven Routing Algorithm for Energy Harvesting Tag Networks Adrian Segall arxiv:1512.06997v1 [cs.ni] 22 Dec 2015 March 12, 2018 Abstract The work contains a first attempt to treat the

More information

Lecture 3: Markov Decision Processes

Lecture 3: Markov Decision Processes Lecture 3: Markov Decision Processes Joseph Modayil 1 Markov Processes 2 Markov Reward Processes 3 Markov Decision Processes 4 Extensions to MDPs Markov Processes Introduction Introduction to MDPs Markov

More information

On the Approximate Linear Programming Approach for Network Revenue Management Problems

On the Approximate Linear Programming Approach for Network Revenue Management Problems On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction MS&E338 Reinforcement Learning Lecture 1 - April 2, 2018 Introduction Lecturer: Ben Van Roy Scribe: Gabriel Maher 1 Reinforcement Learning Introduction In reinforcement learning (RL) we consider an agent

More information

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Matthew Johnston, Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge,

More information

Performance of Wireless-Powered Sensor Transmission Considering Energy Cost of Sensing

Performance of Wireless-Powered Sensor Transmission Considering Energy Cost of Sensing Performance of Wireless-Powered Sensor Transmission Considering Energy Cost of Sensing Wanchun Liu, Xiangyun Zhou, Salman Durrani, Hani Mehrpouyan, Steven D. Blostein Research School of Engineering, College

More information

Optimal Power Allocation for Cognitive Radio under Primary User s Outage Loss Constraint

Optimal Power Allocation for Cognitive Radio under Primary User s Outage Loss Constraint This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 29 proceedings Optimal Power Allocation for Cognitive Radio

More information

Planning in Markov Decision Processes

Planning in Markov Decision Processes Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Planning in Markov Decision Processes Lecture 3, CMU 10703 Katerina Fragkiadaki Markov Decision Process (MDP) A Markov

More information

Energy minimization based Resource Scheduling for Strict Delay Constrained Wireless Communications

Energy minimization based Resource Scheduling for Strict Delay Constrained Wireless Communications Energy minimization based Resource Scheduling for Strict Delay Constrained Wireless Communications Ibrahim Fawaz 1,2, Philippe Ciblat 2, and Mireille Sarkiss 1 1 LIST, CEA, Communicating Systems Laboratory,

More information

Application-Level Scheduling with Deadline Constraints

Application-Level Scheduling with Deadline Constraints Application-Level Scheduling with Deadline Constraints 1 Huasen Wu, Xiaojun Lin, Xin Liu, and Youguang Zhang School of Electronic and Information Engineering, Beihang University, Beijing 100191, China

More information

ABSTRACT WIRELESS COMMUNICATIONS. criterion. Therefore, it is imperative to design advanced transmission schemes to

ABSTRACT WIRELESS COMMUNICATIONS. criterion. Therefore, it is imperative to design advanced transmission schemes to ABSTRACT Title of dissertation: DELAY MINIMIZATION IN ENERGY CONSTRAINED WIRELESS COMMUNICATIONS Jing Yang, Doctor of Philosophy, 2010 Dissertation directed by: Professor Şennur Ulukuş Department of Electrical

More information

Admission control schemes to provide class-level QoS in multiservice networks q

Admission control schemes to provide class-level QoS in multiservice networks q Computer Networks 35 (2001) 307±326 www.elsevier.com/locate/comnet Admission control schemes to provide class-level QoS in multiservice networks q Suresh Kalyanasundaram a,1, Edwin K.P. Chong b, Ness B.

More information

Open Loop Optimal Control of Base Station Activation for Green Networks

Open Loop Optimal Control of Base Station Activation for Green Networks Open Loop Optimal Control of Base Station Activation for Green etworks Sreenath Ramanath, Veeraruna Kavitha,2 and Eitan Altman IRIA, Sophia-Antipolis, France, 2 Universite d Avignon, Avignon, France Abstract

More information

Full-Duplex Cooperative Cognitive Radio Networks with Wireless Energy Harvesting

Full-Duplex Cooperative Cognitive Radio Networks with Wireless Energy Harvesting Full-Duplex Cooperative Cognitive Radio Networks with Wireless Energy Harvesting Rui Zhang, He Chen, Phee Lep Yeoh, Yonghui Li, and Branka Vucetic School of Electrical and Information Engineering, University

More information

Location Determination Technologies for Sensor Networks

Location Determination Technologies for Sensor Networks Location Determination Technologies for Sensor Networks Moustafa Youssef University of Maryland at College Park UMBC Talk March, 2007 Motivation Location is important: Determining the location of an event

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 16.410/413 Principles of Autonomy and Decision Making Lecture 23: Markov Decision Processes Policy Iteration Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology December

More information

Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks

Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks Marco Moretti, Andrea Abrardo Dipartimento di Ingegneria dell Informazione, University of Pisa, Italy

More information

Internet Monetization

Internet Monetization Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition

More information

Distributed Joint Offloading Decision and Resource Allocation for Multi-User Mobile Edge Computing: A Game Theory Approach

Distributed Joint Offloading Decision and Resource Allocation for Multi-User Mobile Edge Computing: A Game Theory Approach Distributed Joint Offloading Decision and Resource Allocation for Multi-User Mobile Edge Computing: A Game Theory Approach Ning Li, Student Member, IEEE, Jose-Fernan Martinez-Ortega, Gregorio Rubio Abstract-

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount

The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount Yinyu Ye Department of Management Science and Engineering and Institute of Computational

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

Stochastic Content-Centric Multicast Scheduling for Cache-Enabled Heterogeneous Cellular Networks

Stochastic Content-Centric Multicast Scheduling for Cache-Enabled Heterogeneous Cellular Networks 1 Stochastic Content-Centric Multicast Scheduling for Cache-Enabled Heterogeneous Cellular Networks Bo Zhou, Ying Cui, Member, IEEE, and Meixia Tao, Senior Member, IEEE Abstract Caching at small base stations

More information

Sequential Decisions

Sequential Decisions Sequential Decisions A Basic Theorem of (Bayesian) Expected Utility Theory: If you can postpone a terminal decision in order to observe, cost free, an experiment whose outcome might change your terminal

More information

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences

Data Gathering and Personalized Broadcasting in Radio Grids with Interferences Data Gathering and Personalized Broadcasting in Radio Grids with Interferences Jean-Claude Bermond a,b,, Bi Li b,a,c, Nicolas Nisse b,a, Hervé Rivano d, Min-Li Yu e a Univ. Nice Sophia Antipolis, CNRS,

More information

HUB NETWORK DESIGN MODEL IN A COMPETITIVE ENVIRONMENT WITH FLOW THRESHOLD

HUB NETWORK DESIGN MODEL IN A COMPETITIVE ENVIRONMENT WITH FLOW THRESHOLD Journal of the Operations Research Society of Japan 2005, Vol. 48, No. 2, 158-171 2005 The Operations Research Society of Japan HUB NETWORK DESIGN MODEL IN A COMPETITIVE ENVIRONMENT WITH FLOW THRESHOLD

More information

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Online Power Control Optimization for Wireless Transmission with Energy Harvesting and Storage

Online Power Control Optimization for Wireless Transmission with Energy Harvesting and Storage Online Power Control Optimization for Wireless Transmission with Energy Harvesting and Storage Fatemeh Amirnavaei, Student Member, IEEE and Min Dong, Senior Member, IEEE arxiv:606.046v2 [cs.it] 26 Feb

More information

Spectrum Sharing in RF-Powered Cognitive Radio Networks using Game Theory

Spectrum Sharing in RF-Powered Cognitive Radio Networks using Game Theory Spectrum Sharing in RF-owered Cognitive Radio Networks using Theory Yuanye Ma, He Henry Chen, Zihuai Lin, Branka Vucetic, Xu Li The University of Sydney, Sydney, Australia, Email: yuanye.ma, he.chen, zihuai.lin,

More information

Wireless Transmission with Energy Harvesting and Storage. Fatemeh Amirnavaei

Wireless Transmission with Energy Harvesting and Storage. Fatemeh Amirnavaei Wireless Transmission with Energy Harvesting and Storage by Fatemeh Amirnavaei A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in The Faculty of Engineering

More information

AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN

AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN A Thesis Presented to The Academic Faculty by Bryan Larish In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

More information

Online Scheduling for Energy Harvesting Broadcast Channels with Finite Battery

Online Scheduling for Energy Harvesting Broadcast Channels with Finite Battery Online Scheduling for Energy Harvesting Broadcast Channels with Finite Battery Abdulrahman Baknina Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Final Exam December 12, 2017

Final Exam December 12, 2017 Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes

More information

Linear Programming in Matrix Form

Linear Programming in Matrix Form Linear Programming in Matrix Form Appendix B We first introduce matrix concepts in linear programming by developing a variation of the simplex method called the revised simplex method. This algorithm,

More information

Changes in the Spatial Distribution of Mobile Source Emissions due to the Interactions between Land-use and Regional Transportation Systems

Changes in the Spatial Distribution of Mobile Source Emissions due to the Interactions between Land-use and Regional Transportation Systems Changes in the Spatial Distribution of Mobile Source Emissions due to the Interactions between Land-use and Regional Transportation Systems A Framework for Analysis Urban Transportation Center University

More information

Today s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning

Today s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld Today s Outline Reinforcement Learning Q-value iteration Q-learning Exploration / exploitation Linear function approximation Many slides

More information

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes Today s s Lecture Lecture 20: Learning -4 Review of Neural Networks Markov-Decision Processes Victor Lesser CMPSCI 683 Fall 2004 Reinforcement learning 2 Back-propagation Applicability of Neural Networks

More information

Optimal Energy Management Strategies in Wireless Data and Energy Cooperative Communications

Optimal Energy Management Strategies in Wireless Data and Energy Cooperative Communications 1 Optimal Energy Management Strategies in Wireless Data and Energy Cooperative Communications Jun Zhou, Xiaodai Dong, Senior Member, IEEE arxiv:1801.09166v1 [eess.sp] 28 Jan 2018 and Wu-Sheng Lu, Fellow,

More information

Supplementary Material to Resolving Policy Conflicts in Multi-Carrier Cellular Access

Supplementary Material to Resolving Policy Conflicts in Multi-Carrier Cellular Access Supplementary Material to Resolving Policy Conflicts in Multi-Carrier Cellular Access Proofs to the theoretical results in paper: Resolving Policy Conflicts in Multi-Carrier Cellular Access ZENGWEN YUAN,

More information

Chapter 5 A Modified Scheduling Algorithm for The FIP Fieldbus System

Chapter 5 A Modified Scheduling Algorithm for The FIP Fieldbus System Chapter 5 A Modified Scheduling Algorithm for The FIP Fieldbus System As we stated before FIP is one of the fieldbus systems, these systems usually consist of many control loops that communicate and interact

More information

Distributed Reinforcement Learning Based MAC Protocols for Autonomous Cognitive Secondary Users

Distributed Reinforcement Learning Based MAC Protocols for Autonomous Cognitive Secondary Users Distributed Reinforcement Learning Based MAC Protocols for Autonomous Cognitive Secondary Users Mario Bkassiny and Sudharman K. Jayaweera Dept. of Electrical and Computer Engineering University of New

More information

1 Markov decision processes

1 Markov decision processes 2.997 Decision-Making in Large-Scale Systems February 4 MI, Spring 2004 Handout #1 Lecture Note 1 1 Markov decision processes In this class we will study discrete-time stochastic systems. We can describe

More information

Optimal Sleeping Mechanism for Multiple Servers with MMPP-Based Bursty Traffic Arrival

Optimal Sleeping Mechanism for Multiple Servers with MMPP-Based Bursty Traffic Arrival 1 Optimal Sleeping Mechanism for Multiple Servers with MMPP-Based Bursty Traffic Arrival Zhiyuan Jiang, Bhaskar Krishnamachari, Sheng Zhou, arxiv:1711.07912v1 [cs.it] 21 Nov 2017 Zhisheng Niu, Fellow,

More information