Energy Efficient Predictive Resource Allocation for VoD and Real-time Services

Energy Efficient Predictive Resource Allocation for VoD and Real-tie Services Changyang She and Chenyang Yang 1 arxiv:1707.01673v1 [cs.it] 6 Jul 2017 Abstract This paper studies how to exploit the predicted inforation to axiize energy efficiency (EE of a syste supporting hybrid services. To obtain an EE upper bound of predictive resource allocation, we jointly optiize resource allocation for video on-deand (VoD and real-tie (RT services to axiize EE by exploiting perfect future large-scale channel gains. We find that the EE-optial predictive resource allocation is a two-tiescale policy, which akes a resources usage plan at the beginning of prediction window and allocates resources in each tie slot. Analysis shows that if there is only VoD service, predicting large-scale channel gains and distribution of sall-scale channel gains are necessary to achieve the EE upper bound. If there is only RT service, future large-scale channel gains cannot help iprove EE. However, if there are both VoD and RT services, predicting large-scale channel gains of both kinds of users are helpful. A low-coplexity is proposed, which is robust to prediction errors. Siulation results show that the optial policy is superior to the relevant counterparts, and the heuristic policy can achieve higher EE than the optial policy when the large-scale channel gains are inaccurate. Index Ters Energy efficiency, predictive resource allocation, VoD services, real-tie services I. INTRODUCTION Energy efficiency (EE is a key perforance etric for the fifth generation (5G cellular networks [2, 3]. Inspired by the finding in [4] that user obility is highly predictable, iproving EE by exploiting predicted inforation has drawn significant attention as the sart phone popularizes and big data analytics flourishes [5, 6]. With predicted trajectory of a obile user [7], EE can be boosted by sending ore data to the user when it is close to a base station (BS. A part of this work was presented in IEEE/CIC ICCC 2015 [1]. Changyang She and Chenyang Yang are with the School of Electronics and Inforation Engineering, Beihang University, Beijing 100191, China (eail:{cyshe,cyyang}@buaa.edu.cn.

2 5G networks are expected to support diverse services with different quality-of-service (QoS provision [8]. As shown in [9], ore than half of the overall data traffic is obile video in 2014, and the percentage is anticipated to becoe 72 % by 2019. In order to satisfy the users experience of video on-deand (VoD services, the video quality and playback interruption are two iportant etrics [10]. In future cellular networks, there still exist any real-tie (RT services such as video conference and voice over IP that require stringent QoS [8], which is characterized by a delay bound and a sall delay bound violation probability for packets [11]. Three kinds of services have been considered in existing predictive resource allocations. The first kind is RT services (see [12, 13] and references therein. To iprove the adission level QoS, the cell-level obility prediction is exploited, which has been widely studied in existing literature (e.g, [14] and the references in [6]. By predicting the future handoff tie and the cell that a RT user will access to, the bandwidth at the next BS was reserved for the user [12], and a call adission control schee was proposed in [13]. These works ai to tradeoff the handoff call dropping rate and the new call blocking rate, and iplicitly assue that fixed bandwidth is reserved for each user to ensure the QoS. The second kind is VoD services [15 21]. To iprove the packet level QoS, the trajectory or rate prediction is exploited, which has been investigated in [7,18] and the references in [6]. With the trajectory prediction and the help of a radio ap, the future large scale channel gains can be predicted. Based on the predicted large scale channel gain or data rate, resource allocation aong future tie slots was studied in [15, 16], either to iniize video degradation or to axiize EE. In [18], the average rates at different locations easured in the past days were used as the average rate prediction with the help of user trajectory, with which the QoS iproveent of video streaing was deonstrated. In [17], a practical two-tiescale resource allocation was proposed. In the first tiescale, tie resource allocation is optiized based on the rate prediction, while in the second tiescale, subcarriers are allocated based on the sall-scale channel gains. In [19], future data rate was allocated to iniize the usage of resources, where an iterative allocation algorith was proposed to account for the uncertainty on predicting user locations and nuber of users in a cell. Considering that future data rate cannot be predicted without error, a robust predictive resource allocation was proposed in [20], where the prediction errors on future rate is odelled as Gaussian distribution. A closed-for relation between prediction errors on data rate and probabilistic QoS guarantees was obtained in [21]. These studies assue that the future

3 data rate is predictable, but the sall-scale channel fading is not considered. However, the data rate of wireless link highly depends on sall-scale channel fading, siply ignoring such fading in predictive resource allocation ay lead to the degradation of QoS. The third kind is the delay tolerant services such as file downloading [15,22,23]. By exploiting perfect instantaneous data rate prediction, tie resource allocation aong ultiple users was optiized in [15], either to axiize the total throughput over a prediction window or to axiize the inial throughput. By exploiting future large-scale channel gains, a proportional fair scheduling policy was optiized in [22]. With both future large-scale channel gains and average arrival rate of RT traffic, a predictive resource allocation policy was proposed in [23] to save energy at the BSs, where RT services are regarded as a background traffic with higher priority, and resources are reserved for RT services to ensure the QoS. In all previous studies, the predictive resource allocation is only designed for a single kind of services. Yet a real-world cellular network needs to support different kinds of services. It is no doubt that jointly allocating resources to different services can iprove EE, and the policy that reserves resources for RT services is inevitably conservative. Besides, all existing predictive resource allocation policies except those proposed in [17, 23] are in one tiescale, which were designed at the beginning of a prediction window. In practice, the large-scale channel gains are predictable in the tiescale of seconds, but the sall-scale channel gains are hard to predict beyong the channel coherence tie, which is in the tiescale of illiseconds. As a consequence, the one tiescale policies can neither fully use the radio resources nor guarantee the QoS in tie-varying fading channels. While the policies in [17, 23] are ipleented in two-tiescale, the policies in different tie scales are designed separately. In this paper, we optiize predictive resource allocation jointly for hybrid services and for the policies in two tiescales. We consider an orthogonal frequency division ultiple access (OFDMA network serving two types of users, one type of users request VoD services (delay tolerant service can be regarded as a special case of VoD services, and the other users request RT services. We study EE-optial resource allocation exploiting both large-scale channel gains predictable within window and sall-scale channel gains estiatable in each tie slot. We further find which type channel inforation needs to predict, which is of practical interest since channel prediction inevitably consues coputing (e.g., trajectory predicting and storage (e.g., radio ap resources. The ajor contributions of this work are suarized as follows,

4 To obtain the EE upper bound achieved by predictive resource allocation and show that predicting which kinds of channel inforation are necessary to achieve the upper bound, resource allocation is jointly optiized for both services and for two tiescale. At the beginning of a prediction window, the average transit power and bandwidth are assigned to each user based on future large-scale channel gains and sall-scale channel distribution. At the start of each tie slot, instantaneous transit power is allocated to the subcarriers of each user according to the assigned resources with the available sall-scale channel gains. Our analysis shows that predicting sall-scale channel gains for VoD users cannot iprove EE. When there are only RT services, predicting large-scale channel gains cannot help iprove EE. When there are both VoD and RT services, predicting large-scale channel gains for both type of users are necessary to achieve the EE upper bound. This is because by optiizing the resource allocation plan for RT users, we can predict how uch resources they will occupy, which is useful for aking the resource allocation plan for users with VoD services. Siulation results show that joint resource allocation for the two kinds of services can iprove EE significantly, and decoupling the resource allocation in two tie scales leads to considerable EE loss. To provide a viable schee for practice use, a heuristic policy is proposed, which is with low coplexity and robust to prediction errors. Siulation results show that the heuristic policy perfors closely to the optial policy if the prediction of large-scale channel gain is error-free and outperfors the optial policy when the prediction is with large errors. The rest of the paper is organized as follows. In section II, we introduce syste odel and the QoS requireents of both services. In section III, we optiize predictive resource allocation with perfect large-scale channel gains, first for a single cell scenario and then extended to ulti-cell scenario. In section IV, we show which kind of channel inforation needs to be predicted for each type of users. In section V, we propose a heuristic policy robust to prediction uncertainty. In section VI, we provide siulation results, and in section VII, we conclude the paper. II. SYSTEM MODEL AND QOS REQUIREMENTS Consider the scenario that ultiple obile users travel through an OFDMA network, which request VoD and RT services, respectively. For notational siplicity, we first consider a single cell scenario, and then extend to the ulti-cell scenario in the end of next section.

5 A. Transission and Channel Models Consider frequency-selective block fading channel. Tie is discretized to fraes each with duration T and tie slots each with duration τ. The durations are defined according to the channel variation, i.e., the variation of large scale channel gain (caused by path-loss and shadowing and sall scale channel gain (caused by fast fading due to user obility. Assue that: (1 the large scale channel gain reains constant within each frae and ay vary aong fraes, and (2 the sall scale channel gain reains constant within each tie slot and is independent and identically distributed (i.i.d. aong different tie slots and subcarriers in each frae. In typical scenarios, large-scale channel gain varies in the order of seconds. τ is the channel coherence tie, which is in the order of illiseconds [24]. With the predicted user location along the trajectory [4] and easured radio ap, the large scale channel gain is predictable [5], but the sall scale channel gain is hard to predict beyond the channel coherence tie. For notational siplicity, we assue T = N S τ. In practice, T τ [24]. TABLE I LIST OF SYMBOLS τ duration of each tie slot T duration of each frae N S nuber of tie slots in each frae N L nuber of fraes in a prediction window M D nuber of VoD users M R nuber of RT users g ijk sall-scale channel gain for the th user in the jth tie slot of the ith frae on the kth subcarrier α i large-scale channel gain for the th user in the ith frae P ax axial transit power K ax axial nuber of subcarriers p ijk transit power allocated to the th user in the jth tie slot of the ith frae on the kth subcarrier K i nuber of subcarriers allocated to the th user in the ith frae B subcarrier spacing σ 2 0 variance of the additive Gaussian noise ρ power aplifier efficiency φ gap between capacity and achievable rate with practical odulation and coding schees s ij R i instantaneous channel capacity for the th user in the jth tie slot of the ith frae aount of data played for the th VoD user in the ith frae S i a ij aount of data that can be transitted to the th user during the ith frae arrival rate for the th RT user in the jth tie slot of the ith frae s i average service rate for the th user in the ith frae Q ax buffer size of each VoD user b ij departure rate for the th RT user in the jth tie slot of the ith frae Q i queue length of the th VoD user at the beginning of the ith frae D ax delay bound of the th RT user ε D axial delay violation probability of the th RT user θ QoS exponent of the th user E B (θ effective bandwidth of the th user E C i (θ effective capacity of the th user in the ith frae E i energy consuption of the BS in the ith frae P c circuit power consuption on each subcarrier P 0 the fixed circuit power consuption Denote the nuber of users that have accessed to the BS at the beginning of a prediction

6 window 1 as M D +M R, where M D and M R are the nubers of VoD and RT users, respectively. The prediction window includes N L successive fraes. For the th user, α i is the large-scale channel gains in the ith frae, and g ijk is the sall-scale channel gain on the kth subcarrier in the jth tie slot of the ith frae. At the beginning of the prediction window, we assue that α i,i = 1,...,N L are perfectly predicted by the BS, but g ijk, k = 1,...,K ax,j = 1,...,N S,i = 1,...,N L are unknown for = 1,...,M D +M R, where K ax is the total nuber of subcarriers. During the transission procedure, gijk is available at both the th user and the BS after channel estiation in the jth tie slot of the ith frae. A list of sybols is given in Table I. The achievable instantaneous data rate for the th user can be expressed as follows [25], K i s ij = B k=1 where B is the subcarrier spacing, p ijk ( log 2 1+ α i p φσ0 2 ijk g ijk bits/s, (1 is the transit power allocated to the th user on the kth subcarrier in the jth tie slot of the ith frae, φ > 1 captures the gap between capacity and achievable rate with practical odulation and coding schees, σ 2 0 is the variance of the additive Gaussian noise, and K i is the nuber of subcarriers assigned to the th user in the ith frae. B. QoS Requireent for VoD Services Since the key factor that deterines the experience of a user requesting VoD service is playback interruption, we consider the queue in the buffer at each user. We assue that the video segents to be played within the prediction window are available at the BS [16, 26, 27]. The queueing odel for VoD services is shown in Fig. 1. Ri is the aount of data played at the th user in the ith frae, which is given when a certain quality level of the video is chosen by the user (e.g., high definition video. The aount of data that can be transitted to the th user during the ith frae is given by Si = τ N S s ij. j=1 Denote the duration of each video segent as T seg, which is deterined by the video sources and does not depend on the obility of users. For notational siplicity and without loss of generality, we set T seg = T. Then, there are N L video segents in a prediction window. Assue that the buffer size is larger than the size of N L video segents, which is reasonable 1 The users arriving at the cell during a prediction window will wait to be served in the next prediction window, where soe of newly arrived users will not be aditted by the BS if their QoS cannot be ensured.

7 Server BS Si Buffer User R i Display Fig. 1. Queueing odel for VoD services. for sart phones since storage devices are cheap nowadays. This assuption will be reoved in Section V, where we design a policy that is aware of liited buffer size. To guarantee the requested video quality, each video segent should be delivered to the user before it is played. Then, the QoS requireent of the VoD services can be reflected by the following constraint [16], l Q 0 + l+1 Si Ri,l = 1,...,N L, = 1,...,M D, (2 where Q 0 = R 1 is the initial queue length and R N L +1 is the nuber of bits in the first video segent to be played in the next prediction window. In other words, the first video segent to be played in a prediction window has been conveyed to the user in the previous prediction window. Hence, no interruption occurs between the adjacent prediction windows. Since the nuber of tie slots in each frae is large in practice, by channel coding aong tie slots, the data rate in a frae can approach the average data rate [28]. Fro (1, the average data rate for the th user in the ith frae can be expressed as follows, Ki s i = B k=1 [ ( ] E h log 2 1+ α i p φσ0 2 ijk g ijk bits/s, (3 where the average is taken over sall-scale channel fading. Then, we have S i = T s i, and the constraint in (2 can be equivalently written as l s i 1 T l+1 Ri,l = 1,...,N L, = 1,...,M D. (4 i=2 Reark 1. For delay tolerant service such as file downloading, the user deand can be characterized as to transit a file with size R in N L fraes. Then, the required data rate can also be forulated as N L s i R, which is siilar to (4. Therefore, the delay tolerant service can also be included in our fraework.

8 C. QoS Requireent for Real-tie Services Different fro VoD services, the data of RT services are randoly generated by users rather than stored in the server, hence the data fro RT users cannot be transitted in advance. After the data fro users randoly arrive the BS, they are waiting in the queue at the BS for transission but cannot wait too long in the buffer to satisfy the QoS. The queueing odel for the th user requesting a RT service is shown in Fig. 2, where a ij represents the data arrival rate in the jth tie slot of the ith frae. The queueing delay in the buffer of the BS should satisfy the statistical QoS requireent, characterized by a delay bound D ax and a delay violation probability ε D. If the queueing delay in the th queue exceeds Dax with probability less than ε D, then the QoS requireent of the th user can be satisfied. For exaple, the upper bound on ε D 2 % while D ax is 50 s for radio access network [11]. for VoIP is a ij Buffer s ij Wireless channel User BS Fig. 2. Queueing odel for the th user requesting a RT service. There are M R queues for the M R RT users. Effective bandwidth and effective capacity are widely applied tools in designing resource allocation with statistical QoS requireent. For uncorrelated rando arrival process {a ij,i = 1,...,N L,j = 1,...,N S }, the effective bandwidth can be expressed as [29] EB (θ = 1 θ τ lne[ exp ( ] θ τa ij (bits/s, (5 where θ is the QoS exponent. For the RT services with short delay requireent, the duration of each frae is uch longer than the delay bound, i.e., T Dax. The sall-scale channel gains are i.i.d. in different tie slots, and the power allocated in the jth tie slot only depends on gijk. Consequently, s ij,j = 1,...,N S are also i.i.d.. Then, the effective capacity in the ith frae for the th user can be expressed as [30] EC i (θ = 1 θ τ lne[ exp ( ] θ τs ij (bits/s. (6 Denote the steady state delay for the th user as D. Then, the required QoS exponent

9 θ to guarantee (Dax,ε D can be obtained fro [31] as Pr{D > Dax} Pr{D > 0}exp[ θ EB (θ Dax ] exp[ θ EB (θ Dax ] = ε D,i = 1,...,N L, where the approxiation is accurate when the delay bound is uch longer than the duration of each tie slot, which is true for obile users requesting typical RT services like video conference and VoIP [11]. To guarantee the required θ, the following constraint should satisfy [32] EC i (θ EB (θ, = M D +1,...,M D +M R,i = 1,...,N L. (7 D. Power Consuption Model and EE Definition The total energy consued at the BS serving M D +M R users in the prediction window (i.e., in N L fraes can be odeled as [33 35] E i = 1 M D +M R N S K i ρ =1 j=1 k=1 M D +M R τp ijk + TP c =1 Ki + TP 0, (8 where E i is the energy consuption in the ith frae, ρ (0,1] is the power aplifier efficiency, P c is the circuit power consued for baseband processing such as channel estiation on each subcarrier, and P 0 is the fixed circuit power consuption for the BS. According to the bits per Joule etric in [36], EE of a syste is the ratio of the aount of data transitted to the energy consued during a certain period. For predictive resource allocation, the period is the prediction window. However, since only the large-scale channel gains are available at the beginning of the prediction window, both the aount of data to be transitted and the energy to be consued in the upcoing N L fraes are rando variables, which depend on the sall-scale channel gains. As a result, we cannot optiize predictive resource allocation to axiize the EE etric in [36]. Since the nuber of tie slots in each frae is large, i.e., N S is large, axiizing the above EE etric is equivalent to axiizing the ratio of the average aount of transitted data to the average energy consuption, where the average is taken over the sall-scale channel gains. Hence, we define the EE as follows, [ ( MD ( N S MD ]/[ ( +M R N S NL η E h τ +E h τ E h E i ]. (9 =1 j=1 s ij =M D +1 For VoD services, the aount of data transitted equals to the aount of data that needs to j=1 b ij

( M D transit. Thus, E h τ N S s ij = M D T s i = M D N L +1 Ri, which is deterined =1 j=1 =1 =1 i=2 at the beginning of the prediction window by the requested video level and network status. For RT services, when the queues are ( in steady states, the average ( departure rates equal to the M D +M R average arrival rates [29]. Thus, E h τ N S M D +M R = E h τ N S, which =M D +1 b ij j=1 =M D +1 a ij j=1 is deterined by the arrival processes. Therefore, the nuerator of (9 does not depend on the resource allocation policy, and axiizing the EE in (9 is equivalent to iniizing the average energy consuption. Substituting (8 without the last ter into the denoinator of (9, axiizing the EE is equivalent to iniizing the following expression, M 1 D ρ E +M R N S K i M D +M R h τp ijk + TP c Ki. (10 =1 j=1 k=1 =1 10 III. ENERGY EFFICIENT PREDICTIVE RESOURCE ALLOCATION In this section, we optiize predictive resource allocation for OFDMA systes supporting both VoD and RT services to show the potential of predictive policy in iproving EE. To exploit future large-scale channel gains and current sall-scale channel gains in a joint anner, we forulate a functional extree proble and obtain the global optial solution. We first consider single cell scenario, and then extend to ulti-cell scenario. A. Proble Forulation At the beginning of the prediction window (i.e., the 1st tie slot of the 1st frae, we cannot optiize p ijk to iniize (10 since future sall-scale ( channel gains are unknown. Yet we can optiize the average transit power P K i i E h and the nuber of subcarriers p ijk k=1 (i.e., bandwidth K i assigned to the th user in the ith frae, when future large-scale channel gains and the distribution of the sall-scale channel are known. We refer to { P i,k i, = 1,...,M D +M R,i = 1,...,N L } as the resource allocation plan. At the beginning of each tie slot, we can optiize p ijk based on g ijk, k = 1,...,K i and { P i,k i }, since the sall-scale channel gains are available at the BS after channel estiation. We denote the power allocation policies for the VoD services and the RT services as p ijk = f D ( P i,k i,g ijk, = 1,...,M D and p ijk = f R( P i,k i,g ijk, = M D + 1,...,M D + M R,

11 respectively, where i = 1,...,N L, j = 1,...,N S and k = 1,...,K i. The fors of the functions f D ( and f R ( differ for different power allocation policies. The optiization of resource allocation plan and power allocation policies are closely coupled. In what follows, we forulate the joint optiization proble for the two-tiescale policy. Substituting the power allocation policy for VoD service p ijk = f D( P i,ki,gijk into (3, the average service rate in the ith frae can be expressed as follows, [ s i = Ki Blog 2 1+ α i f 0 φσ0 2 D ( P i,ki,g ] g e g dg, (11 where = 1,...,M D. Substituting the power allocation policy for the RT service p ijk = f R( P i,ki,gijk into (1 and then into (6, the effective capacity in the ith frae can be obtained as { [ EC i (θ = K i θ τ ln 1+ α i f 0 φσ0 2 R ( P i,ki,g ] } β g e g dg (bits/s, (12 where = M D + 1,...,M D + M R, β θ τb. (11 and (12 are obtained with Rayleigh ln2 fading, 2 where g ijk are exponentially distributed with ean of 1. Further considering (4 and (7, the optial resource allocation plan and power allocation policies that iniize the average energy consuption under the QoS constraints for both VoD and RT services can be obtained by solving the following proble, s.t. in f D (,f R (, P i,ki,,...,n L, =1,...,M D +M R l K i 0 E ave = M D +M R =1 ( 1 ρ P i +P c K i [ Blog 2 1+ α i f φσ0 2 D ( P i,ki,g ] g e g dg 1 T = 1,...,M D,l = 1,...,N L, { K i θ τ ln 0 [ 1+ α i f φσ0 2 R ( P i,ki,g g ] β, (13 e g dg } l+1 Ri, i=2 E B (θ, (13a = M D +1,...,M D +M R,i = 1,...,N L, (13b 2 We take Rayleigh fading as an exaple in this work, but the ethodology can be extended to the other channels.

12 M D +M R =1 M D +M R =1 P i P ave,i = 1,...,N L, K i K ax,i = 1,...,N L, P i 0,K i 0, = 1,...,M D +M R,i = 1,...,N L, ( where the objective function in (13 is obtained by substituting P K i i = E h p ijk k=1 (13c (13d (13e into (10 and ignoring a constant T = N S τ, constraints in (13a and (13b are obtained by substituting (11 and (12 into (4 and (7, respectively, and (13c and (13d are the constraints on the average transit power and the total nuber of subcarriers. Since only channel statistics are known at the start of the window, this proble only optiize average power and bandwidth in each fraes for each user. With constraint (13d, we can always allocate each subcarrier only to one user. The constraints in (13a and (13b depend on the fors of the functions f D ( and f R (. This indicates that the resource allocation planning depends on the power allocation policies. In other words, the optial value of the objective function in (13 is a function of f D ( and f R (. We denote it as E ave(f D,f R. The optial power allocation policies can be obtained by iniizing Eave (f D,f R, and are denoted as fd ( and f R (. It is worth noting that finding the optial for of function is a functional extree proble, and cannot be solved by standard convex optiization tools. In next subsection, we first find the fors of the functions of f D ( and f R ( that iniizes E ave (f D,f R. Then, the optial resource allocation planning, { P i,k i, = 1,...,M D +M R,i = 1,...,N L }, can be found fro proble (13. Reark 2. The ters inside the su of left hand side of (13a (i.e., s i in (11 is the average rate in each frae. In any existing works [15 21], this average rate is assued known by prediction. As a natural result, the predictive resource allocation is either only in one tiescale (i.e., only ake the plan [15, 16, 18 21], or decoupled into independently designed policies in the two tiescales [17]. However, it is clear fro proble (13 that the future average rate depends on { P i,k i } and f D ( even when the syste only supports VoD services. B. Optial Power Allocation Policies A policy that axiizes the average service s i (or effective capacity E C i (θ with given average transit power P i and nuber of subcarriers Ki (i.e., bandwidth can iniize P i

13 with given s i (or E C i (θ and K i [24, 37]. Inspired by such a fact, we first find the power allocation policy that axiizes s i (or E C i (θ with given P i and K i, and then prove that the policy is optial to iniize the average energy consuption, i.e., iniize (13. 1 Power allocation policy for VoD services: As shown in [24], the policy that axiizes s i with given P i and Ki is the water-filling policy, which is ( ( P φσ0 2 1 fd w i 1,g ν α,g = Ki i νi g i, 0, g < νi, where = 1,...,M D,i = 1,...,N L, and the water level ν i can be obtained fro (14 σ0 2 νi αi ( 1 ν i 1 g e g dg = P i. (15 Ki Note that the for of the function in (14 does not depend on the value of g. 2 Power allocation policy for RT services: As shown in [37], the optial power allocation that axiizes E C i (θ with given P i and K i also follows a water-filling structure, but the water-level is tie-varying and the instantaneous power allocated to each subcarrier depends on the sall-scale channel gains on all the subcarriers assigned to the user. For atheatical tractability, we consider independent power allocation policy 3 that axiizes E C i (θ with given P i and K i, which can be expressed as follows [37], f w R ( P i,g = Ki φσ 2 0 α i [ ] 1 (νi 1 β β 1 +1 gβ g +1 0, g < ν i,,g ν i, (16 where = M D + 1,...,M D + M R,i = 1,...,N L, β = θ τb ln2, and the water level ν i over Rayleigh fading channel can be obtained fro [ φσ0 2 1 αi (νi 1 β β+1 g β +1 ν i 1 g ] e g dg = P i. (17 Ki 3 Independent power allocation policy eans that the instantaneous transit power on a certain subcarrier only depends on the sall-scale channel gain on this subcarrier and is independent of the sall-scale channel gains on the other subcarriers. The policy is near optial when θ is sall [37]. For VoIP service, the delay requireent is not very stringent. For video conference service, the average arrival rate is high. For both RT services, θ is sall, and hence the policy is near optial. In the sequel, we refer to the optial independent power allocation policy as the optial power allocation policy for siplicity.

14 3 Optiality of the power allocation policies: The following proposition indicates that (14 is the optial power allocation policy for VoD services and (16 is the optial independent power allocation policy for RT services that axiizes the EE. Proposition 1. For ANY power allocation policies f D( P i,ki,g and f R ( P i,ki,g, Eave (fw D,fw R E ave (f D,f R. (18 ( The proposition is derived based on the result in [24]: the water-filling policy fd w P i Ki can iniize P i with given Ki and average service rate s i. See proof in Appendix A. Proposition 1 indicates thatf D ( P i,k i,g = f w D C. Optial Resource Allocation Planning ( P i K i,g andf R ( P i,k i,g = f w R Substituting the optial power allocation policies in (14 and (16 into (13a and (13b, the optial resource allocation plan can be obtained fro the following proble, s.t. in P i,ki,,...,n L, =1,...,M D +M R l M D +M R =1 ( P Ki F i D K i θ τ ln K i [F R ( P i K i (13c, (13d and (13e, ( 1 ρ P i +P c K i ( P i K i,g,g., (19 1 l+1 Ri T, = 1,...,M D,l = 1,...,N L, (19a i=2 ] EB (θ, = M D +1,...,M D +M R,i = 1,...,N L, (19b where ( P F i D = Ki 0 ( P F i R Ki = 0 Blog 2 [ 1+ α i [ 1+ α i f w φσ0 2 R f w φσ0 2 D ( P i,g Ki ( P i Ki ] β g ],g g e g dg, (20 e g dg. (21 The following two properties indicate that the feasible region of proble (19 is a convex set. Property 1. The left hand side of (19a is jointly concave in P i and K i, i = 1,...,l. The proof of Property 1 is shown in [1]. Property 2. The left hand side of (19b is jointly concave in P i and K i.

15 Proof: See Appendix B. Since the objective function in (19 is linear, proble (19 is a convex prograing, whose global optial solution can be solved nuerically by interior-point ethod if it is feasible [38]. Proble (19 could be infeasible. When it is infeasible, the video quality of VoD services has to be reduced, i.e., the values of {Ri, = 1,...,M D } in constraints (19a need to be reduced to ake this proble feasible. To iniize the quality deterioration, the video quality of VoD services should be optiized. Such proble has been studied in existing literatures, e.g., [39,40]. In this work, we only consider a siple ethod: if proble (19 is infeasible, then the syste will reduce the video quality of all VoD users to a lower level to ake it feasible. The optial predictive resource allocation operates in two tiescales: At the beginning of each prediction window, the BS akes the resource allocation plan for both VoD and RT users (i.e., assigns average transit power and bandwidth for the forthcoing fraes, with the knowledge of large-scale channel gains, sall-scale channel distribution and QoS requireents of all users, and the statistics of the RT arrival processes. At the beginning of each tie slot during transission procedure, the BS allocates transit power to different subcarriers respectively for the VoD users and RT users according to the resource allocation plan, and with the knowledge of sall-scale channel gains. D. Extension to Multicell Scenario Now we consider a scenario where the M D +M R users are served by N B BSs. We assue that the BSs can share the future large-scale channel gains of all the users in a prediction window aong each other, which does not need high capacity and low latency backhaul links. We assue that the inter-cell interference can be treated as noise. It is not hard to show that Proposition 1 can be extended into the ulti-cell scenario, and hence the power allocation policies in (14 and (16 are optial for VoD services and RT services, respectively. Denote P n i and K n i as the average transit power and the nuber of subcarriers assigned to the th user in the ith frae by its accessed BS (i.e., the nth BS. Denote M n i as the set of indices of the users that are served by the nth BS in the ith frae. The difference between single cell scenario and ulti-cell scenario lies in the constraints on average transit power and total nuber of subcarriers. Specifically, the power and bandwidth assigned to the users that

16 access to the sae BS in the ulti-cell scenario should satisfy the following constraints M n i M n i P n i P ave,n = 1,...,N B,i = 1,...,N L, (22 K n i K ax,n = 1,...,N B,i = 1,...,N L. (23 The user association {M n i,n = 1,...,N B,i = 1,...,N L } and resource allocation plan can be jointly optiized, but the resulting proble is a ixed integer optiization proble, which is uch ore challenging than proble (19. To save energy, it is reasonable to assue that each user is accessed to the BS with the highest large-scale channel gain. Then, {M n i,n = 1,...,N B,i = 1,...,N L } are known by the BSs in the beginning of each prediction window since the trajectories of users are predictable. Siilar to proble (19, the optial resource allocation plan in ulti-cell scenario is also convex prograing, and can be solved nuerically by the interior-point ethod. IV. IMPACTS OF PREDICTED INFORMATION OF DIFFERENT KINDS OF SERVICES ON EE With the prediction of user trajectory and the assistance of radio ap, the large-scale channel gains are predictable. On the other hand, the sall-scale channel gains (i.e., channel state inforation (CSI are hard to predict beyond the horizon of channel coherence tie [41]. This fact naturally leads to the following questions: (1 To axiize the EE of a syste, do we really need to know the future CSI? (2 To axiize EE, for which kind of services the future large-scale channel gains are beneficial? In this section, we strive to answer the questions by separately considering VoD users and RT users. For notational siplicity, we consider the single cell scenario. A. Predicted Inforation of VoD Users To study whether or not the future CSI of VoD users are necessary for iproving EE, we assue that there is no RT user, i.e., M R = 0. If the future CSI is available at the BS at the beginning of each prediction window and M R = 0, then iniizing (10 is equivalent to iniizing the following objective function, 1 M D N S K i M D τp ijk + TP c Ki, (24 ρ =1 j=1 k=1 =1

17 where the transit powers on different subcarriers in the ith frae {p ijk,k = 1,...,K i } depend on the sall-scale channel gains {gijk,k = 1,...,K i }. Denote the total transit power for Ki the th user in the jth tie slot of the ith frae as Pij = p ijk. Since the fast fading in different tie slots in each frae are i.i.d., if the nuber of tie slots in each frae is large, then the tie average transit power converges to the enseble average transit power, i.e., 1 Pij P i when N S. Further considering that T = N S τ, iniizing (24 is N S N S j=1 equivalent to iniizing the following expression, ( MD 1 M D P i +P c ρ =1 =1 which is the sae as the objective function in (13 when M R = 0. k=1 K i, (25 The optial policy that iniizes the energy consuption of VoD services with perfect future CSI can be obtained by iniizing (25 under constraints (13a, (13c, (13d and (13e. Since the optiization proble is the sae as proble (13, the optial power allocation policy, the optial average transit power and nuber of subcarriers, and the inial total energy consuption are the sae for the two probles. This suggests the following observation. Observation 1: To axiize the EE of a syste, predicting CSI of VoD users is not beneficial, but the prediction of their large-scale channel gains are necessary. Essentially, this is because the optial power allocation policy only depends on the distribution of sall scale channels (e.g., Rayleigh Fading as we considered [24]. B. Predicted Inforation of Real-Tie Users To study whether or not the future large-scale and sall-scale channel gains of RT users are necessary for iproving EE, we assue that there is no VoD user, i.e., M D = 0. For RT services, τ D ax T. If the future large-scale channel gains are not available at the BS at the beginning of the prediction window but only available at the beginning of each frae, then the BS can assign the average transit power and nuber of subcarriers to each RT user at the beginning of each frae. The resource allocation assigned to each RT users in the ith frae can be obtained fro the following proble, in M R P i,ki, =1 ( 1 ρ P i +P c K i (26

18 s.t.(13b, (13c, (13d and (13e. According to the expressions in (13b, (13c, (13d and (13e, we can see that the resource allocation in the ith frae does not depend on the resource allocation in the other fraes, and hence proble (13 can be decoposed inton L independent probles as proble (26. Knowing the large-scale channel gains in the future fraes cannot help iprove the QoS (i.e., the delay bound Dax and delay bound violation probability ε D or the EE of a syste only with RT services, since ost of the data should be transitted within one frae (except the data arrive the BS in the end of a frae. This gives rise to another observation as follows. Observation 2: To axiize the EE of a syste only with RT services, the future large-scale channel gains of RT users is no need to know at the beginning of the prediction window. Reark 3. For VoD services, the requested data can be pre-buffered at the user terinal before playback. Since the duration of a prediction window exceeds the duration of each frae T, the BS can choose the fraes with high large-scale channel gains to transit data in advance to save energy. By contrast, for RT services, τ D ax T. As a result, the EE can only be iproved by adjusting resources aong the tie slots within Dax, and the future large-scale channel gains of RT users cannot help iprove the EE of a syste only serving RT services. Nevertheless, with the prediction of large-scale channel gains of RT users and the proposed joint optiization, the network resource usage status available for serving VoD users becoes predictable, which is beneficial in iproving the EE of a network with both VoD and RT services. V. A LOW COMPLEXITY POLICY ROBUST TO PREDICTION ERRORS The solution of proble (13 is with high coputational coplexity, which consues extra energy that ay counteract the EE gain fro the joint optiization. Besides, large-scale channel gains can never be predicted error-free. To provide a viable schee for practice use, we propose a heuristic policy in this section, which is with low coplexity and robust to prediction errors. Recall that the basic idea of iproving EE with predictive resource allocation is to transit ore data to a delay tolerant user under good channel condition, and transit less or even no data to the user under bad channel condition [5]. In order to develop a low coplexity policy, we can decouple the design of iproving EE and user experience. To increase EE, we find a ruler to judge whether the large-scale channel gain in a frae is high or low. To iprove user

19 experience, we decide how any video segents should be transitted in a frae considering the queueing status at the VoD user. For RT service, the resource allocation is non-predictive. To find a ruler (i.e., a threshold robust to prediction errors, we resort to classical statistical theory. As shown in [42], the edian of a set of saples is insensitive to outliers, which is defined as the 50th percentile that separating the first half of the data saples with large values fro the second half with sall values. Copared to ean value, another widely used statistic, edian is less sensitive to outliers. For the proble at-hand, outliers are the large-scale channel gains with large prediction errors. Hence, we adopt the edian of the predicted large-scale channel gains, denoted as αed, as the threshold. Then, at the beginning of the prediction window, the BSs only need to predict the edian α ed. To avoid stalling and buffer overflow for the VoD users with very liited buffer sizes, the nuber of segents transitted in each frae needs to be controlled. The nuber depends on the traffic load of the network, the buffer size and channel condition of each VoD user. Since we use the edian as the threshold, in average the BS transits data to a VoD user in 50% tie during streaing. Then, it is reasonable to transit two segents to a user with good channel, if there is still roo in the buffer. Denote the axial buffer size as Q ax, and the queue length of the th user at the beginning of the ith frae as Q i. At the beginning of the ith frae, the large-scale channel gain of the th user, α i, can be estiated at its associated BS. Denote ĩ as the index of last video segent that has been transitted before the ith frae. Then, the indices of segents to be transitted are {ĩ+1,...}. If αi αed, then the th user is in good channel condition. Two segents will be transitted in the ith frae if the buffer has enough residual space, i.e., Q i +R ĩ+1 +R ĩ+2 R i Q ax. Then, the required average service rate with the heuristic policy is s heu i = 1 T (R ĩ+1 +R. One ĩ+2 segent will be transitted if Q i +R ĩ+1 +R ĩ+2 R i > Q ax but Q i +R ĩ+1 R i Q ax, then s heu i = 1 T R ĩ+1. If Q i +R ĩ+1 R i > Q ax, then no video segent will be transitted, and hence s heu i = 0. If αi < αed, then the th user is in bad channel condition. No data will be transitted in the ith frae if ĩ i+1 (i.e., the video segent that to be played in the i+1th frae has been transitted. If ĩ = i, we set s heu i = 1 T R i+1 to avoid playback interruption, which eans that the video segent to be played in the next frae will be transitted in the ith frae. Given the required average service rate s heu i of VoD users, the resource allocation plan in the

20 ith frae can be optiized fro the following proble, s.t. in P i,k i, =1,...,M D +M R ( P Ki F i D K i θ τ ln K i M D +M R =1 ( 1 ρ P i +P c K i, (27 s heu i, = 1,...,M D, ] EB (θ, = M D +1,...,M D +M R, [F R ( P i K i (13c, (13d and (13e. Except that the value of s heu i (27a (27b depends on αed, proble (27 does not depend on future inforation. Because the average service rate constraint in (27a is a special case of the effective capacity constraint in (27b with θ 0 [30], any existing low-coplexity algoriths in [43] and [44] can be applied to find the solution of this proble. The coplexity of the heuristic policy is alost the sae as the non-predictive joint resource allocation policy for VoD and RT users. This is because the only difference between the two policies lies in the service rate requireent in constraint (27a. Without predicted inforation, the video segent to be played in the i + 1th frae should be transitted in the ith frae, and hence the required average service rate in the ith frae is 1 T R i+1 rather than sheu i. The heuristic predictive resource allocation policy can be ipleented in three tiescales: At the beginning of prediction window, the edian of large scale channel gains is predicted. At the beginning of each frae, the BS assigns average transit power and bandwidth for the frae with estiated large-scale channel gains, QoS requireents of all users, and the statistics of RT traffic arrival processes. At the beginning of each tie slot, the BS allocates transit power to different subcarriers respectively for the VoD users and RT users according to the plan with the estiated sallscale channel gains. A possible way to predict the edian of large-scale channel gains in the prediction window is as follows. In each BS, we can pre-store the edian of the large-scale channel gains of all possible locations in the cell, say by drive test or crowd-sourcing. For each VoD user, we only need to predict the cells that it will access in the prediction window. Then, fro the edian of large-scale channel gains in each cell the user accessed, we can predict the edian in the

21 window. Applying the heuristic policy does not need to construct and store fine-grained radio ap and to predict accurate user trajectory. As a result, the storage and coputing resources can be reduced significantly. VI. SIMULATION RESULTS In this section, we evaluate the EE of the proposed optial policy and heuristic policy. We consider both scenarios with perfect and iperfect prediction of large-scale channel gains. A. Siulation Setup For VoD service, we use scalable video coding in [45] (each segent includes one base layer and five enhance layers to evaluate the perforance of different policies. The bit rate of each layer can be found in [46]. The average streaing rate of each VoD service is around 2 Mbits/s. For RT service, the packets of each user arrive at the buffer of BS according to a Poisson process with average rate λ a = 500 packets/s. The size of each packet follows exponential distribution with average 1/λ u = 4 kbits/packet. Hence, the average data arrival rate of RT service is 2 Mbits/s. All the users ove along a road fro point A at (0,0 to point B, as Start point A BSs (250,100 User path (750,100 (0,0 EndpointB Fig. 3. Scenario for siulation. shown in Fig. 3. To save transit power, each user is accessed to its nearest BS. The distances between BSs are 500, and the inial distance between the BSs to the road are 100. The path loss odel is 35.3+37.6log 10 Dj db, where Di is the distance in eters between the th user and its accessed BS in the jth tie slot. The circuit powers of different coponents in [33] are easured in the year of 2012. The scaling law in [47] is further applied to predict P c and P 0 in 2020, which are used in our siulation. The EE is the ratio of the aount of transitted data to the aount of energy consued by the BSs to serve the VoD and RT services in the prediction window. The prediction window is with duration N L T = 60 s. The total siulation

22 tie is 6000 s. All the siulation paraeters are listed in Table III. This setup will be used in the following unless otherwise specified. TABLE II LIST OF SIMULATION PARAMETERS [33, 47] Maxial transit power P ax 40.0 W Nuber of available subcarriers K ax 512 Bandwidth of each subcarrier B 15 khz Power aplifier efficiency ρ 38.8 % Circuit power consuption for one subcarrier P c 72 W/MHz Fixed circuit power consuption P 0 136 W/MHz Single-sided noise spectral density N 0-173 db/hz Duration of each frae T and each tie slot 1 s and 5 s We copare the optial predictive resource allocation policy with three baseline policies. 4 Non-predictive resource allocation (legend Baseline 1 : This baseline is a siple extension of the policy in [43], where only RT services are considered. The video segents to be played in the ith frae are transitted in the i 1th frae (i.e., s n i 1 = 1 T R i. The resource allocation is obtained by solving proble (27 in ulti-cell scenarios, where K i and Pi are replaced by Ki n and Pi n, respectively, and constraints in (13c and (13d are replaced by those in (22 and (23. The gain of the optial policy over Baseline 1 coes fro predicting large scale channel gains for both VoD and RT users. Predictive resource allocation only with future large-scale channel gains for VoD users (legend Baseline 2 : This is a siple extension of the policy in [5], where only VoD service is considered. The unknown distances between BS and RT users are set as the radius of the cell in all the fraes, and then the resource allocation for VoD and RT users are jointly optiized. By considering the worst case, the QoS of the RT users can be guaranteed no atter where they are located. The gain of the optial policy over Baseline 2 coes fro predicting large scale channel gains for RT users. Decoupled resource allocation in the two tiescales (legend Baseline 3 : This is extended fro the two-tiescale policy in [17], where only VoD services are considered. The extended 4 We do not copare with the robust policies in [20, 21] due to two reasons. First, there is no siple extension of these policies to the syste with both VoD and RT services. Second, in this paper we consider scalable video coding with ultiple data layers, but in [20, 21] a single data layer video coding is considered.

23 policy optiizes bandwidth allocation at the beginning of the prediction window (equivalent to allocating transission tie in [17], where the total transit power is equally allocated to all subcarriers in order to predict future average rate. In each tie slot, the instantaneous transit power is allocated to subcarriers with (14 and (16, i.e., allocating transit power to subcarriers with good channels (siilar to subcarrier selection in [17]. The gain of the optial policy over Baseline 3 coes fro joint resource allocation in two tiescales. B. Perfect Prediction of Large-scale Channel Gains The users ove at the sae constant velocity of 20 /s, and the future large-scale channel gains are available at the beginning or prediction window. 14 12 10 Optial Heuristic Baseline 1 Baseline 2 Baseline 3 11 10 9 8 Optial Heuristic Baseline 1 Baseline 2 Baseline 3 EE (Mbits/J 8 6 EE (Mbits/J 7 6 4 5 4 2 3 0 0 1 2 3 4 5 Nuber of VoD users M D 2 0 2 4 6 8 10 Streaing rate for the VoD user (Mbits (a EE v.s. nubers of VoD users M D, M D +M R = 5. (b EE v.s. streaing rate of VoD user, where M D = M R = 1. Fig. 4. EE achieved by different policies. EE achieved by different policies are illustrated in Fig. 4. In Fig. 4(a, the total nuber of users is fixed as M D +M R = 5, and the nubers of different kinds of users vary. In Fig. 4(b, the total data rate required by all users are fixed as E(Ri / T + λ a/λ u = 10 Mbits, where the arrival data rate of RT user (or the streaing rate of VoD user varies. Siulation results show that when there is no VoD user, the achieved EE of the optial policy and Baseline 1 are the sae. The results are consistent with the analysis in Section IV-B, i.e., the predicted inforation can not help iprove the EE of the syste if there are only RT users. When there are both VoD and RT users, the achieved EE of the optial policy could be 50 100% higher than the EE achieved by the baselines. The achieved EE of Baseline 2 is lower than Baseline 1