Delay QoS Provisioning and Optimal Resource Allocation for Wireless Networks

Size: px

Start display at page:

Download "Delay QoS Provisioning and Optimal Resource Allocation for Wireless Networks"

Dina Reynolds
6 years ago
Views:

Syracuse University SURFACE Dissertations - ALL SURFACE June 2017 Delay QoS Provisioning and Optimal Resource Allocation for Wireless Networks Yi Li Syracuse University Follow this and additional

1 Syracuse University SURFACE Dissertations - ALL SURFACE June 2017 Delay QoS Provisioning and Optimal Resource Allocation for Wireless Networks Yi Li Syracuse University Follow this and additional works at: Part of the Engineering Commons Recommended Citation Li, Yi, "Delay QoS Provisioning and Optimal Resource Allocation for Wireless Networks" (2017). Dissertations - ALL This Dissertation is brought to you for free and open access by the SURFACE at SURFACE. It has been accepted for inclusion in Dissertations - ALL by an authorized administrator of SURFACE. For more information, please contact surface@syr.edu.

2 Abstract Recent years have witnessed a significant growth in wireless communication and networking due to the exponential growth in mobile applications and smart devices, fueling unprecedented increase in both mobile data traffic and energy demand. A- mong such data traffic, real-time data transmissions in wireless systems require certain quality of service (QoS) constraints e.g., in terms of delay, buffer overflow or packet drop/loss probabilities, so that acceptable performance levels can be guaranteed for the end-users, especially in delay sensitive scenarios, such as live video transmission, interactive video (e.g., teleconferencing), and mobile online gaming. With this motivation, statistical queuing constraints are considered in this thesis, imposed as limitations on the decay rate of buffer overflow probabilities. In particular, the throughput and energy efficiency of different types of wireless network models are analyzed under QoS constraints, and optimal resource allocation algorithms are proposed to maximize the throughput or minimize the delay. In the first part of the thesis, the throughput and energy efficiency analysis for hybrid automatic repeat request (HARQ) protocols are conducted under QoS constraints. Approximations are employed for small QoS exponent values in order to obtain closed-form expressions for the throughput and energy efficiency metrics. Also, the impact of random arrivals, deadline constraints, outage probability and QoS constraints are studied. For the same system setting, the throughput of HARQ system is also analyzed using a recurrence approach, which provides more accurate results for any value of the QoS exponent. Similarly, random arrival models and deadline constraints are considered, and these results are further extended to the finite-blocklength coding regime.

3 Next, cooperative relay networks are considered under QoS constraints. Specifically, the throughput performance in the two-hop relay channel, two-way relay channel, and multi-source multi-destination relay networks is analyzed. Finite-blocklength codes are considered for the two-hop relay channel, and optimization over the error probabilities is investigated. For the multi-source multi-destination relay network model, the throughput for both cases of with and without CSI at the transmitter sides is studied. When there is perfect CSI at the transmitter, transmission rates can be varied according to instantaneous channel conditions. When CSI is not available at the transmitter side, transmissions are performed at fixed rates, and decoding failures lead to retransmission requests via an ARQ protocol. Following the analysis of cooperative networks, the performance of both halfduplex and full-duplex operations is studied for the two-way multiple input multiple output (MIMO) system under QoS constraints. In full-duplex mode, the selfinterference inflicted on the reception of a user due to simultaneous transmissions from the same user is taken into account. In this setting, the system throughput is formulated by considering the sum of the effective capacities of the users in both half-duplex and full-duplex modes. The low signal to noise ratio (SNR) regime is considered and the optimal transmission/power-allocation strategies are characterized by identifying the optimal input covariance matrices. Next, mode selection and resource allocation for device-to-device (D2D) cellular networks are studied. As the starting point, transmission mode selection and resource allocation are analyzed for a time-division multiplexed (TDM) cellular network with one cellular user, one base station, and a pair of D2D users under rate and QoS constraints. For a more complicated setting with multiple cellular and D2D users, two joint mode selection and resource allocation algorithms are proposed. In the first algorithm, the channel allocation problem is formulated as a maximum-weight matching problem, which can be solved by employing the Hungarian algorithm. In

4 the second algorithm, the problem is divided into three subproblems, namely user partition, power allocation and channel assignment, and a novel three-step method is proposed by combining the algorithms designed for the three subproblems. In the final part of the thesis, resource allocation algorithms are investigated for content delivery over wireless networks. Three different systems are considered. Initially, a caching algorithm is designed, which minimizes the average delay of a single-cell network. The proposed algorithm is applicable in settings with very general popularity models, with no assumptions on how file popularity varies among different users, and this algorithm is further extended to a more general setting, in which the system parameters and the distributions of channel fading change over time. Next, for D2D cellular networks operating under deadline constraints, a scheduling algorithm is designed, which manages mode selection, channel allocation and power maximization with acceptable complexity. This proposed scheduling algorithm is designed based on the convex delay cost method for a D2D cellular network with deadline constraints in an OFDMA setting. Power optimization algorithms are proposed for all possible modes, based on our utility definition. Finally, a two-step intercell interference (ICI)- aware scheduling algorithm is proposed for cloud radio access networks (C-RANs), which performs user grouping and resource allocation with the goal of minimizing delay violation probability. A novel user grouping algorithm is developed for the user grouping step, which controls the interference among the users in the same group, and the channel assignment problem is formulated as a maximum-weight matching problem in the second step, which can be solved using standard algorithms in graph theory.

5 delay qos provisioning and optimal resource allocation for wireless networks By Yi Li B.E., University of Science and Technology of China, Hefei, China, 2011 DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering Syracuse University June 2017

7 Acknowledgements First, I would like to thank my advisors, Prof. Mustafa Cenk Gursoy and Prof. Senem Velipasalar, for their help and guidance during my entire Ph.D. study. In this period, I have gained much experience in research, and completed many studies in the area of wireless communication. I would like to thank all my Ph.D. defense committee members Prof. Lixin Shen, Prof. Biao Chen, Prof. Yingbin Liang, Prof. Makan Fardad, and Prof. Pramod Varshney for carefully reading my thesis. Finally, I would like to thank all my labmates and friends for their support during these years at Syracuse. I had really wonderful time studying and working with them. vi

8 Table of Contents List of Figures xiii List of Tables xviii 1 Introduction Quality of Service (QoS) Requirements Literature Review Throughput and Energy Efficiency of Hybrid Automatic Repeat Request (HARQ) Protocols under QoS Constraints Throughput of Cooperative Relay Networks under QoS Constraints Throughput and Mode Selection in Two-way Multiple-Input Multiple-Output (MIMO) Systems under Queuing Constraints Mode Selection and Resource Allocation for Device-to-Device (D2D) Cellular Networks Delay-Aware Scheduling Algorithms for Content Delivery over Wireless Networks Wireless D2D Caching Networks Intercell Interference (ICI) Control in Cloud Radio Access Network (C-RAN) Outline and Main Contributions vii

9 1.4 Bibliographic Note Preliminaries of Statistical Queuing Constraints Statistical Queuing Constraints Throughput of Single-hop Channels under Statistical Queuing Constraints Effective Capacity Average Arrival Rates of Random Arrival Sources under Statistical Queuing Constraints Throughput of Two-hop Channels under Statistical Queuing Constraints 31 3 Throughput and Energy Efficiency of Hybrid-ARQ under Statistical Queuing Constraints with Low QoS Exponents Throughput of HARQ under Statistical Queuing Constraints System Model Effective Capacity of the HARQ-IR Scheme Numerical Results Energy Efficiency of Hybrid-ARQ under Statistical Queuing Constraints System Model and Preliminaries Energy Efficiency of HARQ-CC scheme with Fixed Outage Probability Comparison of the Energy Efficiency for Different Arrival Models Numerical Results Throughput of Hybrid-ARQ under Statistical Queuing Constraints Using Recurrence Approach Throughput of Hybrid-ARQ Chase Combining with ON-OFF Markov Arrivals under Statistical Queuing Constraints Throughput of HARQ-CC Scheme with Queuing Constraints. 77 viii

10 4.1.2 Numerical Results Throughput of HARQ-IR with Finite Blocklength Codes and QoS Constraints System Model and Preliminaries Throughput of HARQ-IR with Queuing Constraints and Finite Blocklength Codes Numerical Results Throughput of Cooperative Relay Networks under Statistical Queuing Constraints Throughput of Two-Hop Wireless Channels with Queuing Constraints and Finite Blocklength Codes System Model and Preliminaries Throughput of Two-hop Relay Channels With Finite Blocklength Codes Throughput Optimization for Two-hop Relay Systems Numerical Results Throughput of Two-Way Relay Systems under Queueing Constraints System Model Throughput in Two-Way Relay Systems Numerical Results Throughput of Multi-Source Multi-Destination Relay Networks with Queuing Constraints System Model Throughput of the Two-Source Two-Destination Relay Network With Variable Transmission Rates Throughput of the Two-Source Two-Destination Relay Network With Fixed Transmission Rates ix

11 6 Throughput and Mode Selection in Two-way MIMO Systems under Queuing Constraints System Model Half-Duplex Mode Full-Duplex Mode Throughput for Two-way MIMO systems System Throughput for Half-Duplex TDM Mode System Throughput for Half-Duplex FDM Mode System Throughput for Full-Duplex Mode Mode Selection Mode Selection in the Low-SNR Regime Mode Selection in the High-SNR Regime Mode Selection at Different Transmission Distances Mode Selection and Resource Allocation Algorithms for D2D Cellular Networks Mode Selection of Device-to-Device Communication in Cellular Networks under Statistical Queuing Constraints System Model Throughput of Cellular Network with D2D Users Resource Allocation Numerical Results Joint Mode Selection and Resource Allocation for D2D Communications under Queuing Constraints System Model and Transmission Modes Channel Allocation via Maximum-Weight Matching Approach Numerical Results x

12 7.3 A Joint Mode Selection and Resource Allocation Algorithm for D2D Communications via Vertex Coloring System Model and Assumptions Joint Mode Selection and Resource Allocation Algorithm Numerical Results Resource Allocation for Content Delivery over Wireless Cellular Networks A Delay-Aware Caching Algorithm for Wireless D2D Caching Networks System Model and Problem Formulation Caching Algorithm Extensions and Future Work Numerical Results Scheduling in D2D Underlaid Cellular Networks with Deadline Constraints System Model and Transmission Modes Scheduling with Convex Delay Cost Method Utility Maximization and Scheduling Algorithm Numerical Results Intercell Interference-Aware Scheduling for Delay Sensitive Applications in C-RAN System Model and Preliminaries ICI-Aware Scheduling Algorithm for C-RAN Numerical Results Conclusion Summary Future Research Directions xi

13 9.2.1 Popularity Estimation and Scheduling for D2D Caching Systems270 A 271 A.1 Proof of Theorem A.2 Proof of Theorem A.3 Proof of Theorem A.4 Proof of Proposition A.5 Proof of Theorem A.6 Proof of Theorem A.7 Proof of Theorem A.8 Proof of Theorem A.9 Proof of Theorem A.10 Proof of Theorem A.11 Proof of Theorem A.12 Proof of Theorem A.13 Proof of Theorem A.14 Proof of Theorem xii

14 List of Figures 3.1 System Model Effective capacity C e vs. transmission rate R at SNR = 6 db and θ = 0.01 for both ARQ and HARQ-IR var(log 2 (1 + SNRz)) and R2 σ 2 µ 3 1 vs. transmission rate R. SNR = 6 db and θ = Mean µ 1 and the variance σ 2 of the transmission time T vs. transmission rate R. SNR = 6 db and θ = Effective capacity C e of HARQ-IR vs. transmission rate R at SNR = 6 db for different θ values Effective capacity C e vs. 1 µ 1 at SNR = 6 db for different θ values Effective capacity C e vs. transmission rate R for different hard deadline constraints. SNR = 6 db and θ = Structure of a packet transmission period in queue model I Logarithmic buffer overflow probability vs. buffer overflow threshold Throughput vs. energy per bit E b N Minimum energy per bit E b N 0 min and wideband slope S 0 vs. outage probability ϵ Minimum energy per bit E b N 0 min and wideband slope S 0 vs. deadline constraint M xiii

15 3.13 Logarithmic buffer overflow probability vs. buffer overflow threshold for the ON-OFF discrete-time Markov source Throughput vs. energy per bit E b N 0 for ON-OFF discrete-time Markov source with fixed outage probability ε = Throughput vs. energy per bit E b N 0 for ON-OFF Markov fluid source with fixed outage probability ε = Throughput vs. energy per bit E b N 0 for ON-OFF Markov fluid source and MMPS with fixed outage probability ε = Logarithmic buffer overflow probability vs. buffer overflow threshold Throughput vs. deadline constraint M Throughput vs. outage probability ε Throughput vs. P ON for the ON-OFF discrete-time Markov source Throughput vs. P ON for the ON-OFF Markov fluid source and MMPS Outage probability vs. R Logarithmic overflow probability vs. buffer overflow threshold Throughput vs. fixed transmission rate R Throughput vs. blocklength l The two-hop relay system with buffer constraints The maximum throughput vs. relay location parameter The optimal τ vs. relay location parameter The maximum throughput vs. blocklength m The two-way relay system with buffer constraints Maximum arrival rates R 1 and R 2 vs. d Optimal power fraction ρ and time fraction τ vs. d Maximum arrival rate R 1 vs. (SNR 1, SNR r ) when SNR 2 = 8, θ = 0.005, and d = xiv

16 5.9 Maximum arrival rate R 1 vs. (SNR 2, SNR r ) when SNR 1 = 3 and θ = Maximum arrival rate R 2 vs. (θ 2, θ r ) when θ 1 = 0.05 and SNR = Throughput Region achieved with time-sharing between decoding orders {1, 2} and {2, 1}. SNR r = 4, SNR 1 = SNR 2 = 2, θ = 0.005, and d = The relay network system with buffer constraints Logarithmic buffer overflow probability vs. buffer overflow threshold Logarithmic buffer overflow probability vs. buffer overflow threshold The sum rate vs. relay location parameter The sum rate vs. relay location parameter The sum rate vs. time allocation parameter τ The sum rate vs. decoding parameter δ The sum rate vs. decoding parameter δ System throughput region R 1 vs. R Logarithmic buffer overflow probability vs. buffer overflow threshold Logarithmic buffer overflow probability vs. buffer overflow threshold ON state probabilities in the broadcast phase vs. ρ with τ = Region of feasible (ρ, τ) pairs for stability at the relay buffer The arrival rate R 1 as a function of (ρ, τ) Maximum sum arrival rate vs. ρ System model for two-way MIMO channel Sum throughput vs. SNR Sum throughput vs. SNR Sum throughput vs. transmission distance System model with queuing constraints (Dashed lines represent interference only links.) xv

17 7.2 Mode selection result when D 1 is placed at (0, 2.5), and D 2 is placed at (0, 2.5). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively Mode selection result when D 1 is placed at (2, 1), and D 2 is placed at (2, 1). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively Mode selection result when D 1 is placed at (3.5, 0.5), and D 2 is placed at (3.5, 0.5). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively Sum rate vs. angle α System model in the D2D cellular mode System model in the reuse mode (interference links are denoted by the dashed lines) System model in the dedicated mode Structure of the throughput matrix Structure of the throughput matrix for the D2D reuse mode Average throughput vs. N Average number of unserved D2D pairs vs. N System model Comparison of sum rate Comparison of the number of served users Comparison of time consumption Sum rate vs. N d Number of users served by the system vs. N d System model of a D2D cellular network with caches Average delay η vs. Zipf exponent β Average delay η vs. cache size µ xvi

18 8.4 Average delay η vs. the number of users N System model in the D2D cellular mode System model in the reuse mode System model in the dedicated mode Delay violation probability vs. α Average Delay vs. α Delay violation probability vs. α Average Delay vs. α System model of C-RAN with ICI Delay violation probability vs. interference control parameter γ Throughput vs. interference control parameter γ Delay violation probability vs. power control parameter α Throughput vs. power control parameter α xvii

19 List of Tables 5.1 Table of notations for Section Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Comparison between our algorithm and SFR xviii

20 Chapter 1 Introduction 1.1 Quality of Service (QoS) Requirements Recent years have witnessed significant growth in wireless communication and networking due to the exponential growth in mobile applications and smart devices, fueling unprecedented increase in both mobile data traffic and energy demand. Among such data traffic, real-time data transmissions in wireless systems require certain QoS constraints e.g., in terms of delay, buffer overflow or packet drop/loss probabilities, so that acceptable performance levels can be guaranteed for the end-users, especially in delay sensitive scenarios, such as live video transmission, interactive video (e.g., teleconferencing), and mobile online gaming. With this motivation, we consider statistical queuing constraints in this thesis, imposed as limitations on the decay rate of buffer overflow probabilities. In [1], effective bandwidth was introduced as a measure of the system throughput under such statistical queuing or QoS constraints. More specifically, effective bandwidth has been defined as the minimum constant transmission rate required to support time-varying arrivals while the buffer overflow probability decays exponentially with increasing overflow threshold. In [2], effective bandwidths of departure processes with time-varying service rates were investigated, 1

21 and the theory of effective bandwidth was employed to analyze the performance of high speed networks in [3]. Later, effective capacity was defined in [4] as a dual concept to characterize the maximum constant arrival rates that can be supported by time-varying wireless transmission rates again under statistical queuing constraints. 1.2 Literature Review Throughput and Energy Efficiency of Hybrid Automatic Repeat Request (HARQ) Protocols under QoS Constraints In wireless communications, higher throughput and better energy efficiency are two key considerations and have become critical performance metrics due to the exponential growth in mobile applications and smart devices, fueling unprecedented increase in both mobile data traffic and energy demand. More specifically, with this growth coupled with the availability of only limited battery power for mobile devices, rising energy costs and growing concerns on environmental impact, the analysis of the energy efficiency and green operation in wireless systems have become increasingly more important in recent years. At the same time, throughput and energy efficiency are not the only considerations. In a wireless propagation environment in which noise, fading, path loss, multipath propagation and Doppler frequency shift are being experienced, reliability is equally important with strong implications on energy efficiency. Due to the challenges in wireless systems, many advanced techniques have been developed to address these concerns, and automatic repeat request (ARQ) and forward error correction (FEC) are two types of widely used schemes applied in order to ensure reliable delivery of data in such challenging wireless channel conditions. While ARQ facilitates the retransmission of erroneously received data packets with 2

22 feedback from the receiver to the transmitter, FEC schemes enable the correction of transmission errors without retransmission by adding redundancy to the data. In order to provide better error correction performance and lower implementation cost, ARQ and FEC schemes are combined to develop hybrid ARQ (HARQ) [5]. HARQ protocols have the ability to increase the probability of successful transmission and adapt the transmission rate to time-varying channel conditions with limited channel side information (CSI) at the transmitter [6]. In HARQ with chase combining (HARQ-CC) and HARQ with incremental redundancy (HARQ-IR) schemes, the corrupted packets are not deleted but rather stored and combined in the next transmission period. In particular, better adaptation to channel conditions and higher throughput can be achieved by employing HARQ-IR. A detailed study on the performance of HARQ-CC and HARQ-IR protocols was provided in [7], in which the throughput was characterized following an outage probability analysis. The throughput of HARQ protocols was studied from an information-theoretic perspective and it was shown that the throughput of HARQ-IR could approach the ergodic capacity for large transmission rates with only limited CSI. More recently, performance of HARQ in Rayleigh block fading channels was investigated via a mutual information-based analysis in [8], and long-term average rates achieved with HARQ were characterized under constraints on the outage probability and the maximum number of HARQ rounds. A similar throughput analysis of HARQ schemes subject to an outage constraint was also conducted in [9]. And in [10], the tradeoff between energy efficiency and transmission delay in wireless multiuser systems employing HARQ-IR was s- tudied. The energy efficiency of HARQ protocols has been addressed recently. For instance, the energy efficiency of HARQ-CC and HARQ-IR schemes for delay insensitive systems was studied in [11], and the energy efficiency achievable by HARQ schemes with optimized code rate is studied in [12]. In the presence of QoS constraints, it is critical to evaluate the performance of 3

23 HARQ schemes since they involve retransmissions. With this motivation, the authors in [13] analyzed the impact of different power allocation schemes on energy per bit and effective transmission delay of HARQ-IR in a multiuser downlink channel. Moreover, the recent work in [14] mainly focused on the performance comparison between adaptive modulation and coding (AMC) and HARQ-IR in terms of energy efficiency under QoS constraints. More recently, the authors of [15] characterized the effective capacity of retransmission schemes through a recurrence relation approach Throughput of Cooperative Relay Networks under QoS Constraints In recent years, the traffic load of wireless networks has grown significantly, primarily due to the exponential increase in multimedia traffic driven by applications on smart mobile devices. Many wireless communication schemes have been proposed to boost the transmission speed, reduce latency, enhance reliability and improve energy efficiency. One such strategy is cooperative relaying, which can greatly enhance the performance for long distance transmissions among users and improve resource efficiency. Two-hop relay channel is a basic cooperative relay network structures. Several studies have applied the effective capacity analysis to two-hop wireless relay channels. In [16], the power allocation policies in relay networks under QoS constraints were analyzed. However, no buffer constraints were imposed at the relay node in this work. In [17], the effective capacity of two-hop relay channel was investigated under queueing constraints at both the source and relay nodes. In [16] and [17], it was assumed that the instantaneous transmission rates were given by the Shannon capacity achieved with channel codes with blocklengths growing without bound. In [18] and [19], the coding rate expressions in finite blocklength regimes were studied. This has led to interest in the analysis of performance attained with finite blocklength codes. 4

24 For instance, the effective capacity of single-hop channels was characterized in [20] under finite blocklength assumption. Recently, performance of relaying in the finite blocklength regime was studied in [21] and [22] without considering any queueing constraints. Multiple-access relay channel is a more general model, in which multiple users transmit to a destination node with the help of a single relay node. The throughput of multiple-access relay networks have been analyzed by several studies. In [23], the achievable rates of Gaussian orthogonal multi-access relay channels were investigated, which were also proved to have a max-flow min-cut interpretation. The throughput region of the same system model was also given in [24] with superposition block Markov encoding and multiple access encoding. Further analysis was also provided in [25], in which the optimal resource allocation strategy was studied to achieve the maximum sum rate. In [26], the system throughput region of a generalized multiple access relay network, which includes multiple transmitters, multiple relays, and a single destination, was studied. In all cases, with the help of relay nodes, the channel conditions effectively improve for long distance wireless communication, and performance enhancements are realized. A further generalization of multiple-access relay channels is to introduce multiple destination nodes. These models are referred to as multi-source multi-destination relay networks. Multi-source multi-destination relay network model can be seen as a combination of multiple-access, broadcast, and two-hop relay channels, and it can be used to address scenarios in which multiple pairs of users simultaneously communicate with the help of a relay node. A basic practical example of these models is cellular operation in which multiple mobile users within a cell communicate with each other through a base station, which essentially acts as a relay unit between the source and destination nodes 1. Such networks have been analyzed in several recent studies. In 1 Moreover, in LTE-Advanced cellular standards, relaying and coordinated multi point (CoMP) operation are introduced to provide enhanced coverage and capacity at cell edges, and multi-user 5

25 [27], the throughput of the amplify-and-forward multi-source multi-destination relay network was studied, when the relay was equipped with multiple antennas. Based on this work, the same authors studied the impact of imperfect CSI in [28], and proposed an antenna selection algorithm to improve the performance. In [29], the joint power optimization was investigated for the multi-source multi-destination relay network, and in [30], network coding was applied to this type of network, and the system performance was evaluated. Recently, effective capacity analysis has been applied to multiuser and cooperative relay systems. For cooperative relay systems, the authors in [16] studied efficient resource allocation strategies over wireless relay channels under statistical QoS constraints by employing effective capacity as the throughput metric. However, in this work, either no buffer was needed at the relay if amplify-and-forward (AF) strategy was employed or no relay buffer constraints were imposed when decode-and-forward (DF) was used. In [31], queueing analysis was conducted for a butterfly network when the arrivals were modeled as a two-state Markov-modulated fluid process, and network coding or classical routing was performed by the intermediate relay node. In this study, all links were assumed to be time-invariant. Therefore, static rather than fading channels were considered. For multi-user systems, in [32], the effective capacity region of the multiple-access fading channel under queueing constraints was analyzed, and this result was extended to characterize the throughput region of the multiple-access channel with Markov arrivals in [33]. The effective capacity region of the fading broadcast channel and optimal power allocation policies were studied in [34]. These multi-user studies addressed single-hop channels and did not consider cooperative schemes. relay models can be realized in these operation modes as well. 6

26 1.2.3 Throughput and Mode Selection in Two-way Multiple- Input Multiple-Output (MIMO) Systems under Queuing Constraints Recently, full-duplex two-way communication has attracted much attention due to its potential to significantly improve the spectrum efficiency by allowing two users to communicate simultaneously over a single channel. A survey of the key concepts, challenges and opportunities in full-duplex wireless communications was provided in [35]. In particular, self-interference cancelation is a critical concern in full-duplex wireless systems, which has been addressed by several recent studies. For instance, in [36], multiple schemes in full-duplex MIMO relays have been studied for loopback selfinterference cancelation, including natural isolation, time-domain cancelation, and spatial suppression. It is important to note that self-interference cancelation is generally imperfect and the residual interference is considered to be proportional to the transmitted signal of the same node. In addition to self-interference management, there are numerous works analyzing the performance in full-duplex two-way channels in various settings. Among different models, two-way MIMO systems have been highlighted as they can further boost the system throughput by employing multiple antennas for transmitting and receiving. In [37], the influence of spatial fading correlation was investigated in two-way MIMO systems, and strategies were proposed to reduce the sensitivity to spatial fading correlation. In [38], the impact of channel estimation error was studied for full-duplex two-way networks, and closed-form expressions were derived for the ergodic capacity. Also, Effective capacity has been studied extensively for various different wireless channel models in order to determine the system throughput under such statistical queuing constraints. For instance, in [39], the effective capacity of a one-directional point-to-point MIMO systems was investigated. 7

27 1.2.4 Mode Selection and Resource Allocation for Device-to- Device (D2D) Cellular Networks The concept of D2D communication underlaid with cellular networks has attracted much interest recently. D2D communication enables users to communicate directly without going through the base station, and reuse the same spectral resources with cellular users. In [40], the advantages of D2D communications were studied, and it was shown that it could greatly enhance the spectral efficiency and lower the latency. A comprehensive overview was provided in [41], where different modeling assumptions and key considerations in D2D communications were detailed. Mode selection and resource allocation are two key problems in D2D communication, which has attracted much interest. In mode selection, each D2D user has to decide whether to transmit directly using a dedicated spectrum or by sharing the spectrum with cellular users in the D2D mode, or transmit in the same way as cellular users via a D2D two-hop channel through the base station in the cellular mode (which is essentially a non-d2d mode). In resource allocation, the system has to assign a channel resource to each user, and users have to optimize their transmission power. The resource allocation problem in D2D cellular networks is rather complicated because D2D users can reuse (i.e., share) the same channel resources with cellular users and inflict interference to them. Due to this reusing mechanism, the number of possible solutions for channel assignment increases exponentially with the number of D2D users, and the power optimization problem becomes high-dimensional and nonconvex. Therefore, the analysis becomes even more complicated when mode selection and resource allocation problems are considered jointly for improved performance. In the literature, many studies have been conducted to address the mode selection and resource allocation problems for D2D cellular networks. In the literature, many studies have been conducted to address the mode selection and resource allocation problems for D2D cellular networks. For instance, the authors of [42] considered 8

28 the mode selection problem in a cell with one D2D pair and one cellular user. The channel allocation problem was solved using the Hungarian algorithm in [43], just for the uplink reuse mode. Later in [44], the resource allocation in the D2D cellular network was investigated, for the uplink and downlink reuse mode, and in [45], a resource allocation strategy was proposed, which also included the D2D dedicated mode. More recently, the joint mode selection and resource allocation in a general cellular network with multiple D2D pairs were addressed in [46]. In order to reduce the complexity in analysis, most of these studies were based on the instantaneous channel conditions. In such cases, the system may have to perform the mode selection and resource allocation very frequently, resulting in high computational load and significant cost. In order to achieve improved results with lower time consumption, several algorithms were proposed via game-theoretic approaches. For example, the resource allocation problem was considered in [47] via the reverse iterative combinatorial auction game, and the authors of [48] solved a similar problem using the coalitional game theory. Besides the game-theoretic techniques, vertex coloring is another method that can efficiently divide D2D users into groups in which interference constraints are satisfied. In [49], vertex coloring algorithm was used to group D2D users with the goal of avoiding interference. A similar approach was used in [49] and [50] to maximize the instantaneous sum rate while satisfying the instantaneous signal to interference plus noise ratio (SINR) constraints, and a frequency band assignment process was also included after dividing D2D users into groups in [51] Delay-Aware Scheduling Algorithms for Content Delivery over Wireless Networks Scheduling is one of the key design considerations in cellular networks. Scheduling algorithms allocate limited channel resources to large amount of users, while also 9

29 guaranteeing the performance requirements of the system. For delay sensitive applications, packet delay is an important criterion of the performance. Numerous studies have been conducted to design and analyze delay optimal scheduling algorithms. A series of works considered this problem from the perspective of minimizing the queue length at transmitters. In [52], the MaxWeight rule was proposed, and the throughput optimality was shown for MaxWeight type algorithms. These type of algorithms stabilize the queueing system, and minimize the queue length. However, small queue length can not always guarantee good delay performance. The second line of work considered minimizing the decay rate of the delay violation probability as the scale of the system (which is the numbers of the users and available channels) increases. In [53], the Delay Weighted Matching (DWM) algorithm was proposed, which can provide good delay performance. In order to reduce the complexity of the DWM algorithm, the authors of [54] proposed a hybrid algorithm that reduced the asymptotic complexity of the scheduling algorithm. Although this approach guarantees good delay performance, the asymptotic analysis of these algorithms are complicated, making them difficult to be extended for a more general system setting. In [53] and [54], only downlink transmission with binary rate was considered. Yet another line of studies considered constructing convex delay costs. In [55] and [56], convex delay cost approach was proposed and developed to deal with systems operating under deadline constraints. This type of algorithm minimizes the delay violation probability under heavy traffic assumption Wireless D2D Caching Networks Recently, many studies have been conducted to analyze caching strategies in wireless networks in order to satisfy the throughput, energy efficiency and latency requirements in next-generation 5G wireless systems. By storing parts of the popular files 10

30 at the base station and users devices, network traffic load can be managed/balanced effectively, and traffic delay can be greatly reduced. It has been pointed out that 60% of the content is cacheable in the network traffic [57], which can be transmitted and stored close to the users before receiving the requests. A brief overview of wireless caching was provided in [58], which introduced the key notions, challenges, and research topics in this area. In order to improve the performance effectively, the system needs to estimate and track the popularity of those cacheable contents, and predict the popularity variations, helping to guarantee that the most popular contents are cached and the outdated contents are removed. In [59], popularity matrix estimation algorithms were studied for wireless networks with proactive caching. Multiple caching strategies have been investigated in the literature, which improve the performance in different ways. When contents are cached at the base stations, the energy consumption, traffic load and delay of the backhaul can be reduced [60], and the base stations in different cells can cooperate to improve the spectral efficiency gain [61]. When contents are cached at the users devices, the base station can combine different files together and multicast to multiple users, and the users can decode their desired files using their cached files. A content distribution algorithm for this approach was given in [62], and the analysis of the coded multicasting gain was provided in [63]. In the literature, several studies have been performed to combine content caching with D2D wireless networks. In such cases a user can receive from its neighbors if these have cached the requested content. An overview on wireless D2D caching networks was provided in [64], in which the key results for different D2D caching strategies were presented. To design caching policies for the wireless D2D network, the authors of [65] proposed a caching policy that maximizes the probability that requests can be served via D2D communications. For a similar system setting, a caching policy that maximizes the average number of active D2D links was obtained 11

31 in [66]. Most of these works were based on stochastic geometry models, in which nodes/users were distributed randomly. However, these types of models mainly focus on the path loss, and do not fully address the effects of channel fading. Without the characterization of the channel fading, an accurate analysis on the throughput and delay is not viable. Moreover, many works only tackle a simple case in which users have identical popularity vectors Intercell Interference (ICI) Control in Cloud Radio Access Network (C-RAN) In order to satisfy the growing demands and provide the required QoS guarantees and high reliability in next-generation 5G wireless systems, several advanced techniques have been proposed, and C-RAN is one novel mobile network architecture that improves the performance of cellular networks. By centralizing the baseband processing resources of multiple cells in a virtualized baseband unit (BBU) pool, C-RAN can achieve cooperative processing among different cells and utilize the BBUs more efficiently [67] [68]. Remote radio heads (RRHs) and BBU are separated geographically and connected via optical fibers in the C-RAN architecture. BBU pool is shared between cells as a virtualized cluster which operates baseband processing. Compared with the conventional architectures in which BBUs of different cells are not shared, C-RAN can achieve information exchange and cooperative processing between cells more easily with low latency, and it has high adaptability to nonuniform traffic. A comprehensive survey on C-RAN and its implementation is provided in [69]. For most orthogonal frequency division multiple access (OFDMA)-based cellular networks, ICI is a significant interference source because of the frequency reuse a- mong multiple neighbouring cells. Many advanced methods have been studied to control ICI. For instance, the soft frequency reuse (SFR) scheme is proposed in [70] and [71], in which cell edge users transmit with high power in non-overlapping cell 12

32 edge bands allocated to adjacent cells, and center users use the cell center bands with limited transmission power. The authors in [72] further compared the performance of SFR with partial frequency reuse scheme. In these conventional ICI control schemes, cooperation between neighbouring cells are not considered, which limits their performances. In C-RAN, cooperative processing among the cells sharing the same BBU pool becomes easier and more efficient, which helps to improve ICI control. In [73], a resource allocation and RRH association algorithm was proposed for ICI coordination in a long term evolution (LTE) heterogeneous network setting with C-RAN architecture. However, optimization over multiple cells greatly increases the complexity, which causes problems in delay sensitive applications. In addition, packet delay is an important performance criterion for delay sensitive applications such as live video streaming and online gaming. In most of the related studies considering ICI control, the objectives are interference minimization, SINR maximization and throughput maximization, and hence delay minimization is not addressed. 1.3 Outline and Main Contributions In Chapter 2, we provide a detailed review on the formulation of our statistical queuing constraints, and describe the methods to characterize the throughput under queuing constraints for different channel and arrival models. More specifically, both single-hop and two-hop channels are considered with constant-rate arrivals at the source nodes, and random arrival models are considered for the single-hop channel. In Chapter 3, we conduct the throughput and energy efficiency analysis for HARQ protocols under QoS constraints. Approximations are employed for small QoS exponent values in order to obtain closed-form expressions for the throughput and energy efficiency metrics. 13

33 In Section 3.1, the throughput of HARQ-IR with fixed transmission rate is s- tudied in the presence of queuing constraints imposed as limitations on buffer overflow probabilities. In particular, tools from the theory of renewal processes and stochastic network calculus are employed to characterize the maximum arrival rates that can be supported by the wireless channel when HARQ-IR is adopted. Effective capacity formulation is employed as the throughput metric and a closed-form expression for the effective capacity of HARQ-IR is determined for small values of the QoS exponent. The impact of the fixed transmission rate, queuing constraints, and hard deadline limitations on the throughput is investigated and comparisons with regular ARQ operation are provided. In Section 3.2, energy efficiency of HARQ schemes with statistical queuing constraints is studied for both constant-rate and random Markov arrivals by characterizing the minimum energy per bit and wideband slope. In particular, two queuing models are considered. Specifically, when outage occurs, the transmitter keeps the packet, lowers its priority, and attempts to retransmit it later in the first queue model while the packet is discarded and removed from the buffer in the second queue model. For both models, energy efficiency is investigated when outage constraints, statistical queuing constraints and deadline constraints are imposed. The deadline constraint provides a limitation on the number of retransmissions or equivalently the number of HARQ rounds. Under these assumptions, closed-form expressions are obtained for the energy efficiency metrics of HARQ-CC, and comparisons among different arrival models are made. For instance, it is shown that stricter queuing constraints and more bursty sources degrade the energy efficiency by lowering the wideband slope. In Chapter 4, we conduct throughput analysis for HARQ protocols under QoS 14

34 constraints via the recurrence approach proposed in [15]. Also, deadline and outage constraints, random arrival models and finite blocklength codes are considered. In Section 4.1, we characterize the throughput of HARQ-CC for three typical Markov sources in the presence of statistical queuing constraints while satisfying a target outage probability. In most of the related works investigating the throughput of HARQ schemes under statistical queuing constraints, the occurrence of packet drops from the buffer due to deadline violation have generally not been explicitly addressed. From the perspective of the buffer, packets dropped/discarded from the buffer should contribute to the departure rate in the queuing analysis. However, when characterizing the throughput, the discarded packets should not be taken into account, since the receiver does not receive them. In this section, the impact of such packet drops is explicitly considered. Also, we identify the impact of source randomness on the throughput of HARQ-CC systems under statistical QoS constraints. In Section 4.2, we study the throughput of HARQ-IR with finite-blocklength codes, deadline limits, and statistical queuing constraints by employing the notions of effective capacity and effective bandwidth from stochastic network calculus. Two different arrival models, namely the constant-rate and ON-OFF discrete time Markov arrivals, are studied, and throughput characterizations are obtained for both arrival models. In prior works focusing on HARQ under queuing constraints, it was assumed that the instantaneous transmission rates were given by the Shannon capacity achieved with channel codes with blocklengths growing without bound. However, it is practically more appealing to address the performance with finite-blocklength codes and more explicitly take into account decoding error probabilities in the analysis of HARQ especially in the presence of deadline and queuing constraints. With this motivation, we leverage recent advances in the characterization of coding rates in the finite 15

35 blocklength regime [18], [19], and study the throughput of HARQ-IR schemes achieved with finite-blocklength codes subject to statistical queuing constraints at the transmitter buffer. In Chapter 5, the throughput of three cooperative relay network models are studied under QoS constraints. In Section 5.1, we characterize the throughput of the two-hop relay channel in the finite blocklength regime when statistical queueing constraints are imposed at both the source and relay. We first identify the throughput by determining the effective capacity of the two-hop relay system in the finite blocklength regime, and then establish several key properties of the throughput in terms of the target error rates. Based on these properties, we develop an efficient search algorithm to solve the throughput maximization problem and obtain the corresponding optimal parameter values. In Section 5.2, we extend the effective capacity analysis of one-directional twohop relay channel to a two-way relay channel setting. More specifically, we study the throughput of two-way relay channels in the presence of queueing constraints at both the source nodes and the relay node. Note that the two-way relay model has significant differences from that of the one-way relay considered in [17]. We consider half-duplex, decode-and-forward relaying. In this setting, our main goal is to identify the pair of maximum arrival rates at the sources while the statistical queuing constraints at the source nodes and relay are satisfied. In Section 5.3, the throughput of relay networks with multiple source-destination pairs under queuing constraints has been investigated for both variable-rate and fixed-rate schemes. When CSI is available at the transmitter side, transmitter- 16

36 s can adapt their transmission rates according to the channel conditions, and achieve the instantaneous channel capacities. In this case, the departure rates at each node have been characterized for different system parameters, which control the power allocation, time allocation and decoding order. In the other case of no CSI at the transmitters, a simple ARQ protocol with fixed rate transmission is used to provide reliable communication. Under this ARQ assumption, the instantaneous departure rates at each node can be modeled as an ON-OFF process, and the probabilities of ON and OFF states are identified. With the characterization of the arrival and departure rates at each buffer, stability conditions are identified and effective capacity analysis is conducted for both cases to determine the system throughput under statistical queuing constraints. In addition, for the variable-rate scheme, the concavity of the sum rate is shown for certain parameters, helping to improve the efficiency of parameter optimization. In Chapter 6, we extend the analysis conducted in [39] to two-way MIMO systems. We consider both half-duplex and full-duplex operations. In half-duplex mode, we can have time-division multiplexing or frequency-division multiplexing, which are essentially equivalent in terms of their performances. In full-duplex mode, we take into account the self-interference inflicted on the reception of a user due to simultaneous transmissions from the same user. In this setting, we initially formulate the system throughput by considering the sum of the effective capacities of the users in both half-duplex and full-duplex modes. Subsequently, we consider the low signal to noise ratio (SNR) regime and characterize the optimal transmission/power-allocation strategies by identifying the optimal input covariance matrices. Finally, via numerical results, we address mode selection by determining which mode yields higher through- 17

37 put under different SNR levels and distances. For fair comparison, we assume that the number of transmitting and receiving antennas are the same at each node in both half-duplex and full-duplex modes. In Chapter 7, we study the mode selection and resource allocation algorithms for D2D cellular networks. In Section 7.1, transmission mode selection and resource allocation in a timedivision multiplexed (TDM) cellular network with one cellular user, one base station, and a pair of D2D users is investigated under rate and queuing constraints. Using tools from stochastic network calculus, the system throughput under statistical queuing constraints is formulated, efficient resource allocation algorithms for all possible modes are proposed, and the influence of the positions of each node and the queuing constraints is analyzed. Scenarios and conditions for different modes to be optimal in the sense of maximizing the sum-throughput are identified. In Section 7.2, we propose a novel channel matching algorithm for joint mode s- election and channel allocation with the goal of maximizing the system throughput under statistical queuing constraints. Seven possible modes are considered, namely the D2D cellular mode, D2D dedicated mode, uplink dedicated mode, downlink dedicated mode, uplink reuse mode, downlink reuse mode, and D2D reuse mode. Using tools from stochastic network calculus, the throughput is characterized by determining the effective capacity. We formulate the channel allocation problem as a maximum-weight matching problem, which can be solved by employing the Hungarian algorithm. In Section 7.3, we propose a novel joint mode selection and channel resource allocation algorithm via the vertex coloring approach. In our analysis, we divide 18

38 the problem into three subproblems, namely user partition, power allocation and channel assignment. Different from prior works, the power allocation, mode s- election and channel assignment are considered after grouping D2D users via vertex coloring. Algorithms are designed for each subproblem, and we propose a novel three-step joint mode selection and resource allocation method by combining these algorithms designed for the three subproblems. We also incorporate the adaptation of the interference constraints in the grouping step when the given interference constraints are relatively loose, and fairness among the users in the same group is considered in the power allocation step. In Chapter 8, we investigate resource allocation algorithms for content delivery over wireless networks. In Section 8.1, we design a caching algorithm that minimizes the average delay of a single-cell cellular networks. We first provide a characterization of the average delay for both cellular and D2D modes, and then we propose a very efficient and robust algorithm to solve the delay minimization problem. Our algorithm is applicable in settings with very general popularity models, with no assumptions on how file popularity varies among different users, and we further extend our algorithm to a more general setting, in which the system parameters and the distributions of channel fading change over time. In Section 8.2, we propose, for D2D cellular networks operating under deadline constraints, a scheduling algorithm that manages mode selection, channel allocation and power maximization with acceptable complexity. Our scheduling algorithm is designed based on the convex delay cost method for a D2D cellular network with deadline constraints in an OFDMA setting. All seven 19

39 possible modes are included into our scheduling decisions, namely the D2D cellular mode, D2D dedicated mode, uplink dedicated mode, downlink dedicated mode, uplink reuse mode, downlink reuse mode, and D2D reuse mode. Power optimization algorithms are proposed for all possible modes, based on our utility definition. In Section 8.3, we propose a two-step ICI-aware scheduling algorithm for C- RAN, which performs user grouping and resource allocation with the goal of minimizing delay violation probability. A novel user grouping algorithm is developed for the user grouping step, which controls the interference among the users in the same group, and we formulate the channel assignment problem in the second step as a maximum-weight matching problem, which can be solved using standard algorithms in graph theory. The performance of our algorithm is verified via simulations, and we compare our algorithm with a conventional SFR algorithm. Also, the influence of the system parameters is investigated with the help of numerical results. 1.4 Bibliographic Note The results in Section 3.1 appeared in the journal paper: Y. Li, M. C. Gursoy and S. Velipasalar, On the Throughput of Hybrid- ARQ Under Statistical Queuing Constraints, in IEEE Trans. Veh. Technol., vol. 64, no. 6, pp , June The results in Section 3.2 appeared in the journal paper: Y. Li, G. Ozcan, M. C. Gursoy and S. Velipasalar, Energy Efficiency of Hybrid-ARQ Under Statistical Queuing Constraints, in IEEE Trans. Commun., vol. 64, no. 10, pp , Oct

40 and in the conference paper: Y. Li, G. Ozcan, M. C. Gursoy and S. Velipasalar, Energy efficiency of hybrid-arq systems under QoS constraints, in Proc. of the 48th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, 2014, pp The results in Section 4.1 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Throughput of Hybrid-ARQ Chase Combining with ON-OFF Markov Arrivals under QoS Constraints, in Proc. of the IEEE Global Communications Conference (GLOBECOM), Washington, DC, 2016, pp The results in Section 4.2 will appear in the accepted conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Throughput of HARQ-IR with Finite Blocklength Codes and QoS Constraints, in Proc. of the IEEE International Symposium on Information Theory (ISIT), Aachen, The results in Section 5.1 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Throughput of two-hop wireless channels with queueing constraints and finite blocklength codes, in Proc. of the IEEE International Symposium on Information Theory (ISIT), Barcelona, 2016, pp The results in Section 5.2 appeared in the conference paper: Yi Li, D. Qiao, M. C. Gursoy and S. Velipasalar, On the throughput of two-way relay systems under queueing constraints, in Proc. of the IEEE Global Communications Conference (GLOBECOM), Atlanta, GA, 2013, pp

41 The results in Section 5.3 appeared in the journal paper: Y. Li, M. C. Gursoy and S. Velipasalar, On the Throughput of Multi- Source Multi-Destination Relay Networks With Queueing Constraints, in IEEE Trans. Wireless Commun., vol. 15, no. 8, pp , Aug and the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, On the throughput of ARQ over multiple-access relay fading channels with queueing constraints, in Proc. of the IEEE Wireless Communications and Networking Conference (WCNC), New Orleans, LA, 2015, pp The results in Chapter 6 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Throughput and mode selection in two-way MIMO systems under queuing constraints, in Proc. of the IEEE International Conference on Communications (ICC), London, 2015, pp The results in Section 7.1 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Device-to-device communication in cellular networks under statistical queueing constraints, in Proc. of the IEEE International Conference on Communications (ICC), Kuala Lumpur, 2016, pp The results in Section 7.2 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Joint mode selection and resource allocation for D2D communications under queueing constraints, in 22

42 Proc. of the IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), San Francisco, CA, 2016, pp The results in Section 7.3 are reported in the submitted paper: Y. Li, M. C. Gursoy and S. Velipasalar, Joint Mode Selection and Resource Allocation for D2D Communications via Vertex Coloring, in Proc. of the IEEE Global Communications Conference (GLOBECOM), Singapore, The results in Section 8.1 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, A Delay-Aware Caching Algorithm for Wireless D2D Caching Networks, in Proc. of the IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Atlanta, GA, The results in Section 8.2 appeared in the conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Scheduling in D2D Underlaid Cellular Networks with Deadline Constraints, in Proc. of the IEEE 84th Vehicular Technology Conference (VTC-Fall), Montreal, QC, 2016, pp The results in Section 8.3 will appear in the accepted conference paper: Y. Li, M. C. Gursoy and S. Velipasalar, Intercell Interference-Aware Scheduling for Delay Sensitive Applications in C-RAN, in Proc. of the IEEE 86th Vehicular Technology Conference (VTC-Fall), Toronto, ON,

43 Chapter 2 Preliminaries of Statistical Queuing Constraints 2.1 Statistical Queuing Constraints In this thesis, the statistical queuing constraints require the buffer overflow probability to decay exponentially fast, i.e.,[4] [74] Pr{Q q} ςe θq, (2.1) for sufficiently large q, where Q is the stationary queue length, q is the overflow threshold, ς = Pr{Q > 0} is the probability of non-empty buffer, and the nonnegative scalar θ is called the QoS exponent. More rigorously, QoS exponent θ is defined as [1] 1 log Pr{Q q} θ = lim. (2.2) q q 1 Throughout the text, logarithm expressed without a base, i.e., log( ), refers to the natural logarithm log e ( ). 24

44 Note that θ is a factor that controls the exponential decay rate of the buffer overflow probability. From (2.1), we notice that higher values of θ indicate stricter limitations on the buffer overflow probability, leading to more stringent QoS constraints whereas lower values of θ represent looser QoS requirements. Conversely, for a given buffer threshold q and overflow probability limit Pr{Q q} = δ, the desired value of θ can be determined as θ = 1 q log δ + 1 log ς. (2.3) q As q, the term 1 log ς in (2.3) vanishes, which leads to (2.2). q 2.2 Throughput of Single-hop Channels under S- tatistical Queuing Constraints At the buffer, the arrival rates a i (bits/block) and the departure rates c i (bits/block) form the arrival and departure processes, respectively, where i is the time index. According to the effective bandwidth and effective capacity formulations provided in [1] and [4], respectively, in the presence of queuing constraints with QoS exponent θ, the arrival process and departure process at the buffer should satisfy Λ a (θ) + Λ c ( θ) = 0, (2.4) where Λ p (θ) = lim t 1 t log e generating function (LMGF) of the random process p i. E{e θ t i=1 p i } is the asymptotic logarithmic moment 25

45 2.2.1 Effective Capacity When the arrival rate is constant i.e., a i = a for all i, it can be easily seen that Λ a (θ) = aθ. (2.5) Then, from (2.4), we have a = 1 θ Λ c( θ). (2.6) Indeed, the right-hand side of (2.6) is defined as the effective capacity of the wireless link [4] C E (θ, SNR) = 1 θ Λ c( θ), (2.7) characterizing the maximum constant arrival rate that can be supported by the timevarying wireless transmission rates while satisfying the statistical queuing constraint in (2.1). Therefore, under the constant-rate arrival assumption, the maximum average arrival rate is given by the effective capacity: r avg (θ, SNR) = E{a i } = a = C E (θ, SNR) = 1 θ Λ c( θ). (2.8) Average Arrival Rates of Random Arrival Sources under Statistical Queuing Constraints In effective capacity analysis, constant-rate arrivals are often assumed at the transmitter. On the other hand, randomly time-varying arrivals are frequent in real applications. For instance, the data traffic can be regarded as an ON-OFF process in voice communications (e.g., in VoIP) and variable bit-rate video traffic is statistical- 26

46 ly characterized as autoregressive, Markovian, or Markov-modulated processes [75]. With this motivation, the authors in [76] studied the impact of source burstiness on the energy efficiency under statistical queuing constraints, and they further developed energy-efficient power control policies in [77] considering Markov arrivals. When the arrival rate is not constant, the computation of the system throughput is more complicated. In general, we need to formulate the LMGF of the arrival process as a function of the average arrival rate, and obtain the throughput by solving (2.4). In this thesis, three random arrival sources are considered, which are the ON-OFF discrete Markov and Markov fluid sources, and ON-OFF Markov modulated Poisson sources (MMPS), in Chapters 3 and 4. After characterizing their LMGF of the arrival process, the average arrival rates can be obtained as functions of the effective capacities by solving (2.4) Average Arrival Rates of ON-OFF Discrete-Time Markov Source under Statistical Queuing Constraints ON-OFF discrete-time Markov source only has two states, namely, ON and OFF states. We define state 1 as the OFF state, in which the source keeps silent. When the source is in ON state, or equivalently state 2, the arrival rate is a i = r. The state transition probability matrix of this Markov source can be written as G = p 11 p 12 p 21 p 22, (2.9) where p 11 and p 22 denote the probabilities that the source remains in the same state (OFF and ON states, respectively) in the next time block, and p 12 and p 21 are the probabilities that source will transition to a different state in the next time block. Using the properties of Markov processes, we can express the probability of the ON 27

47 state as P ON = 1 p 11 2 p 11 p 22. (2.10) Then, the average arrival rate of this ON-OFF Markov source is 1 p 11 r avg = rp ON = r. (2.11) 2 p 11 p 22 From [78], the LMGFs of this arrival process is given by ( p 11 + p 22 e rθ + ) (p 11 + p 22 e Λ a (θ) = log rθ ) 2 4(p 11 + p 22 1)e rθ e. (2.12) 2 Plugging the characterizations in (2.12) and (2.7) into (2.4), we obtain the arrival rate in the ON state as r = 1 ( θ log e 2θC E p 11 e θc E 1 p 11 p 22 + p 22 e θc E ). (2.13) Further inserting this result into (2.11), we find the maximum average arrival rate as r avg = P ON θ ( log e 2θC E p 11 e θc E 1 p 11 p 22 + p 22 e θc E ), (2.14) where P ON is given in (2.10) Average Arrival Rates of ON-OFF Fluid Markov Source under Statistical Queuing Constraints Different from the discrete-time Markov source whose state does not change in a given time block and state transitions occur in discrete time steps, fluid Markov source may stay in a state over a continuous duration of time. In other words, the source can change its state at any time. Here, the definitions of ON and OFF states are the same as for the ON-OFF discrete-time source. The generating matrix of this 28

48 continuous-time Markov process is given by G = α β α β, (2.15) and the ON state probability is P ON = α α + β. (2.16) In this case, the average arrival rate is α r avg = rp ON = r α + β. (2.17) Using a similar approach as for the discrete-time Markov source, we can find the maximum average arrival rate for the ON-OFF fluid Markov source, under statistical queuing constraints. From [79], the LMGF of the arrival process of the ON-OFF Markov fluid source is given by Λ a (θ) = 1 2 ( θr α β + ) (θr α β) 2 + 4αθr. (2.18) Plugging (2.18) into (2.4), we can find that r = C E θc E + α + β θc E + α. (2.19) Inserting this result into (2.17), we determine the maximum average arrival rate as r avg = P ON C E θc E + α + β θc E + α, (2.20) where P ON is given in (2.16). 29

49 Average Arrival Rates of ON-OFF MMPS under Statistical Queuing Constraints The arrival rates of ON-OFF MMPS models are described as a Poisson process with intensity ρ in the ON state while there is no arrival in the OFF state. State transitions are governed by a continuous-time Markov chain as in the Markov fluid model. However, compared to the ON-OFF Markov fluid source analyzed in Section , MMPS can be seen to have a higher degree of burstiness since its arrival rate, rather than being a constant, is random in the ON state. Here, the expressions of the generating matrix and ON state probability are the same as in Section In this case, the average arrival rate is r avg = ρp ON = ρ α α + β. (2.21) From [79], the LMGF of the arrival process of the ON-OFF MMPS is given by Λ a (θ) = 1 2 [ (e θ 1)ρ (α + β) ] [(e θ 1)ρ (α + β)] 2 + 4αρ(e θ 1). (2.22) Plugging (2.22) into (2.4), we can find ρ = θ [θc E + α + β] (e θ 1) [θc E + α] C E. (2.23) Inserting (2.23) into (2.21), we find the maximum average arrival rate as θ r avg = P ON C E,Q1 e θ 1 θc E,Q1 + α + β θc E,Q1 + α, (2.24) where P ON is given in (2.16). 30

50 2.3 Throughput of Two-hop Channels under Statistical Queuing Constraints In a two-hop channel, source node S sends information to the receiver D with the help of the intermediate relay node R, and there is no direct link between S and D (which, for instance, holds, if these nodes are sufficiently far apart in distance). The system model of two-hop relay channels is shown in Figure 5.1 in Section 5.1. Both the source and the intermediate relay nodes operate under statistical queuing constraints with QoS exponents θ 1 and θ 2, resectively. In such cases, we assume a constant arrival rate at the source node, and we characterize the maximum constant arrival rate at the source node that can be supported by the two-hop link under queuing constraints as the throughput. From the theory of effective bandwidth and effective capacity [1], [2], [4], the queuing constraint can be satisfied at the source node when the constant arrival rate R satisfies R = Λ S,R( θ) θ (2.25) for some θ θ 1, where Λ S,R is the LMGF of the service (or equivalently transmission) rate at the source. The above arrival rate formulation considers only the queuing constraints at the source node. However, we need to address the constraints at the relay buffer as well. It was shown in [2] that the queuing constraint can be satisfied at the relay if we have Λ R (ˆθ) + Λ R,D ( ˆθ) = 0 (2.26) 31

51 for some ˆθ θ 2. Above, Λ R,D is the LMGF of the service rate at relay. In (2.26), Λ R is the LGMF of the arrival process to R and is formulated as [2, equation (18)] Rθ, 0 θ θ Λ R (θ) = R θ + Λ S,R (θ θ), θ > θ. (2.27) Hence, in order to satisfy the queuing constraints at both the source and relay nodes, the arrival rate R should satisfy (2.25) and (2.26) simultaneously, which implies that R should be the minimum of the rates obtained from (2.25) and (2.26). 32

52 Chapter 3 Throughput and Energy Efficiency of Hybrid-ARQ under Statistical Queuing Constraints with Low QoS Exponents HARQ is a high performance communication protocol, leading to effective use of the wireless channel and the resources with only limited feedback about the CSI to the transmitter. In this chapter, we analyze the throughput and energy efficiency of HARQ protocols in the presence of statistical queuing requirements when the QoS exponent θ is sufficiently small via Taylor expansion. In Section 3.1, the throughput of HARQ with fixed transmission rate is studied. In particular, tools from the theory of renewal processes and stochastic network calculus are employed to characterize the maximum arrival rates that can be supported by the wireless channel when HARQ is adopted. Effective capacity is employed as the throughput metric and a closed-form expression for the effective capacity of HARQ is determined for small values of the QoS exponent. The impact of the fixed trans- 33

53 mission rate, queuing constraints, and hard deadline limitations on the throughput is investigated. In Section 3.2, the energy efficiency of the HARQ chase combining scheme under outage, deadline, and statistical queuing constraints in the low-power and low-θ regimes is studied for both constant-rate and random Markov arrivals by characterizing the minimum energy per bit and wideband slope. In particular, two queuing models are considered. Specifically, when outage occurs, the transmitter keeps the packet, lowers its priority, and attempts to retransmit it later in the first queue model while the packet is discarded and removed from the buffer in the second queue model. For both models, energy efficiency is investigated when outage constraints, statistical queuing constraints and deadline constraints are imposed. The deadline constraint provides a limitation on the number of retransmissions or equivalently the number of HARQ rounds. Under these assumptions, closed-form expressions are obtained for the minimum energy per bit and wideband slope for HARQ-CC, and comparisons among different arrival models are made. For instance, it is shown that stricter queuing constraints and more bursty sources degrade the energy efficiency by lowering the wideband slope. In the numerical results, analytical characterizations are verified through simulations. Moreover, the impact of source variations/burstiness, deadline constraints, outage probability, queuing constraints on the energy efficiency is analyzed. 34

54 Figure 3.1: System Model 3.1 Throughput of HARQ under Statistical Queuing Constraints System Model Fading Channel We consider a point-to-point wireless link shown in Fig. 3.1 and assume that block fading is experienced in the channel. More specifically, in each block duration of T s seconds, fading is assumed to stay fixed and then change independently in the subsequent block. We assume that transmissions occur in blocks and one fading block is our basic time unit throughout the paper. In each block duration, transmitter sends l symbols to the receiver. In the i th block, the transmitter sends the l-dimensional signal vector x i with average energy E{ x i 2 } = le, and the received discrete time signal can be expressed as y i = h i x i + n i i = 1, 2,... (3.1) where h i is the channel fading coefficient in this time block, and n i denotes the noise vector with i.i.d. complex, circularly-symmetric Gaussian components with zeromean and variance N 0. Then, the instantaneous capacity in each fading block can be expressed as C i = T s B log 2 (1 + SNRz i ) bits/block (3.2) 35

55 where B is the system bandwidth, T s is the block duration, SNR = E{ x 2 } E{ n 2 } = le ln 0 = E N 0 represents the transmitted average signal-to-noise ratio, and z i = h i 2 denotes the magnitude-square of the fading coefficient HARQ-IR with Fixed-Rate Transmissions We assume that the transmitter sends information at the constant rate of R bits/block 1 and a Type-II HARQ-IR protocol is employed for reliable reception. In this scheme, the messages at the transmitter are encoded according to a certain codebook and the codewords are divided into a number of subblocks of the same length. During each fading block, only one subblock is sent to the receiver. At the receiver side, the transmitted message is decoded according to the current received subblock combined with the previously received subblocks related to the current transmitted message. In this case, information accumulates at the receiver side. According to informationtheoretical results [80], the receiver can decode the transmitted message at the end of the N th subblock without error only if R satisfies R < T s B N log 2 (1 + SNRz i ). (3.3) i=1 Hence, with the above characterization, we consider the maximum achievable rates of HARQ with an information-theoretic perspective as in [80]. Indeed, a coding strategy for HARQ-IR is described in detail in [80] for messages to be decoded successfully when (3.3) is satisfied. Hence, if (3.3) is satisfied, the receiver gets R bits of information, an ACK feedback signal is sent, and the first subblock of a new message is transmitted in the next interval. We assume that the decoder at the receiver has the ability to detect transmission errors reliably. Therefore, if R does not satisfy (3.3), 1 More accurately, we assume that the transmitter, after each successful transmission, attempts to send R bits within the next transmission duration. If the transmitted codeword is successfully decoded in the first fading block, the received data rate is R bits/block. If successful decoding occurs at the end of N th fading block, the received data rate is R/N bits/block. 36

56 receiver detects the error and sends NACK feedback to the transmitter, triggering the transmission of the next subblock of the same message in the subsequent transmission interval. We define the random transmission time T of a message as T = min { N : R < T s B } N log 2 (1 + SNRz i ). (3.4) i=1 Hence, T denotes the number of block-fading channel uses needed to successfully send a message. In our HARQ-IR model, if a renewal event occurs when the receiver decodes the transmitted message correctly. Therefore, T describes the interarrival time (in terms of number of fading blocks) between consequent renewal events. It is shown in [80] that the throughput of this HARQ-IR scheme is given by γ = R E{T } = R µ 1 (3.5) where µ 1 denotes the expected value of T. Additionally, it is proven in [80] that as R, the throughput approaches the ergodic capacity, i.e., lim γ = E{T sb log 2 (1 + SNRz i )} = E{C i }. (3.6) R Effective Capacity of the HARQ-IR Scheme We assume that the transmitter operates under queuing constraints imposed as limitations on the buffer overflow probability, which is introduced in Chapter 2. In [81] the notion of effective capacity was proposed to analyse the system throughput under such constraints, which provides the maximum constant arrival rates that can be supported while satisfying (2.1) in Chapter 2. From (2.8) in Chapter 2, the maximum arrival rate or equivalently the effective capacity under this queuing constraint is given by 37

57 1 C e = lim t θt log e E{e θs t } (3.7) 1 = lim t θt log e E{e θrn t } (3.8) where S t = i=t i=1 s[i] is the time-accumulated service process representing the total number of bits sent until time t. In our system setting, if we denote the number of successful message transmissions until time t by N t, then S t = RN t. Note that N t is the number of renewals made by time t and hence {N t } can be regarded as the renewal counting process with i.i.d. interarrival intervals. More explicitly, we can define N t as N t = max { k : } k T j < t j=1 (3.9) where {T j } is the i.i.d. sequence of durations of successful transmissions of consecutive messages. Note that T j is the number of fading blocks needed to successfully decode the j th message. Renewals occur when the receiver decodes a packet successfully and N t is number of renewal events (or equivalently the number of instances R bits of information has been successfully received) up until time t. Using the properties of renewal processes, we obtain the following closed-form expression of the effective capacity for small θ values in terms of the statistical averages of the random transmission time T. Theorem 1 For the HARQ-IR scheme with fixed rate transmissions, the effective capacity in (3.8) has the following first order expansion with respect to the QoS exponent θ around θ = 0: C e = R R2 σ 2 θ + o(θ), (3.10) µ µ 3 1

58 where R is the fixed transmission rate, µ 1 and σ 2 are the mean and variance of the random transmission time T, and θ is the QoS exponent. Note that µ 1 and σ 2 are also functions of R. Finally, o(θ) represents the terms that vanish faster than θ as θ 0, i.e., lim θ 0 o(θ) θ = 0. Proof : See Appendix A.1. To calculate effective capacity, we need the mean and variance of the transmission time. For the HARQ-IR scheme, the distribution of T is not available in closed-form and hence we resort to Monte-Carlo simulations to obtain µ 1 and σ 2. Remark 1 We note that if no queuing constraints are imposed and hence θ = 0, the effective capacity expression in (3.10) specializes to C e = R µ 1 and therefore we recover the throughput formulation obtained in [80]. Additionally, we notice in (3.10) that since R 0, µ 1 0, and σ 2 0, the introduction of the queuing constraints even with small QoS exponent θ leads to a loss in the throughput, which was quantified by the term R2 σ 2 θ. Finally, another observation is that while depending only on µ 2µ when θ = 0, the throughput starts also depending on the variance, σ 2, of the random transmission time in the presence of queuing requirements. Indeed, the larger the variance, the smaller the throughput becomes in the regime of small θ. Remark 2 By the Central Limit Theorem for renewal counting processes [82], if the inter-renewal intervals have finite variance σ 2, then we have the following convergence in distribution N t t µ 1 σµ t 1/2 N (0, 1) as t. (3.11) Hence, the distribution of N t tends to a Gaussian distribution with mean t µ 1 and variance σ2 t µ 3 1 for large t. Now, if we approximate the distribution of N t as f Nt (x) 1 exp 2π σ2 t µ 3 1 ( x t µ σ 2 t µ 3 1 ) 2 for large t, (3.12) 39

59 then plugging the parameters into the moment generating function of Gaussian distribution, we can obtain E{e θrn t } exp ( Rµ ) θt + R2 σ 2 θ 2 t 2µ 3 1 (3.13) which implies that 1 C e = lim t θt log e E{e θrn t } R R2 σ 2 θ. (3.14) µ 1 2µ 3 1 Remark 3 Theorem 1 can also be applied to Type-II HARQ-CC protocol. In HARQ- CC protocol, the transmitter just sends the same coded data in each retransmission, and the receiver uses maximum-ratio combining for decoding. Therefore, the packet can be decoded within N fading blocks only if R satisfies R T s B log 2 (1 + ) N SNRz i. (3.15) i=1 Since the number of successful message transmissions, N t, can still be regarded as a renewal counting process, and all moments of the transmission time T are also finite in Chase Combining, the characterization in (3.10) applies to this type of HARQ as well Hard Deadline Constraints Heretofore, we have not considered any restrictions on the random transmission time T. Hence, the number of block-fading channel uses needed to successfully send a message can be arbitrarily large especially if the transmission rate R is also large. Indeed, as will be evidenced in the numerical results, throughput improves as R increases but this comes at the cost of increased transmission time. On the other hand, practical systems can require hard deadline constraints for the messages and it 40

60 is of interest to have bounds on T. For instance, we can impose T T u, (3.16) and hence limit the number of HARQ rounds to send a message by T u. More specifically, if R > T s B T u i=1 log 2(1+SNRz i ) and hence the message is not correctly decoded at the end of the T th u transmission, the transmitter initiates the transmission of the new message. Detailed discussions about dealing with outage events is provided in Section 3.2. Here we consider a case, in which the transmitted packet is not removed from the buffer when an outage happens. The transmitter reduces its priority, and starts transmitting the packet with the highest priority in the next time block. This scenario corresponds to the queue model I with deadline constraints in Section 3.2. Under this situation, the system has to operate under queuing constraint and deadline constraint. We can easily see that the characterization in Theorem 1 applies in the presence of hard-deadline constraints as well, once we adopt the following approach. We define ˆT as the total duration of time that has taken to successfully send one message, including the periods of failed transmissions due to imposing the upper bound T u. In this case, the count starts from 0 after a successful decoding, and transmission time ˆT increases until the next successful decoding. Again, the renewal events happen only when the receiver can decode the packet successfully. Now, the probability that ˆT = n + kt u, i.e., the probability that the transmission of first k messages have ended in failure due to the deadline constraint and (k + 1) th message is successfully transmitted after n T u HARQ transmissions, can be expressed as Pr{ ˆT = n + kt u } = (Pr{T > T u }) k Pr{T = n} for n = 1, 2,..., T u and k = 0, 1, 2,... (3.17) 41

61 where T is as defined in (3.4). Under the upper bound constraint T u, the new interrenewal time between successful message transmissions is ˆT. Hence, only the statistical description of inter-renewal time changes and the throughput formulation in (3.10) still applies but now with µ 1 = E{ ˆT } and σ 2 = var( ˆT ). Note that the inter-renewal time ˆT can grow very fast on average with increasing rate R. This is due to the fact that the likelihood to complete the message transmission within T u intervals becomes small for large R. Hence, many message transmissions can fail before a successful transmission. More specifically, as R increases, Pr{T > T u } grows, increasing the probability of large values of ˆT and also increasing µ 1 = E{ ˆT }. This growth is faster than what would be experienced in the absence of hard-deadline constraints and it can lower the throughput significantly if R is larger than a threshold Numerical Results In this subsection, we provide our numerical results. In particular, we focus on the relationship between the transmission rate R and our throughput metric C e. In our results, we both compute the first-order expansion of the effective capacity given in (3.10) and also simulate the HARQ-IR transmissions and estimate the effective capacity by computing 1 log θt e E{e θrn t } for large t. More specifically, E{e θrn t }, the expected value and variance of the transmission time are determined via Monte- Carlo simulations. In the numerical analysis, we assume the fading coefficient h i has a circularly symmetric complex Gaussian distribution with zero mean and variance 1. Hence, we consider a Rayleigh fading environment. In Fig. 3.2, we plot the effective capacity C e as a function of the transmission rate R for Type-I HARQ, HARQ Chase Combining and HARQ-IR schemes. The throughput curves of HARQ-IR and HARQ Chase Combining are plotted both by computing the first-order expansion in (3.10) and also via simulation. We immediately 42

62 Effective capacity HARQ IR (equation(3.10)) HARQ IR (simulation) Type I HARQ HARQ chase combining (equation(3.10)) HARQ chase combining (simulation) Upper bound (equation(3.18)) R Figure 3.2: Effective capacity C e vs. transmission rate R at SNR = 6 db and θ = 0.01 for both ARQ and HARQ-IR Var{C[i]} R 2 σ 2 /µ R Figure 3.3: var(log 2 (1 + SNRz)) and R2 σ 2 µ 3 1 θ = vs. transmission rate R. SNR = 6 db and 43

63 notice that for both HARQ-IR and HARQ Chase Combining, the effective capacity approximation provided by the first-order expansion is very close to that obtained by simulation for θ = Hence, as predicted in Section 3.1.2, first-order expansion gives an accurate characterization of the throughput of HARQ-IR and HARQ Chase Combining. In the figure, we further observe that HARQ-IR significantly outperforms Type-I HARQ and HARQ Chase Combining. Throughput of Type-I HARQ initially increases and reaches its peak value at an optimal value R beyond which it starts to diminish. Hence, in Type-I HARQ, rates higher than the optimal R are leading to a large number of retransmissions and resulting in lower throughput. Similar behavior is shown by HARQ Chase Combining. On the other hand, the throughput of HARQ-IR interestingly improves with increasing R and approaches C e,perfect CSI = 1 θ log e E{e θc } = 1 θ log e E{e θt sb log 2 (1+SNRz) } (3.18) which is the effective capacity of a system in which the transmitter knows the channel fading coefficients perfectly and transmits the data at the time-varying rate of B log 2 (1 + SNRz) in each block. Note that this observation can be seen as the extension of (3.6) to the case with queuing constraints. Furthermore, it can be easily verified that the first-order expansion of C e,perfect CSI is given by C e,perfect CSI = E{T s B log 2 (1 + SNRz)} var(t s B log 2 (1 + SNRz)) θ + o(θ) (3.19) 2 where var(t s B log 2 (1+SNRz)) denotes the variance of T s B log 2 (1+SNRz). Comparing this expansion with (3.10) and noting the limiting result in (3.6) and the observation in Fig. 3.2, we expect that R2 σ 2 µ 3 1 which is verified numerically in Fig approaches var(t s B log 2 (1 + SNRz)) as R increases, The improvement in the throughput of HARQ-IR with increasing R comes at the 44

64 µ 1 σ Figure 3.4: Mean µ 1 and the variance σ 2 of the transmission time T vs. transmission rate R. SNR = 6 db and θ = R cost of increased transmission time. This is demonstrated in Fig. 3.4 which shows that both the mean µ 1 = E{T } and the variance σ 2 = var(t ) of the random transmission time T increases with increasing R. Two curves in Fig. 3.4 are obtained using simulation results. It is interesting to note that this increased transmission time in HARQ-IR does not have detrimental impact on the throughput under queuing constraints, which is a testament to the efficient utilization of the channel and resources by HARQ-IR. Indeed, it takes more time to send the data but proportionally a large amount of data is sent successfully with HARQ-IR over this extended period of time. Another observation in Fig. 3.4 is at the other end of the line. As R diminishes, µ 1 and σ 2 approach 1 and 0, respectively. This implies from (3.10) that C e R for very small R, explaining the linear growth of the effective capacity curve of HARQ-IR for small R values in Fig In Fig. 3.5, we plot the effective capacity vs. R curve for different values of the QoS exponent θ. We see that larger θ values (and hence stricter queuing constraints) expectedly lead to lower throughput. Equivalently, as θ increases, the same effective capacity is achieved by transmitting at higher rates R and hence by potentially ex- 45

65 2 Effective capacity θ=0 θ=0.1 θ= Figure 3.5: Effective capacity C e of HARQ-IR vs. transmission rate R at SNR = 6 db for different θ values. R Effective capacity θ=0 θ=0.2 θ= /µ 1 Figure 3.6: Effective capacity C e vs. 1 µ 1 at SNR = 6 db for different θ values. 46

66 Effective capacity T =1 u T u =5 T u =10 T = u Figure 3.7: Effective capacity C e vs. transmission rate R for different hard deadline constraints. SNR = 6 db and θ = 0.1 R periencing larger transmission time as depicted in Fig We note in Fig. 3.6 that especially for high effective capacities, when θ is increased, the same effective capacity is achieved at smaller values of 1 µ 1 = 1 E{T }. Finally, we address the impact of hard-deadline constraints in Fig We plot C e vs. R curves for different values of the upper bound T u on the transmission time T (or equivalently the number of HARQ rounds). Again, the expected value and variance of the transmission time are obtained via simulation, and effective capacities are calculated using our first-order approximation. We readily observe that when hard deadline constraints are imposed, there exists an optimal transmission rate R (T u ) at which the throughput is maximized and beyond which the throughput starts diminishing. The optimal R (T u ) and the achieved maximum throughput get larger for larger T u while the throughput monotonically increases with increasing R when no deadline constraints are imposed, i.e., when T u =. 47

67 3.2 Energy Efficiency of Hybrid-ARQ under Statistical Queuing Constraints System Model and Preliminaries In this subsection, we introduce our system model, preliminaries on the HARQ-CC scheme, and energy efficiency metrics in the low-snr regime. First, we describe our system and channel model. In order to enhance the reliability, the system employs HARQ-CC scheme with fixed transmission rate. A brief introduction on HARQ-CC is provided following the introduction on the system model. A detailed discussion regarding dealing with outage events is given in Section Finally, we introduce the two energy efficiency metrics, namely minimum energy per bit and wideband slope, in the low-snr regime System Model In this section, as depicted in Fig. 3.1, the same point-to-point wireless communication system is considered, in which data packets arriving from the source are initially stored in a buffer at the transmitter before being sent over a fading channel to a receiver. We assume a block flat-fading model in which the fading coefficients stay the same within one block, but change independently across blocks. Each fading block is assumed to have a duration of l symbols. Throughout this section, we use subscript i as the discrete time index. The received signal in the i th block can be written as y i = h i x i + n i i = 1, 2,... (3.20) Above, x i and y i are the transmitted and received signal vectors of length l, respectively, and h i denotes the channel fading coefficient in the i th block. Also, n i represents the noise vector with i.i.d. circularly-symmetric, zero-mean Gaussian com- 48

68 ponents, each with variance N 0 2. Then, the instantaneous capacity (bits/symbol) in the i th block is given by C i = log 2 (1 + SNRz i ), (3.21) where z i = h i 2 is the magnitude-square of the fading coefficient, SNR = E N 0 denotes the signal-to-noise ratio, and E = 1 l E{ x i 2 } is the average energy per transmitted symbol HARQ-CC To guarantee the reliability of the system, we assume that the system employs HARQ- CC scheme with fixed transmission rate R (bits/symbol), and the size of each packet is fixed at lr bits, where l is the number of symbols in each fading block. If the receiver decodes the received packet correctly, it sends an ACK feedback to the transmitter through an error-free feedback link, and a new packet will be sent in the next time block. If the receiver cannot decode the packet, a retransmission request is sent through the feedback link, and another codeword block of the same packet will be sent in the next time block. For simplicity, we assume an ideal ARQ protocol in our analysis, in which the transmitter gets the feedback immediately at the end of each time block without any delay. In this section, deadline constraint is incorporated to control the average packet delay. More specifically, the deadline constraint limits the the maximum number of successive retransmission attempts of a packet (or equivalently the number of HARQ rounds for a packet). We define the HARQ period as the duration of successive time blocks used to transmit a single packet. Then, the deadline constraint limits the 2 Our model considers block fading with independent fading coefficients across different blocks and also a white noise process. In practical scenarios in which fading is correlated, our model assumptions will be applicable if frame-level interleaving and deinterleaving are performed at the transmitter and receiver, respectively, potentially introducing more delay compared with symbol-level interleaving. Also, if non-white noise is experienced, a whitening filter can be employed at the receiver. 49

69 maximum duration of HARQ periods. In this section, we assume that the deadline constraint is M time blocks, and the packets that cannot be received correctly by the receiver in the first HARQ period become outdated or their transmission priority is lowered. Therefore, the retransmission of a packet continues until the receiver gets the packet without error or if the limit on the number of retransmissions is reached, and then the transmitter starts with another packet in the next HARQ period. The receiver starts combining the received signal from the beginning in each HARQ period. Whether the previous packet (which has experienced deadline violation) is kept in the buffer for transmission later or is discarded from the buffer depends on the queue models described in the next subsection. In the HARQ-CC scheme, the same coded data is transmitted in each retransmission. The receiver employs maximum-ratio combining and decodes the data packet error-free after the N th round only if R satisfies [7] R log 2 (1 + SNR ) N z i. (3.22) Outage happens when a packet cannot be received correctly within one HARQ period, and we denote the value of outage probability by ε. More specifically, the outage probability is expressed as Pr {log 2 ( 1 + SNR i=1 ) } M z i < R = ε. (3.23) i=1 Although the transmitter always sends information at a fixed rate, HARQ-CC has the ability to adapt the average transmission rate to the channel conditions without requiring perfect CSI at the transmitter. According to (3.22), if the data packet is successfully decoded in the N th round, the average transmission rate is bounded as ( 1 + ) ( N 1 SNR i=1 z i < R 1 log N N ) N SNR i=1 z i. For instance, when the 1 N log 2 channel conditions are favorable, the transmission of a single packet can be completed 50

70 Figure 3.8: Structure of a packet transmission period in queue model I within a few blocks, resulting in a relatively large average transmission rate, and vice versa if the channel conditions are poor. In [7], a detailed discussion about the throughput of different types of HARQ is provided Queue Models As we have mentioned in Section , the transmitter is equipped with a buffer that is used to store the packets before transmission to the receiver. In this section, we consider two typical queue models. 1. Queue Model I: Packets are removed from the buffer only when they are received by the receiver correctly. If a packet is not received by the receiver correctly within M successive time blocks, the transmitter reduces its priority, and starts transmitting the packet with the highest priority in the next time block. 2. Queue Model II: Packets are removed from the buffer when they are received by the receiver correctly or if the duration of its HARQ period reaches the upper limit M. In queue model I, there is no limitation on the overall number of time blocks used for a packet, and no packet is discarded. Instead, the packets are sorted according to 51

71 their priorities 3, and the transmitter transmits the packet with the highest priority in each HARQ period. The priority can be determined by the urgency of the data packet, and the priority level of a packet is reduced every time the deadline constraint is violated during its transmission. Once the priority of a packet is reduced, it can be transmitted again when it has the highest priority in the queue. In other words, a packet can occupy multiple HARQ periods in this model. In this case, the meaning of the deadline constraint is to control the average packet delay 4. By reducing the priority of an outdated packet, the packets waiting behind this outdated packet can have smaller packet delay. We define the packet transmission period as the duration of successive time blocks until one packet is removed from the buffer (either due to successful transmission, or for instance, in queue model II, due to deadline violation), and denote the packet transmission period by T. When a successful transmission of the packet occurs, we define V as the number of time blocks used to successfully transmit the packet within the last HARQ period. In queue model I, if there are multiple HARQ periods, transmission in the last HARQ period of a packet transmission period is successful, and all other HARQ periods in the same packet period have ended up with outages. Therefore, the duration of the final HARQ period in a packet transmission period can be represented by V, and we can have the relationship T = km +V, where k represents the number of failed HARQ periods or equivalently the number of outage events within the packet transmission period. Fig. 3.8 shows an example of a packet transmission period with T = 8, k = 2, M = 3, and V = 2 in queue model I. A detailed discussion about the distributions of T and V is provided in Section The packets with the same priority level are sorted according to their arrival order. 4 In this section, packet delay is defined as the duration a packet waits starting from its entrance to the buffer until its successful transmission. 52

72 Queue model I is suitable for applications, in which the receiver insists on getting every data packet it requests from the transmitter. For instance, in a cache-aided system, the receiver updates its cached data using the outdated data packets for future use [83]. In the second queue model, a packet transmission period only contains one HARQ period, and we have T M. Hence, there is always a departure from the buffer within one HARQ period due to either successful transmission or packet drop because of deadline violation. Queue model II is suitable for applications, in which outdated data is useless, such as live video transmission. In the buffer analysis, the instantaneous departure rate takes only two values. The departure rate is c i = R (bits/symbol) when a packet is removed from the queue in the i th time block, otherwise c i = 0. More specifically, in queue model I, c i = R when a packet is received by the receiver correctly; in queue model II, c i = R when a packet is received by the receiver correctly or a packet is discarded because of the deadline constraint. For the queue model I, the throughput is characterized by the maximum average arrival rate r avg that can be supported under queuing constraints, described by (2.1) in Chapter 2. For the queue model II, the throughput is given by the maximum average arrival rate r avg multiplied by (1 ε), because only (1 ε) of the packets are received by the receiver, and ε of the packets are discarded on average due to deadline violations Energy Efficiency Metrics As mentioned in the previous subsection, the system throughput is characterized by the average arrival rate r avg and (1 ε)r avg in the queue models I and II, respectively. 53

73 Moreover, we choose energy per bit 5, defined as SNR over the throughput r T H, i.e., E b N 0 = SNR r T H = SNR r avg(θ,snr) SNR (1 ε)r avg (θ,snr) in queue model I in queue model II, (3.24) as the metric for energy efficiency under statistical QoS constraints. For a given throughput requirement, the system with smaller energy per bit has better energy efficiency. In this section, we focus on the low-snr regime, in which the throughput r T H can be approximated as a linear function of the bit energy in db scale [84]: r T H = ( S 0 Eb E ) ( b Eb + o E ) b, (3.25) 10 log 10 2 N 0 db N 0 min,db N 0 N 0 min where E b N 0 db = 10 log 10 E b N 0, E b N 0 min,db is the minimum energy per bit in db scale, which is achieved as SNR and throughput approach 0 in our system setting, and S 0 10 log 10 2 is the slope of the throughput curve at E b N 0 min. Because of the linear behavior of the throughput curve in the low-snr regime, energy efficiency can be described by the minimum energy per bit and wideband slope, and we have the following characterizations: 1. The system with smaller E b N 0 min has better energy efficiency for sufficiently small SNR. 2. Among the systems with the same E b N 0 min, the system with higher S 0 value has better energy efficiency in the low-snr regime, because higher slope provides higher throughput increment as E b N 0 increases, or equivalently the same throughput is achieved at a smaller value of E b N 0. Therefore, the minimum energy per bit E b N 0 min and wideband slope S 0 are the key 5 In the literature, E b N 0 is also referred to as the signal-to-noise ratio per bit. In this section, we denote signal-to-noise ratio per symbol as SNR, and throughput is in bits/symbol. Therefore, SNR divided by the throughput provides the SNR per bit. In (3.24), we also assume that the circuit power consumption is negligible and the transmission power is the dominant factor. 54

74 metrics of energy efficiency in the low-snr regime. In queue model I, the minimum energy per bit is obtained from [84] E b N 0 min = lim SNR 0 SNR r avg (θ, SNR) = 1 ṙ avg (θ, 0) (3.26) where ṙ avg (θ, 0) denotes the first derivative of the system throughput r avg (θ, SNR) with respect to SNR at zero SNR. The wideband slope S 0 is given by S 0 = 2(ṙ avg(θ, 0)) 2 r avg (θ, 0) log e 2. (3.27) Above, r avg (θ, 0) denotes the second derivative of r avg (θ, SNR) 6 with respect to SNR at zero SNR. Correspondingly, the minimum energy per bit and wideband slope in queue model II are given by E b N 0 min = 1 (1 ε)ṙ avg (θ, 0) (3.28) and S 0 = (1 ε) 2(ṙ avg(θ, 0)) 2 r avg (θ, 0) log e 2, (3.29) respectively Energy Efficiency of HARQ-CC scheme with Fixed Outage Probability In this subsection, we study the energy efficiency of HARQ-CC scheme with fixed outage probability. Initially, we consider constant-rate arrivals, characterize throughput by employing the effective capacity formulation, and derive the minimum energy 6 In the remainder of this section, especially when θ is fixed and derivatives with respect to SNR are considered, we generally express average arrival rate and effective capacity only as a function of SNR explicitly as r avg (SNR) and C E (SNR), respectively, and suppress θ in order to avoid cumbersome expressions. 55

75 per bit and wideband slope. Subsequently, we incorporate random arrival models by considering discrete-time Markov, Markov fluid, and Markov modulated Poisson sources and determine the system throughput and analyze the energy efficiency again by determining the minimum energy per bit and wideband slope. Based on these results, a comparison of the energy efficiency with different arrival models is given in the next subsection Statistical Distribution of T Before obtaining the minimum energy per bit and wideband slope expressions for HARQ-CC, we first characterize the system throughput of HARQ-CC scheme subject to an outage constraint. Recall that an outage event happens if the receiver does not correctly decode the message within an HARQ period with a maximum duration of M time blocks. The formulation of the outage probability is given in (3.23). Correspondingly, the transmission rate that guarantees an outage probability of ϵ can be expressed as [9] for both queue model I and II, where F 1 M R = log 2 ( 1 + F 1 M (ε)snr) (3.30) is the inverse cumulative distribution function (CDF) of M i=1 z i. Specifically, for Rayleigh fading, 2 E{z} M i=1 z i follows a chisquare distribution with 2M degrees of freedom; for Nakagami-m fading, M i=1 z i follows a Gamma distribution with shape parameter M m and scale parameter E{z}/m. Hence, using the above rate expression and the formulation (3.10) in Section 3.1, we can express, for small θ, the throughput of the HARQ-CC scheme subject to an outage constraint ϵ as r avg (SNR) = log [ 2(1 + F 1 M (ε) SNR) log2 (1 + F 1 M (ε) SNR)] 2 σ 2 θ, (3.31) µ 2µ 3 for both queue model I and II. The only difference between the average arrival rates 56

76 in these two queue models lies in the expressions of µ and σ 2. In order to obtain the expressions of µ and σ 2, we first find the probability mass function (pmf) of T, which represents the duration of a packet period. Recall that in Section , we denote the duration of the last HARQ period in a packet transmission period by V, and we have characterized the relationship T = km + V in queue model I. We denote the values of the random variables T and V by t and v, respectively. In the rest of this section, we use subscript Q1 and Q2 to distinguish the notations (T, µ, σ, C E, E b N 0 min and S 0 ) for queue models I and II, respectively. Queue model I: The probability that the transmission of the first k packets have ended in failure due to the the deadline constraint M, and the (k + 1) th packet is successfully transmitted after v M time blocks is given as follows: Pr{T Q1 = km + v} = ε k Pr{V = v} (3.32) where ε is the outage probability. According to the condition given in (3.22), Pr{V = v} for v M can be expressed as Pr{V = v} = Pr{V v} Pr{V v 1} (3.33) { } { } v ) v 1 ) = Pr log 2 (1 + SNR z i R Pr log 2 (1 + SNR z i R (3.34) = Pr { v i=1 i=1 z i F 1 M (ε) } Pr { v 1 } z i F 1 M (ε) i=1 i=1 (3.35) = F v 1 ( F 1 M (ε)) F v ( F 1 M (ε)) (3.36) where F v is the CDF of v i=1 z i. Now, (3.32) can be expressed as Pr{T Q1 = km + v} = ε k ( F v 1 ( F 1 M (ε)) F v ( F 1 M (ε))). (3.37) 57

77 Queue model II: Recall that the value of T Q2 cannot exceed M in queue model II. T Q2 < M corresponds to successful transmission, and T Q2 = M corresponds to either successful transmission using M time blocks or an outage event due to deadline violation, which leads to packet being removed from the buffer and discarded. Therefore, we can express the pmf of T Q2 as ( Pr{V = t} = F t 1 F 1 M Pr{T Q2 = t} = (ε)) ( F t F 1 M (ε)), ( Pr{V = M} + ε = F M 1 F 1 M (ε)), t < M t = M (3.38) where V has the same pmf as in queue model I. Recall that V is defined only for successful transmission, and thus V = M in (3.38) represents that the packet is received successfully using M time blocks. Theorem 2 For queue model I, the expected value and variance of T are given by µ Q1 = 1 1 ε σ 2 Q1 = 1 1 ε M v=1 v=1 v Pr{V = v} + Mε 1 ε M v 2 Pr{V = v} + 2Mε (1 ε) 2 M v Pr{V = v} v=1 (3.39) + M 2 ε(1 + ε) (1 ε) 2 µ 2 Q1, (3.40) respectively. And for queue model II, the expected value and variance of T are given by M µ Q2 = t Pr{V = t} + Mε, (3.41) t=1 M σq2 2 = t 2 Pr{V = t} + M 2 ε µ 2 Q2, (3.42) t=1 respectively. In the above expressions, the pmf of random variable V is given by (3.36) 58

78 for both queue models I and II. Proof : See Appendix A Energy Efficiency of HARQ-CC with Constant-Rate Arrivals Note that the expressions of µ and σ 2 do not depend on SNR in both queue models I and II. In the following result, we characterize the energy efficiency in the low SNR regime for small θ. Theorem 3 For small QoS exponent θ, the minimum energy per bit and wideband slope of the HARQ-CC scheme with the outage constraint ϵ are given, respectively, by E b = µ Q1 log e 2 N 0 min Q1 F 1 M (ε), (3.43) 2µ Q1 log S 0 Q1 = e 2 σq1 2 θ + µ2 Q1 log e 2, (3.44) for queue model I, where µ Q1 and σq1 2 are given by (3.39) and (3.40), respectively. For queue model II, the minimum energy per bit and wideband slope are given, respectively, by E b = µ Q2 log e 2 N 0 min Q2 (1 ε)f 1 M (ε), (3.45) 2µ Q2 log S 0 Q2 = (1 ε) e 2 σq2 2 θ + µ2 Q2 log e 2, (3.46) respectively, where µ Q2 and σ 2 Q2 are given by (3.41) and (3.42). Proof : See Appendix A.3. We immediately notice that for both queue models I and II, the minimum energy per bit E b N 0 min does not depend on the QoS exponent θ, and hence is not affected by the presence of QoS constraints. On the other hand, via µ and F 1 M (ε), E b N 0 min is a function of the deadline constraint M and the outage limit ϵ. This dependence will 59

79 be explored in the numerical results. We further notice that the wideband slope S 0 diminishes with increasing θ. Hence, stricter QoS constraints lead to smaller slopes, increasing the energy per bit requirements at the same throughput level. Proposition 1 For the same SNR, θ and channel fading, queue models I and II lead to the same minimum energy per bit. On the other hand, the system operating with queue model II achieves a higher wideband slope. Proof : See Appendix A.4. A detailed discussion of the characterization in Proposition 1 is provided in the numerical results Energy Efficiency of HARQ-CC with ON-OFF Discrete-Time Markov Source As mentioned in Section 2.2.2, when the arrival rate a i is not constant, the computation of the throughput is more involved. Generally, we need to express the LMGFs of the random arrival processes and random departure processes (or equivalently random wireless transmissions), and then solve (2.4) in order to determine the maximum average arrival rate r avg that can be supported by the wireless transmissions under statistical queuing constraints. In these cases, derivation of the minimum bit energy and wideband slope only involves the first and second order derivatives of r avg evaluated at SNR = 0, which can be obtained easily by taking the derivatives of both sides of (2.4) and letting SNR 0. In this subsection, we analyze the energy efficiency of HARQ-CC with fixed outage probability when we have ON-OFF discrete-time Markov sources. A detailed study about the throughput of the ON-OFF discrete-time Markov source under statistical queuing constraints is provided in Section Since the departure and arrival processes at the transmitter are independent, for both queue 60

80 models I and II, the expressions of µ and σ 2 in (3.39), (3.40), (3.41) and (3.42) are still valid. Theorem 4 For small QoS exponent θ and ON-OFF discrete-time Markov source, the minimum energy per bit and wideband slope of the HARQ-CC scheme with the outage constraint ϵ are given, respectively, by E b = µ Q1 log e 2 N 0 min Q1 F 1 M (ε), (3.47) 2 log e 2 S 0 Q1 = σ 2 Q1 θ+µ2 Q1 log e 2 µ Q1 + θζ (3.48) for queue model I, where µ Q1 and σ 2 Q1 are given by (3.39) and (3.40), respectively, and ζ is defined as ζ = (1 p 22)(p 11 + p 22 ) (1 p 11 )(2 p 11 p 22 ). (3.49) For queue model II, the minimum energy per bit and wideband slope are given by E b = µ Q2 log e 2 N 0 min Q2 (1 ε)f 1 (3.50) (ε), M S 0 Q2 = 2(1 ε) log e 2, (3.51) σq2 2 θ+µ2 Q2 log e 2 µ Q2 + θζ respectively, where µ Q2 and σ 2 Q2 are given by (3.41) and (3.42). Proof : See Appendix A Energy Efficiency of HARQ-CC with ON-OFF Fluid Markov Source In this subsection, we consider the ON-OFF fluid Markov sources, which is studied in detail in Section Using a similar approach as for the discrete-time Markov source, we can find the minimum energy per bit and wideband slope for the ON-OFF fluid Markov source as in the following result. 61

81 Theorem 5 For small QoS exponent θ and ON-OFF fluid Markov source, the minimum energy per bit and wideband slope of the HARQ-CC scheme with the outage constraint ϵ are given, respectively, by E b = µ Q1 log e 2 N 0 min Q1 F 1 M (ε), (3.52) 2 log e 2 S 0 Q1 = σ 2 Q1 θ+µ2 Q1 log e 2 µ Q1 + 2θβ α(α+β) (3.53) for queue model I, where µ Q1 and σq1 2 are given by (3.39) and (3.40), respectively. For queue model II, the minimum energy per bit and wideband slope are given, respectively, by E b = µ Q2 log e 2 N 0 min Q2 (1 ε)f 1 (3.54) (ε), S 0 Q2 = M 2(1 ε) log e 2 σ 2 Q2 θ+µ2 Q2 log e 2 µ Q2 + 2θβ α(α+β) (3.55) respectively, where µ Q2 and σ 2 Q2 are given by (3.41) and (3.42). Proof : See Appendix A Energy Efficiency of HARQ-CC with ON-OFF MMPS In this subsection, we investigate the energy efficiency of ON-OFF MMPS models. The throughput of MMPS is investigated in Section , and the following result identifies the the minimum energy per bit and wideband slope for the ON-OFF MMPS models. Theorem 6 For small QoS exponent θ and ON-OFF MMPS, the minimum energy per bit and wideband slope of the HARQ-CC scheme with the outage constraint ϵ are 62

82 given, respectively, by E b = eθ 1 N 0 min Q1 θ S 0 Q1 = θ e θ 1 µ Q1 log e 2 F 1 M (ε), (3.56) 2 log e 2 (3.57) σ 2 Q1 θ+µ2 Q1 log e 2 µ Q1 + 2θβ α(α+β) for queue model I, where µ and σ 2 are given by (3.39) and (A.37), respectively. For queue model II, the minimum energy per bit and wideband slope are given, respectively, by E b = eθ 1 N 0 min Q2 θ S 0 Q2 = θ e θ 1 µ Q2 log e 2 (1 ε)f 1 M (ε), (3.58) 2(1 ε) log e 2 (3.59) σ 2 Q2 θ+µ2 Q2 log e 2 µ Q2 + 2θβ α(α+β) respectively, where µ Q2 and σ 2 Q2 are given by (3.41) and (3.42). A comparison of the results in Theorems 3 6 is provided in the following subsection. Proof : See Appendix A Comparison of the Energy Efficiency for Different Arrival Models In this subsection, we compare the results obtained in the previous subsection for constant-rate arrivals, ON-OFF discrete-time and fluid Markov sources, and ON- OFF MMPS. In the first part below, we compare the results between constant-rate arrivals and ON-OFF discrete-time Markov sources. In the second part, we provide a comparison among constant-rate arrivals, ON-OFF fluid Markov sources and MMPS. Our analysis shows that source burstiness makes it difficult to satisfy the queuing constraint, which leads to degraded energy efficiency. The key parameters that have 63

83 significant impact on the energy efficiency of these random arrival models are the QoS exponent θ, ON state probability P ON and the state transition parameters p 21 and β Comparison Between Constant Arrival and ON-OFF Discrete- Time Markov Source The results on the minimum energy per bit and wideband slope for constant-rate arrivals and ON-OFF discrete-time Markov sources are given in Theorems 3 and 4, respectively. Since the results for queue model II are very similar to the results for queue model I with the only difference being the additional factor (1 ε), the discussion in this subsection is applicable to both queue models. We first observe that source randomness does not have any influence on the minimum energy per bit, and the results of minimum energy per bit shown in Theorem 4 are the same as in the case of constant-rate arrivals. On the other hand, source burstiness has an impact on the wideband slope. Compared with the case of constantrate arrivals, there is an additional term θζ in the denominator, and this additional term is only related to the arrival process. Since both of p 11 and p 22 are between 0 and 1, it is easy to verify that θζ 0, which means that random arrivals always degrade the wideband slope and make the system less energy-efficient compared with constant-rate arrivals. For the ON-OFF discrete-time Markov source, source burstiness is described by P ON and p 21. P ON represents the probability that the source is in ON state, and p 21 denotes the probability that the source transitions from ON state to the OFF state. When P ON = 1, ON-OFF discrete-time Markov source becomes a constantrate source, and p 11 = 0, p 22 = 1. Under this situation, we have ζ = 0, and the results in Theorem 4 specialize to those obtained in the case of the constant-rate arrivals. 64

84 For any P ON in the open interval (0, 1), we can rewrite the expression of ζ as ζ = (1 P ON)(1 p 21 2P ON ) p 21 P ON, (3.60) by applying the facts p 22 = 1 p 21, p 12 = p 21P ON 1 P ON and p 11 = 1 p 12 = 1 (1+p 21)P ON 1 P ON to (3.49). It can be easily verified that ζ is an decreasing function of both P ON and p 21, which means that higher P ON and p 21 values improve the energy efficiency by increasing the wideband slope. Also, we notice that when θ = 0, the additional term is zero, and the parameters of the arrival process do not have any influence on the energy efficiency. When θ becomes larger, the influence of source burstiness becomes more significant. An intuitive description for this is provided in the numerical results subsection Comparison Among Constant Arrivals, ON-OFF Fluid Markov Source and MMPS The results on the minimum energy per bit and wideband slope for constant-rate arrivals, ON-OFF fluid Markov sources and MMPS are given in Theorems 3, 5 and 6, respectively. Similarly as in the previous subsection, our following remarks are applicable to both queue models. From the comparison between Theorems 3 and 5, we notice that burstiness/randomness of the ON-OFF fluid Markov sources does not affect the minimum energy per bit, and it only results in the addition of the positive term 2θβ α(α+β) in the denominator of the wideband slope expressions in (3.53) and (3.55). Therefore, constant-rate arrival sources have better energy efficiency, compared with ON-OFF fluid Markov sources. Similar to ON-OFF discrete-time Markov sources, the burstiness of the ON-OFF fluid Markov sources is described by P ON and β. When P ON = 1, arrival rates become constant, and this additional term vanishes. For any P ON in the open interval (0, 1), 65

85 we can rewrite the expression of the additional term as 2θβ α(α + β) = 2θ (1 P ON) 2, (3.61) P ON β by applying α = P ON β 1 P ON. It can be easily verified that this additional term is an decreasing function of both P ON and β, which means that higher P ON and β values improve the energy efficiency by increasing the wideband slope. As in the case of ON-OFF discrete-time Markov sources, we notice that as θ increases, the effect of source burstiness becomes more pronounced, while the parameters of the arrival process do not have any influence on the energy efficiency when θ = 0. When comparing the results of Theorems 5 and 6, we assume that these two kinds of Markovian sources share the same α and β values. From the comparison, we notice that Poisson arrival model only leads to an additional factor of θ e θ 1 in the expressions of the minimum energy per bit and wideband slope. Therefore, β has the same impact as in the case of ON-OFF fluid Markov sources. For θ 0, we have θ e θ 1 1, resulting in a larger minimum energy per bit and smaller wideband slope for the ON-OFF MMPS compared to those for the ON-OFF Markov fluid source. Since the factor θ is a decreasing function of θ, the performance gap grows further as the e θ 1 queuing constraint gets stricter. Moreover, as a stark contrast to the observations in Sections and , the minimum energy per bit depends on θ when ON-OFF MMPS arrival model is considered. Therefore, we can conclude that among these three arrival models, highest energy efficiency is achieved in the case of constant-rate arrivals while ON-OFF MMPS leads to the worst levels of energy efficiency. 66

86 0-0.1 Queue model I Queue model II -0.2 log Pr{Q q} q Figure 3.9: Logarithmic buffer overflow probability vs. buffer overflow threshold Numerical Results In this subsection, we present numerical results to illustrate the energy efficiency of HARQ-CC in the presence of QoS constraints. In the first part, numerical results for the constant-rate arrival model are provided to demonstrate the influence of the deadline constraint M and outage probability ε. In the second part, we concentrate on the impact of random arrivals and source burstiness. Via Monte Carlo simulation, we verify the analytical results provided in our theorems. A comparison between queue models I and II is also provided in the first subsection, verifying Proposition Constant-Rate Arrival Models In this part, we analyze the energy efficiency of the HARQ-CC scheme with fixed transmission rate and constant arrival rate, and we assume Rayleigh fading channel with exponentially distributed fading power having a mean value of E{z} = 1 within this subsection. First, we have performed Monte Carlo simulations to verify that the constant arrival rate, or equivalently the effective capacity, given by (3.10) in Section 3.1 satisfies 67

87 Throughput (bits/symbol) Queue model I, ǫ=0.1, M=11 Queue model II, ǫ=0.1, M=11 Queue model I, ǫ=0.2, M=3 Queue model II, ǫ=0.2, M=3 Simulation for model I, ǫ=0.1, M=11 Simulation for model II, ǫ=0.1, M=11 Simulation for model I, ǫ=0.2, M=3 Simulation for model II, ǫ=0.2, M= Energy per bit, E /N (db) b 0 Figure 3.10: Throughput vs. energy per bit E b N 0. the statistical queuing constraint in both queue models I and II 7. In Fig. 3.9, we set the queuing constraint as θ = 0.1, choose the outage probability as ε = 0.1, and we plot the logarithmic buffer overflow probabilities log Pr{Q q} as functions of the buffer overflow threshold q for both queue models I and II. For each queue model, we have repeated the simulations 100 times, in each of which the simulation is conducted over time blocks. In the simulation, the values of µ Q1 and σ 2 Q1 are computed using (3.39) and (3.40), respectively, and µ Q2 and σ 2 Q2 are computed using (3.41) and (3.42), respectively. From Fig. 3.9, we observe that the logarithmic buffer overflow probabilities decrease linearly even starting from relatively small q values, agreeing with the characterizations in (2.1) and (2.2) 8. We have estimated the slopes via linear regression, and the estimated slopes for queue models I and II are and 7 In queue model I, when a deadline violation occurs and the packet is not successfully sent within an HARQ period of M time blocks, we keep the packet in the buffer, clear the accumulated signal at the receiver, and initiate the transmission anew in the next HARQ period. Under these assumptions, since the sole goal of the simulations is to keep track of the queue length, there is no distinction regarding whether a new packet is transmitted in the next HARQ period or the same packet is repeated. On the other hand, in queue model II, we discard the outdated packet from the buffer after violating the deadline constraint, and start transmitting a new packet. 8 Note that if (2.1) holds, then log(pr{q q}) θq + log ς and hence the logarithmic overflow probability decays linearly with slope θ as threshold q increases. 68

88 0.103 respectively, which are indeed very close to the desired value 0.1. In Fig. 3.10, we plot the throughput, which is C E (θ, SNR) for queue model I, and (1 ε)c E (θ, SNR) for queue model II, as a function of the energy per bit E b N 0, under two different outage constraints ϵ and deadline constraints M. The results in Fig are also validated via Monte Carlo simulations using (2.8). To determine the LMGF of the departure process, we have conducted simulations over time blocks and repeated this for times. We notice that analytical and simulation results agree perfectly for both queue models I and II. Note that the throughput for both queue models cannot exceed E{log 2 (1 + SNRz)}, which is the Shannon capacity achieved in the absence of queuing constraints. Since this throughput upper bound is an increasing concave function of SNR, the minimum energy per bit is achieved as SNR approaches 0, i.e., lim SNR = log e 2 SNR 0. Since we set E{z} = 1, the E{log 2 (1+SNRz)} E{z} minimum energy per bit cannot be smaller than log e 2, which is equal to 1.59 db. Comparing the curves of these two queue models, we find that queue model II has better energy efficiency. According to our results in Proposition 1, queue model I and II should achieve the same minimum energy per bit, while queue model II achieves a higher wideband slope in the constant-rate arrival model. In order to verify Proposition 1, we have computed the E b N 0 min and S 0 for both queue models I and II in Figs and All E b N 0 min and S 0 values in Figs and 3.12 are verified via simulations. For each pair of (throughput vs. E b N 0 min and S 0, we have obtained 50 points of the throughput curves E b N 0 db ) in the low-snr regime (SNR 33 db) from simulations, and estimated E b N 0 min and S 0 via linear regression according to (3.25). The maximum errors of E b N 0 min and S 0 are % and 0.3%, respectively. From Figs and 3.12, we observe that queue model I and II have the same minimum energy per bit, and the wideband slopes of queue model II are slightly greater than the wideband slopes of queue model I. This observation agrees with Proposition 1, and an intuitive 69

89 Minimun energy per bit E /N (db) b 0 min Wideband slope S Queue model I, M=2 Queue model II, M=2 Queue model I, M=10 Queue model II, M= Outage probability ǫ 2 Queue model I, M=2 Queue model II, M=2 Queue model I, M=10 1 Queue model II, M= Outage probability ǫ Figure 3.11: Minimum energy per bit E b N 0 min and wideband slope S 0 vs. outage probability ϵ. Minimum energy per bit E b /N 0 min (db) Wideband slope S Queue model I, ǫ=0.05 Queue model II, ǫ=0.05 Queue model I, ǫ=0.2 Queue model II, ǫ= Deadline constraint M 2 Queue model I, ǫ=0.05 Queue model II, ǫ=0.05 Queue model I, ǫ=0.2 1 Queue model II, ǫ= Deadline constraint M Figure 3.12: Minimum energy per bit E b N 0 min and wideband slope S 0 vs. constraint M. deadline 70

90 explanation is given as follows for the minimum energy per bit. The throughput of queue model I and II are given by r T H Q1 = R/µ Q1 and r T H Q2 = (1 ε)r/µ Q2, respectively, when θ = 0. From (A.46) in Appendix A.4, we have r T H Q1 = r T H Q2, which means that the throughput curves of these two queue models are exactly the same. This implies that E b N 0 min Q1 = E b N 0 min Q2 and S 0 Q1 = S 0 Q2, when θ = 0. Since minimum energy per bit does not depend on the queuing constraints, E b N 0 min Q1 = E b N 0 min Q2 is valid for any θ value. In Fig. 3.11, we display the minimum energy per bit E b N 0 min and wideband slope S 0 as functions of the outage probability constraint ϵ for two different values of the deadline constraint M. It is observed from the figure that the minimum energy per bit first decreases with increasing ϵ and then starts increasing after a certain threshold point. When the outage probability is small, the fixed transmission rate R is small, which leads to small departure rates for both queue models I and II. On the other hand, when the outage probability is large, the average transmission rate is small for both queue models I and II, because the transmitter wastes a whole HARQ period when an outage happens. Also, we observe that the wideband slope always decreases with increasing ϵ. In Fig. 3.12, the minimum energy per bit and wideband slope are plotted as functions of the deadline constraint M for both queue models I and II. It is seen that both the minimum energy per bit and wideband slope decrease with increasing M. Hence, by reducing the minimum energy per bit, relaxed deadline constraints lead to improvements in energy efficiency in the vicinity of E b N 0 min Random Arrival Models In this part, we investigate the impact of source randomness/burstiness on the energy efficiency. Within this subsection, we assume a Nakagami-m fading channel with m = 2, and E{z} = 1. Unless mentioned explicitly, QoS exponent is set to θ =

91 Queue model I Queue model II log Pr{Q q} q Figure 3.13: Logarithmic buffer overflow probability vs. buffer overflow threshold for the ON-OFF discrete-time Markov source. For all fixed outage probability results, we fix ε = 0.1. In Fig. 3.13, we provide the results of buffer simulations for the ON-OFF discrete-time Markov source, similarly as depicted in Fig The arrival rates in the ON state are given by (2.13) in Section for both queue models. We set p 11 = 0.4, p 22 = 0.7, and θ = 0.1 for both queue models. All other parameters are the same as in Fig The estimated slopes of queue models I and II are and 0.102, respectively, which are again very close to the desired value 0.1. As we have mentioned in Section 3.2.3, since source burstiness has similar impacts on queue models I and II, we only consider queue model I in the following discussion on the impacts of source burstiness. Figs and 3.15 demonstrate the influence of source burstiness considering both ON-OFF discrete-time Markov and Markov fluid sources for queue model I. For the ON-OFF discrete-time Markov source, the source burstiness is described by P ON and p 21, and the source burstiness is described by P ON and β for the ON-OFF Markov fluid source. As discussed in Section 3.2.3, larger P ON, p 21 and β values improve the energy efficiency for both queue models I and II. In Fig. 3.14, when fix p 21 = 0.3, the slopes of the throughput curves increase as P ON increases from 0.1 to Also, 72

92 0.25 Throughput (bits/symbol) P ON =0.1, P 21 =0.3 P ON =0.3, P 21 =0.3 P ON =0.75, P 21 =0.3 P ON =0.3, P 21 =0.1 P ON =0.3, P 21 = Energy per bit, E /N (db) b 0 Figure 3.14: Throughput vs. energy per bit E b N 0 source with fixed outage probability ε = 0.1. for ON-OFF discrete-time Markov 0.25 Throughput (bits/symbol) P ON =0.1, β=0.3 P ON =0.3, β=0.3 P ON =0.9, β=0.3 P ON =0.3, β=0.1 P ON =0.3, β= Energy per bit, E /N (db) b 0 Figure 3.15: Throughput vs. energy per bit E b N 0 fixed outage probability ε = 0.1. for ON-OFF Markov fluid source with 73

93 Throughput (bits/symbol) Markov fluid source, θ=0.001 MMPS, θ=0.001 Markov fluid source, θ=0.2 MMPS, θ= Energy per bit, E /N (db) b 0 Figure 3.16: Throughput vs. energy per bit E b N 0 MMPS with fixed outage probability ε = 0.1 for ON-OFF Markov fluid source and when P ON = 0.3 is fixed, the wideband slope increases as p 21 increases from 0.1 to 0.9. Since source burstiness does not affect the minimum energy per bit for both ON- OFF discrete-time Markov and Markov fluid sources, we observe that the throughput curves in Fig converge to the same minimum energy per bit. Similarly in Fig. 3.15, for the ON-OFF Markov fluid source, we can observe that larger P ON and β values increase the slope of the throughput curve, and all throughput curves in Fig again approach the same minimum energy per bit. When the average arrival rate is fixed, the arrival rate in the ON state increases as P ON decreases because r avg = rp ON, and large arrival rates make it difficult to satisfy the queuing constraint. Hence, larger P ON improves the energy efficiency for both ON-OFF discrete-time Markov and Markov fluid sources. When P ON is fixed, higher p 21 and β values make the source transition between two states more frequently. For the same P ON, more frequent state transitions make the queuing constraints to be satisfied more easily, because the change from ON state to OFF state gives the source the chance to clear its buffer. As we have mentioned in Section 74

94 3.2.3 that as θ increases, the influence of source burstiness becomes more significant, and the parameters of the arrival process do not have any influence on the energy efficiency when θ = 0, for both ON-OFF discrete-time Markov and Markov fluid sources. Therefore, larger P ON, p 21 and β values improve the energy efficiency by helping the system to satisfy queuing constraints more effectively, and this impact becomes more striking when the queuing constraints become stricter. If the system is not restricted by the queuing constraint, then the source burstiness does not affect the energy efficiency for the ON-OFF discrete-time Markov and Markov fluid sources. Finally, in Fig. 3.16, we compare the performances of ON-OFF Markov fluid source and MMPS for queue model I. As mentioned in Section 3.2.3, compared to the the minimum energy per bit and wideband slope of the ON-OFF Markov fluid source, the corresponding results for MMPS are scaled by the factor eθ 1 θ respectively. When θ is close to 0, both eθ 1 θ and its reciprocal, and its reciprocal approach 1. For this reason, the throughput curves of ON-OFF Markov fluid source and MMPS stay very close to each other in both figures when θ = As θ increases, the factor eθ 1 θ grows, which leads to larger gap between the throughput curves of these two types of Markov sources. For instance, we can easily observe from Fig that there is a 0.44 db difference between the corresponding minimum energy per bit values when θ =

95 Chapter 4 Throughput of Hybrid-ARQ under Statistical Queuing Constraints Using Recurrence Approach In this chapter, throughput of HARQ under statistical queuing constraints is studied via recurrence approach. Compared with the low-θ approximation used in Chapter 3, recurrence approach is more accurate for any QoS exponent value. However, it is difficult to obtain closed-form expression via recurrence approach. Therefore, we cannot conduct a similar energy efficiency analysis as we have done in Section 3.2. In Section 4.1, throughput of HARQ-CC schemes is studied in the presence of Markovian data arrivals and statistical queuing constraints. In particular, two queuing models are considered. Specifically, when outage occurs, the transmitter keeps the packet, lowers its priority, and attempts to retransmit it later in the first queue model while the packet is discarded and removed from the buffer in the second queue model. The throughput is investigated when outage constraints, statistical queuing constraints and deadline constraints are imposed. The deadline constraint provides a limitation on the number of retransmissions. Under these assumptions, throughput 76

96 characterizations are obtained for HARQ-CC scheme with three types of Markovian sources, namely the ON-OFF discrete-time and fluid Markov sources and MMPS. In Section 4.2, throughput of HARQ-IR schemes with finite blocklength codes is studied for both constant-rate and ON-OFF discrete-time Markov arrivals under statistical queuing constraints and deadline limits. After analyzing the decoding error probability and outage probability, the distribution of transmission period is characterized, and the throughput expressions are obtained for both arrival models. The analytical results are verified via Monte Carlo simulations. 4.1 Throughput of Hybrid-ARQ Chase Combining with ON-OFF Markov Arrivals under Statistical Queuing Constraints In this section, we study the throughput of HARQ-CC protocol under both statistical queuing constraints and deadline constraints. The system model is the same with in Section 3.2, and the discussions about HARQ-CC scheme, deadline constraints, outage probability, two queue models and the throughput metrics in Section are valid in this section as well. Different from the analysis in Section 3.2, we characterize the throughput via a recurrence approach, which provides sufficiently accurate results for any QoS exponent value Throughput of HARQ-CC Scheme with Queuing Constraints In this subsection, we study the throughput of HARQ-CC scheme under queuing constraints. Initially, we consider constant-rate arrivals, characterize throughput by employing the effective capacity formulation. Subsequently, we incorporate random 77

97 arrival models by considering three types of Markovian sources and determine the system throughput using the results obtained in the constant-rate arrival model Throughput of HARQ-CC Scheme with Constant-rate Arrivals Recall that an outage event happens if the receiver does not correctly decode the message within an HARQ period with a maximum duration of M time blocks. The formulation of the outage probability is given in (3.23). Correspondingly, the transmission rate that guarantees an outage probability of ϵ can be expressed as [9] R = log 2 ( 1 + F 1 M (ε)snr) (4.1) for both queue models I and II, where F 1 M function (CDF) of M i=1 z i. In this section, we define P t,v,qj is the inverse cumulative distribution as the probability that the duration of an HARQ period is t, and the number of packets removed from the queue in this HARQ period is v, for queue model j. From the discussion in Section , we have 1 t M, and v only takes two values, 1 or 0. In queue model I, v = 0 when outage occurs. In such cases, we have t = M, because outage happens only after the transmitter s unsuccessful M transmission attempts. P t,1,q1 is the probability that a transmission period ends up with success in the t th time block. From the above discussion, we have 0 t < M P t,0,q1 = ε t = M (4.2) 78

98 and P t,1,q1 = Pr = Pr { log 2 (1+SNR { t i=1 } { } t t 1 z i ) R Pr log 2 (1+SNR z i ) R i=1 z i F 1 M (ε) } Pr { t 1 i=1 i=1 z i F 1 M (ε) } (4.3) (4.4) =F t 1 ( F 1 M (ε)) F t ( F 1 M (ε)) (4.5) where F t is the CDF of t i=1 z i. In (4.3), we use the fact that the probability that the receiver decodes the packet successfully in the t th time block is equal to the probability that the receiver decodes the packet within t time blocks minus the probability that the receiver decodes the packet within t 1 time blocks. In queue model II, v can only be 1, because a packet definitely leaves the queue at the end of each HARQ period due to either successful transmission or packet drop. Similar to the discussion in queue model I, t < M only corresponds to successful transmission, and t = M corresponds to two cases, in which the receiver gets the packet in the M th time block, or an outage happens and the packet is dropped. Therefore, we have ( F t 1 F 1 M P t,1,q2 = (ε)) ( F t F 1 M (ε)) t < M ( F M 1 F 1 M (ε)) t = M. (4.6) For the case of t = M, we use the fact the P t,1,q2 = F M 1 ( F 1 M (ε)) F M ( F 1 M (ε)) +ε, and F M ( F 1 M (ε)) = ε. Proposition 2 The throughput of HARQ-CC scheme with fixed outage probability ε, 79

99 deadline constraint M, and constant-rate arrivals is given by C E,Q1 for queue model I r th = (1 ε)c E,Q2 for queue model II, (4.7) where the effective capacity for queue model j is given by C E,Qj = 1 θ log ( max{ λ 1,Qj,, λ M,Qj } ). (4.8) {λ 1,Qj,, λ M,Qj } are the eigenvalues of the matrix A Qj expressed as a 1,Qj a 2,Qj a M 1,Qj a M,Qj A Qj = (4.9) where [ ( Fk 1 F 1 M (ε)) ( F k F 1 M (ε)) ] e θr 1 k M 1 [ ( a k,qj = FM 1 F 1 M (ε)) ε ] e θr + ε k = M and j = 1 ( F M 1 F 1 M (ε)) e θr k = M and j = 2. (4.10) Proposition 2 can be directly obtained by inserting our characterization of P t,v,qj into Theorem 1 in [15]. Since the authors of [15] did not consider packet drop, we need to redefine the variable ν in [15] as the number of packets leaving the queue in a transmission period, in order to apply their theorem to our queue models. In the Section 4.1.2, simulation results are provided to verify our effective capacity characterization. 80

100 Throughput of HARQ-CC with ON-OFF Discrete-Time Markov Source Since the departure and arrival processes at the transmitter are independent, for both queue models I and II, the effective capacity characterizations in Proposition 2 are still valid. Theorem 7 For the ON-OFF discrete-time Markov source with fixed outage probability ε and deadline constraint M, the throughput in queue model I is given by r th = P ON θ ( e 2θC E,Q1 ) p 11 e θc E,Q 1 log, (4.11) 1 p 11 p 22 + p 22 e θc E,Q 1 and the throughput in queue model II is given by r th = (1 ε) P ON θ ( e 2θC E,Q2 ) p 11 e θc E,Q 2 log, (4.12) 1 p 11 p 22 + p 22 e θc E,Q 2 where the effective capacities C E,Q1 and C E,Q2 are given in (4.8). According to our discussion in Section , the throughput of queue model I is given by r avg, and the throughput of queue model II is given by (1 ε)r avg. With the result in (2.14), we readily obtain the throughput expressions given in Theorem Throughput of HARQ-CC with ON-OFF Fluid Markov Source Theorem 8 For ON-OFF fluid Markov source with fixed outage probability ε and deadline constraint M, the throughput in queue model I is given by r th = P ON C E,Q1 θc E,Q1 + α + β θc E,Q1 + α, (4.13) 81

101 and the throughput in queue model II is given by r th = (1 ε)p ON C E,Q2 θc E,Q2 + α + β θc E,Q2 + α, (4.14) where the effective capacities C E,Q1 and C E,Q2 are given in (4.8). The throughput expressions in Theorem 8 can be determined directly using the maximum average arrival rate given in (2.20) Throughput of HARQ-CC with ON-OFF MMPS Theorem 9 For ON-OFF MMPS with fixed outage probability ε and deadline constraint M, the throughput in queue model I is given by θ r th = P ON C E,Q1 e θ 1 θc E,Q1 + α + β θc E,Q1 + α, (4.15) and the throughput in queue model II is given by θ r th = (1 ε)p ON C E,Q2 e θ 1 θc E,Q2 + α + β θc E,Q2 + α, (4.16) where the effective capacities C E,Q1 and C E,Q2 are given in (4.8). The throughput expressions in Theorem 9 can be obtained directly using the maximum average arrival rate given in (2.24) Numerical Results In this subsection, we further investigate the throughput of HARQ-CC with ON-OFF Markov arrivals in the presence of QoS constraints. Throughout this subsection, we assume Rayleigh fading channel with exponentially distributed fading power having a mean value of E{z} = 1. 82

102 log Pr{Q>q} Constant arrival, queue model I Constant arrival, queue model II -1.4 Markov source, queue model I Markov source, queue model II q Figure 4.1: Logarithmic buffer overflow probability vs. buffer overflow threshold. First, we present Monte Carlo simulation results to verify the characterizations in Proposition 2 and Theorem 7. In Fig. 4.1, we set the queuing constraint as θ = 0.2, choose the outage probability as ε = 0.1, and we plot the logarithmic buffer overflow probabilities log Pr{Q q} as functions of the buffer overflow threshold q for both queue models with constant-rate arrivals and ON-OFF discrete-time Markov arrivals. For each curve, we repeat the simulations for 100 times, and in each time the simulation is conducted over time blocks. For the constant-rate arrival model, the arrival rates are given by C E,Q1 and C E,Q2 described in (4.8), for queue models I and II, respectively. For the ON-OFF discrete-time Markov arrivals, we set p 11 = 0.4, p 22 = 0.7, and the arrival rates in the ON state are given by (2.13) for both queue models. From Fig. 4.1, we observe that the logarithmic buffer overflow probabilities decrease linearly even starting from relatively small q values, which agrees with the characterizations in (2.1) and (2.2) 1. We estimate the slopes via linear regression. The estimated slopes of these four curves are (constant arrival for queue model I), (constant arrival for queue model II), (Markov source for queue model I), and (Markov source for queue model II), and the maximum slope 1 Note that if (2.1) holds, we have log Pr{Q q} θq + log ς 83

103 Throughput (bits/symbol) θ=0.1, queue model I θ=0.1, queue model II θ=1, queue model I θ=1, queue model II Deadline constraint M Figure 4.2: Throughput vs. deadline constraint M. error of these four curves is 0.3%, which is very small, demonstrating the accurateness of the throughput characterizations. In Fig. 4.2, we set the outage probability as ε = 0.05, and plot the throughput under constant-rate arrivals as a function of the deadline constraint M. We can observe that there exists an optimal M that maximizes the throughput. When M is small, the deadline constraint is strict and the transmitter has to reduce the fixed rate R to satisfy the target outage probability. As M increases, the transmission rate R increases, and the throughput is improved. After the throughput reaches its maximum value, further increase in M results in reduced throughput. For large M and fixed outage probability, the system wastes more time when outage happens, which is not favorable in the presence of the queuing constraint. Also, we find that as θ increases, the throughput becomes smaller in order to satisfy a stricter queuing constraint. In Fig. 4.3, we set the deadline constraint as M = 3, and plot the throughput under constant-rate arrivals as a function of the outage probability ε. Similarly, we can observe that there exists an optimal ε that maximizes the throughput, and the explanation is similar to the influence of M in Fig When we compare the 84

104 Throughput (bits/symbol) θ=0.1, queue model I θ=0.1, queue model II θ=1, queue model I θ=1, queue model II Outage probability ǫ Figure 4.3: Throughput vs. outage probability ε. throughputs of queue models I and II under constant-rate arrivals in Figs. 4.2 and 4.3, we observe that queue model II always has higher throughput than queue model I. When there is no queuing constraint, the throughput of queue model I is r th = r avg = (1 ε)r, and the throughput of queue model II is r E{T } th = (1 ε)r avg = (1 ε) R [7] [15], E{T } where T is the duration of a transmission period. Therefore, these two queue models have the same throughput in the absence of queuing constraints. When the queuing constraint is imposed, queue model II has an advantage. Moreover, the throughput gap between these two queue models increases when we increase θ from 0.1 to 1. In Figs. 4.4 and 4.5, we fix M = 10, ε = 0.01 and θ = 0.1 to investigate the impact of source randomness on the throughput. Since source randomness has similar impact on queue models I and II, we only consider queue model I in the following discussion about the influence of source randomness. In Fig. 4.4, we plot the throughput as a function of P ON for the ON-OFF discrete-time Markov source. We can observe that the throughput is an increasing function of P ON. As P ON decreases, the arrival rate in ON state r needs to increase with a certain rate in order to keep r avg non-decreasing. However, with the same departure process, it is difficult to satisfy the queuing constraints and keep the throughput non-decreasing simultaneously. 85

105 0.45 Throughput (bits/symbol) p 21 =0.1 p 21 =0.2 p 21 =0.3 p 21 = P ON Figure 4.4: Throughput vs. P ON for the ON-OFF discrete-time Markov source Throughput (bits/symbol) β=0.1, ON-OFF fluid Markov source β=1, ON-OFF fluid Markov source β=0.1, MMPS β=1, MMPS P ON Figure 4.5: Throughput vs. P ON for the ON-OFF Markov fluid source and MMPS. 86

106 Therefore, the throughput decreases as P ON decreases. Similar explanation can be applied to the ON-OFF Markov fluid source and MMPS in Fig As P ON 1, the ON-OFF discrete-time and fluid Markov sources become constant-rate arrival sources, which implies that constant-rate arrivals have higher throughput. We also observe in Fig. 4.4 that higher p 21 values improve the throughput. For the same P ON, higher p 21 values make the source transitions from the ON state to the OFF state to occur more frequently, giving a better chance for the transmitter to shorten its queue length. Similarly, in Fig. 4.5, higher β values improve the throughput for both ON-OFF Markov fluid source and MMPS. Finally, in Fig. 4.5, we find that the ON-OFF Markov fluid source has higher throughput than MMPS. Comparing the results in Theorems 8 and 9, we note that there is an additional factor θ e θ 1 traightforward to show that lim θ 0 in the throughput expression of MMPS. It is s- θ e θ 1 = 1, and θ e θ 1 is an decreasing function of θ. These properties indicate that the throughput of MMPS improves as θ decreases, and it has the same throughput as the ON-OFF Markov fluid source when there is no queuing constraint. 4.2 Throughput of HARQ-IR with Finite Blocklength Codes and QoS Constraints System Model and Preliminaries As depicted in Fig. 3.1, the same point-to-point wireless communication system is considered in this section. It is assumed that arriving data packets are initially stored in a buffer at the transmitter, which operates under queuing constraints, before being sent to the receiver. Also, we assume a block flat-fading channel in which the fading coefficients stay constant within one block, but change independently across blocks. 87

107 Each fading block is assumed to have a duration of l symbols. We use subscript i as the index of the fading block. Under these assumptions, the received signal in the i th block can be written as y i = h i x i + n i i = 1, 2,... (4.17) Above, x i and y i are the transmitted and received signal vectors, respectively, and h i denotes the channel fading coefficient. The average transmission energy per symbol of the transmitted signal x i is given by E = E{ x i 2 }/l. Also, n i represents the noise vector with i.i.d. circularly-symmetric, zero-mean Gaussian components, each with variance N 0. Therefore, we can denote the signal-to-noise ratio at the transmitter as SNR = E N HARQ-IR and Deadline Constraints It is assumed that the system employs HARQ-IR scheme to guarantee the reliability of transmissions. The transmission rate is fixed at lr (bits/block) at the transmitter, where l is the number of symbols in each fading block, or equivalently the blocklength of each codeword, and R is the fixed rate in bits/symbol. Each packet is encoded into M codeword blocks (where M also represents the deadline constraint introduced below), and each block has a length of l symbols. In each time block, transmitter sends one codeword block to the receiver. If the receiver decodes the received packet correctly, it sends an ACK feedback to the transmitter through an error-free feedback link, and a new packet is sent in the next time block. If the receiver cannot decode the packet, a retransmission request is sent through the feedback link, and another codeword block of the same packet is sent in the next time block [5]. For simplicity, we assume an ideal ARQ protocol in our analysis, in which the transmitter gets the feedback immediately at the end of each time block without any delay. 88

108 Deadline constraint limits the maximum duration of a transmission period as M time blocks. We assume that an HARQ period lasts until either the receiver gets the packet without error or if the limit on the duration of the transmission period is reached, and then the transmitter starts with another packet in the next transmission period. An outage happens when the receiver does not receive the packet within one transmission period, or equivalently M decoding errors occur successively for a packet. The outdated packet is discarded in such a case. The duration of an HARQ period is denoted by the random variable T, where 1 T M, and the outage probability (or equivalently deadline violation probability) is represented by ε. In the HARQ-IR scheme, additional information is sent in each retransmission and the receiver combines all received code blocks in the same transmission period to decode the transmitted packets. Detailed introduction about HARQ-IR is provided in Section 3.1. Under the constant-rate arrival assumption, the throughput (in bits/symbol) is given by (1 ε) times the effective capacity (normalized by the blocklength l): r th = (1 ε)c E (θ, SNR)/l = 1 ε lθ Λ c( θ). (4.18) When the arrival rate is not constant, we need to formulate the LMGF of the arrival process as a function of the average arrival rate, and obtain the throughput by solving (2.4) Throughput of HARQ-IR with Queuing Constraints and Finite Blocklength Codes In this subsection, we study the throughput of HARQ-IR scheme with statistical queuing constraints, finite blocklength codes, and deadline limits. Initially, we consider constant-rate arrivals, and characterize the throughput by employing the effective 89

109 capacity formulation. Subsequently, we incorporate random arrival models by considering ON-OFF discrete-time Markov sources and determine the system throughput using the characterizations obtained in the constant-rate arrival model Outage Probability for HARQ-IR at Finite Blocklengths As noted before, with HARQ-IR, the received information is accumulated at the receiver. At the end of the m th trial in a transmission period, the receiver combines the m received codeword blocks to decode the packet, which is equivalent to decoding a codeword with m subblocks and each subblock has a length of l symbols from the perspective of achievable rate. According to the results in [85], the relationship between the fixed transmission rate and error probability is given by R = m log 2 (1 + SNRz i ) m i=1 i=1 (SNRz i + 2)SNRz i Q 1 (ν) log l(snrz i + 1) 2 2 e + log(ml) + o(1) l l (4.19) for the m th trial, where l is the blocklength, Q 1 ( ) represents the inverse Q-function, ν is the decoding error probability, and z i = h i 2 is the magnitude-square of the fading coefficient. From (4.19), we can express the decoding error probability for the m th trial or attempt of packet transmission as m ν m = Q i=1 log 2 (1 + SNRz i ) + log(ml) R l m (4.20) (2+SNRz log 2 e i )SNRz i i=1 l(snrz i +1) 2 for given channel fading z = (z 1,, z m ). Therefore, we can obtain the probability mass function (pmf) of T, which represents the duration of a transmission period, as 90

110 expressed in (4.21) 1 E z {ν 1 } for t = 1 Pr{T = t} = Pr{T t} Pr{T t 1} = E z {ν t 1 } E z {ν t } for 2 t M 1 E z {ν M 1 } for t = M (4.21) One assumption we have is that if the receiver cannot decode the packet correctly using all m received codeword blocks in the m th trial, then it cannot decode the packet in the previous m 1 trials. This is due to the fact that it is more difficult to have correct decoding with less information, and the receiver uses less codeword blocks in previous trials. Therefore, the probability that the receiver decodes a packet within t trials is given by Pr{T t} = 1 E z {ν t }. (4.22) In (4.21), for 2 t M 1, T = t indicates that only the t th trial has been successful, and the first t 1 trials of packet transmission has ended up with error. Having T = M indicates that the first M 1 trials have failed, and the result of the last attempt does not have any influence on the pmf, because the deadline constraint forces the transmission to cease after M trials. Recall that an outage (or deadline violation) event happens if the receiver experiences M decoding errors successively in a transmission period. The outage probability can be expressed as ε = E z {ν M }. (4.23) In Fig. 4.6, we set the blocklength as l = 100 and SNR as 0 db, and plot the outage probabilities as functions of the fixed transmission rate R for different deadline 91

111 M=3 M=5 M=8 Outage probability ǫ R (bits/symbol) Figure 4.6: Outage probability vs. R constraints M. As R increases, the decoding error probability ν increases, which leads to an increased outage probability. We also note that as M increases, the deadline constraints become looser, and we can expect lower outage probabilities. Finally, the fixed transmission rate R can be numerically determined using (4.23) and (4.20) once a target outage probability ε and deadline limit M are specified Throughput of HARQ-IR with Constant-rate Arrivals Proposition 3 The throughput of HARQ-IR scheme (in bits/symbol) with fixed transmission rate R (bits/symbol), deadline constraint M, QoS exponent θ, and constantrate arrivals is given by r th = (1 ε)c E /l (4.24) where the effective capacity C E (in bits/block) is given by C E = 1 θ log (max{ λ 1,, λ M }). (4.25) 92

112 Above, {λ 1,, λ M } are the eigenvalues of the matrix A given in (4.26) Pr{T = 1}e θlr Pr{T = 2}e θlr Pr{T = M 1}e θlr Pr{T = M}e θlr A = (4.26) Proposition 3 can be shown by applying Theorem 1 in [15] to our model. Since the authors in [15] did not consider packet drop, we need to redefine the variable ν in [15] as the number of packets leaving the queue in a transmission period, which is always equal to 1 in our model due to the packet drop mechanism. In the Section 4.2.3, simulation results are provided to verify our effective capacity characterization Throughput of HARQ-IR with ON-OFF Discrete-Time Markov Source Since the departure and arrival processes at the transmitter are independent, the effective capacity characterization in Proposition 3 is still valid. Theorem 10 For the ON-OFF discrete-time Markov source with fixed transmission rate R (bits/symbol), deadline constraint M, and QoS exponent θ, the throughput (in bits/symbol) is given by r th = 1 ε P ON l θ ( log e 2θC E p 11 e θc E 1 p 11 p 22 + p 22 e θc E ), (4.27) where the effective capacity C E is given in (4.25). Similar to the discussion in Section , the throughput is given by (1 ε)r avg. Using the maximum average arrival rate given in (2.14), we readily obtain the through- 93

113 put expressions given in Theorem 10. From (4.27), it is very easy to show that the throughput of HARQ-IR with ON- OFF discrete time Markov sources is an increasing function of the effective capacity C E by checking the first order derivative of r th with respect to C E. Therefore, the transmission parameters, such as R and ε, that maximize the throughput for the constant-rate arrival models also maximize the throughput for the ON-OFF discrete time Markov arrival model. The influence of the source burstiness was discussed in [86], in which it was shown that source burstiness degrades the energy efficiency under queuing constraints. Similar analysis can be applied to our scenario. The source is less bursty if it stays in the ON state for a longer period, resulting in smaller instantaneous arrival rates r for fixed average arrival rate r avg. In other words, for different sources with the same r avg, the one with less burstiness or equivalently smaller instantaneous arrival rate r is more favorable in terms of satisfying the queuing constraints Numerical Results In this subsection, we further investigate the throughput of HARQ-IR with finite blocklength codes in the presence of deadline and QoS constraints. Throughout this subsection, we assume Rayleigh fading channel with exponentially distributed fading power having a mean value of E{z} = 1, and SNR is set as 0 db. We first verify our characterizations in Proposition 3 and Theorem 10 via Monte Carlo simulations. The logarithmic buffer overflow probabilities log Pr{Q q} are plotted as functions of the buffer overflow threshold q for both constant-rate arrivals and ON-OFF discrete-time Markov arrivals in Fig For both arrival models, we set the QoS exponent as θ = 0.01, deadline constraint M = 5, fixed transmission rate R = 3 (bits/symbol), and blocklength l = 100. For each curve, we repeat the simulations 2000 times, and in each time the simulation is conducted over time blocks. For the constant-rate 94

114 Logarithmic buffer overflow probability log(pr{q q}) Constant-rate arrival ON-OFF Markov source Buffer overflow threshold q Figure 4.7: Logarithmic overflow probability vs. buffer overflow threshold. arrival model, the arrival rate is given by the effective capacity in (4.25). For the ON-OFF discrete-time Markov arrivals, we set p 11 = 0.3, p 22 = 0.7, and the arrival rates in the ON state are given by lr (bits/block) in (2.13). It is observed in Fig. 4.7 that the logarithmic buffer overflow probabilities decrease almost linearly when q is sufficiently large, which agrees with the characterizations in (2.1) and (2.2) 2. When q > 1100, we estimate the slopes of these two curves via linear regression, and the estimated slopes are and for the constant-rate and ON-OFF Markov arrival models, respectively. The slope errors are smaller than 1%, which is very small, demonstrating the accurateness of the throughput characterizations. In Fig. 4.8, we plot the throughput as a function of the fixed transmission rate R for different deadline constraints with constant arrival sources, and QoS exponent θ = 0.1. As shown in Fig. 4.8, there exists a unique optimal R value that maximizes the throughput. When R is small, the departure rate is also small, which limits the throughput. When R is too large, the outage probability ε is large, and most of the packets violate the deadline constraint and are discarded. Also, as M increases, the deadline constraints become looser, and we can achieve a higher maximum throughput 2 Note that if (2.1) holds, we have log Pr{Q q} θq + log ς 95

115 M=3 M=5 M=8 Throughput (bits/symbol) R (bits/symbol) Figure 4.8: Throughput vs. fixed transmission rate R. as seen in Fig Similar results were observed in [87] for small θ values without considering finite blocklength effects. In [87], it has been shown that the throughput of HARQ-IR is an increasing function of the fixed transmission rate R without deadline constraints. Therefore, we can have higher R values and low outage probabilities when M is large, which improves the maximum throughput. Fig. 4.9 shows the influence of the finite blocklength l for constant arrival models with QoS exponent θ = 0.1. In order to apply the approximation in (4.19), the blocklength l needs to be sufficiently large. In such a case, the throughput decreases as the block length l increases, because large l corresponds to slow fading 3, which is not favorable for delay sensitive systems with queueing constraints. In slow fading cases, strong attenuation would last for a longer time, leading to buffer overflows. Therefore, larger l value is expected to have a stronger impact on the throughput when the system has stricter queueing constraints. Indeed, this is observed in Fig. 4.9 where we see that the throughput curves associated with larger values of θ (indicating stricter queuing constraints) decrease faster with increasing l. 3 This is by our block-fading assumption in which each fading block consists of l symbols. 96

116 0.65 Throughput (bits/symbol) θ=0.001 θ=0.005 θ= Blocklength l Figure 4.9: Throughput vs. blocklength l. 97

117 Chapter 5 Throughput of Cooperative Relay Networks under Statistical Queuing Constraints In this chapter, we investigate the throughput of cooperative relay networks under statistical queuing constraints. Three types of cooperative relay networks are considered, namely two-hop relay channel, two-way relay channel and multi-source multi-destination relay network. In Section 5.1, throughput of two-hop wireless relay channels is studied in the finite blocklength regime. Half-duplex relay operation, in which the source node initially sends information to the intermediate relay node and the relay node subsequently forwards the messages to the destination, is considered. It is assumed that all messages are stored in buffers before being sent through the channel, and both the source node and the relay operate under statistical queueing constraints. After characterizing the transmission rates in the finite blocklength regime, the system throughput is formulated via queueing analysis. Subsequently, several properties of the throughput function in terms of system parameters are identified, and an efficient algorithm is 98

118 proposed to maximize the throughput. Interplay between throughput, queueing constraints, relay location, time allocation, and code blocklength is investigated through numerical results. In Section 5.2, throughput of two-way relaying under buffer constraints is studied. In the two-way relay system, source nodes initially send their messages to the relay in the multiple-access phase. Relay decodes and stores the messages from different sources in different buffers and subsequently broadcasts a superimposed signal. It is assumed that both source nodes and the relay operate in the presence of statistical queueing constraints. Under these assumptions, arrival rates that can be supported in this system are investigated through the LMGFs of the arrival and service processes. In particular, after identifying the service rates in the multiple-access and broadcast phases and addressing the stability conditions, characterizations of the maximum arrival rates are provided in terms of system resource allocation parameters, signalto-noise ratios, and quality-of-service exponents. Impact of different parameters on the performance is investigated through numerical results. In Section 5.3, the throughput of relay networks with multiple source-destination pairs under queueing constraints has been investigated for both variable-rate and fixed-rate schemes. When CSI is available at the transmitter side, transmitters can adapt their transmission rates according to the channel conditions, and achieve the instantaneous channel capacities. In this case, the departure rates at each node have been characterized for different system parameters, which control the power allocation, time allocation and decoding order. In the other case of no CSI at the transmitters, a simple ARQ protocol with fixed rate transmission is used to provide reliable communication. Under this ARQ assumption, the instantaneous departure rates at each node can be modeled as an ON-OFF process, and the probabilities of ON and OFF states are identified. With the characterization of the arrival and departure rates at each buffer, stability conditions are identified and effective capacity analysis 99

119 Figure 5.1: The two-hop relay system with buffer constraints. is conducted for both cases to determine the system throughput under statistical queueing constraints. In addition, for the variable-rate scheme, the concavity of the sum rate is shown for certain parameters, helping to improve the efficiency of parameter optimization. 5.1 Throughput of Two-Hop Wireless Channels with Queuing Constraints and Finite Blocklength Codes System Model and Preliminaries System Model The two-hop relay channel is shown in Fig In this model, source node S sends information to the receiver D with the help of the intermediate relay node R. We assume that there is no direct link between S and D (which, for instance, holds, if these nodes are sufficiently far apart in distance). All data packets are stored in buffers before being transmitted through wireless channels. Data arriving to the source node S is initially buffered before transmission to the relay. Similarly, the relay, upon receiving the signal from the source node and decoding the message, places the decoded data from the source in its own buffer before forwarding it to 100

120 the destination D. Both the source and the intermediate relay nodes operate under statistical queuing constraints. More specifically, buffer violation probabilities are constrained to decay exponentially for large buffer thresholds. Detailed discussion on the queuing constraints is provided in Chapter 2. Since we consider half-duplex relay operation, reception and transmission at the relay occur in non-overlapping intervals. We introduce the parameter τ (0, 1) as the fraction of time allocated to the initial phase, in which only the S R link is active. Then, the fraction of time allocated to the second phase is 1 τ, in which only the R D link is active. Next, we express the discrete-time input and output relationships in both phases. In the initial phase, the signal Y r received at the relay can be expressed as Y r [i] = h 1 [i]x[i] + n r [i] (5.1) where X denotes the signal transmitted from the source node, h 1 represents the fading coefficient of the S R link, and n r is the Gaussian noise at the relay. In the second phase, the received signal Y at receiver D is given by Y [i] = h 2 [i]x r [i] + n D [i] (5.2) where X r denotes the signal sent from the relay node, h 2 represents the fading coefficient of the R D link, and n D is the Gaussian noise at the receiver D. The inputs are subject to individual average energy constraints E{ X 2 } P s /B and E{ X r 2 } P r /B, where B is the bandwidth in the system. We assume that the fading coefficients h j, j = {1, 2} are jointly stationary and ergodic discretetime processes, and we denote the magnitude-square of the fading coefficients by z j [i] = h j [i] 2. Above, in the channel input-output relationships, the noise components are all zero-mean, circularly symmetric, complex Gaussian random variables 101

121 with variance E{ n k [i] 2 } = N k for k = {r, D}, and all noise samples are assumed to form an i.i.d. sequence. We denote the signal-to-noise ratios at the source and relay by SNR s = P s N rb and SNR r = P r N D B, respectively Coding Rate in Finite Blocklength Regimes In this section, we investigate the throughput achieved with finite blocklength coding. In the complex Gaussian noise channel with channel gain z, the coding rate (in bits per channel use) is approximated by [18] [19] r = log 2 (1 + SNRz) 1 m ( 1 ) 1 Q 1 (ϵ) log (SNRz + 1) 2 2 e + log m m + O(1) m (5.3) where m represents the coding blocklength, ϵ (0, 1) denotes the error probability, Q 1 ( ) is the inverse Gaussian Q-function, and O(1) denotes a constant term 1. Hence, the approximation becomes more accurate as m increases. Also, the coding rate is an monotonic increasing function of the target error probability ϵ. Moreover, as the blocklength m grows without bound, the coding rate becomes r = log 2 (1 + SNRz), which is the Shannon capacity. For our half-duplex two-hop relay system, every time block is divided into two for the two transmission phases. Therefore, the coding blocklength of the source and relay are given by τm and (1 τ)m, respectively. From (5.3), we can characterize the coding rates of the source and relay nodes as r 1 = log 2 (1 + SNR s z 1 ) + log(τm) τm 1 τm ( 1 1 (SNR s z 1 + 1) 2 ) Q 1 (ϵ 1 ) log 2 e (5.4) 1 Therefore, the term O(1) m vanishes fast with increasing blocklength m and is neglected in the remainder of the formulations and analysis 102

122 and r 2 = log 2 (1 + SNR r z 2 ) + ( ) log (1 τ)m (1 τ)m 1 (1 τ)m ( 1 1 (SNR r z 2 + 1) 2 ) Q 1 (ϵ 2 ) log 2 e (5.5) respectively, where ϵ 1 and ϵ 2 are the target error probabilities of the S R and R D transmissions, respectively Throughput of Two-hop Relay Channels With Finite Blocklength Codes In this subsection, we characterize the system throughput for the two-hop relay system operating under queuing constraints in the finite blocklength regime. In order to apply the effective capacity analysis, we have to guarantee that the stability conditions are satisfied, which require that the average arrival rate is smaller than the average departure rate at both the source and relay nodes. At the source node, the stability condition is guaranteed by (2.25) due to the fact that the constant arrival rate given by (2.25) is always smaller than the average transmission rate between the source and relay 2. At the relay node, the stability condition is expressed as (1 ϵ 1 )τe{r 1 } (1 ϵ 2 )(1 τ)e{r 2 } (5.6) where r 1 and r 2 are given by (5.4) and (5.5), respectively. In [17], the throughput of the two-hop relay channel is studied for both half- and full-duplex scenarios without considering finite blocklength restrictions. Hence, instantaneous transmission rates were given by the Shannon capacities in [17]. Through 2 It can be shown that the right-hand-side of (2.25) increases with decreasing θ and converges to the average departure rate as θ approaches zero [17]. 103

123 a similar analysis, we can extend this result to the finite blocklength regime. Theorem 11 For the half-duplex two-hop relay system with finite code blocklength m, the maximum arrival rate (in bits per channel use) at the source node is given by { min { R = min } 1 mθ 1 log E z1 {ϵ 1 + (1 ϵ 1 )e τθ 1mr 1 }, 1 mθ 2 log E{ϵ 2 + (1 ϵ 2 )e (1 τ)θ 2mr 2 } 1 ( mθ 1 log E{ϵ2 + (1 ϵ 2 )e (1 τ)θ 2mr 2 } + log E{ϵ 1 + (1 ϵ 1 )e τ(θ 2 θ 1 )mr 1 } ), } 1 mθ 1 log E z1 {ϵ 1 + (1 ϵ 1 )e τθ 1mr 1 } θ 2 θ 1 θ 2 > θ 1 (5.7) when the stability condition (5.6) is satisfied. Proof : See Appendix A Throughput Optimization for Two-hop Relay Systems In this subsection, we investigate the throughput maximization problem for our twohop relay system. We assume that both the source and relay nodes transmit at their maximum power level, the blocklength m is given, and the system maximizes its throughput by choosing the optimal τ, ϵ 1 and ϵ 2. This optimization problem can be formulated as Maximize τ,ϵ1,ϵ 2 R Subject to (1 ϵ 1 )τe{r 1 } (1 ϵ 2 )(1 τ)e{r 2 } (5.8) 0 < τ < 1 (5.9) 0 < ϵ 1 < 1 (5.10) 0 < ϵ 2 < 1 (5.11) 104

124 Compared to the rate optimization in the absence of finite blocklength code restrictions, our optimization problem has more complicated rate expressions and higher dimensionality. Moreover, since the stability region described by (5.8) is not convex, and the objective function R given in (5.7) is not a convex function of the target error probabilities, this optimization problem is in general non-convex. In this setting, we initially establish several key properties of the throughput R and determine the optimal error probabilities ϵ 1 and ϵ 2 under certain conditions. Subsequently, we propose an algorithm that can efficiently solve the optimization problem. Theorem 12 For a given τ value, the error probability that maximizes 1 θ 1 Λ S,R ( θ 1 ) is given by the solution of { 1 = E (1 ϵ } 1)θ 1 1 τm(1 log 2 (1 + SNR s z 1 ) ) Q 1 (ϵ 2 1 )e τθ 1mr 1 + e τθ 1mr 1, (5.12) and the error probability that maximizes 1 θ 2 Λ R,D ( θ 2 ) is given by the solution of { 1 =E (1 ϵ } 2)θ 2 1 (1 τ)m(1 log 2 (1 + SNR r z 2 ) ) Q 1 (ϵ 2 2 )e (1 τ)θ 2mr 2 + e (1 τ)θ 2mr 2. (5.13) Proof 1 It was shown in [20] that the unique optimal error probability that maximizes 1 Λ( θ) in a single-hop model is obtained by solving θ Λ( θ) = 0. (5.14) ϵ This property can be directly applied to our half-duplex two-hop system. After taking the derivatives of Λ S,R ( θ 1 ) and Λ R,D ( θ 2 ) with respect to ϵ 1 and ϵ 2 respectively, and plugging them into (5.14), we obtain (5.12) and (5.13). 105

125 Theorem 13 For a given τ value, assume that ϵ 1 and ϵ 2 are the solutions of (5.12) and (5.13), respectively. Then, the throughput given in (5.7) is maximized at (ϵ 1, ϵ 2) when θ 1 θ 2. Proof 2 When θ 1 θ 2, we have { R = min 1 Λ S,R ( θ 1 ), 1 } Λ R,D ( θ 2 ). (5.15) θ 1 θ 2 Since ϵ 1 and ϵ 2 maximize 1 θ 1 Λ S,R ( θ 1 ) and 1 θ 2 Λ R,D ( θ 2 ), respectively, Theorem 13 follows immediately. Theorem 14 For a given τ value, assume that ϵ 2 is the solution of (5.13), and ϵ 1 is the optimal probability that maximizes { ˆR = min 1 Λ S,R ( θ 1 ), 1 ( } Λ R,D ( θ 2 ) θ 1 θ 1 + Λ S,R (θ 2 θ 1 )). (5.16) ϵ2 =ϵ 2 Then, the throughput given in (5.7) is maximized at ( ϵ 1, ϵ 2) when θ 1 < θ 2. Proof 3 When θ 1 < θ 2, we have { R = min 1 Λ S,R ( θ 1 ), 1 ( } Λ R,D ( θ 2 ) + Λ S,R (θ 2 θ 1 )). (5.17) θ 1 θ 1 It can be readily shown that the throughput is a non-decreasing function of Λ R,D ( θ 2 ), and Λ R,D ( θ 2 ) achieves its maximum value at ϵ 2 = ϵ 2. Therefore, the throughput achieves its maximum value when we choose ϵ 2 = ϵ 2 and the characterization in the theorem follows. ) If 1 θ 1 Λ S,R ( θ 1 ) is always smaller than 1 θ 1 (Λ R,D ( θ 2 ) + Λ S,R (θ 2 θ 1 ), ϵ2 =ϵ 2 then apparently ϵ 1 = ϵ 1, where ϵ 1 is the solutions of (5.12). Otherwise, we need to search for the ϵ 1 that maximizes ˆR. 106

126 The results in Theorems 13 and 14 do not incorporate the stability condition. The following result gives a characterization when the optimal error probability pair determined by Theorem 13 does not satisfy the stability condition. Theorem 15 For a given τ value, if the error probability pair found using Theorem 13 does not satisfy the stability condition, in the case of θ 1 θ 2, then the optimal error probability pair, satisfying the stability condition, lies on the boundary of the stability region, described by (1 ϵ 1 )τe{r 1 } = (1 ϵ 2 )(1 τ)e{r 2 }. (5.18) Proof : See Appendix A.9. Using the above characterizations, we can optimize the throughput efficiently. We search for the optimal τ value in the region (0, 1) with the following steps. For a given τ value, we first check whether the error probabilities given by Theorems 13 and 14 satisfy the stability condition. If satisfied, then we only have to perform a one-dimensional optimization over τ (0, 1). If not, we determine the optimal error probability pair on the boundary of the stability region in the case of θ 1 θ 2, or search in the entire bounded stability region in the case of θ 1 < θ Numerical Results In this subsection, we provide our numerical results. We consider a simple scenario in which all three nodes are placed on a straight line. The distance between S and D has been normalized to 1, and d (0, 1) represents the distance between the source node and relay, which is shown in Fig We assume Rayleigh fading with path loss E{z 1 } = d 4 and E{z 2 } = (1 d) 4. Unless specified otherwise, we assume SNR s = SNR r = 6dB, and blocklength is m = 150. In Figs. 5.2 and 5.3, we plot the maximum throughput R and the optimal time 107

127 R (bits per channel use) θ 1 =θ 2 =0.1 θ 1 =0.1, θ 2 =0.07 θ 1 =0.05, θ 2 = d Figure 5.2: The maximum throughput vs. relay location parameter. allocation parameter τ as functions of the distance parameter d for three different queuing constraint settings, respectively. When the source and relay nodes have the same QoS exponent value, the optimal location of the relay node is the midpoint between S and D to balance the channel conditions in the S R and R D links. When the source node has a stricter queuing constraint, the relay moves closer to S to enhance the departure rate of the source node. Similarly, when the relay node has a larger QoS exponent value, the optimal location of the relay is closer to D to improve the channel conditions in the R D link. From (5.7) we know that the link with stricter queuing constraint and smaller departure rate has more influence on the throughput, so the optimal location of the relay is chosen with the goal of improving the link experiencing stricter queuing constraints. Similar mechanisms can be observed in Fig Comparing two dashed lines in Fig. 5.3, we find that the system allocates more time to the link with a more stringent queuing constraint. When θ 1 = θ 2, the system allocates more time to the link with the worse channel condition. Note that these results are obtained with the optimal values of ϵ 1 and ϵ 2. In Fig. 5.4, we place the relay at the midpoint and plot the maximum through- 108

128 θ 1 =θ 2 =0.1 θ 1 =0.1, θ 2 =0.07 θ 1 =0.05, θ 2 =0.1 Optimal τ d Figure 5.3: The optimal τ vs. relay location parameter. 2.4 R (bits per channel use) θ 1 =θ 2 =0.005 θ 1 =θ 2 = θ 1 =θ 2 = m Figure 5.4: The maximum throughput vs. blocklength m. put as a function of the blocklength m. When m is small, increasing m improves the performance because it increases the departure rates. When m grows beyond a threshold, the throughput starts decreasing, because large m corresponds to slow fading, which is not favorable for delay sensitive systems with queuing constraints. In slow fading cases, strong attenuation would last for a longer time, leading to buffer overflows. Therefore, large m value has a stronger influence on the throughput when 109

129 Figure 5.5: The two-way relay system with buffer constraints. the system has stricter queuing constraints. 5.2 Throughput of Two-Way Relay Systems under Queueing Constraints System Model The two-way relay communication link is depicted in Fig In this model, sources S 1 and S 2 wish to exchange information with each other with the help of the intermediate relay node R. We assume that there is no direct link between S 1 and S 2 (which, for instance, holds, if these nodes are sufficiently far apart in distance). Data arriving to sources S 1 and S 2 is initially buffered before transmission to the relay. Similarly, the relay, upon receiving the superimposed signals from the source nodes in the multiple-access phase and decoding the messages, places the decoded data from the sources in two different buffers before broadcasting the superimposed messages back to the source nodes. Both the source and the intermediate relay nodes operate under queuing limitations given in (2.1). The QoS exponents at the source nodes are denoted by θ sj for j = 1, 2. Similarly, the QoS exponents for the two buffers at the 110

130 relay are θ r,sj with j = 1, 2. In this section, we assume that the same asymptotic QoS constraints are imposed at the relay buffers, i.e., θ r,s1 = θ r,s2 = θ r. However, we also note that different QoS exponents at the relay buffers can easily be accommodated in the analysis as well. Since we consider half-duplex relay operation, reception and transmission at the relay occur in non-overlapping intervals. Next, we express the discrete-time input and output relationships in both multiple-access and broadcast phases. In the multipleaccess phase, the signal Y r received at the relay can be expressed as Y r [i] = g 1 [i]x 1 [i] + g 2 [i]x 2 [i] + n r [i] (5.19) where X j for j = 1, 2 denotes the signal transmitted from source node S j, and g j is the fading coefficient between the nodes S j and R. The decoded information from each source is stored in a separate buffer at the relay. In the broadcast phase, the signal Y j received at source node S j is given by Y j [i] = g j [i]x r [i] + n j [i], j = 1, 2 (5.20) where X r represents the signal sent from the relay node. The inputs are subject to individual average energy constraints E{ X j 2 } P j /B for j = 1, 2 and E{ X r 2 } P r /B, where B is the bandwidth in the system. Assuming that the symbol rate is B complex symbols per second, we can easily see that the symbol energy constraint of P k /B for k = 1, 2, r implies that the nodes have a power constraint of P k. We assume that the fading coefficients g j, j = {1, 2} are jointly stationary and ergodic discretetime processes, and we denote the magnitude-square of the fading coefficients by z j [i] = g j [i] 2. Above, in the channel input-output relationships, the noise component n k [i] are zero-mean, circularly symmetric, complex Gaussian random variables with variance E{ n k [i] 2 } = N k for k = {1, 2, r}. The additive Gaussian noise samples 111

131 {n j [i]} are assumed to form an i.i.d. sequence. We denote the signal-to-noise ratios as SNR k = P k N k B where k = 1, 2, r. Finally, we introduce two system parameters: We assume that the fraction of time allocated to the multiple-access phase, in which source nodes transmit to the relay, is τ (0, 1). Hence, broadcast phase occurs in the remaining fraction (1 τ) of the time. In the broadcast phase, fraction of power allocated to data transmission to node S 1 is denoted by ρ (0, 1). Therefore, data intended for S 2 is sent using (1 ρ) fraction of the relay power Throughput in Two-Way Relay Systems Instantaneous Transmission Rates We initially describe the instantaneous transmission or equivalently service rates at the source and relay nodes. Let us first consider the multiple-access phase in which S 1 and S 2 simultaneously transmit to the relay. Assume that the decoding order is fixed at the relay and is, for instance, given by {1, 2}, i.e., the information sent from user 1 is decoded first. Then, the maximum instantaneous achievable transmission rates at S 1 and S 2 are given, respectively, by ( ) R s1,r = B log SNR 1z 1, 1 + SNR 2 z 2 R s2,r = B log 2 (1 + SNR 2 z 2 ). (5.21) If, on the other hand, the decoding order is {2, 1}, we have R s1,r = B log 2 (1 + SNR 1 z 1 ), ( ) R s2,r = B log SNR 2z SNR 1 z 1 (5.22) In the broadcast phase, we assume that the relay transmits the superimposition 112

132 of the signals intended for the source nodes, i.e., we have X r = X r,s1 + X r,s2 (5.23) with energies E{ X r,s1 2 } ρ P r /B and E{ X r 2 } (1 ρ) P r /B. Above, X r,sj is the signal intended for S j. We assume that source nodes know the channel gains and their own signals forwarded from the relay, i.e., S 1 knows X r,s2 and S 2 knows X r,s1. Equipped with such knowledge, source nodes can eliminate the self-interference terms in the received signals in the broadcast phase and obtain Y j [i] = g j [i]x r,sj [i] + n j [i], j = 1, 2. (5.24) With these assumptions, the instantaneous service rates for the two buffers at the relay become R r,s1 = B log 2 (1 + ρsnr r z 1 ) R r,s2 = B log 2 (1 + (1 ρ)snr r z 2 ). (5.25) Stability Conditions In this subsection, we discuss the conditions required to ensure stability in the relay buffers, which experience random arrivals and random departures. In particular, we investigate the conditions imposed on the parameters τ and ρ. As mentioned in Section 5.1.2, the stability condition is guaranteed by (2.25) at the source node. Assume, without loss of generality, that the relay employs the decoding order {1, 2}. For the stability of the relay buffer storing data intended for S 2, average arrival rate should be smaller than the average departure rate, i.e., we need to satisfy τe {R s1,r} (1 τ)e {R r,s2 }, (5.26) 113

133 that is, ( )} τe {B log SNR 1z 1 (1 τ)e {B log 1 + SNR 2 z 2 (1 + (1 ρ)snr r z 2 )}. (5.27) 2 Similarly, for the stability of the relay buffer storing data intended for S 1, we should have τe {B log 2 (1 + SNR 2 z 2 )} (1 τ)e {B log 2 (1 + ρsnr r z 1 )}. (5.28) Hence, we need to identify (τ, ρ) pairs satisfying both (5.27) and (5.28). Note that as ρ increases, the right-hand-side (RHS) of (5.27) decreases, and hence τ must decrease to compensate the loss incurred by ρ. Indeed, when ρ = 1, we should have τ = 0. On the other hand, when ρ = 0, we have τ E{log 2 (1 + SNR r z 2 )} ( )} E {log SNR. (5.29) 1z 1 1+SNR 2 z 2 + E{log 2 (1 + SNR r z 2 )} At the same time, the RHS of (5.28) increases with increasing ρ, so τ must increase to satisfy (5.28). When ρ = 0, we need to set τ = 0. When ρ = 1, we have τ E{log 2 (1 + SNR r z 1 )} E{log 2 (1 + SNR 2 z 2 )} + E{log 2 (1 + SNR r z 1 )}. (5.30) From the above observations, we conclude that if we plot the maximum value of τ as a function of ρ, the τ(ρ) curve, which satisfies (5.27), is a decreasing curve with end point at τ(1) = 0. The τ(ρ) curve, which satisfies (5.28), is an increasing curve, starting at τ(0) = 0. Therefore, there exists a crossing point of the two curves, where both (5.27) and (5.28) are satisfied. Let (ρ 1, τ1 ) represent the intersection point. Now, the maximum value of τ, for which the queues are stable, is given by τ1. Finally, note that a similar discussion follows if the decoding order is {2, 1}. 114

134 Throughput Region under Statistical QoS Constraints Denote the region of (τ, ρ) pairs, for which the buffers are stable, by W 1. It is clear from the descriptions in the previous subsection that W 1 is well-defined and nonempty. The following definition characterizes the throughput region in two-way relay channels in the presence of statistical queuing constraints both at the source nodes and the relay. In order to simplify the analysis, we henceforth consider block-fading channels in which the fading stays constant over a certain duration and then changes independently. Under block-fading assumption, asymptotic LMGF expressions of service processes simplify to log E{e θ n i=1 c[i] } Λ C (θ) = lim = log E{e θc[i] }. (5.31) n n Following the analysis given in Section 2.3, the maximum arrival rates for given (τ, ρ) W 1 can be expressed as { min { R 1 = min 1 θ s1 log E z1 {e τθ s 1 R s1,r }, 1 θ r log E{e (1 τ)θ rr r,s2} } 1 θ s1 log E z1 {e τθ s 1 R s1,r }, 1 θ s1 ( log E{e (1 τ)θ r R r,s2 } + log E{e τ(θ r θ s1 )R s1,r } )} θ r θ s1 θ r > θ s1 (5.32) and { min { R 2 = min 1 θ s2 log E z2 {e τθs 2 Rs 2,r }, 1 θ r log E{e (1 τ)θrrr,s 1 } } 1 θ s2 log E z2 {e τθ s 2 R s2,r }, θ r θ s2 (5.33) 1 θ s2 ( log E{e (1 τ)θ r R r,s2 } + log E{e τ(θ r θ 2 )R S2,r } )} θ r > θ s2. The arrival rates in (5.32) and (5.33) can further be optimized over all (τ, ρ) W 1 as 115

135 2.5 2 Throughput R 1 with θ=0.005 R with θ= R 1 with θ=0.2 R 2 with θ= Figure 5.6: Maximum arrival rates R 1 and R 2 vs. d. d will be done in the numerical results below Numerical Results In this subsection, we provide numerical results. We consider a simple scenario in which the sources and relay are located on a straight line. The distance between the two sources S 1 and S 2 has been normalized to 1, and d (0, 1) is the distance between S 1 and the relay R. Therefore, distance between S 2 and R is (1 d). We assume that the fading magnitude-squares z 1 and z 2 are independent exponential random variables with means E{z 1 } = 1 d α and E{z 2 } = 1 (1 d) α. We set the path-loss exponent to α = 4. Unless specified otherwise, we assume θ s1 = θ s2 = θ r,s1 = θ r,s2 = θ and the decoding order at the relay is {1, 2}. In Fig. 5.6, we plot the maximum arrival rates R 1 and R 2 at S 1 and S 2, respectively, as a function of the distance d for two different QoS exponents when SNR 1 = SNR 2 = SNR r = 2. Fig. 5.7 provides the corresponding optimal (τ, ρ ) values as a function of d. We notice that if the relay is very close to one of the sources, the fraction of time allocated to the multiple-access phase, τ, diminishes due to the fact that downlink channel between the relay and the far-away source node becomes 116

136 1 0.9 Time/Power Fraction ρ with θ=0.005 τ with θ=0.005 ρ with θ=0.2 τ with θ= Figure 5.7: Optimal power fraction ρ and time fraction τ vs. d. d the bottleneck in the information exchange and more time needs to be allocated to relay broadcasting to avoid buffer overflows and/or instability. As a result, we see in Fig. 5.6 that both R 1 and R 2 start diminishing as d approaches 0 or 1. Another observation is that as d increases from 0.1 to 0.5, the relay approaches S 2 and hence the channel between S 2 and R improves, leading to larger values of R 2. Therefore, higher arrival rates can be supported at S 2. Interestingly, we notice in Fig. 5.7 that ρ approaches to 1 in this case, meaning that the relay allocates more energy to forwarding data to S 1, which is basically needed to support the higher arrival rates at S 2. Finally, we note in Fig. 5.6 that arrival rates are smaller under more stringent queueing constraints, i.e., when θ is increased from to 0.2. On the other hand, (τ, ρ ) remain rather robust as seen in Fig In the following numerical results, we set d = In Fig. 5.8, we plot maximum arrival rate R 1 as a function SNR parameters. Expectedly, as SNR 1 and/or SNR r increases, transmission/service rates from S 1 and R increase and higher arrival rates at S 1 can be supported. Fig. 5.9 plots R 1 now as a function of SNR r and SNR 2, the signal-to-noise ratio of source S 2 s transmissions. Note that since decoding order of {1, 2} is considered, transmissions from S 1 experience interference proportional to 117

137 R 1 with SNR 2 = R SNR SNR r Figure 5.8: Maximum arrival rate R 1 vs. (SNR 1, SNR r ) when SNR 2 = 8, θ = 0.005, and d = R 1 with SNR 1 =3 2 R SNR SNR r Figure 5.9: Maximum arrival rate R 1 vs. (SNR 2, SNR r ) when SNR 1 = 3 and θ =

138 R 2 with θ 1 = R θ θ r Figure 5.10: Maximum arrival rate R 2 vs. (θ 2, θ r ) when θ 1 = 0.05 and SNR = 2. SNR 2 as seen in the expression of R s1,r in (5.21). Therefore, due to this coupling in the multiple-access phase, we see in Fig. 5.9 that R 1 diminishes with increasing SNR 2. As before, increasing SNR r improves R 1. In Fig. 5.10, we plot R 2 as a function of the QoS exponents. Note that the higher the QoS exponents, the more stringent the buffer constraints are. Therefore, as demonstrated in the figure, increasing QoS exponents results in reduced arrival rates. Sources basically admit lower-rate arrivals to satisfy more strict buffer constraints. Until now, we have fixed the decoding order at {1, 2} at the relay in the multipleaccess phase. In general, varying the decoding order can enlarge the region of arrival rates. In Fig. 5.11, we plot the throughput region, i.e., region of arrival rates (R 1, R 2 ), achieved by time-sharing between decoding orders {1, 2} and {2, 1}. The boundary of the region is determined by optimizing the resource allocation parameters (τ, ρ). This figure is obtained when SNR r = 4, SNR 1 = SNR 2 = 2, θ = 0.005, and d = 0.5, hence the relay is midway between the sources. 119

139 R R 1 Figure 5.11: Throughput Region achieved with time-sharing between decoding orders {1, 2} and {2, 1}. SNR r = 4, SNR 1 = SNR 2 = 2, θ = 0.005, and d = 0.5. Figure 5.12: The relay network system with buffer constraints. 5.3 Throughput of Multi-Source Multi-Destination Relay Networks with Queuing Constraints System Model In this section, we consider a multi-source multi-destination relay network model with two pairs of sources and destinations, as depicted in Fig In this system, two sources S 1 and S 2 send information to their corresponding destinations D 1 and D 2 120

140 with the help of an intermediate relay node, and there is no direct link between the source nodes and their destinations. This assumption is accurate if the source and destination nodes are sufficiently far apart in distance. We assume that D j only needs the packets coming from source S j, where j = 1, 2. Each source node has a buffer, keeping the packets to be transmitted to the relay node. The arrival rates at source nodes S 1 and S 2 are assumed to be constant, and are denoted as R 1 and R 2 respectively. At the relay node, there are two buffers 3, one for keeping the decoded information coming from source S 1, and the other for the decoded data of S 2. In our setup, relay node performs decode-and-forward relaying and works in halfduplex mode, and hence it cannot transmit and receive at the same time. The entire transmission process can be divided into two phases, namely multiple-access phase and broadcast phase. In the multiple-access phase, both S 1 and S 2 transmit to the relay node simultaneously through a multiple-access channel. Relay node attempts to decode their messages by using certain decoding orders, and the decoded information bits are stored in their corresponding buffers at the relay. We assume that if fixedrate transmissions are employed, transmission fails if the rate is greater than the instantaneous capacity of the link for a given decoding strategy at the relay 4. The received discrete-time signal at the relay node can be expressed as Y r [i] = g 1 [i]x 1 [i] + g 2 [i]x 2 [i] + n r [i], (5.34) where X j for j = 1, 2 represents the transmitted signal from source node S j, g j is the fading coefficient of the S j R link, and n r is the additive Gaussian noise at the relay. 3 In practice, only one physical buffer is sufficient at the relay node to store the received packets from S 1 and S 2. In the analysis, we essentially decompose this physical buffer into two equivalent virtual buffers, in each of which data for only one destination is stored and first-in first-out policy is employed. 4 It is assumed that errors are detected reliably at the receivers, and when the system employs ARQ protocol, ACK and retransmission request (RQ) packets are assumed to be received with no errors. 121

141 In the broadcast phase, relay node forwards information bits to their destinations through a broadcast channel. The received signal at D j is Y j [i] = h j [i]x r [i] + n j [i], j = 1, 2 (5.35) where X r stands for the transmitted signal from R 5, n j is the additive Gaussian noise at D j, and h j represents the channel fading coefficient of the R D j link. Magnitudesquares of the fading coefficients in both phases are denoted by z j [i] = g j [i] 2 and ω j [i] = h j [i] 2, for j = 1, 2. In our analysis, we consider block fading and assume that fading coefficients stay constant in one time block, and change independently from block to block. While our analysis is general and applicable to any fading distribution with finite variances, we assume Rayleigh fading in all channels in our numerical analysis. The transmitted signals are subject to energy constraints given by E{ X j 2 } P j /B for j = 1, 2 and E{ X r 2 } P r /B, where B is the system bandwidth and P k for k = 1, 2, r is the transmit power constraint for the corresponding node. The additive noise terms n k [i] for k = 1, 2, r are independent, zero-mean, circularly symmetric, complex Gaussian random variables with variances E{ n k [i] 2 } = N 0. Then, signalto-noise ratios are defined as SNR k = P k N 0 B (5.36) where k = 1, 2, r. Finally, there are three important system parameters: τ, ρ and δ. τ (0, 1) denotes the fraction of time allocated to the multiple-access phase, and hence the fraction of time allocated to the broadcast phase is 1 τ. ρ (0, 1) represents the fraction of power allocated by the relay to the transmission of the message intended for D 1, and therefore the fraction of power allocated to the transmission to D 2 is 5 The signal transmitted from the relay can be written as X r = X r1 + X r2, and hence is a combination of X r1 and X r2, which are the signals intended for D 1 and D 2, respectively. 122

142 1 ρ. In the multiple-access phase, relay node decodes the received signal using different decoding orders, and the fraction of time allocated to decoding order {1, 2} and {2, 1} at the relay node are denoted by δ and 1 δ, respectively. This time sharing strategy between different decoding orders is used only for the case of variable-rate transmissions, performed when CSI is available at all transmitters. For fixed-rate transmission schemes, decoding order is part of the decoding strategy, which is fixed for each node. In this section, system throughput is characterized by the pair of maximum constant arrival rates R 1 and R 2 that can be supported by the relay network with two pairs of source-destination nodes in the presence of statistical queuing constraints. Detailed discussion about the arrival rates for two-hop channels in the presence of statistical queuing constraints is given in Chapter 2. Finally, we provide a list of notations together with their descriptions in Table Throughput of the Two-Source Two-Destination Relay Network With Variable Transmission Rates In this subsection, we study the throughput of the two-source two-destination relay network with variable-rate transmissions. Under the assumption that CSI is available at each transmitter, transmitters adapt their transmission rate to the instantaneous channel conditions, and the departure rates at each buffer are given by the corresponding instantaneous channel capacities. To perform an effective capacity analysis at each node with a buffer, we have to first identify the instantaneous transmission rates as functions of the fading coefficients. 123

143 Table 5.1: Table of notations for Section 5.3 Notation Definition Y j Received signal at relay R (for j = r) or destination D j (for j = 1, 2). X j Transmitted signal from relay R (for j = r) or source S j (for j = 1, 2). g j Fading coefficient of the S j R link. z j Magnitude-square of the fading coefficient g j. h j Fading coefficient of the R D j link. ω j Magnitude-square of the fading coefficient h j. n j Additive Gaussian noise at the relay R( for j = r) or destination D j (for j = 1, 2) with variance N 0. SNR j Signal-to-noise ratio of relay R (for j = r) or source S j (for j = 1, 2). θ j QoS exponent associated with the buffer constraint at relay R (for j = r) or source S j (for j = 1, 2). Λ(θ) LMGF of a departure or arrival process as a function of the QoS exponent θ. τ The fraction of time allocated to the multiple-access phase. ρ The fraction of power allocated by the relay to the transmission of the message intended for D 1. δ The fraction of time allocated to decoding order {1, 2} at relay R. R j The maximum constant arrival rate at source S j that can be supported under queuing constraints. R A,B The instantaneous channel capacity of link A B. r A,B The fixed transmission rate of link A B in the fixed rate scheme. 124

144 Instantaneous Transmission Rates in Multiple User Relay Networks We initially describe the instantaneous transmission rates of four links. Let us first consider the multiple-access phase in which links S 1 R and S 2 R are active simultaneously. When the decoding order at the relay is given by {1, 2}, i.e., the information sent from node S 1 is decoded first, and the information sent from node S 2 is decoded after interference cancelation, then the maximum instantaneous achievable rates at S 1 and S 2 are given, respectively, by [32] ) R S1,R{1,2} = B log 2 (1 + SNR 1z 1 1+SNR 2 z 2, R S2,R{1,2} = B log 2 (1 + SNR 2 z 2 ). (5.37) If the decoding order at the relay node is {2, 1}, then we have R S1,R{2,1} = B log 2 (1 + SNR 1 z 1 ), ) R S2,R{2,1} = B log 2 (1 + SNR 2z 2 1+SNR 1 z 1. (5.38) If we perform time-sharing between two decoding orders with parameter δ, then the rates of links S 1 R and S 2 R are characterized by (5.37) in δ fraction of the time, and the rates are characterized by (5.38) rest of the time. Overall, the transmission rates between the source nodes and the relay node can be expressed as R Sj,R = δr Sj,R{1,2} + (1 δ)r Sj,R{2,1}, (5.39) for j = 1, 2. In the broadcast phase, relay node forwards packets to their corresponding destinations. In this phase, only links R D 1 and R D 2 are active. When the channel conditions are available at the relay node and destinations, the decoding order are 125

145 decided by the relationship between ω 1 and ω 2, and the instantaneous transmission rates are given by [88] [34] R R,D1 = B log 2 (1 + R R,D2 = B log 2 (1 + ) ρsnr rω 1 1+(1 ρ)snr rω 1 1{ω 1 <ω 2, } ) (5.40) (1 ρ)snr r ω 2 1+ρSNR rω 2 1{ω 2 <ω 1 } where 1{ } is indicator function Stability Conditions With the expressions of the instantaneous rates for both the multiple-access channel and broadcast channel described above, we can characterize the stability region in the ρ τ δ space. Stability at the source buffers is ensured by requiring the arrival rates to satisfy (2.25), which actually leads to compliance with the stricter condition that the tail distribution of the buffer length decays exponentially fast. The stability conditions at the relay node requires the average arrival rate to be less than or equal to the average departure rate at each buffer in the relay. Hence, the stability conditions can be formulated as τ ( δe{r S1,R{1,2}} + (1 δ)e{r S1,R{2,1}} ) (1 τ)e{r R,D1 }, τ ( δe{r S2,R{1,2}} + (1 δ)e{r S2,R{2,1}} ) (1 τ)e{r R,D2 }. (5.41) Plugging (5.37), (5.38), and (5.40) into (5.41), we obtain 126

146 ( )} ρsnr (1 τ)e {B log rω 1 1+(1 ρ)snr r ω 1 1{ω 1 <ω 2 } ( ) τ (δe{b log SNR 1z 1 1+SNR 2 z 2 )} + (1 δ)e{b log 2 (1 + SNR 1 z 1 )}, ( )} (1 ρ)snr (1 τ)e {B log r ω 2 1+ρSNR r ω 2 1{ω 2 <ω 1 } ( ) τ (δe{b log 2 (1 + SNR 2 z 2 )} + (1 δ)e{b log SNR 2z 2 1+SNR 1 z 1 )}. (5.42) All feasible (ρ,τ,δ)-tuples satisfying the inequalities in (5.42) form the stability region in the ρ τ δ space. Hence, we formally define the the stability region Ξ in the ρ τ δ space as Ξ = {(ρ, τ, δ) ρ, τ, and δ that satisfy (5.42)}. (5.43) For a certain time sharing scheme at the relay node with fixed δ, since τ is the time fraction allocated to the multiple-access phase, lower τ value is more likely to satisfy the stability condition, and the two inequalities in (5.42) provide two upper bounds on τ as functions of ρ. The power allocation parameter ρ has a different influence on these two phases. With more power allocated to transmission to D i in the broadcast phase, the corresponding buffer in the relay can support a higher τ value while satisfying the stability constraint Throughput Region under Statistical Queuing Constraints As noted before, for a certain parameter setting, the system throughput is defined as the pair of constant arrival rates R 1 and R 2, which can be supported by two-hop links S 1 D 1 and S 2 D 2, respectively, under queuing constraints. Since stability is a prerequisite for effective capacity analysis, our system throughput is only defined with parameter values included in the stability region. For those parameter settings 127

147 outside the stability region, at least one of the queuing constraints cannot be satisfied, and the system throughput is set to zero. Using the results in the previous subsection, to comply with queuing constraints at all nodes, R j for j = 1, 2 has to satisfy (2.25) and (2.26) simultaneously, which leads to the following characterization of the system throughput. min R j = min Theorem 16 For any parameter setting {τ, ρ, δ} that satisfies the stability conditions, the maximum constant arrival rate R j, which can be supported at source node S j for j = 1, 2 in the presence of all queuing constraints, is given by { 1θj log(e{e θjτrsj,r }), 1θr log(e{e θ r(1 τ)rr,dj }) } θ r θ j { 1θj log(e{e θjτrsj,r }), 1θj ( log(e{e θ r(1 τ)r R,Dj }) + log(e{e (θ r θ j )τr Sj,R })) } θ r > θ j, (5.44) Proof 4 See Appendix A.10. Following this characterization, some properties of the system throughput are shown in the next subsection Properties of the System Throughput under Queuing Constraints In the previous subsection, we have characterized the throughput of the two-source two-destination relay network. Based on (5.44), we next analyze the behavior of the throughput in the parameter space, and establish several convexity properties, which can lead to simplifications in parameter optimization. Theorem 17 In the stability region, for a given τ ρ pair, the maximum arrival rates R 1, R 2 and the sum rate R 1 + R 2 are concave over the time sharing parameter δ between different decoding orders at the relay. 128

148 Proof : See Appendix A.11. Theorem 17 indicates that there exists a globally optimal time sharing parameter for the two possible decoding orders at the relay, which can be determined via convex optimization methods. Similarly, the system throughput functions are also concave functions of τ, which is the parameter for time allocation between the multiple-access and broadcast phases. Theorem 18 In the stability region, for given power allocation parameter ρ and timesharing parameter δ, the maximum arrival rates R 1, R 2 and the sum rate R 1 + R 2 are concave over the time allocation parameter τ. Proof : See Appendix A.12. Using these results, we can maximize the system throughput over δ and τ under stability constraints by employing efficient convex optimization methods Throughput of Multi-Source Multi-Destination Networks Our analysis in this subsection has primarily considered a two-source two-destination relay network. However, using similar techniques and approach, we can extend the analysis to multi-source multi-destination networks. For instance, let us consider a multiple-user model in which N sources send information to their corresponding destinations with the help of a relay node. The magnitude-squares of the fading coefficients of links S j R and R D j are represented by z j and ω j, respectively. Compared with the two-user model, adding more users only increases the dimension of the parameter space while the analytical methods and results essentially remain the same. In this multi-user setting, system parameters ρ = (ρ 1, ρ 2,, ρ N ) and δ = (δ 1, δ 2,, δ N! ) become vectors, while the time allocation parameter τ is still a scalar. In the multiple-access phase, we denote the k th decoding order at the relay as π k = {k 1, k 2,, k N }, which is a permutation of {1, 2,, N}. With this decoding 129

149 order, the instantaneous rate of the S ki R link is characterized by R Ski,R,π k = B log 2 (1 + SNR ki z ki 1 + N j=i+1 SNR k j z kj ). (5.45) Given a time sharing vector δ = (δ 1, δ 2,, δ N! ), the rate of the S j R link is given by N! R Sj,R = δ k R Sj,R,π k, (5.46) k=1 for j = 1, 2,, N. For the broadcast channel, the instantaneous rate is given by ρ j SNR r ω j R R,Dj = B log 2 ( N l=1,l j ρ lsnr r ω j 1{ω j < ω l } ), (5.47) for j = 1, 2,, N. Similarly, the stability region in the parameter space is defined as Ξ = { (τ,ρ 1,, ρ N, δ 1,, δ N! ) τ, ρ and δ that satisfy τe{r Sj,R} (1 τ)e{r R,Dj }, N ρ i = 1 and i=1 N! i=1 } δ i = 1, for all j = 1, 2,, N. (5.48) In this multiple-user setting, the dimension of the parameter space becomes much higher than that in the two-user model. For a set of parameters that guarantee the stability conditions, the throughput of the S j D j link under queuing constraints satisfies (2.25) and (2.26) simultaneously, and hence is given by (5.44), for j = 1, 2,, N, with the instantaneous rate expressions provided above. 130

150 Numerical Results In this subsection, numerical results are provided to further analyze the throughput of the two-source two-destination relay network with variable transmission rates. Our numerical results are based on (5.44). In order to verify our analysis, we have conducted Monte Carlo simulations in which we have generated arrivals to the buffer at constant rates determined by our theoretical characterization in (5.44) and also generated random (Rayleigh) fading coefficients to simulate the wireless channel and random transmission rates. We have tracked the buffer occupancy and overflows for different threshold levels. We plot the simulated logarithmic buffer overflow probabilities as functions of the overflow threshold q max in Figs and In each simulation, we generate time blocks to estimate the buffer overflow probability, and repeat each simulation 1000 times to evaluate the averages. We set the queuing constraints as θ 1 = θ 2 = θ r = 0.1, and the constant arrival rates at nodes S 1 and S 2 are determined from (5.44). In both figures, E{z j } = E{ω j } = 1, τ = ρ = δ = 0.5, SNR 1 = SNR 2 = 10dB. In Fig. 5.13, we set SNR r = 30dB. Note that log Pr{Q q max } log γ θq max, the slope of the logarithmic overflow probability is expected to be proportional to θ. Although large q max is required, our simulation results show that log Pr{Q q max } can be approximated as a linear function of q max starting from relatively small q max. In Fig. 5.13, the slopes of the logarithmic overflow probabilities at buffers in S 1 and S 2 are and 0.099, respectively. This implies that simulation results demonstrate perfect agreement with the analysis and the arrival rates given by (5.44) fit the queuing constraints at S 1 and S 2 exactly. We also observe that the logarithmic overflow probabilities of the two relay buffers decay faster with steeper slopes than our requirement of θ r = 0.1. In this specific example, due to relay having a relatively large transmit power, the system performance is mainly decided by the multiple-access phase, which is the bottleneck of the system. Although the relay can potentially 131

151 0-1 log Pr{Q>q max } Buffer at S 1-5 Buffer at S 2 Relay buffer for S 1 Relay buffer for S q (bit) max Figure 5.13: Logarithmic buffer overflow probability vs. buffer overflow threshold. support higher R 1 and R 2, this is not allowed by the multiple-access phase. As we reduce the transmission power of the relay node, the system bottleneck shifts to the broadcast phase and the situation is reversed. In Fig. 5.14, we reduce SNR r to 27.5dB. Now, the arrival rates given by (5.44) fit the queuing constraints at the relay exactly, and the corresponding slopes for the two relay buffers are and 0.097, respectively. On the other hand, the decays of the overflow probabilities at the source nodes are faster, meaning that sources can potentially support higher arrival rates but this leads to the violation of the overflow constraints at the relay buffers and is therefore not allowed. Overall, these simulation results, while confirming the analysis, also interestingly unveil the critical interactions between the queues and buffer constraints. For the rest numerical results in this subsection, we consider Rayleigh fading and we set SNR 1 = SNR 2 = 3 db and SNR r = 6 db. Fig shows the influence of the position of the relay node for different θ values. We assume a symmetric model, in which θ 1 = θ 2 = θ r, and Dist S1,R = Dist S2,R and Dist R,D1 = Dist R,D2, where Dist A,B stands for the distance between A and B. The overall distance D = 132

152 log Pr{Q>q max } Buffer at S 1 Buffer at S 2 Relay buffer for S 1 Relay buffer for S q (bit) max Figure 5.14: Logarithmic buffer overflow probability vs. buffer overflow threshold. Dist S1,R + Dist R,D1 = Dist S2,R + Dist R,D2 = 2, and the position parameter d = Dist S1,R = Dist S 2,R D D. Obviously, d [0, 1], and the smaller value of d indicates that relay is closer to the source. Path loss as a function of distance is incorporated into the statistics of fading powers, and hence, we have E{z j } = ( 1 D d )4 and E{ω j } = ( 1 D(1 d) )4 for j = 1, 2. In the figure, we see that the maximum sum rate R 1 + R 2 is achieved when d is close to 0.5, which means that it is better to place the relay in the middle between the source and destination in this symmetric setting. When the relay is close to the source nodes, the channels between the relay and destinations deteriorate and the overall throughput is limited by the broadcast links. Similarly, the multiple-access links become the bottleneck when d is close to 1. Also, we observe that the system throughput decreases when θ increases due to tighter queuing constraints. This occurs because when θ is small, the effective capacity is closer to the Shannon capacity, and as θ increases, effective capacity diminishes and approaches the zero-outage capacity (which is, for instance, zero in Rayleigh fading). In Fig. 5.16, we consider an asymmetric scenario in terms of QoS exponents, and again plot sum rate vs. relay location parameter d. We fix ρ = δ = 0.5 and 133

153 θ 1 =θ 2 =θ r =1 θ 1 =θ 2 =θ r =2 0.9 Sum rate (bit/s) d Figure 5.15: The sum rate vs. relay location parameter θ 1 =θ 2 =0.1, θ r =5 θ 1 =θ 2 =5, θ r =0.1 Sum rate (bit/s) d Figure 5.16: The sum rate vs. relay location parameter. 134

154 determine the optimal value of τ for each given d. When θ r = 5, θ 1 = θ 2 = 0.1, the maximum sum rate is achieved at d = In this case, relay should be placed closer to the destinations to support more stringent queuing constraints at the relay. On the other hand, when θ 1 = θ 2 = 5, θ r = 0.1, the optimal position for the relay is at d = Hence, the relay needs to be closer to the source nodes to support their stricter queuing constraints. These observations indicate the sensitivity of optimal relay placement to different QoS requirements. Figs and 5.18 demonstrate the concavity 6 of the sum rate with respect to τ and δ, respectively, when the parameter values are in the stability region. In these two figures, θ 1 = θ 2 = θ r = 1, and E{z j } = E{ω j } = 1. In Fig. 5.17, the sum rate curves first increase with τ, and then decrease very fast after reaching the maximum sum rate. As τ exceeds a threshold, the sum rates drop to 0, because stability conditions are violated beyond this threshold. In Fig. 5.18, the sum rate curves are concave with respect to the decoding parameter δ, and the optimal δ values which maximize the sum rate are all close to 0.5. In this case, relay allocates time to two decoding orders equally. However, note that these results are again for a symmetric scenario in which all QoS exponents are the same. In Fig. 5.19, we address a heterogeneous setting in terms of QoS exponents. For instance, when θ 1 = θ r = 1 and θ 2 = 0.1, the optimal value of δ is 1. Hence, sum rate is maximized when the decoding order at the relay is always fixed as {1, 2}, i.e., relay initially decodes data arriving from source S 1 in the presence of interfering signal of S 2. The underlying reason for this result is the following. Source S 1 operates under stricter QoS constraints with respect to S 2 and consequently can support smaller arrival rates and needs, in turn, smaller transmission rates which can be sustained even in the presence of interference. If the roles are switched (i.e., if we have θ 2 = θ r = 1 and θ 1 = 0.1), then the optimal value of δ is zero. If the QoS exponents are more comparable (e.g., θ 1 = 1 and θ 2 = 0.5 or 6 These concavity results can simplify the search for the optimal parameter setting with the use of convex optimization tools. 135

155 ρ=0.5,δ=0.2 ρ=0.5,δ= Sum rate (bit/s) τ Figure 5.17: The sum rate vs. time allocation parameter τ. θ 1 = 0.5 and θ 2 = 1), we notice that optimal values of δ start to slightly deviate from the two extremes of 0 and 1. Fig shows the throughput regions of the two-source two-destination relay network under different queuing constraints. The boundary of the throughput region is obtained by searching over the three-dimensional parameter space. When R 1 achieves its maximum value, δ is close to 0, and ρ is slightly greater than 0.5, because decoding order {2, 1} and more power in the R D 1 link can help S 1 D 1 link to support higher arrival rates. Similar results are also obtained for the maximum value of the arrival rate R Throughput of the Two-Source Two-Destination Relay Network With Fixed Transmission Rates In practice, CSI may not be available at the transmitters. In such cases, the instantaneous departure rates from each buffer will be different. In this subsection, we investigate the system throughput when the transmitters do not have CSI and 136

156 Sum rate (bit/s) ρ=0.5, τ=0.4 ρ=0.5,τ=0.39 ρ=0.5,τ= δ Figure 5.18: The sum rate vs. decoding parameter δ Sum rate (bit/s) θ =θ r =1, θ 2 =0.1 θ 2 =θ r =1, θ 1 =0.1 θ 1 =θ r =1, θ 2 =0.5 θ 2 =θ r =1, θ 1 = δ Figure 5.19: The sum rate vs. decoding parameter δ. 137

157 θ 1 =θ 2 =θ r =1 θ 1 =θ 2 =θ r =2 0.5 R 2 (bit/s) R 1 (bit/s) Figure 5.20: System throughput region R 1 vs. R 2. transmit at fixed rates. We further assume that an ARQ protocol is employed and retransmissions are requested in case of communication failures. [89] In the ARQ protocol, if the receiver decodes the packet, an ACK feedback is sent to the transmitter, otherwise the receiver asks for the retransmission of the same packet until the receiver gets the packet correctly. Here, the feedback signals are assumed to be transmitted without error and delay. In other words, the transmitter gets the error free feedback signal immediately after it completes the transmission of the corresponding packet. In this model, ARQ scheme guarantees the reliability, and the packets are kept in the buffer until the receiver decodes it correctly. With this ARQ assumption, the instantaneous departure rate at a buffer is equal to the fixed transmission rate if the receiver decodes the packet correctly, and it is 0 if the transmission fails. In order to determine the asymptotic LMGFs Λ Sj,R, Λ R and Λ R,Dj, we have to first identify the success and failure probabilities of these fixed-rate transmissions. 138

158 State Probabilities in the Multiple Access Phase As noted before, source node S j transmits in the multiple-access phase with fixed rate r Sj,R for j = 1, 2. In the broadcast phase, relay node transmits to destination D j with fixed rate r R,Dj, for j = 1, 2. Since all transmitters are using the ARQ protocol, all links can be regarded to be in either ON or OFF state at a given time. The link is in the ON state if the fixed transmission rate is less than the instantaneous channel capacity, and the receiver can decode the packet correctly. Otherwise, failure occurs and the link is in the OFF state in which the transmission rate is effectively zero. In the multiple-access phase, the channel capacity is related to the decoding s- trategy of the relay, which is described as follows: 1. Relay tries to decode the first packet while treating the interference as noise. Without loss of generality, we assume that the relay always starts with the packets sent by S 1. (a) If the receiver decodes the packet correctly, then it moves to the interference cancelation step (i.e., Step 2 below). (b) If the receiver cannot decode the packet from S 1, it tries to decode the packet from S 2. (c) If the receiver decodes it correctly, then it moves to the interference cancelation step. Otherwise, it asks retransmission from both transmitters, and decoding process ends. 2. The receiver performs interference cancelation by subtracting the decoded message from the received signal. 3. The receiver attempts to decode the remaining packet after interference cancelation. If it cannot decode the packet, retransmission is required from the corresponding transmitter. 139

159 Later in our analysis, we show that it does not make any difference if the relay starts with the packets sent by S 2. In the multiple-access phase, according to the states of the links S 1 R and S 2 R, we identify four possible cases: Case 1: In this case, the relay node cannot decode any of the received messages. Relay node attempts to decode the message from S 1 first, while treating the signal from S 2 as noise. Following unsuccessful decoding, relay tries to decode the message from S 2 while treating the interference as noise, and cannot succeed either. Hence, we in this scenario have ) r S1,R > τb log 2 (1 + SNR 1z 1 1+SNR 2 z 2 ) r S2,R > τb log 2 (1. (5.49) + SNR 2z 2 1+SNR 1 z 1 (5.49) can be transformed into the following bounds on fading magnitude-squares z 1 and z 2 : 1 z 1 > SNR (SNR 1 2 z 2 / ( 1 z 1 < SNR 1 2 r S 1,R z 2 > 0 z 2 < (2 r S 2,R τb 1 ) (2 r S 2,R τb 1 ) 1 ) τb 1 (1 + SNR 2 z 2 ) ) 2 r S 1,R τb / { [( ) ) ]} SNR 2 2 r S 1,R τb 1 (2 r S 2,R τb 1 1, if ) ) (2 r S 1,R τb 1 (2 r S 2,R τb 1 < 1. (5.50) (5.50) defines a region on the first quadrant of (z 1, z 2 ) plane, which we denote by Ψ 1. Therefore, the probability of Case 1 is given by 140

160 P M,1 = Ψ 1 p z1,z 2 (z 1, z 2 )dz 1 dz 2, (5.51) where p z1,z 2 (z 1, z 2 ) is the joint probability density function (pdf) of z 1 and z 2. For instance, if we consider independent Rayleigh fading, then joint pdf is given by p z1,z 2 (z 1, z 2 ) = 1 ( exp z 1 z ) 2, (5.52) z 1 z 2 z 1 z 2 where z j represents the expected value of z j for j = 1, 2. In this case, since the relay can decode none of them, switching the decoding order will not make a difference. Case 2: In this case, the relay can decode the message from S 1 in the presence of interference from S 2, but the message from S 2 cannot be decoded successfully even after interference cancelation. This scenario can be expressed by the following two inequalities: ) r S1,R τb log 2 (1 + SNR 1z 1 1+SNR 2 z 2, (5.53) r S2,R > τb log 2 (1 + SNR 2 z 2 ) which can further be expressed as z 1 z 2 < ) (2 r S 1,R τb 1 (1 + SNR 2 z 2 )/SNR 1 ). (5.54) (2 r S 2,R τb 1 /SNR 2 (5.54) defines the region Ψ 2 on the first quadrant of (z 1, z 2 ) plane, and the probability of Case 2 is given by 141

161 P M,2 = Ψ 2 p z1,z 2 (z 1, z 2 )dz 1 dz 2. (5.55) Notice that since the relay cannot decode the message from S 2 even after interference cancelation, changing the decoding order would not help. Case 3: This is the symmetric version of Case 2 with the roles of S 1 and S 2 interchanged. Hence, the relay can decode the message from S 2 with interference, but not the message from S 1. The probability of this case is given by P M,3 = p z1,z 2 (z 1, z 2 )dz 1 dz 2, (5.56) Ψ 3 where Ψ 3 is the region in the first quadrant of (z 1, z 2 ) plane described by z 2 z 1 < ) (2 r S 2,R τb 1 (1 + SNR 1 z 1 )/SNR 2 ) (2 r S 1,R τb 1 /SNR 1. (5.57) Case 4: In this case, the relay can decode both messages from two source nodes. Although the description of this case is more involved, we can fortunately express the probability of this case as 3 P M,4 = 1 P M,i. (5.58) Note now that the ON state probability of the S j R link is given by i=1 P j = P M,j+1 + P M,4 for j = 1, 2. (5.59) 142

162 State Probabilities in Broadcast Phase In the broadcast phase, the decoding strategy of the destination node D j for j = 1, 2 is described as follows: 1. D j attempts to decode its own packet first while treating the interference as noise. (a) If the receiver decodes correctly, then the decoding process ends. (b) If the receiver cannot decode its own packet first, it tries to decode the packet intended for the other destination first. (c) If the receiver decodes the other packet correctly, then it moves to the interference cancelation step. Otherwise, it asks for a retransmission from the relay node, and decoding process ceases. 2. The receiver performs interference cancelation by subtracting the decoded message from the received signal. 3. The receiver tries to decode its own packet after interference cancelation. If it still cannot decode the packet, retransmission is required from the relay. There are two possibilities for link R D 1 being in the ON state. D 1 may decode its message while treating interference as noise, or it may decode the message for D 2 first, and then decode its own message after interference cancelation. These are described by the following conditions: 143

163 ) SNR r ρω 1 r R,D1 (1 τ)b log 2 ( SNR r (1 ρ)ω 1 or ) r R,D1 > (1 τ)b log 2 (1 + SNRrρω 1 1+SNR r (1 ρ)ω 1 r R,D1 (1 τ)b log 2 (1 + SNR r ρω 1 ) r R,D2 (1 τ)b log 2 (1 + SNR ) r(1 ρ)ω 1 1+SNR rρω 1 (5.60) (5.61) where ω 1 = h 1 2. We first define ( a 1 = 2 ( a 2 = 2 ( a 3 = 2 r R,D1 ) / { (1 τ)b 1 r R,D1 r R,D2 ) / { (1 τ)b 1 r R,D1 SNR r [1 (1 ρ)2 (1 τ)b ]} (5.62) ) (1 τ)b 1 /(SNR r ρ) (5.63) r R,D2 ]} SNR r [1 ρ2 (1 τ)b. (5.64) Using the conditions in (5.60) and (5.61), we can express the ON probability of the R D 1 link as 0, a 1 < 0 and a 3 < 0 P 3 = a 1 p ω1 (ω 1 )dω 1, (a 1 > 0 and a 3 < 0) or (a 3 > a 1 > 0) (5.65) p max{a 2,a 3 } ω 1 (ω 1 )dω 1, otherwise, where, for instance, in Rayleigh fading, p ω1 (ω 1 ) = 1 ω 1 exp( ω 1 /ω 1 ) is the pdf of ω 1, and ω 1 is the expected value of ω 1. A similar analysis can be applied to obtain the ON state probability of the link R D 2 as 144

164 0, b 1 < 0 and b 3 < 0 P 4 = b 1 p ω2 (ω 2 )dω 2, (b 1 > 0 and b 3 < 0) or (b 3 > b 1 > 0) (5.66) p max{b 2,b 3 } ω 2 (ω 2 )dω 2, otherwise, where, for instance, if again Rayleigh fading is considered, p ω2 (ω 2 ) = 1 ω 2 exp( ω 2 /ω 2 ) is the pdf of ω 2, ω 2 is the expected value of ω 2, and parameters b j for j = 1, 2, 3 are defined as ( b 1 = 2 ( b 2 = 2 ( b 3 = 2 r R,D2 ) / { (1 τ)b 1 r R,D2 r R,D1 ) / { (1 τ)b 1 r R,D2 SNR r [1 ρ2 (1 τ)b ]} (5.67) ) (1 τ)b 1 /(SNR r (1 ρ)) (5.68) r R,D1 ]} SNR r [1 (1 ρ)2 (1 τ)b. (5.69) From the view of ARQ, outage happens when the receiver cannot decode the received signal, thus 1 P j can be regarded as outage probabilities of their corresponding links Stability Conditions Similar to the variable-rate case, stability at the source buffers is ensured by requiring the arrival rates to satisfy (2.25). The stability at the relay buffer requires the average arrival rate to be smaller than the average departure rate. This can be ensured by choosing the parameters (ρ, τ) accordingly. Now, our parameter space is just a two dimensional plane, and we can describe the feasible set of (ρ, τ) for stability as Ξ = {(ρ, τ) r S1,RP 1 r R,D1 P 3 and r S2,RP 2 r R,D2 P 4 }. (5.70) 145

165 It can be easily seen that both average arrival rates r S1,RP 1 and r S2,RP 2 are monotonic increasing functions of τ, because allocating more time to the multiple-access phase is beneficial to links S 1 R and S 2 R. For the same reason, average departure rates r R,D1 P 3 and r R,D2 P 4 are decreasing functions of τ. Therefore, for given ρ, conditions in (5.70) provide two upper bound curves on τ. Then, feasible set for stability is the region under these two upper bounds on the (ρ, τ) plane Throughput Region under Statistical Queuing Constraints Similar to the variable-rate case, the system throughput is only defined for the feasible parameter setting, which guarantees the stability. For those parameter values outside the stability region, the system throughput is set to 0. For a given feasible (ρ, τ) pair and given fixed transmission rates, we next formulate the maximum constant arrival rates R 1 and R 2 at the source nodes under statistical queuing constraints parameterized by QoS exponents θ 1, θ 2 and θ r. For the described ON-OFF link model with independent fading coefficients, the asymptotic LMGFs can be simplified as ( ) Λ Sj,R(θ) = log e θr S j,r P j + e θ0 (1 P j ) (5.71) ) = log (e θr S j,r P j + 1 P j (5.72) ( ) Λ R,Dj (θ) = log e θr R,D j Pj+2 + e θ0 (1 P j+2 ) (5.73) ) = log (e θr R,D j Pj P j+2 j = 1, 2, (5.74) noting that the transmission rates are either equal to the fixed rates r Sj,R from the sources and r R,Dj from the relay if transmissions are successful and the corresponding links are in the ON state with probabilities P j and P j+2 for j = 1, 2, and are zero in case of failures. Recall that in order to satisfy the queuing constraints at both the sources and the relay, the arrival rates at the two source nodes should satisfy (2.25) and (2.26) simultaneously. Then, using (5.72) and (5.74) and considering (2.25) and 146

166 (2.26), we can characterize the maximum constant arrival rates as { min 1 θ 1 log ( ) e θ 1r S1,R P P 1, 1 θ r log ( ) } e θ rr R,D1 P P 3 θ r θ 1 { R 1 = min 1 θ 1 log ( ) e θ 1r S1,R P P 1, ( ( ) 1 θ 1 log e θ r r R,D1 P P 3 (5.75) + log ( ) ) } e (θ r θ 1 )r S1,R P P 1 θ r > θ 1 { min 1 θ 2 log ( ) e θ 2r S2,R P P 2, 1 θ r log ( ) } e θ rr R,D2 P P 4 θ r θ 2 { R 2 = min 1 θ 2 log ( ) e θ 2r S2,R P P 2, ( ( ) 1 θ 2 log e θ r r R,D2 P P 4 (5.76) + log ( e (θr θ 2)r S2,R P P 2 ) ) } θ r > θ 2. Searching over the stability region Ξ, the arrival rates R 1, R 2 and their sum rate can be further optimized over ρ and τ, which will be numerically evaluated in the next subsection Numerical Results In this subsection, numerical results for the two-source two-destination relay network with fixed transmission rates are provided. First, we verify our analysis through Monte Carlo simulations. In each simulation, we generate time blocks to estimate the buffer overflow probability, and repeat each simulation 500 times to evaluate the averages. We set the queuing constraints as θ 1 = θ 2 = θ r = 0.1, and the constant arrival rates at nodes S 1 and S 2 are chosen according to (5.75) and (5.76), respectively. We further assume that r S1,R = r S2,R = r R,D1 = r R,D2 = 0.3 bit/s, z 1 = z 2 = ω 1 = ω 2 = 2, SNR 1 = 6.02 db, SNR 2 = 4.77 db, SNR r = 7.78 db, τ = 0.39, 147

167 ρ = 0.7. We plot the logarithmic buffer overflow probabilities as functions of the overflow thresholds in Fig In this specific example, the system throughput is mainly decided by the multiple-access channel, and the overflow probabilities at the buffers of the two source nodes almost exactly meet the queuing constraints. The simulated slopes of the logarithmic overflow probabilities at S 1 and S 2 are and 0.101, respectively. The overflow probabilities at the relay buffers diminish with steeper slopes than required and hence satisfy even stricter queuing constraints. Among the two relay buffers, we note that the overflow probability in the buffer keeping the data from S 1 decays much more faster. This is because we set ρ = 0.7, meaning that the R D 1 link gets more power than the R D 2 link. Fig demonstrates the case in which the performance bottleneck is in the broadcast phase. In Fig. 5.22, we again set the queuing constraints as θ 1 = θ 2 = θ r = 0.1. Other system parameters are given as r S1,R = r S2,R = bit/s, r R,D1 = r R,D2 = bit/s, z 1 = z 2 = ω 1 = ω 2 = 2, SNR 1 = 4.77dB, SNR 2 = 4.77dB, SNR r = 10dB, τ = 0.45, ρ = In this case, the overflow probabilities at the buffers of the two source nodes decrease faster than the imposed queuing constraints, and the overflow probabilities at the two relay buffers almost exactly meet the QoS requirements. Specifically, the simulated slopes of the logarithmic overflow probabilities at the two relay buffers are and 0.098, respectively. The rest numerical results are obtained for the following parameter values: r S1,R = r S2,R = r R,D1 = r R,D2 = 0.3 bit/s, z 1 = z 2 = ω 1 = ω 2 = 2, SNR 1 = 6.02 db, SNR 2 = 4.77dB, SNR r = 7.78dB, θ 1 = θ 2 = 1 and θ r = 3. Fig shows the influence of the power allocation parameter ρ on the successful transmission probabilities P 3 and P 4 in the broadcast phase. We observe that as ρ increases from 0 to 0.5 and hence a larger fraction of the power is allocated to the transmission of the message to D 1, P 3 grows dramatically while P 4 diminishes by a relatively small amount. This indicates that the sum arrival rate increases initially with increasing ρ. We also notice that both 148

168 Buffer at S 1 Buffer at S 2 Relay buffer for S 1 log Pr{Q>q max } Relay buffer for S q (bit) max Figure 5.21: Logarithmic buffer overflow probability vs. buffer overflow threshold. 0 Buff at S 1 Buff at S 2-2 Relay buff for S 1 Relay buff for S 2 log Pr{Q>q max } q (bit) max Figure 5.22: Logarithmic buffer overflow probability vs. buffer overflow threshold. 149

169 P ON P 3 with τ=0.4 P 4 with τ= ρ Figure 5.23: ON state probabilities in the broadcast phase vs. ρ with τ = 0.4. P 3 and P 4 decrease slightly at around ρ = 0.5 due to the increased interference caused by the joint transmission of messages at similar power levels in the broadcast phase. Similarly, the boundary of region of feasible (ρ, τ) pairs for stability at the relay buffer shown in Fig has a local minimum τ value at ρ close to 0.5. Additionally, we see in the figure that the boundary, which essentially bounds τ from above, can be regarded as the intersection of two upper bounds on τ as discussed in Section Fig shows the arrival rate R 1 as a function of (ρ, τ). Outside the feasible region, rate is set to zero. We note that as τ increases, R 1 initially increases and then decreases within the feasible region. From (5.75), we know that R 1 is characterized as the point-wise minimum of two functions, one being an increasing function of τ, while the other being a decreasing function. Therefore, there exists an optimal τ that maximizes R 1 for given ρ. We also observe that the maximum of R 1 over all (ρ, τ) is achieved when τ = 0.39 and ρ = 0.7. Note that with this relatively large ρ value, the S 1 R D 1 link can in general support higher arrival rates because the relay allocates more power for the transmission of the message coming from S 1. Similar numerical results can be obtained for R 2. Expectedly, R 2 has higher values when ρ <

170 τ Stability Region Figure 5.24: Region of feasible (ρ, τ) pairs for stability at the relay buffer. ρ R 1 (bit/s) τ ρ Figure 5.25: The arrival rate R 1 as a function of (ρ, τ). 151

171 Sum rate (bit/s) ρ Figure 5.26: Maximum sum arrival rate vs. ρ. Finally, we consider the maximum sum arrival rate. Fig plots the maximum sum arrival rate max{r 1 +R 2 } as a function of ρ. As ρ approaches 0, the performance of the S 1 R D 1 link is limited by the low transmission power of the relay, leading to the adoption of a very small τ value as seen in Fig Small value of τ lowers the throughput of the S 2 R D 2 link as well. Hence R 2 is also small. Similar concerns arise as ρ approaches 1. Hence, allocating the power almost exclusively for the transmission of one message is not an efficient strategy in terms of maximizing the sum rate. Indeed, the sum arrival rate is maximized when ρ = However, it is interesting to note that equal allocation (i.e., having ρ = 0.5) is not the optimal strategy either, because the sum rate has a local minimum point around ρ = 0.5 again as a reflection of increased interference. 152

172 Chapter 6 Throughput and Mode Selection in Two-way MIMO Systems under Queuing Constraints In this chapter, the throughput of and mode selection between half-duplex and fullduplex modes are studied in two-way MIMO systems operating under statistical queuing constraints. In particular, the effective capacity of these systems is determined in order to identify the throughput under constraints on the buffer overflow probability. In the low SNR regime, the optimal input covariance matrices that achieve the minimum energy per bit of the system are investigated. Full-duplex mode is found to have better performance at low SNRs and short distances, while half-duplex mode outperforms full-duplex operation at high SNR levels and long distances. 6.1 System Model We consider a two-way MIMO system shown in Fig. 6.1, in which two users A and B want to exchange information with each other. Both of them are equipped with multiple antennas. N A and N B are the numbers of antennas at users A and B, 153

173 Figure 6.1: System model for two-way MIMO channel respectively. We assume a discrete-time system model with block flat-fading. In each time block, the fading coefficients stay fixed, and change independently across blocks. Also, we assume that there is a buffer at each node to store the arriving packets, and these two users have to satisfy the statistical queuing constraints described in Chapter 2. Packets will be cleared from the buffer only after the corresponding receiver successfully decodes them. Perfect CSI is assumed to be available at both nodes, so that transmitters can adapt their transmission rates to the channel conditions. In this section, we investigate the system throughput in both half-duplex and full-duplex modes Half-Duplex Mode In the half-duplex mode, we consider both time division multiplexing (TDM) and frequency division multiplexing (FDM). In both cases, there is no self interference, and the input-output relationships are given by ȳ A = H BA x B + n A (6.1) ȳ B = H AB x A + n B, (6.2) 154

174 where x j, ȳ j and n j are the received signal vector, transmitted signal vector and Gaussian noise vector of node j, for j = A, B, and H AB and H BA are the channel matrices for the links A B and B A, respectively, whose components are the channel fading coefficients between the corresponding transmitting and receiving antenna pairs. The average energy of the transmitted signal is E{ x j 2 } = P j B j, where B j is the bandwidth allocated to the j th link. The noise vectors are assumed to be zero-mean Gaussian random vectors with covariance matrices given by E{ n j n j } = σ2 ni. At this point, we can define the system SNR for link A B as SNR A = E{ x A 2 } E{ n B 2 } = we have SNR B = P A N B B A. Similarly, σn 2 P B N A B B. In the special case of Rayleigh fading, the components of σn 2 the channel fading matrices are assumed to follow zero-mean circularly symmetric Gaussian distribution with variance σ 2 h, which we denote by CN (0, σ2 h ). Note that n B and x B are N B 1 dimensional vectors, n A and x A are N A 1 dimensional vectors, H AB is an N B N A dimensional matrix, and H BA is an N A N B dimensional matrix. In the half-duplex TDM mode, two users cannot transmit and receive at the same time. In this case, we denote the fraction of time allocated to link A B as τ, where τ [0, 1]. Therefore, the fraction of time allocated to link B A is 1 τ. In the half-duple FDM mode, the system divides the frequency band into two subbands. We denote the fraction of bandwidth allocated to link A B as ρ. Hence, the bandwidth allocated to the transmission of node A is B A = ρb, where B = B A + B B is the total bandwidth of the system, and B B = (1 ρ)b is the bandwidth allocated to the transmission of node B. Since the numbers of antennas are fixed at A and B for all transmission modes, including the full-duplex mode, each antenna should transmit and receive at the same time in half-duplex FDM mode and full-duplex mode to achieve the best performance. For half-duplex FDM mode, self-interference is negligible since transmitting and receiving are being performed in different frequency bands. Self-interference cancellation in full-duplex mode will be discussed in the next subsection. 155

175 6.1.2 Full-Duplex Mode In the full-duplex mode, two nodes A and B transmit and receive at the same time using a common frequency band, and there is self-interference which originates from the transmitting antennas at the same node. The discrete-time input-output relationships in the full-duplex mode can be expressed as ȳ A = H BA x B + γ A H AA x A + n A (6.3) ȳ B = H AB x A + γ B H BB x B + n B (6.4) where γ j and H jj represent the self-interference cancelation parameter and selfinterference channel matrix, respectively for node j, where j = A, B. The rest of the notation has the same description and SNR j is defined in the same way as in halfduplex mode. The self-interference components, γ j H jj x j, do not necessarily follow a Gaussian distribution. However, since Gaussian distribution leads to lower bounds on the channel capacities, we consider these components as zero-mean circularlysymmetrical complex Gaussian distributed, leading to a worst-case analysis. To have a fair comparison of the full-duplex and half-duplex modes, we assume that the same number of antennas are employed at both nodes for transmission and reception in both modes. This requires each antenna to transmit and receive simultaneously in full-duplex mode. Feasibility of this was demonstrated by a practical design that has been proposed in [90], guaranteeing over 40 db channel isolation between the transmitter and receiver circuity. With this assumption, we have two types of self-interference. The first type occurs between two antennas at the same node. For this type, interference signals are received from the transmitting antennas, and the channel fading coefficients are described by the off-diagonal components of H AA and H BB, which are assumed to have zero-mean and variance σs1. 2 At the receiver side, self-interference cancellation is described and quantified with the parameter γ j 156

176 at node j. From (6.3) and (6.4), we note that γ = 1 represents no self-interference cancelation scheme, while γ = 0 represents perfect cancelation. The second type of self-interference occurs on a single antenna, and the main component of the interference comes from its own transmitting circuit. By applying the techniques in [90], this self-interference can be controlled by introducing isolation between the transmitting and receiving circuity sharing the same antenna. If we denote the cancelation parameter of this type of self-interference by β j at node j, then the diagonal components of H jj can be assumed to have zero-mean and variance (σ s2 β j /γ j ) 2 to satisfy (6.3) and (6.4). 6.2 Throughput for Two-way MIMO systems In this section, we formulate the system throughput under queuing constraints for both half-duplex and full-duplex modes. Apparently, the covariance matrices of the transmitted signals have significant influence on the system throughput. We define the normalized input covariance matrix of x j as K xj = E{x jx j } P j /B j, for j = A, B. (6.5) Throughout this section, we mainly discuss the effective capacity for given input covariance matrices. The maximum throughput can be obtained through optimization over K xa and K xb pairs System Throughput for Half-Duplex TDM Mode The transmission in the half-duplex TDM two-way MIMO systems can be divided into two phases. In the first phase, node A sends information to node B, and node B keeps silent. This phase occupies τ fraction of the time. In this phase, the instantaneous 157

177 rate of user A can be expressed as [ r A = B log 2 det I + P A H Bσn 2 AB K xa H AB [ = B log 2 det I + N B SNR A H AB K xa H AB ] (6.6) ], (6.7) for given K xa. In the next phase, users A and B exchange their roles, and only node B transmits. This phase occupies 1 τ fraction of the time, and the instantaneous rate of user B is given by [ r B = B log 2 det I + P B H Bσn 2 BA K xb H BA [ = B log 2 det I + N A SNR B H BA K xb H BA ] (6.8) ], (6.9) for given K xb. By plugging (6.7) and (6.9) into (2.7), the effective capacities of links A B and B A are given by C A = 1 θ A log e ( E { e τθ A r A }), and (6.10) C B = 1 θ B log e ( E { e (1 τ)θ B r B }), (6.11) respectively, given the covariance matrices K xa and K xb. Theorem 19 For given input covariance matrices, the sum throughput C A + C B is a concave function of the time-fraction parameter τ. Proof 5 By taking the second derivative of C A with respect to τ, we get 2 C A τ 2 θ A = (E {e τθ Ar A}) 2 {E { } } { } ( { }) e τθ Ar A E r 2A e τθ Ar A E ra e τθ Ar A 2. (6.12) 158

178 Applying Cauchy-Schwarz inequality, we have E { e τθ Ar A } E { r 2 A e τθ Ar A } ( E { ra e τθ Ar A }) 2. (6.13) From (6.13) which implies that the second derivative in (6.12) is non-positive, we determine that the effective capacity of the A B link is a concave function of τ. Similarly, we can prove that C B is also a concave function of τ, leading to the desired result that the sum throughput C A + C B is concave. With this characterization, the optimal τ value which maximizes the sum throughput can be obtained via convex optimization algorithms System Throughput for Half-Duplex FDM Mode In the half-duplex FDM mode, both users can transmit and receive simultaneously, using different frequency bands. The instantaneous transmission rates of users A and B are given by [ r A = ρb log 2 det I + P A H ρbσn 2 AB K xa H AB [ = ρb log 2 det I + N B SNR A H AB K xa H AB ] (6.14) ], (6.15) and [ r B = (1 ρ)b log 2 det I + P B H (1 ρ)bσn 2 BA K xb H BA [ = (1 ρ)b log 2 det I + N A SNR B H BA K xb H BA ] ], (6.16) respectively. Comparing (6.15) and (6.16) with (6.7) and (6.9), we have the following observation. 159

179 Remark 1 For a half-duplex FDM two-way MIMO system with system parameter ρ, there exists a half-duplex TDM system with parameter τ = ρ, which can achieve the same system throughput with the same average power consumption. We use subscripts TDM and FDM to distinguish the corresponding quantities. For user A, we set SNR A,TDM = SNR A,FDM. Since we have τ = ρ, by comparing (6.7) and (6.15), apparently we should have r A,TDM = r A,FDM, which also implies that we have C A,TDM = C A,FDM. From SNR A,TDM = SNR A,FDM, we can get P A,TDM = P A,FDM /ρ. Then, the average power of TDM system can be expressed as P A,TDM = τp A,TDM = τ ρ P A,FDM = P A,FDM = P A,FDM. Now, we have shown that for user A, we can achieve the same throughput using the same average power in both TDM and FDM systems, when we set SNR A,TDM = SNR A,FDM and τ = ρ. Similarly, we can show the same result for user B, when we set SNR B,TDM = SNR B,FDM and τ = ρ. Therefore, TDM and FDM systems have the same performance, and we can only address one of them in the half-duplex mode. Hence, we subsequently consider the TDM system to represent the performance of the half-duplex mode System Throughput for Full-Duplex Mode In full-duplex mode, since the two users transmit and receive simultaneously in the same frequency band, we have additional self-interference terms as seen in (6.3) and (6.4). As mentioned in Section 6.1.2, there are two types of self-interference, experienced due to transmissions from the other antennas at the same node and from the transmitting circuitry of the same antenna, respectively. These interference terms are characterized by the off-diagonal and diagonal elements of the self-interference channel matrices, respectively. 160

180 We define the overall interference covariance matrix K zj as ( ) 2 γj K zj = H jje{x j x j σ }H jj + 1 E{ n n σn 2 j n j } (6.17) =N j γj 2 SNR jh jj K xj H jj + I, (6.18) for j = A, B. Then, the instantaneous transmission rates for A and B can be written, respectively, as [ r A = B log 2 det I + N B SNR A H AB K xa H AB K 1 zb ], (6.19) and [ ] r B = B log 2 det I + N A SNR B H BA K xb H BA K 1 za, (6.20) where K za and K zb are given by (6.18). Inserting (6.19) and (6.20) into (2.7), the effective capacities of the A B and B A links are given by C A = 1 ( { }) log θ e E e θ A r A, A and (6.21) C B = 1 ( { }) log θ e E e θ B r B, B (6.22) respectively. 6.3 Mode Selection In this subsection, we investigate the mode selection protocol in two-way MIMO systems with the goal of maximizing the sum throughput C sum = C A + C B. At the beginning of the transmission, two users evaluate the sum throughput of halfduplex and full-duplex modes, and choose the one with the higher sum throughput. 161

181 In Section 6.2, we have mentioned that the throughput depends on the input signal covariance matrices, K xa and K xb. In general, it is not easy to determine the optimal covariance matrices that maximize the sum throughput, but we can identify them in the low-snr regime Mode Selection in the Low-SNR Regime In the low-snr regime, minimum energy per bit is a widely used performance metric [84], which characterizes the minimum energy required to send one bit of information reliably. In our system setting, the minimum energy per bit is achieved when SNR approaches zero, and is given by E b N 0 min = lim SNR 0 SNR C E (SNR) = 1 Ċ E (0), (6.23) where ĊE(0) denotes the first derivative of the the effective capacity function C E (SNR) with respect to SNR at SNR = 0. Now, our purpose is to find the optimal covariance matrices that achieves the smallest E b N 0 min for both half-duplex and full-duplex modes. Remark 2 In the low-snr regime, for half-duplex two-way MIMO systems, the optimal input covariance matrices that achieve the smallest E b N 0 min are given by K xa = u A u A and K x,b = u B u B (6.24) where u A is the eigenvector corresponding to the maximum eigenvalue of H AB H AB, and u B is the eigenvector corresponding to the maximum eigenvalue of H BA H BA. It has been proved in [39] that the optimal input covariance matrix that minimize the minimum energy per bit for a point to point MIMO channel is K x = uu, (6.25) 162

182 where u is the eigenvector corresponding to the maximum eigenvalue of H H, and H is the channel fading matrix. For our half-duplex two-way MIMO systems, transmissions over A B and B A links are separated, and minimizing the overall bit energy is equivalent to minimizing the individual bit energies of users A and B. Applying (6.25) to links A B and B A, we immediately have the observation in Remark 2. In full-duplex mode, transmission over these two links are now interacting due to self-interference. In this case, we can identify the optimal solution, which minimizes the bit energies separately for users A and B, via an iterative procedure. First, we have the following characterization. Theorem 20 In the low-snr regime for full-duplex two-way MIMO systems, for given K xb, the optimal input covariance matrix K xa of user A is that achieves the smallest E b N 0 min K xa = Ψ A Ψ A, (6.26) where Ψ A is the eigenvector corresponding to the maximum eigenvalue of H AB K 1 zb H AB. Similarly for user B, for given K xa, the optimal K xb is K xb = Ψ B Ψ B, (6.27) where Ψ B is the eigenvector corresponding to the maximum eigenvalue of H BA K 1 za H BA. Proof : See Appendix A.13. Using Theorem 20, we can determine the optimal covariance matrices through an iterative process. Initially, we set K xa = 1 N A I and K xb = 1 N B I. In each iteration, we update K xa and K xb using (6.26) and (6.27), until convergence or the number of iterations reaches a limit. 163

183 6 5 Sum throughput Half duplex without K x optimization Half duplex with optimized K x Full duplex without K x optimization Full duplex with optimized K x SNR Figure 6.2: Sum throughput vs. SNR Fig. 6.2 shows the low SNR performance for half-duplex and full-duplex modes with different input covariance matrices. For all numerical results in the paper, we assume Rayleigh fading. All fading coefficients between the corresponding transmitting and receiving antenna pairs follow zero-mean circularly symmetrical complex Gaussian distribution. In this figure, we set SNR A = SNR B = SNR, N A = N B = 3, θ A = θ B = 1, γ A = γ B = 0.1, σ n = 0.33, σ h = 0.7, σ s1 = 10, σ s2 β A = σ s2 β B = For the curves with no covariance matrix optimization, we set K xa = 1 N A I and K xb = 1 N B I. Fig. 6.2 shows that full-duplex mode has better performance at low SNR values, if self-interference is under control, because it allows two users to utilize the channel simultaneously. For both half-duplex and full-duplex modes, the sum throughput with optimized covariance matrices are greater. This is due to the facts that the direction with the best channel gain is selected and the influence of selfinterference is reduced. Since the covariance matrix optimization is done for low SNR levels, when we have relatively higher SNR values, the advantage of the optimization is diminished in the half-duplex case. Additionally, in our numerical analysis, we have observed that our iterative algorithm converges 99.47% of the time, and the average number of iterations needed to 164

184 achieve convergence is 13.77, when we set the criterion as tr(e A e A )+tr(e Be B ) 10 10, where e j = K i x j K i 1 x j and K i x j is the covariance matrix after the i th iteration, for j = A, B Mode Selection in the High-SNR Regime Apparently, the sum throughput of the half-duplex mode grows without bound when both SNR A and SNR B are increased. However, the situation is different for the fullduplex mode. Proposition 4 In the high-snr regime, when the power ratio P A /P B is kept fixed, the sum throughput approaches a constant, which only depends on the power ratio. Proof 6 If we keep P A PB rate for the A B link as SNR A becomes lim r A = SNR A = η constant, plugging (6.18) into (6.19), the instantaneous lim SNR A = lim SNR A = B log 2 det [ B log 2 det B log 2 det I + N B SNR A H AB K xa H AB K 1 zb [ I+ η ] H γb 2 AB K xa H AB (H BBK xb H BB ) 1 [ I + η ] H γb 2 AB K xa H AB (H BBK xb H BB ) 1, ] which approaches a constant for large SNR A and SNR B values. A similar result can be found for r B under the high SNR assumption. Then the sum throughput C A + C B approaches a constant that only depends on η. Proposition 4 implies at high SNRs that half-duplex mode has better performance than the full-duplex mode. When the transmission power increases, self-interference also increases proportionally, and becomes difficult to control. Or equivalently, when the noise power vanishes, the system will primarily be self-interference limited. In this case, the advantage of half-duplex operation, which avoids self-interference, becomes 165

185 Sum throughput Half duplex Full duplex with γ=0.10 Full duplex with γ=0.07 Full duplex with γ= SNR Figure 6.3: Sum throughput vs. SNR significantly beneficial. Fig. 6.3 shows the sum throughput of the full-duplex mode with different self-interference cancelation parameters and of the half-duplex mode over a relative wide SNR range. We set γ A = γ B = γ, K xa = 1 N A I, K xb = 1 N B I, and all other parameter settings are the same as in Fig Although full-duplex mode has better performance at low SNRs, half-duplex operation starts outperforming when the SNR level is sufficiently high, and the gap increases as SNR increases. When γ decreases, we have better self-interference cancelation in the full-duplex mode, and the performance improves. If we have perfect interference cancelation, full-duplex performance should always be better than that of the half-duplex mode Mode Selection at Different Transmission Distances In addition to the SNR level, the transmission distance also has an impact on mode selection. The distance between the two users affects the distribution of H AB and H BA. Assuming that the transmission distance is d, we set σh 2 = 1. In Fig. 6.4, we d 4 plot the sum throughput of half-duplex and full-duplex modes as a function of d for different queuing constraints. We set θ A = θ B = θ, γ A = γ B = 0.8, SNR A = SNR B = 7, and other parameter settings are the same as in Fig When the distance is

186 Half duplex with θ=0.1 Full duplex with θ=0.1 Half duplex with θ=8 Full duplex with θ=8 Sum throughput Distance d Figure 6.4: Sum throughput vs. transmission distance small, full-duplex mode has better performance. As the distance increases, channel conditions become worse, and the received signal power diminishes, but the selfinterference is still unchanged. As a result, the SINR in full-duplex mode decreases very fast, and the performance drops below that of half-duplex mode. Therefore, self-interference control is a critical problem in the full-duplex mode in long-distance transmissions. Also, Fig. 6.4 shows that increasing the values of the QoS exponent θ lowers the system throughput. 167

187 Chapter 7 Mode Selection and Resource Allocation Algorithms for D2D Cellular Networks In this chapter, we study the mode selection and resource allocation algorithms for D2D cellular networks. D2D communication underlaid with cellular networks is a new paradigm, proposed to enhance the performance of cellular networks. By allowing a pair of D2D users to communicate directly and share the same spectral resources with the cellular users, D2D communication can achieve higher spectral efficiency, improve the energy efficiency, and lower the traffic delay. In Section 7.1, transmission mode selection and resource allocation in a TDM cellular network with one cellular user, one base station, and a pair of D2D users is investigated under rate and queueing constraints. In particular, four possible modes are considered, namely the cellular mode, dedicated mode, uplink reuse mode, and downlink reuse mode. Using tools from stochastic network calculus, the system throughput under statistical queueing constraints is formulated, efficient resource allocation algorithms for all possible modes are proposed, and the influence of the 168

188 positions of each node and the queueing constraints is analyzed via numerical results. Scenarios and conditions for different modes to be optimal in the sense of maximizing the sum-throughput are identified. In Section 7.2, we propose a novel channel matching algorithm for joint mode selection and channel allocation with the goal of maximizing the system throughput under statistical queueing constraints. Seven possible modes are considered, namely the D2D cellular mode, D2D dedicated mode, uplink dedicated mode, downlink dedicated mode, uplink reuse mode, downlink reuse mode, and D2D reuse mode. The throughput is characterized by determining the effective capacity. We formulate the channel allocation problem as a maximum-weight matching problem, which can be solved by employing the Hungarian algorithm. Via simulation results, we verify the performance improvements achieved by our proposed matching algorithm. In Section 7.3, we propose a novel joint mode selection and channel resource allocation algorithm via the vertex coloring approach. We decompose the problem into three subproblems and design algorithms for each of them. In the first step, we divide the users into groups using a vertex coloring algorithm. In the second step, we solve the power optimization problem using the interior-point method for each group and conduct mode selection between the cellular mode and D2D mode for D2D users, and we assign channel resources to these groups in the final step. Numerical results show that our algorithm achieves higher sum rate and serves more users with relatively small time consumption compared with other algorithms. Also, the influence of system parameters and the tradeoff between sum rate and the number of served users are studied through simulation results. 169

189 Figure 7.1: System model with queuing constraints (Dashed lines represent interference only links.) 7.1 Mode Selection of Device-to-Device Communication in Cellular Networks under Statistical Queuing Constraints System Model As mentioned above, we study mode selection and resource allocation in a cellular network with D2D users with the goal of maximizing the overall sum rate of the network under queuing constraints. For simplicity, we consider a typical model shown in Fig. 7.1, in which there is only one cellular user (CU), one base station (BS), and one pair of D2D users denoted by D 1 and D 2, respectively. We assume the transmission between D2D users is one-way, and D 1 is the transmitter and D 2 is the receiver. Therefore, there are overall three transmitters in this network, namely BS, CU and D 1, and their maximum transmission powers are denoted by P b, P c and P dmax. There are three main communication links corresponding to the three transmitters. In the uplink, the cellular user CU sends information to the base 170

190 station; in the downlink, the base station sends information to the cellular user; and in the D2D link, D 1 transmits to D 2. Before the transmitters send their packets to the corresponding receivers, the packets are stored in buffers. All transmitters are operating under statistical queuing constraints imposed as limitations on buffer overflow probabilities. We consider a block-fading model. The fading coefficients and their magnitudesquares in different communication and interference links are represented by h i and z i = h i 2, respectively, as depicted in Fig. 7.1 for i = 1, 2,, 7. The overall bandwidth of this system is B, and the time in each transmission period is divided into several phases and allocated to different links. The fractions of time allocated to each link are given by the elements of the time allocation vector τ. For different modes, the dimension of τ is different. In this section, we consider four modes for this network, namely the cellular mode, the dedicated mode, the uplink reuse mode, and the downlink reuse mode. Cellular Mode: In this mode, the transmission is divided into 4 phases. In the first phase, only the uplink is active, in which CU sends information to BS; in the second phase, only the downlink is active, in which BS sends information to CU; in the third phase, D 1 sends information to the base station (only D 1 BS link is active), and the base station decodes and forwards the information to D 2 in the last phase (only BS D 2 link is active). In the cellular mode, the base station works as a relay node between D 1 and D 2, and hence these two D2D users essentially communicate just like the cellular users. We denote the fractions of time allocated to the cellular uplink and downlink as τ 1 and τ 2, respectively, and we denote the overall fraction of time allocated to the two D2D links as τ 3. The time allocation between the two D2D links is discussed in Section Now, we have τ 1 = 3 j=1 τ j = 1. For simplicity, we assume that there are two separate buffers at the base station. One buffer is for the packets that will be sent to CU, and the other one stores the packets 171

191 that has arrived from D 1 and will be sent to D 2. These two buffers operate under different queuing constraints. Since there is only one pair of transmitter and receiver being active in each phase, there is no interference in the cellular mode. In each time block, the received signal at each receiver has the form y = h i x + n i, (7.1) where x is the transmitted signal, h i is the corresponding channel fading coefficient, and n i is the additive noise component. At different receiver nodes, the noise terms are assumed to be independent, zero-mean, circularly symmetric, complex Gaussian random variables with variances E{ n i 2 } = σ0. 2 Dedicated Mode: In this mode, the transmission is divided into three phases, which are the uplink phase, downlink phase and D2D phase, and the fractions of time allocated to each phase are given by τ = (τ 1, τ 2, τ 3 ). Compared to the cellular mode, the only difference is that D 1 sends information directly to D 2 without the participation of the base station in the third phase. Similar to the cellular mode, there is no interference, and the received signals have the form in (7.1). Reuse Modes: In the uplink and downlink reuse modes, the D2D link reuses the channel resource with uplink and downlink, respectively. Therefore, the transmission is divided into two phases in these two modes. In the uplink reuse mode, the first phase is allocated to the downlink, and the second phase is allocated to the uplink and D2D link. Since D 1 and CU transmit in the second phase simultaneously, D 2 and BS experience interference. In the uplink channel, the received signals follow the form y = h i x + h inter x inter + n i, (7.2) where x is the desired signal, h i is the fading coefficient of the channel between this 172

192 receiver and its corresponding transmitter, x inter is the interference signal, h inter is the fading coefficient of the interfering link, and n i is the Gaussian noise. Since there is no interference in the downlink phase, the received signal at CU node follows the form in (7.1). In the downlink reuse mode, the first phase is allocated to the uplink, and the second phase is allocated to the downlink and D2D link. Similarly, the received signals at D 2 and CU follow (7.2) in the downlink reuse mode, and the received signal at the base station in the uplink channel follows (7.1). In order to guarantee certain performance levels, we impose throughput/rate constraints for all users in all 4 modes. The minimum throughput required in the uplink, downlink and D2D link are denoted by R Cmin, R Bmin and R Dmin, respectively. If one of the rate constraints cannot be satisfied, then the corresponding mode is not used Throughput of Cellular Network with D2D Users In the previous subsection, we introduce the formulation of the system throughput under queuing constraints. In order to determine the effective capacities, we have to characterize the instantaneous transmission rates of each communication link. In this subsection, we formulate the overall system throughput of the cellular network in each mode, under given resource allocation strategies, and efficient resource allocation algorithms are provided in Section Throughput in the Cellular Mode In the cellular mode, CU BS, BS CU, D 1 BS and BS D 2 links are active. The fractions of time allocated to CU BS, BS CU and D 1 BS D 2 links are given by τ 1, τ 2 and τ 3, respectively. In this mode, all transmitters transmit with their maximum power since there is no interference. The instantaneous transmission rate 173

193 of the uplink channel is ( r C,B = B log P ) c z Bσ (7.3) Plugging (7.3) into (2.7), we obtain the effective capacity of the CU BS link as R C = 1 θ C log E { e τ 1θ C r C,B }. (7.4) For the downlink channel, the instantaneous rate is ( r B,C = B log P ) b z Bσ0 2 4, (7.5) and the corresponding effective capacity is R B = 1 θ BC log E { e τ 2θ BC r B,C }. (7.6) In the cellular mode, the D2D link is a two-hop channel with two queues in tandem. More specifically, D 1 first sends information to BS in the third phase, and then BS forwards the information to D 2 in the last phase. The instantaneous transmission rates of D 1 BS and BS D 2 links are given, respectively, by ( r D,B = B log P ) dmax z Bσ0 2 2, (7.7) ( r B,D = B log P ) b z Bσ (7.8) Define ˆτ as the fraction of time allocated to the third phase. Then the fraction of time allocated to the last phase is τ 3 ˆτ. In this two-hop case, the arrival rate at node D 1 has to satisfy the QoS constraints at D 1 and BS, simultaneously. The effective capacity analysis of this half-duplex two-hop channel is much more involved, 174

194 and a detailed analysis is given in [17, Section III-B]. Through a similar process, we express the effective capacity of the D 1 BS D 2 link as R D = 1 θ D log E{e ˆτθ Dr D,B }, (7.9) where ˆτ = min{τ 0, τ }, and τ 0 is the solution to τ 0 E{r D,B } = (τ 3 τ 0 )E{r B,D }, (7.10) and τ is the solution to 1 θ D log E{e τ θ D r D,B } = 1 θ BD log E{e (τ 3 τ )θ BD r B,D } (7.11) when θ D θ BD, or 1 log E{e τ θ D r D,B } = 1 ( ) log E{e (τ 3 τ )θ BD r B,D } + log E{e τ (θ BD θ D )r D,B } θ D θ D (7.12) when θ D < θ BD Throughput in the Dedicated Mode In the dedicated mode, CU BS, BS CU, and D 1 D 2 links are active, and the fraction of time allocated to these links are given by τ = (τ 1, τ 2, τ 3 ). In this mode, D 1 transmits to D 2 directly without the help of the base station. Since there is no interference, all transmitters transmit using their maximum power. Similar to the analysis above, the instantaneous transmission rates of the uplink, downlink are also given by (7.3) and (7.5), and their effective capacities still follow (7.4) and (7.6). For 175

195 the direct D2D link, the instantaneous transmission rate is ( r D1,D2 = B log P ) dmax z 5. (7.13) Bσ 2 0 Plugging (7.13) into (2.7), we can get its corresponding effective capacity expressed as R D = 1 θ D log E { e τ 3θ D r D1,D2 }. (7.14) Throughput in Reuse Modes In the two reuse modes, only one cellular link is active in the first phase, and the other cellular link share the channel resource with the D2D direct link in the second phase. We denote the fraction of time allocated to the first phase as τ 1. Then the fraction of time left for the second phase is 1 τ 1. In reuse modes, we assume that the base station and cellular user transmit with their maximum power, due to cellular links being assumed to have higher priority over the D2D link. Assume that the transmission power of D 1 is given by P d. Then the instantaneous transmission rates for the downlink, uplink and D2D link are given by ( r B,C = B log P ) b z Bσ0 2 4 (7.15) ) P c /Bσ0 r C,B = B log 2 (1 2 + z 1 + (P d /Bσ0)z 2 1 (7.16) 2 ) P d /Bσ0 r D1,D2 = B log 2 (1 2 + z 1 + (P c /Bσ0)z 2 5, (7.17) 6 176

196 in the uplink reuse mode, and their corresponding effective capacities are given, respectively, by R B = 1 θ BC log E { e τ 1θ BC r B,C } R C = 1 θ C log E { e (1 τ 1)θ C r C,B } (7.18) (7.19) R D = 1 θ D log E { e (1 τ 1)θ D r D1,D2 }. (7.20) Similar characterizations can be obtained for the downlink reuse mode, and the optimization over P d in reuse modes is discussed in the next subsection Resource Allocation In the previous subsection, we have formulated the throughput under statistical queuing constraints for all 4 modes. In this subsection, we investigate efficient resource allocation strategies with the goal of maximizing the sum throughput. In the cellular and dedicated modes, only the time allocation vector τ is optimized, and in the two reuse modes, both τ 1 and P d are optimized Cellular Mode In cellular mode, the throughput maximization problem is formulated as Maximize τ R sum = R C + R B + R D (7.21) Subject to τ 1 = 1 (7.22) R j R j min, for j = C, B, D (7.23) where R Cmin, R Dmin and R Dmin are the minimum rates required for the uplink, downlink and D2D link, respectively. This optimization problem can be solved through auction game approach. Auction game has been studied in the game theory literature, 177

197 Resource allocation for the cellular mode Table 7.1: Algorithm Initialization: Initialize τ 1 and τ 2 by solving R C = R Cmin and R B = R Bmin. ˆτ is given by the solution of 1 θ D log E{e ˆτθ Dr D,B } = R Dmin. With this ˆτ value, solve τ 3 from (7.10), and denote this solution by τ 3,1. If θ D θ BD, solve τ 3 from (7.11), otherwise solve τ 3 from (7.12), and denote this solution by τ 3,2. Set τ 3 = max{τ 3,1, τ 3,2 }. Check if τ 1 > 1, then end the allocation process. 2. Auction: Divide the remaining time resource equally into N parts, and set τ = (1 τ 1 τ 2 τ 3 )/N. Set the bids of the uplink, downlink and D2D link as Bid C = R C (τ 1 + τ) R C (τ 1 ), Bid B = R B (τ 2 + τ) R B (τ 2 ) and Bid D = R D (τ 3 + τ) R D (τ 3 ), respectively. For i = 1 : N (a) Select the link with the highest bid as the winner of the i th round, and increase its time allocation parameter by τ. (b) Update the winner s bid with its new time allocation parameter. end and it has been used to design algorithms for resource allocation in D2D models. For example, in [47], spectrum allocation problem was solved through an iterative combinatorial auction approach. In our setting, we build our auction game as follows: The time resource is divided into small time slots (of the same duration), which are regarded as resource units. In each round, three bidders, which are uplink, downlink and D2D link, compete for a single resource unit. Bidders bid according to the throughput increment they gain with an additional time slot, and the resource unit will be allocated to the link giving the highest bid. The detailed algorithm is provided in Table 7.1. The initialization step guarantees the rate constraints. If the rate constraints cannot be satisfied, we have τ 1 > 1. It is immediately seen that this iterative algorithm is completed within finite time, and the system only needs to evaluate 178

198 effective capacity N + 3 times, where N is the number of resource units. Therefore, the complexity of this algorithm does not depend on the number of bidders. The performance of this auction algorithm is evaluated via numerical results. We generate the location of the nodes randomly, and compare the optimal throughput values obtained from Algorithm 7.1 and exhaustive search. The normalized error is defined as ε = R sum,auction R sum,search R sum,search. Over 10 5 rounds of testing, the average normalized error is In each time, N is selected as the smallest integer that makes the step τ Dedicated Mode In the dedicated mode, the optimization problem has the same formulation as in the cellular mode. Different from the cellular mode case, the throughput maximization problem can be solved by convex optimization algorithms. Theorem 21 In the dedicated mode, the sum rate R sum = R C +R B +R D is concave with respect to the time allocation vector τ. Proof : See Appendix A.14 Also, the constraints define a convex region in the τ 1 τ 2 τ 3 space. Therefore, the throughput maximization problem in the dedicated mode is a concave maximization problem, which can be solved efficiently by convex optimization algorithms. 179

199 Reuse Modes In the two reuse modes, we need to optimize the throughput over τ 1 and the D2D transmission power P d, and the optimization problem is formulated as Maximize τ1,p d R sum = R C + R B + R D (7.24) Subject to 0 P d P dmax (7.25) R j R j min, for j = C, B, D (7.26) Due to the interference between the D2D and cellular users, the objective function is not concave. To solve this problem, we search for the optimal τ 1 from 0 to 1. For a given value of τ 1, we determine the optimal P d through the successive convex approximation (SCA) algorithm. If the rate constraints cannot be all satisfied, the sum rate is set to 0. The idea of SCA was proposed in [91]. In this approach, the nonconvex optimization problem is transformed into a series of convex optimization problems. More specifically in our model, we replace the sum throughput by a series of concave functions of P d, which are denoted as {U l } for l = 1, 2,. In the uplink reuse mode, since R B does not depend on P d, we only need to consider the maximization of R = R C + R D, and we construct {U l } as where P l 1 d U l = R D (P d ) + R C (P l 1 d ) + (P d P l 1 d ) R C P d Pd =P l 1 d is the optimal P d value in the (l 1) th iteration, and the derivative of R C with respect to P d is given by R C P d { 1 τ 1 = E E{e (1 τ 1)θ C r C,B} e (1 τ 1)θ C r C,B BP c z 1 z 2 (P d z 2 + Bσ 2 0)(P d z 2 + P c z 1 + Bσ 2 0) log e 2 (7.27) }. 180

200 Table 7.2: Algorithm 7.2 SCA algorithm for power allocation in the uplink reuse mode 1. If R B < R Bmin, set R sum = 0, end process. 2. Find the lower bound of P d by solving R D = R Dmin, and denote this lower bound by P dl. Find an upper bound of P d by solving R C = R Cmin, and denote this upper bound by ˆP du. Set P du = min{ ˆP du, P dmax } as the upper bound of P d. If P dl > P du, set R sum = 0, end process. 3. Set l = 1 and select an initial value P 0 d between P dl and P du. Repeat (a) Maximize U l with the constraint P dl P d P du. (b) Update Pd l, and increase l by 1. Until convergence The optimal P d value is given by the Pd l in the last iteration, and the maximum sum rate is given by R sum = U opt + R B, where U opt is the optimal U l value in the last iteration. For a given τ 1, our SCA algorithm is described in Table 7.2, and the overall resource allocation algorithm for the uplink reuse mode is given in Table 7.3. The optimization algorithm for the downlink reuse mode can be obtained through a similar process Mode Selection In previous subsections, we have introduced the throughput formulation and resource allocation algorithms for all 4 possible modes. In its overall operation, the system runs the throughput maximization algorithms for these 4 modes, and selects the one with the highest optimal sum throughput. Since effective capacity is a long-term throughput criterion, the system does not need to conduct mode selection frequently. Our resource allocation algorithms can also be extended to the case of multiple cellular users without much difficulty. Since the number of cellular users only changes 181

201 Table 7.3: Algorithm 7.3 Resource allocation in the uplink reuse mode 1. Set a step τ for the searching algorithm, and set the optimal τ 1, P d and R sum values as τ 1 = 0, P d = 0 and R sum = For τ 1 = 0 : τ : 1 (a) Maximize the sum rate using the SCA algorithm. Denote the maximum sum rate and optimal P d as ˆR sum and ˆP d, respectively. (b) If ˆR sum > R sum, then update R sum = ˆR sum, P d = ˆP d and τ 1 = τ 1. end the dimensionality of the optimization problem, the algorithms given in the previous subsection can still be employed Numerical Results In this subsection, we further analyze the mode selection for the D2D users in the considered cellular network through numerical results. In all numerical results, we consider Rayleigh fading with path loss E{z} = d 4, where d is the distance between the transmitter and receiver, and z is the magnitude square of the corresponding fading coefficient. The position of the base station is fixed at the origin of the coordinate plane, and we set P c = P dmax = 500, P b = For the rate constraints, we set R Cmin = R Dmin = bit/s and R Bmin = bit/s 1. Also, we set all θ values to 1. In the figures, we use dark blue points to indicate the positions of the base station, D 1 and D 2. The color of each pixel represents the selected mode if the cellular user is in that position, and we denote the cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode by I, II, III and IV, respectively in the color bars. 1 The values of R Cmin, R Bmin and R Dmin are selected as the throughput of each link in the dedicated mode, when the distance between the transmitter and receiver pairs is 4, and τ 1 = τ 2 = τ 3 =

202 4 IV 3 2 III 1 0 II -1-2 I Figure 7.2: Mode selection result when D 1 is placed at (0, 2.5), and D 2 is placed at (0, 2.5). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively. 4 IV 3 2 III 1 0 II -1-2 I Figure 7.3: Mode selection result when D 1 is placed at (2, 1), and D 2 is placed at (2, 1). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively. 183

203 4 IV 3 2 III 1 0 II -1-2 I Figure 7.4: Mode selection result when D 1 is placed at (3.5, 0.5), and D 2 is placed at (3.5, 0.5). Cellular mode, dedicated mode, uplink reuse mode and downlink reuse mode are denoted by I, II, III and IV, respectively Sum rate (bit/s) Cellular mode Dedicated mode Uplink reuse mode Downlink reuse mode α ( ) Figure 7.5: Sum rate vs. angle α 184

204 In Fig. 7.2, we place D 1 and D 2 at (0, 2.5) and (0, 2.5) respectively. In this situation, only cellular mode can be selected, because the relatively large distance between D 1 and D 2 makes it difficult for the rate constraint of the D2D users to be satisfied. Cellular mode is preferred when the D 1 BS link and BS D 2 link are much stronger than direct link D 1 D 2. The dark blue regions on the corners represent the region outside the range of the cell. In Fig. 7.3, D 1 and D 2 get closer to each other, and both of them are shifted to the right. Since the direct link becomes stronger, cellular mode is no longer preferred. In this situation, most of the region prefers dedicated mode. Dedicated mode is preferred when the D2D direct link and interference links are strong. There is a small region on the lower left, in which uplink reuse mode is selected. In this region, the interference link CU D 2 becomes weaker, which results in the uplink reuse mode to have higher throughput than the dedicated mode. When CU further moves towards to the bottom left corner, the uplink becomes too weak to satisfy the rate constraint, and dedicated mode is selected. In Fig. 7.4, D 1 and D 2 are very close to each other, and both of them are far away from the base station. In this situation, a large area on the left side prefers uplink reuse mode, because the D2D direct link and uplink are stronger than the interference links. Reuse modes are preferred when the direct links are all much stronger than the interference links. When the cellular user is close to the D2D users, there is still a large region, where dedicated mode is preferred. Since P b is much larger than P c and P dmax, the interference arising from the base station is very strong, which makes downlink reuse mode unfavorable. We let the cellular user move around the circle x 2 + y 2 = 4, and we denote the angle between line CU BS and the positive direction of the x axis as α. We plot the sum rates of these 4 modes as functions of the angle α in Fig When α = 190, the distance between CU and D 2 is maximal, and the uplink reuse mode achieves the maximum sum rate. When α = 170, the distance between 185

205 CU and D 1 is maximal, and the downlink reuse mode reaches its own maximum sum rate. Moreover, the downlink reuse mode provides the highest sum rate when 60 < α < 90, verifying the observation in Fig Around the upper right corner of Fig. 7.4, there is also a small region that prefers downlink reuse mode. In this region, the received SINR in downlink is higher than that in uplink, because P b is much greater than P c, and the interference link CU D 1 is also relatively weak. 7.2 Joint Mode Selection and Resource Allocation for D2D Communications under Queuing Constraints System Model and Transmission Modes System Model We consider a cellular network with one base station (BS), N 1 cellular users {CU 1, CU 2,, CU N1 } and N 2 D2D pairs {(DT 1, DR 1 ), (DT 2, DR 2 ),, (DT N2, DR N2 )}. We assume that the D2D transmission is one-way, in which DT i and DR i represent the transmitter and receiver of the i th D2D pair, respectively. Each cellular user transmits to the base station through an uplink channel, and receives data from the base station via a downlink channel. For all transmitters, the data packets are stored in buffers before being sent to the corresponding receiver. We assume that all transmitters operate under statistical queuing constraints, which require the buffer overflow probability to decay exponential fast with QoS exponent θ. The QoS exponent at CU i and DT i are denoted by θ Ci and θ Di, respectively. For simplicity, we assume that there are N 1 + N 2 separate buffers at the base station. N 1 of them are used for the downlink transmission, and the QoS exponent of the BS CU i link is θ BCi. The 186

206 remaining N 2 buffers are for the data of the D2D users operating in the cellular mode, in which the base station acts as a relay. The QoS exponent for the base-station buffer storing the data to be sent via the BS DR i link is θ BDi. Since not all D2D users operate in the cellular mode, only a portion of the buffers are used at the base station. The channel fading is assumed to be block fading, in which the fading coefficients h stay constant in one time block and change independently across blocks. In Figs. 7.6, 7.7 and 7.8, the magnitude-square of the fading coefficients are denoted by z = h 2. At each receiver, the background noise is assumed to follow an independent complex Gaussian distribution with zero mean and variance σ 2, i.e., n CN (0, σ 2 ). There are N available orthogonal channels for this cellular network, each of them having a bandwidth of B. In the resource allocation step, these orthogonal channels are allocated to the transmission links, and those links to which channels are not assigned are not activated for communication. There are five assumptions regarding the channel allocation: 1. A D2D pair operating in the cellular mode occupies only one channel, and cannot share its channel with other users. 2. Each cellular uplink or downlink is allocated a single orthogonal channel, and channels cannot be shared by different cellular links. 3. Each D2D direct link, cellular uplink and downlink can occupy at most one channel. 4. Each orthogonal channel can be occupied by two transmission links at most. 5. Cellular links have higher priority to be assigned a channel than D2D links. In this section, we mainly consider the case in which N 2N 1, so that we have sufficient number of channels to guarantee the performance requirements of all cellular 187

207 Figure 7.6: System model in the D2D cellular mode users. The transmission power of a cellular user is set at P c, and the maximum transmission power of D2D transmitters is P dmax. When acting as a transmitter, the base station transmits with power P b in each channel. Therefore, the overall transmission power of the base station depends on the number of cellular users and the number of D2D pairs operating in the cellular mode. The primary objective in this section is to maximize the overall system throughput under statistical queuing constraints by using our joint mode selection and channel allocation algorithm. In the following subsection, we introduce all possible transmission modes and describe the relationship between mode selection and channel allocation Transmission Modes and Instantaneous Transmission Rate In this section, our mode selection is done through channel allocation. Depending on how the system uses each channel, we can determine the transmission mode for each user. For instance, if a channel is allocated to a cellular uplink and a D2D direct link, then the corresponding D2D pair and cellular user are in the uplink reuse mode. According to our channel allocation assumptions, there are 7 possible modes for the users in each channel, namely D2D cellular mode, uplink reuse mode, downlink reuse mode, D2D reuse mode, uplink dedicated mode, downlink dedicated mode, and D2D dedicated mode. All of these modes can be summarized into 3 categories. D2D Cellular Mode The first category is D2D cellular mode, and the model is shown in Fig

208 In this mode, the channel is occupied by a D2D pair, transmitting with the help of the base station. In this mode, each transmission period is equally divided into two phases. The first phase is allocated to the DT i BS link, in which the D2D transmitter sends information to the base station, and the second phase is allocated to the BS DR i link, in which the base station forwards the information to the D2D receiver. Denote the fraction of time allocated to the first phase as τ, and the fraction of time allocated to the second phase as 1 τ. In each phase, there is only one transmitter and one receiver, and the received signal at each transmitter follows the form y = hx + n, (7.28) where x is the transmitted signal, n is the additive Gaussian noise component, h is the corresponding channel fading coefficient. We can express the instantaneous rates of DT i BS link and BS DR i link as ( r Di,BS(τ) = τb log P ) dmax Bσ z 2 1 (7.29) and ( r BS,Di (τ) = (1 τ)b log P ) b Bσ z 2 2, (7.30) respectively. In this two-hop model, to guarantee the stability, the average arrival rate should be less than or equal to the average departure rate from the buffer at the base station, which can be expressed as E{r Di,BS(τ)} E{r BS,Di (τ)}. Suppose that τ 0 is the solution of E{r Di,BS(τ)} = E{r BS,Di (τ)}, then the effective capacity of the two-hop 189

209 Figure 7.7: System model in the reuse mode (interference links are denoted by the dashed lines) channel DT i BS DR i is given by R Di = 1 θ Di log E{e θ D i r Di,BS(ˆτ) } (7.31) where ˆτ = min{τ 0, τ }, and τ is the solution to 1 θ Di log E{e θ D i r Di,BS(τ) } = 1 θ BDi log E{e θ BD i r BS,Di (τ) } (7.32) when θ BDi θ Di, or 1 θ Di [ log E{e θ BDi r BS,Di (τ) } + log E{e (θ BD i θ Di )r Di,BS(τ) } ] = 1 θ Di log E{e θ D i r Di,BS(τ) } (7.33) when θ BDi > θ Di, which comes from the results in [17]. Reuse Mode The second category is the reuse mode, and the system model is shown in Fig In this model, two transmitter-receiver pairs share the same channel, and they inflict interference on each other. In Fig. 7.7, two interference links (z 3 and z 4 links) are 190

210 denoted by the dashed line. According to the types of users sharing the channel, reuse mode includes uplink reuse mode, downlink reuse mode, and D2D reuse mode. In the uplink reuse mode, a cellular uplink shares the channel with a D2D direct link; in the downlink reuse mode, a cellular downlink shares the channel with a D2D direct link; in D2D reuse mode, two pairs of D2D users transmit in the same channel. The received signal at each receiver follows the form y = hx + h inter x inter + n, (7.34) where x is the desired signal, h is the fading coefficient of the channel between this receiver and its corresponding transmitter, x inter is the interference signal, h inter is the fading coefficient of the interfering link, and n is the Gaussian noise. Treating the interference as noise, we can express the instantaneous transmission rates in these two links as r 1 = B log 2 (1 + ) P 1 z Bσ P 2 z 3 (7.35) and r 2 = B log 2 (1 + ) P 2 z Bσ 2 2, (7.36) + P 1 z 4 respectively. In (7.35) and (7.36), P 1 and P 2 denote the transmission powers of Tx 1 and Tx 2, respectively. For the base station and cellular users, the transmission powers are fixed at P b and P c, respectively. For the D2D users, the transmission power is determined by the average SINR constraints. To control the interference, the users operating in reuse modes have to satisfy the average SINR constraints, which set lower bounds on the SINR values at the receivers. If two transmission links cannot satisfy the average SINR constraints, then they are not allowed to share the same channel, i.e., the reuse mode will not be permitted for these links. In the uplink reuse mode, we suppose that Tx 1 is the cellular user CU i, Rx 1 is 191

211 the base station, Tx 2 is the D2D user DT j, and Rx 2 is the D2D user DR j. Then, the SINR constraints can be expressed as { } P E c Bσ 2 +P dj z 3 z 1 γ c E { Pdj Bσ 2 +P C z 4 z 2 } γ d (7.37) P dj P dmax where P dj is the transmission power of DT j, γ c and γ d are the SINR thresholds of cellular and D2D users, respectively. Inequality group (7.37) provides us upper and lower bounds on P dj, hence describing an interval of P dj values that satisfy the average SINR constraints. Similar formulations and characterizations can be obtained for the downlink reuse mode. In the D2D reuse mode, we have to determine the transmission power for the two D2D transmitters. Suppose that Tx 1 is the D2D user DT i, Rx 1 is the D2D user DR i, Tx 2 is the D2D user DT j, and Rx 2 is the D2D user DR j. Then, the SINR constraints can be expressed as { E { E P di Bσ 2 +P dj z 3 z 1 } γ d P dj Bσ 2 +P di z 4 z 2 } γ d (7.38) P di P dmax P dj P dmax where P di and P dj are the transmission powers of DT i and DT j, respectively. Inequality group (7.38) provides a region on the P di P dj plane, and each (P di, P dj ) pair in this region satisfies the average SINR constraints. For the uplink, downlink and D2D reuse modes, the optimal transmission power of the D2D transmitter has to be identified by searching over the region determined by 192

212 Figure 7.8: System model in the dedicated mode the average SINR constraints. Inserting the rate expressions given in (7.35) and (7.36) with the optimal transmission power into (2.7), we can get the effective capacities of these two links. Dedicated Mode The third category is the dedicated mode, which is depicted in Fig In this model, one transmitter-receiver pair occupies a channel without sharing it with others. Depending on the type of the transmission link occupying the channel, this category includes uplink dedicated mode, downlink dedicated mode, and D2D dedicated mode, in which the channel is occupied by a cellular uplink, a cellular downlink, and a direct D2D link, respectively. Since there is no interference, the received signal at the receiver also follows the form given in (7.28), and the instantaneous rate can be expressed as ( r = B log P ) Bσ z, (7.39) 2 where P is the transmission power. In this mode, the transmission powers of cellular users, base station and D2D transmitters are fixed at P c, P b and P dmax, respectively. For the uplink, downlink and D2D dedicated modes, inserting the instantaneous rates given by (7.39) into (2.7), we obtain the effective capacity for the corresponding transmission link. 193

213 7.2.2 Channel Allocation via Maximum-Weight Matching Approach From the above analysis, we can determine that the essence of mode selection is channel allocation. To identify the optimal modes for all users, we only need to come up with a channel matching rule that maximizes the overall throughput. This problem can be modeled as a maximum-weight matching problem. In our setting, we seek to match the transmission links to the channels. The weight/gain of matching a transmission link with a channel is given by the effective capacity of that link. There are four main challenges to solve this maximum-weight matching problem as described below: 1. This problem involves both one-to-one matching and two-to-one matching. This is due to the fact that there are three kinds of reuse modes, in which two transmission links are matched to a single channel. 2. The weights are not independent of each other in the case of two-to-one matching. Due to the interference in three reuse modes, the throughput of each link also depends on the other link with which it shares the channel. 3. Even for the case of one-to-one matching, the weights are not constant. This occurs when a D2D pair uses the channel exclusively. The D2D pair has two choices, which are the D2D cellular mode and D2D dedicated mode. The link throughput varies depending on which mode the user is in. 4. Cellular links have higher priority than D2D links. In conventional matching problems, there are no priorities. 194

214 To solve these problems, we propose a new channel matching algorithm, which transforms the original problem into a one-to-one maximum-weight matching problem. In the following subsections, we introduce our new algorithm step by step. Before the matching process, we enumerate the N 1 cellular users, N 2 D2D users and N channels to distinguish them Channel Allocation for Cellular links The first step is to assign the cellular uplink and downlink to the channels. Since cellular links have higher priority, we can give each of them a channel at the very beginning. In this step, we allocate the 1 st to N th 1 channel to the cellular uplink, and allocate the (N 1 + 1) th to 2N th 1 channel to the cellular downlink. After this step, there are N 2N 1 free channels left. Now the problem becomes matching N 2 D2D pairs to the N channels, which is a one-to-one matching problem if the D2D reuse mode is excluded. If a D2D pair gets a channel assigned also to a cellular link, then they operate in the corresponding reuse mode; if a D2D pair gets a free channel, it can choose from the D2D reuse mode, D2D dedicated mode and D2D cellular mode. Now, the first and last challenges above are overcome. In the next step, we construct the throughput matrix Constructing the Throughput Matrix The structure of the (N 2 + 2N 1 ) N throughput matrix is shown in Fig Each row corresponds to a D2D pair, and each column corresponds to a channel. Each element of this matrix equals to the channel throughput if the corresponding channel and D2D pair are matched together. To include the uplink and downlink dedicated modes, we introduce 2N 1 dummy D2D pairs with enumerations varying from N to N 2 + 2N 1, and assume that they do not perform transmissions. If a dummy D2D pair is matched to the channel with a cellular uplink/downlink, then the cellular link 195

215 Figure 7.9: Structure of the throughput matrix works in uplink/downlink dedicated mode. The matrix can be divided into 6 regions, which correspond to the 6 modes. Suppose R i,j denotes the element of the throughput matrix in the i th row and j th column, then R i,j is given by the maximum throughput that can be achieved in channel j if we match D2D pair i to it. Here, the throughput of a channel is the sum effective capacity of the transmission links in this channel. For the reuse modes, we have to search for the optimal transmission power for the D2D transmitters. If a D2D pair gets a free channel, then it compares the effective capacities of D2D dedicated and D2D cellular modes, and chooses the one giving a higher throughput. Since we do not want to match a dummy D2D pair to a free channel, we set the elements in the corresponding region (in the lower right corner) to negative infinity. Since the D2D reuse mode involves two-to-one matching, we exclude the D2D reuse mode in this step, and consider it in the next step. At the end of this step, we have successfully transformed the original problem to a conventional maximum-weight matching problem, and all matching weights are independent from each other. Hence, the second and third challenges are addressed. Assume that ζ i,j is a parameter that indicates whether the i th row and j th column are matched. If they are matched, then ζ i,j = 1, otherwise ζ i,j = 0. Then, the matching 196

216 problem can be formulated as max ζ i,j Subject to R i,j ζ i,j (7.40) i j ζ i,j 1, j (7.41) i ζ i,j 1, i (7.42) j This problem can be solved by the Hungarian algorithm (Kuhn-Munkres algorithm) [92]. After applying the Hungarian algorithm, we get a result for both mode selection and channel allocation without considering the D2D reuse mode. We pick out all D2D pairs working in the D2D dedicated mode, enumerate them from 1 to M 1, and also enumerate the D2D pairs without any channel assignments from 1 to M 2, where M 1 is the number of the D2D pairs in D2D dedicated mode, and M 2 is the number of D2D pairs which were not assigned channels. In the next step, we conduct the matching just for the D2D reuse mode Channel Allocation for the D2D Reuse Mode In this last step, we just seek to match the D2D pairs in D2D dedicated mode with the D2D pairs without any channel assignments. Similar to the previous step, we form another (M 1 + M 2 ) M 1 throughput matrix, which is shown in Fig In this matching step, those D2D pairs in the D2D dedicated mode choose whether to share their channel with another D2D pair. In order to give the right to the D2D users in the D2D dedicated mode to refuse sharing channel with others for the purposes of throughput maximization, we have to include D2D dedicated mode by introducing M 1 dummy D2D pairs with enumerations from M to M 2 + M 1. After applying the Hungarian algorithm to this throughput matrix, the channel allocation and mode selection are accomplished. 197

217 Figure 7.10: Structure of the throughput matrix for the D2D reuse mode Numerical Results In this subsection, we further investigate the performance of our proposed channel matching algorithm via numerical results. In the simulation, we set the overall channel number as N = 6, the number of cellular users N 1 = 2, all θ C = θ D = 1, θ BD = θ BC = 2, P c = P dmax = 500, P b = 600. For each cellular uplink, the SINR threshold is γ c = 1.95; for each downlink, the SINR threshold is γ c = 2.34; and for each D2D link, the SINR threshold is γ d = We assume Rayleigh fading with path loss E{z} = d 4, where d is the distance between the corresponding transmitter and receiver. In the numerical results, we choose random channel matching as the benchmark, and compare its performance with our improved channel matching algorithm. In the random matching algorithm, we randomly allocate the channel to the D2D users after assigning 2N 1 channels to the cellular links, and repeat N 3 2 times and pick the best matching results. Hence, in the results of random matching, there is still an attempt to optimize the performance by finite attempts but without using any particular structure. 198

218 Proposed matching algorithm Random matching Average throughput N 2 Figure 7.11: Average throughput vs. N 2 Average number of unserved D2D pairs Proposed matching algorithm Random matching N 2 Figure 7.12: Average number of unserved D2D pairs vs. N 2 199

219 In Figs and 7.12, we plot the system throughput and the number of unserved D2D pairs as functions of the number of D2D pairs N 2. Since the results vary as we move the position of users for each N 2 value, we take the average over 100 different systems, which are generated randomly. In Fig. 7.11, we find that the system throughput of our algorithm is higher, especially for high N 2 values. As the number of D2D pairs increases, the random allocation algorithm has less chance to obtain the best result via random searching. Therefore, the throughput of random allocation algorithm reaches the limit when N 2 > 8. On the contrary, the throughput of our algorithm keeps increasing fast as we increase N 2 to very high values. This is because our algorithm can more effectively match the D2D pairs to the channels and efficiently identify their modes when the number of D2D pairs is large. In addition, Fig shows that our algorithm can serve more users compared to the random allocation algorithm. 7.3 A Joint Mode Selection and Resource Allocation Algorithm for D2D Communications via Vertex Coloring System Model and Assumptions In this section, as shown in Fig. 7.13, we consider a D2D underlaid cellular network, which has one base station (BS), N c cellular users {CU 1, CU 2,, CU Nc } and N d D2D pairs {(DT 1, DR 1 ), (DT 2, DR 2 ),, (DT Nd, DR Nd )}. We assume that the D2D transmission is one-way, in which DT i and DR i represent the transmitter and receiver of the i th D2D pair, respectively. Each D2D pair can choose between the cellular mode and D2D mode. In D2D mode, D2D users transmit through D2D direct links, while in cellular mode, they transmit via D2D two-hop links through 200

220 Figure 7.13: System model the base station. Each cellular user transmits to the base station through an uplink channel, and receives data from the base station via a downlink channel. Hence, there are overall N c uplinks, N c downlinks and N d D2D links. The maximum transmission power of a cellular user and D2D transmitter are set at P c and P d, respectively. When acting as a transmitter, the maximum transmission power of the base station is P b in each channel. Therefore, the overall transmission power of the base station depends on the number of cellular users and the number of D2D pairs operating in the cellular mode. There are N available orthogonal channels for this cellular network, each of them having a bandwidth of B. For simplicity, there are four assumptions regarding the channel allocation, which were also made in many related works such as [43] and [93]: 1. A D2D pair operating in the cellular mode cannot share its channel with other users. 201

221 2. Each cellular link, including both uplink and downlink, is allocated a single orthogonal channel, and channels cannot be shared by different cellular links. 3. It is necessary for a pair of direct links to satisfy the pair-wise interference constraints given below in (7.43) in Section to reuse the same channel. 4. Each link, including D2D direct link, D2D two-hop link, cellular uplink and downlink, can operate in one channel at most. 5. The base station has the knowledge of the distributions of all channel fading coefficients, i.e., has statistical channel side information. The first assumption helps to protect the performance of those D2D users that select the cellular mode. In general, D2D users that select the cellular mode usually have weak connections to their corresponding receivers, i.e., the distances between D2D transmitter, D2D receiver and the base station are relatively large. Therefore, assigning these D2D two-hop channels dedicated transmission resources provides a certain level of QoS guarantee. The second assumption guarantees the performance of cellular users, which have higher priorities than D2D users. The third assumption controls the interference among the users that reuse the same transmission resource. The last assumption implies that our resource allocation algorithm is performed at the base station, and our algorithm only requires the knowledge of the fading distributions, i.e., statistical channel side information. In general, fading distributions depend on the environment and distance between the transmitter and receiver. If a certain fading model is considered, such as Rayleigh, Rician or Nakagami-m fading, then the fading distributions are mainly determined by the location of the users. According to these assumptions, a channel can be assigned to a single D2D link, a single cellular link, a group of D2D direct links or a group of D2D direct links together with a cellular link. For the last two cases, the users transmitting in the same channel cause interference to each other. Note that in some systems, cellular downlinks do 202

222 not share transmission resources with D2D users. In such cases, we just need to first assign transmission resources to those cellular downlinks before applying our algorithm. In this section, we assume that 2N c N 2N c + N d, which implies that we have sufficient number of channels to guarantee the performance requirements of all cellular users. However, having dedicated channels for all D2D users is not feasible in such a situation, and reusing/sharing of channel resources has to be considered in order to serve as many D2D users as possible. The channels are assumed to experience ergodic fading, and the fading coefficients are denoted by h. Fading coefficients in different frequency bands are assumed to be i.i.d.. In the following analysis, the magnitude-squares of the fading coefficients are denoted by z = h 2. At each receiver, the background noise is assumed to follow an independent complex Gaussian distribution with zero mean and variance σ 2, i.e., n CN (0, σ 2 ). Therefore, the SNR of each transmitter can be defined as SNR = where P represents the transmission power. P, Bσ 2 In this section, we consider mode selection, power optimization and channel allocation jointly to maximize the throughput as well as the number of users served in the network. In the next subsection, we introduce our algorithm step by step Joint Mode Selection and Resource Allocation Algorithm In this subsection, we introduce our three-step joint mode selection and resource allocation algorithm in detail. In the first step, we divide the transmission links into groups via the vertex coloring method. In the second step, we conduct power optimization for each group, and perform mode selection between D2D mode and cellular mode for those D2D links which form groups. In the last step, we assign channels to those groups. Before applying the algorithm, we enumerate cellular uplinks from 1 to N c, cellular 203

223 downlinks from N c + 1 to 2N c, and D2D direct links from 2N c + 1 to 2N c + N d. D2D two-hop links are only considered in the mode selection part in the second step. With the given link indices, we can denote the magnitude-square of the fading coefficient between the transmitter of link i and the receiver of link j by z i,j, and we can represent the expected values of z collectively in a channel fading matrix Z. Two main objectives of our algorithm are to maximize the sum rate and to maximize the number of users served in the network. Most of the time, these two goals cannot be achieved simultaneously because of the presence of interference. In the following discussion, we illustrate how to balance these two goals via parameter selection Partition via Vertex Coloring Method The first step of our algorithm is transmission link partition. The partition algorithm divides transmission links into small groups, greatly reducing the dimensionality of the power optimization problem in the second step. According to our channel assignment assumptions, multiple cellular links cannot be in the same group, and any two links in the same group have to satisfy the pair-wise interference constraints given by P imax z ii /(P jmax z ji ) γ P jmax z jj /(P imax z ij ) γ (7.43) where P imax and P jmax are the maximum transmission powers over links i and j respectively, z represents the expected value of z, and γ is the interference threshold. These pair-wise interference constraints provide QoS guarantees for both cellular and D2D users from the perspective of interference control. The key steps of our partition algorithm are to construct a graph while regarding these 2N c + N d transmission links as vertices, and to perform the partition using the minimum vertex coloring algorithms from graph theory. Note that these algorithms 204

224 Table 7.4: Algorithm 7.4 Partition Algorithm Input: interference threshold γ, channel fading matrix Z. Output: partition Π = π 1, π 2,, π ng. For i = 1 : 2N c + Nd γ i = γ; End Generate a random permutation of integers from 1 to 2N c + N d, and denote it by A 1 ; For each i A 1 Generate a random permutation of integers from i + 1 to 2N c + N d, and denote it by A 2 ; For each j A 2 If both links i and j are smaller than 2N c Create an edge between vertices i and j; Elseif links i and j cannot satisfy { P imax z ii /(P jmax z ji ) γ i P jmax z jj /(P imax z ij ) γ j Create an edge between vertices i and j; Else Increase both γ i and γ j by γ; End End End Apply the Welsh-Powell algorithm to get the partition Π; divide all vertices into minimum number of groups such that any two vertices in the same group are not connected. Therefore, we construct the graph by checking each pair of vertices, and connect them if they cannot be in the same group. A detailed description of our partition algorithm is given in Table 7.4. The output of this algorithm is a partition with size n g, and each element of the partition is a set of vertices that form a group. In order to further control the interference and number of users in a group, we gradually increase the γ values of each link. As we can see in the algorithm, all threshold values are set at γ initially. Each time we find a pair of links that can be in the same group, we increase the thresholds of these two links by 205

225 γ. This mechanism can effectively limit the received interference at each receiver, and balance the size of each group. Also due to this mechanism, two links may have a higher chance to be in the same group if we check them earlier. In order to let the vertices to have equal chances to connect with each other, we use random orders to choose link pairs in the double for-loop. In the last step of the algorithm, we use the Welsh-Powell algorithm [94] to solve the vertex coloring problem. Welsh-Powell algorithm is a very fast algorithm that can provide good results effectively. In this step, γ and γ are the parameters to control the tradeoff between sum rate and number of users served by the system. For large values of γ and γ, the interference is well controlled, but the system serves potentially small number of users. On the other hand, for small γ and γ values, more users can reuse the same channel resource, but the interference may lower the sum rate. In practice, the partition algorithm can potentially provide us a partition with size smaller than the number of channels, which means that some of the channels will not be utilized, because each user group π i is assigned a channel in the third step of our algorithm. In order to avoid this situation, we need to further improve our partition algorithm using a γ-adjusting algorithm described in Table 7.5. In Algorithm 7.5, we find a threshold ˆγ that makes the partition size n g = N through bisection search. Notice that the threshold value that can achieve n g = N is not unique, and the time consumption of this adjusting algorithm is very small. After obtaining the partition, we conduct power optimization and mode selection in the second step Power Optimization and Mode Selection In the second step, we do power optimization for each group. If a group just contains a single D2D direct link, then we perform mode selection for this D2D pair. 206

226 Table 7.5: Algorithm 7.5 γ Adjusting Algorithm Input: interference threshold γ, channel fading matrix Z. Output: partition Π = π 1, π 2,, π ng. Run Algorithm 7.4 with threshold γ; If n g N End process; End Set ˆγ = γ; While n g < N ˆγ = 2ˆγ; Run Algorithm 7.4 with threshold ˆγ; End Set the upper bound γ u = ˆγ, lower bound γ l = ˆγ/2, and ˆγ = (γ u + γ l )/2; While n g N Run Algorithm 7.4 with threshold ˆγ; If n g > N γ u = ˆγ; Elseif n g < N γ l = ˆγ; End ˆγ = (γ u + γ l )/2; End 207

227 Power Optimization If a group only contains one direct link, then the transmitter transmits with its maximum power. For the groups containing multiple transmission links, a general expression of the objective function for the power optimization problem in group π i is Obj(P i ) = k ω k CRT k (P i ), (7.44) where P i represents the power vector which consists of the transmission powers of the transmitters in group π i, the function CRT can be defined based on the criteria selected in the optimization problem, such as the maximization of the sum rate, energy efficiency, or minimum rate, and ω k is the corresponding weight of CRT k which indicates the significance of CRT k. The formulation given in (7.44) can provide QoS and fairness guarantees. For instance, by choosing the energy efficiency as a criterion, a certain energy efficiency performance can be achieved; or by choosing the minimum rate as a criterion, the minimum rate performance of each users can be guaranteed. In this section, we consider both the sum rate and minimum rate as the criteria, and formulate our power optimization problem for group π i as Maximize Pi j π i E{R j (P i )} + µ min j π i { E{Rj (P i )} } (7.45) Subject to 0 P j P jmax, for j π i (7.46) where P jmax represents the maximum transmission power in link j. The evaluation of the average transmission rate E{R j (P i )} is discussed in Remark 3 below. In this problem, µ is the weight parameter for the minimum rate. For small µ values, the objective function is mainly determined by the sum rate component, which may sacrifice the rates of some users. On the other hand, for large µ values, the objective function is mainly dominated by the minimum rate component, which may limit the 208

228 sum rate. This optimization problem can be transformed into Maximize Pi,r j π i E{R j (P i )} + µr (7.47) Subject to 0 P j P jmax, for j π i (7.48) E{R j (P i )} r, for j π i (7.49) for which suboptimal solutions can be obtained via the interior-point method [95]. In order to improve the performance, we need to repeat the algorithm several times with randomly selected initial points. Remark 3 In order to determine the average rate of a user accurately and efficiently,we perform numerical integration. To evaluate this high-dimensional integral, we transform it into two single integrals for certain specific fading models: ( )} SNR j z jj E{R j (P i )} =E {B log (7.50) k π i,k j SNR kz kj )} =E {B log 2 (1 + k πi SNRk z kj E {B log 2 ( 1 + )} SNR k z kj. (7.51) k π i,k j In Rayleigh fading, SNR z follows an exponential distribution with probability density function (pdf) f(x) = 1 SNR z e x/(snr z). (7.52) According to the results in [96], the summation of independent exponentially distributed random variables S M = M k=1 X k, where X k exp(λ k ), has a pdf given by f SM (s) = M i=1 M j=1 λ j M j=1,j i (λ j λ i ) e sλ i. (7.53) 209

229 Using this characterization, the sum terms k π i SNR k z kj and k π i,k j SNR kz kj in (7.51) can be regarded as two random variables, and the average rate can be evaluated using two single integrals. Similar approach can be applied to some other fading models as well. Mode Selection If a group just contains a single D2D direct link, then this D2D pair can choose between D2D mode and cellular mode. In cellular mode, D2D users communicate through the base station, and each time block is divided into two phases. In the first phase, the D2D transmitter sends packets to the base station, and base station forwards the packets to the corresponding D2D receiver in the second phase. We assume that the base station decodes and stores the received packets from D2D transmitters in a buffer, and the buffer empty probability is negligible. Let τ i denote the fraction of time allocated to link DT i BS. If users DT i and DR i are in cellular mode, then the fraction of time allocated to BS DR i link is 1 τ i. Since the throughput of the two-hop link DT i BS DR i is min{τ i E{R DTi BS}, (1 τ i )E{R BS DRi }}, the optimal τ i value is given by τ i = E{R BS DRi } E{R DTi BS} + E{R BS DRi }, (7.54) which leads to τ i E{R DTi BS} = (1 τ i )E{R BS DRi }. Above in (7.54), the instantaneous rates of links DT i BS and BS DR i are formulated as ( R DTi BS = B log P ) d Bσ z 2 DT i BS ( R BS DRi = B log P ) b Bσ z 2 BS DR i (7.55) (7.56) where the subscript of the fading power z denotes the link to which the fading power 210

230 is associated. Then, the average transmission rate of the i th D2D pair in cellular mode is E{R DTi BS DR i } = τ i E{R DTi BS} (7.57) = E{R BS DR i }E{R DTi BS} E{R DTi BS} + E{R BS DRi }. (7.58) In D2D mode, the average transmission rate of link DT i DR i is ( E{R DTi DR i } = E {B log P )} d Bσ z 2 jj (7.59) where j = 2N c + i is the index of the i th D2D direct link. We compare the average rates in these two modes, and select the one with the higher average rate Channel Assignment In the first step, we divide the transmission links into n g groups, and the optimal transmission power and transmission mode of each user are obtained in the second step. In this third step discussed in this subsection, we allocate channel resources to each group. We first allocate a channel to each group containing a cellular link, to guarantee that each cellular link is provided a channel. Following this step, there are N 2N c channels left for the remaining D2D users. Given these channels, we can choose to maximize the sum rate or maximize the total number of users served by the system. If we choose to maximize the sum rate, then we need to pick N 2N c groups with the highest group sum rates from the remaining n g 2N c groups, and assign each of them a channel. If we choose to maximize the number of users served by the system, then we need to select N 2N c groups with the largest group sizes, and assign each of them a channel. 211

231 Table 7.6: Algorithm 7.6 Joint Mode Selection and Resource Allocation Algorithm Run Algorithm 7.5 for a given γ value to obtain a partition with size N g greater or equal to the number of channels N; For i = 1 : n g Run the power optimization algorithm for the i th group; If the i th group only contains one D2D link Run the mode selection algorithm for this D2D link; End End Run the channel assignment algorithm to assign channel resources to these groups Summary Our joint mode selection and resource allocation algorithm is described in Table 7.6. Via the vertex coloring algorithm, we can quickly divide users into small groups, which greatly lowers the dimensionality of the power optimization problem in the second step and reduces the time consumption. From numerical results, we notice that the majority of the time is spent on solving the power optimization problems in the second step. Therefore, finding a faster algorithm instead of the interior-point method for the power optimization problem is the key to further reduce the time consumption of our algorithm, and we leave a detailed study of this problem as our future work Numerical Results In this subsection, we further investigate the performance and parameter selection of our joint mode selection and resource allocation algorithm via simulations. For our algorithm, we set the initial threshold of the interference constraints as γ = 250, and γ {50, 125, 250, 1250, 2500}. In the power allocation step, we set the weight for the minimum rate objective as µ = 0.2 Size(π i ), where size(π i ) represents the number of links in the i th group. In the channel assignment step, we choose to maximize the 212

232 Sum rate (bits/s) I II III IV V VI Algorithm Figure 7.14: Comparison of sum rate 22 Number of served users I II III IV V VI Algorithm Figure 7.15: Comparison of the number of served users 213

233 Time consumption (s) I II III IV V VI Algorithm Figure 7.16: Comparison of time consumption Sum rate (bits/s) No channel reuse Vertex coloring with γ=50 Vertex coloring with γ=125 Vertex coloring with γ=250 Vertex coloring with γ=1250 Vertex coloring with γ= N d Figure 7.17: Sum rate vs. N d 214

234 22 21 Number of served users No channel reuse Vertex coloring with γ=50 Vertex coloring with γ=125 Vertex coloring with γ=250 Vertex coloring with γ=1250 Vertex coloring with γ= N d Figure 7.18: Number of users served by the system vs. N d sum rate. The number of channels is N = 25, and the number of cellular users is N c = 10. The maximum SNRs are P b Bσ 2 = db, P c Bσ 2 = P d Bσ 2 = db. We consider Rayleigh fading with path loss E{z} = d 4, where d is the transmission distance, and users are randomly placed in the cell. We repeat each simulation 200 times, and each point in the numerical plots is averaged over 200 randomly generated systems. In Figs , we compare the performance of our algorithm with the coalitional game method proposed in [48]. In the coalitional game, each user forms a coalition at the beginning, and link i prefers to join coalition j if the sum objective function increases by moving link i to coalition j. In Figs , algorithms I, II, III, IV, V and VI represent the coalitional game algorithm and the vertex coloring algorithms with γ = 50, γ = 125, γ = 250, γ = 1250 and γ = 2500, respectively, and the number of D2D pairs is fixed as N d = 15. From the results, we can see that our vertex coloring algorithm provides higher sum rates and serves more users than the coalitional game algorithm. As γ increases, the sum rate increases due to less interference, but the number of users being served decreases due to stricter interference constraints, as noted in Section Also, we notice that as γ increases, 215

235 the time consumption of our algorithm reduces, and when γ = 2500, our algorithm is much faster than the coalitional game algorithm 2. For larger values of γ, the maximum group size is small, reducing the dimensionality and time consumption of the power allocation problems in the second step. In Figs and 7.18, we plot the sum rate and number of served users as functions of the number of D2D pairs N d. In these two figures, we consider the results without channel reuse as the benchmark, in which all users transmit with their maximum power and only 25 links (10 cellular uplinks, 10 cellular downlinks and 5 D2D links) with the highest rates are allocated dedicated channels. We can see that as N d increases, the advantages of allowing channel reuse become more obvious. The sum rate and number of served users increase much faster when channel reuse is allowed. Similarly, larger γ improves the sum rate while sacrificing the number of served users. In summary, our algorithm has high performance and low time consumption. When the sum rate is more important, we can choose relatively high values for γ and γ, and assign channels to the groups with higher group rates. In this case, the time consumption can also be reduced. However, if the values of γ and γ are too large, then our algorithm leads to the cases in which channel reuse is not allowed. On the other hand, we can choose relatively low values for γ and γ, and assign channels to the groups with more users if we choose to maximize the number of served users. However, if the values of γ and γ are too small, then the interference limits the transmission rate of each user, and the service quality may degrade. Therefore, avoiding such extreme values and optimizing parameter selection are preferred. 2 These time consumption measurements are obtained for codes in Matlab 2015b running on a 2.40GHz Intel i7-4700mq CPU. 216

236 Chapter 8 Resource Allocation for Content Delivery over Wireless Cellular Networks In this chapter, we focus on the delay performance of content delivery over wireless cellular networks. Three types of network models are considered, including D2D caching network, D2D cellular network, and C-RAN. By storing parts of the popular files at the mobile users, users can locate some of their requested files in their own caches or the caches at their neighbors. In the latter case, when a user receives files from its neighbors, D2D communication is enabled. D2D communication underlaid with cellular networks is also a new paradigm for the upcoming 5G wireless systems. In Section 8.1, we propose a very efficient caching algorithm for D2D-enabled cellular networks to minimize the average transmission delay. Instead of searching over all possible solutions, our algorithm finds out the best <file,user> pairs, which provide the best delay improvement in each loop to form a caching policy with very low transmission delay and high throughput. This algorithm is also extended to address a more general scenario, in which the distributions of fading 217

237 coefficients and values of system parameters potentially change over time. In Section 8.2, we develop a scheduling algorithm for D2D cellular networks with deadline constraints via the convex delay cost approach. At the beginning of each time slot, the algorithm allocates all available channels to the users, and each user can choose to transmit in different modes. After characterizing the transmission rates and defining the utility for each possible scheduling decision, we propose power optimization algorithms to maximize the utility for each type of decision. Our scheduling algorithm allocates each channel according to the decision that provides the maximum utility value, and it manages mode selection, channel allocation and power optimization. Via simulation results, we discuss the parameter selection for our algorithm and verify the performance improvements by allowing D2D users to share channels with other users. In Section 8.3, we formulate the utility of each user using a convex delay cost function, and design a two-step scheduling algorithm with good delay performance for the C-RAN architecture. C-RAN architecture is a new mobile network architecture that enables cooperative baseband processing and information sharing among multiple cells and achieves high adaptability to nonuniform traffic by centralizing the baseband processing resources in a virtualized BBU pool. In the first step, all users in multiple cells are grouped into small user groups, according to their interference levels and estimated utilities. In the second step, channels are matched to the user groups to maximize the system utility. The performance of our algorithm is further studied via simulations, and the advantages of C-RAN architecture is verified. 218

238 Figure 8.1: System model of a D2D cellular network with caches 8.1 A Delay-Aware Caching Algorithm for Wireless D2D Caching Networks System Model and Problem Formulation System Model and Channel Allocation As shown in Fig. 8.1, we consider a cellular network with one base station (BS), in which a library with M files (F 1, F 2,, F M ) is stored, and we assume that the size of each file is fixed to F bits 1. There are N users (U 1, U 2,, U N ) in the network who seek to get the content files from the library. Each user is equipped with a cache of size µf bits, and therefore can store µ content files. The caching state is described by an N M matrix Φ, whose (i, j)-th component has a value of ϕ i,j = 1 if file F j is 1 In the literature, it is noted that the base station may only store a portion of the library contents, and needs to acquire the remaining files from the content server [58]. Since we focus on the wireless transmission delay, we do not explicitly address the link between the base station and content server. Also, the content files may not have the same size in practice, but we can further divide them into sub-files with equal size. 219

239 cached at user U i, and ϕ i,j = 0 when the user U i does not have file F j in its cache. In general, users request files with different probabilities, which are characterized by an N M popularity matrix P, in which the entry on the i th row and j th column, P i,j, represents the probability of user U i requesting file F j. Each row of the popularity matrix corresponds to a popularity vector of a user. Although the popularity matrix may change over time in practice, we can assume that the popularity stays constant within a certain period, and our caching algorithm needs to be repeated when the popularity matrix is updated. In the literature, Zipf distribution is generally considered as a good statistical model for the popularity. The pmf of this distribution is given by P i,j = f β i,j M, (8.1) k=1 k β where f i,j is the popularity index that user U i gives to file F j, and β 0 is the Zipf exponent. Each user enumerates the files with popularity index from 1 to M, where the most popular file gets index 1, and the least popular file gets index M. As the Zipf exponent β increases, the difference in the popularity of different files increases, while all files have the same popularity when β 0. Although we use Zipf distribution for our numerical results in Section 8.1.4, our proposed algorithm works for any type of popularity model. At each user, the generated requests are buffered in a queue before getting served, and it is assumed that these request queues are not empty at any time. In a D2D-enabled wireless network, users can choose to transmit in cellular mode or D2D mode. In the cellular mode, users request and receive information from the base station, while in the D2D mode, a user requests and receives information from another user through a D2D direct link. In our model, the users first check their local cache when a file is requested. If the user does not have the corresponding file 220

240 in its own cache, it sends a request to the base station. We assume that the base station has knowledge of all fading distributions (i.e., only has statistical information regarding the channels) and the cached files at each user. After receiving the request, the base station identifies the source node from which the file request can be served and allocates channel resources to the corresponding user. Therefore, the result of mode selection is determined by the result of source selection. If the source node is another user, then the requested file is sent over the direct D2D link and hence the communication is in D2D mode, otherwise the receiving user works in cellular mode and receives files from the base station. In source selection, among all the nodes (including the base station) who have the requested file, the node with the lowest average transmission delay to the receiver is selected as the transmitter. In this section, we consider an OFDMA system with N c orthogonal channels, and the bandwidth of each channel is B. We assume that the background noise samples follow i.i.d. circularly-symmetric complex Gaussian distribution with zero mean and variance σ 2 at all receivers in all frequency bands, and the fading coefficients of the same transmission link are i.i.d. in different frequency bands. The fading coefficients are assumed to stay constant within one time block of duration T 0, and change across different time blocks. We summarize the resource allocation assumptions for the discussions in Sections and as follows: 1. Each channel can be used for the transmission of one requested file at most, and the transmission of a file cannot occupy multiple channels. 2. D2D transmitters are not allowed to transmit to multiple receivers simultaneously. In other words, the file requests whose best source node is a user who is already transmitting cannot be assigned a channel resource by the base station. 3. The probability of a channel being allocated to a request generated by user i is ˆp i. 221

241 4. After a request is served, the corresponding transmitter keeps silent in the remaining time block, and the base station allocates the channel resource to other requests at the beginning of the following time block. 5. If the i th user is selected as a D2D transmitter, its maximum transmission power is P i. 6. The base station can serve multiple requests simultaneously using different channels, and its maximum transmission power is P b for each request. The first four assumptions describe a class of simple scheduling algorithms, in which only point to point transmission without spectrum reusing is considered. At the beginning of each time block, base station assigns available channels to the requests, and each transmission link gets one channel at most, and uses the assigned channel exclusively. The transmitter transmits until the request is served and then releases the channel resource. The behavior of the scheduling algorithm is described by a set of probabilities ˆp i defined in the third assumption. Although our delay characterizations in Sections and are only valid for this type of scheduling algorithms, we further extend our results for more complicated scheduling algorithms in Section In that case, only the last two assumptions are required, which describe the maximum power constraints. With a more complicated scheduling algorithm, we can only estimate the average delay of each request at each user through simulation or learning methods. A detailed discussion is provided in Section In this section, only the distributions of the fading coefficients are required at the base station, which mainly depend on the environment and the location of each user. A centralized computation scheme is used, and the base station sends the results of caching and scheduling algorithms to the users through additional control channels. Since the base station knows the distributions of all fading coefficients and the cached files at each user, the average delay between each user pair and the best source node 222

242 for each request can be obtained at the beginning and stored in tables at the base station. A detailed discussion on delay calculation and the determination of the best source node is provided in the next subsection Transmission Delay In this section, we use the transmission delay, which is defined as the number of time blocks used to transmit a content file, as the performance metric. From the above discussion, the instantaneous channel capacity a transmission link in the k th time block is ( C[k] = B log P ) t Bσ z 2 k bits/s (8.2) where P t is the transmission power, and z i is the magnitude square of the corresponding fading coefficient in the k th time block. In order to maximize the transmission rate, all transmitters transmit at the maximum power level. Therefore, P t = P b P i if the transmitter is the base station if the transmitter is the i th user, (8.3) and the duration to send a file is { T = min t : F } t T 0 C[k] k=1 (8.4) where F is the size of each file, T 0 is the duration of each block, and C[k] is the instantaneous channel capacity in the k th time block. When the fading distribution is available, the average transmission delay of the link U i U j, which is denoted by E{T i,j }, can be obtained through numerical methods or Monte-Carlo simulations. These average delay values can be stored in an N N symmetric matrix T avg, whose 223

243 component on the i th row and j th column is given by E{T i,j } when i j, and the diagonal element T i,i is the average delay between U i and the base station. According to our channel assumptions, the average delays of a transmission link are the same in every channel. Therefore, we only need to analyze the performance in a single channel. The best source node of the request, which is generated by user U i requesting file F j, is the node which has file F j and the smallest average transmission delay to U i, and this minimum average delay is denoted by D 2 i,j. The best source of each possible request can be stored in an N M table S, in which each row corresponds to a user who generates the request, and each column corresponds to a file being requested. Also, these D i,j values can be collected in an N M matrix D. Using the above results, the average transmission delay of the requests generated by user U i can be obtained as D i = M P i,j D i,j, (8.5) j=1 where P i,j is the (i, j)-th component of the popularity matrix P Problem Formulation In the previous subsection, we have determined and expressed the average delay. In this subsection, we formulate and discuss our caching problem. In this section, our goal is to minimize the weighted sum of the average delays of the users, which is expressed as η = N ω i D i = i=1 N ω i M i=1 j=1 P i,j D i,j (8.6) 2 If U i has cached F j, then the best source node is U i itself, and D i,j =

244 where ω i [0, 1] is the weight for user U i. We assume that the values of the weights are predetermined. In practice, ω values can be determined according to the priorities of users, so that users with higher priority have higher weights. Our caching problem is formulated as P1: Minimize Φ η (8.7) Subject to M ϕ i,j = µ (8.8) j=1 ϕ i,j {0, 1} (8.9) where Φ is the caching result indicator matrix. The constraint in (8.8) arises due to the maximum cache size. It is obvious that the optimal caching policy must use all caching space. In a special case, if we choose ω i = ˆp i, where ˆp i is the probability that a channel is allocated to user U i, then η expresses the average delay of the system. In this situation, the throughput of the system can be expressed as R = N c F η. (8.10) Therefore, in this special case, minimizing η is equivalent to maximizing the throughput of the system Caching Algorithm In this subsection, we propose our caching algorithm that solves problem P1. Note that the objective in problem P1 is not convex, and the solution space is a discrete set with size ( M! (M µ)! µ! )N. Therefore, the globally optimal solution can only be obtained via exhaustive search. In this section, we propose an efficient algorithm to determine a caching policy with delay performance close to the optimal solution. At the end of 225

245 Table 8.1: Algorithm 8.1 Find the delay improvement for a <file,user> pair Input : user index i, file index j, caching indicator ϕ i,j, weight vector ω = (ω 1,, ω N ), popularity matrix P, source table S, delay matrices T avg and D. Output : delay improvement g i,j, updated source table Ŝ, updated optimal delay matrix ˆD. Initialization : Ŝ = S and ˆD = D If ϕ i,j = 1 g i,j = 0, end process. Else g i,j = ω i P i,j D i,j and update Ŝi,j U i, ˆDi,j = 0. End For k = 1 : N If D k,j > T i,k and i k g i,j = g i,j + ω k P k,j (D k,j T i,k ) update ˆD k,j = T i,k and Ŝk,j U i End End this section, we show that our algorithm has the potential to be extended to more complicated scenarios Caching Algorithm Our algorithm is a greedy algorithm, which searches over a subset of the solution space with smaller size. At the beginning, we assume that all caches are empty, and every user has to operate in cellular mode, in which they only receive files from the base station. Then, in each step, we find the best <file,user> pair, which provides the maximum delay improvement (or equivalently reduction in delay) if the selected file is stored in the cache of the corresponding user. This process needs to be repeated Nµ times, in order to fill all cache space, and the final caching policy is obtained. In Table 8.1, we describe Algorithm 8.1 in detail, which calculates the delay improvement and determines the updated S and D matrices accordingly when we cache file F j at user U i. First, we check if F j has already been cached at U i. If so, we end 226

246 Table 8.2: Algorithm 8.2 Find the optimal <file,user> pair to be added in the updated caching result, leading to maximum delay improvement Input : weight vector ω = (ω 1,, ω N ), popularity matrix P, caching indicator matrix Φ, source table S, delay matrices T avg and D. Output : new source table S, new optimal delay matrix D, and new caching indicator matrix Φ. Initialization : set optimal delay improvement g = 0, and set the corresponding S = S, D = D. For i = 1 : N If M j=1 ϕ i,j < µ For j = 1 : M run Algorithm 8.1 for < U i, F j >, to obtain the gain g i,j and the corresponding Ŝ and ˆD. IF g i,j > g update g = g i,j, S = Ŝ, D = ˆD, ĩ = i, and j = j. End End End End update ϕĩ, j = 1, S = S and D = D. the process, and return the delay improvement g i,j = 0; if not, we set g i,j = ω i P i,j D i,j because that is the reduction in η at user U i if it adds F j to its cache. Then, we need to sum up all reductions at each user. At user U k, if D k,j > T i,k, then D2D link U i U k has the lowest average delay for U k to receive F j and the reduction at U k is ω k P k,j (D k,j T i,k ); if not, then caching F j at U i does not help to improve the delay performance at U k. Based on Algorithm 8.1, Algorithm 8.2 described in Table 8.2 helps to find the optimal <file,user> pair to be added to the updated caching result, which leads to the maximum delay reduction. In Algorithm 8.2, ĩ and j record the optimal user index and file index, respectively. g tracks the maximum delay improvement, and S and D record the new source table and minimum delay matrix, respectively, after 227

247 Table 8.3: Algorithm 8.3 Caching Algorithm Input : weight vector ω = (ω 1,, ω N ), popularity matrix P, and delay matrix T avg. Output : caching indicator matrix Φ, source table S. Initialization : for all requests, S i,j BS, D i,j = T i,i. Set all ϕ i,j = 0. For loop = 1 : Nµ run Algorithm 8.2 to cache a file and update the result. End caching F j at U ĩ. We search over all NM possible <file,user> combinations, find their delay improvements and update g, ĩ, j, S and D accordingly. At user U i, we check if there is empty space in its cache. If its cache is full, we directly jump to the next user U i+1. For each <file,user> pair, we run Algorithm 8.1 to calculate the corresponding delay improvement, and compare it with g. If a <file,user> pair exceeds the maximum delay improvement up to that point, we perform the update accordingly. Every time we run Algorithm 8.2, we cache one more file at a user. Therefore, we need to run Algorithm 8.2 Nµ times to obtain the final caching result, and this process is described in Algorithm 8.3 in Table 8.3. For our proposed caching algorithm, we initially have all caches empty, and all users work in cellular mode, in which they only receive files from the base station at first. We assume that the system has calculated the average delay between every two nodes, and stored the delay matrix T avg at the base station. Then, base station runs Algorithm 8.2 Nµ times, and in each time we cache one more file and update the caching indicator Φ, source table S, and minimum delay matrix D accordingly. Finally, the base station sends the caching files to the users when the traffic load is low. 228

248 Complexity Analysis In the l th iteration, Algorithm 8.2 searches over NM (l 1) possible <file,user> pairs, where the term l 1 corresponds to the l 1 <file,user> pairs that have been selected in previous iterations. Therefore, the size of the search space of our algorithm is l=nµ l=1 NM (l 1) = N 2 Mµ 1N 2 µ Nµ, which is much smaller than the 2 2 size of the entire solution space ( M! (M µ)! µ! )N. In order to test the performance of our algorithm, we compare our algorithm with the brute-force exhaustive search algorithm. We apply both algorithms to a system, in which there are 5 users, 10 files in the library and each user can cache 2 files. These two algorithms obtain the same caching result, however the time consumption of the exhaustive search algorithm is seconds, while our algorithm only takes only seconds Extensions and Future Work In this subsection, we consider a more general case, in which the delay matrices T avg and D, weight vector ω = (ω 1,, ω N ), popularity matrix P and transmission powers P i change over time. For simplicity, we assume that all these parameters stay constant within one update cycle, and we use κ as the index of cycles. The duration of the κ th cycle, denoted by τ κ, depends on how fast the parameters vary. Then, we can formulate our caching problem in the κ th cycle as P2: Minimize Φ κ Subject to N i=1 ω κ i M j=1 P κ i,jd κ i,j (8.11) M ϕ κ i,j = µ (8.12) j=1 M ϕ κ i,j ϕ κ 1 2ξi κ (8.13) j=1 i,j ϕ κ i,j {0, 1}. (8.14) 229

249 If we define the weight of user i as ω κ i = E{NPK κ i /NPK κ }, where NPK κ i and NPK κ represent the number of received packets in the κ th cycle at user i and at all users, respectively, then the objective function in the optimization problem P2 represents the expected packet delay in the κ th cycle. The transmission power P κ i is determined according to the battery budget of user i. Due to the changes in transmission powers and the distributions of channel fading, the delay matrices T avg κ and D κ also vary over time. Compared with P1, P2 includes an additional constraint given by (8.13). In (8.13), ξ κ i is the upper bound of the number of cache files that will be replaced in the current update cycle. Due to requirements regarding energy efficiency and current traffic load, each user may be able to update only a few cache contents. The solution of P2 is described below: 1. At the beginning of the κ th cycle, the system estimates the delay matrix T avg κ 1, weight vector ω κ 1, and popularity matrix P κ 1 according to the samples obtained in the previous cycle. The base station receives the transmission powers P κ i from the users, determine the cycle period τ κ and the upper bound ξ κ i, and then predicts T avg κ, ω κ and P κ. 2. Algorithm 8.2 is repeated Nµ times to determine the caching result in the κ th cycle. 3. At the end of each iteration in the second step, it is checked if the constraint in (8.13) is satisfied with equality at any one of the users. If this constraint is satisfied with equality at a user, then no more cache updating is allowed for this user, meaning that this user can only choose from the files that are already stored in its cache in the remaining iterations. After this process, the base station sends the cache contents to each user, and conduct regular transmission after updating the cache files at each user. 230

250 As we have mentioned in Section 8.1.1, this improved algorithm does not require the first 4 resource allocation assumptions described in Section , and works for any resource allocation algorithm, since the delay matrices T avg and D need to be evaluated via estimation or learning methods. Also, we note that this method requires estimation algorithms in the first step. Due to the page limitations, we leave a detailed study of this problem as our future work Numerical Results In this subsection, we investigate the performance of our proposed algorithm via numerical results. Since the estimation and resource allocation components required for the extended algorithm in Section are beyond our scope, we only consider Algorithm 8.3 and its corresponding system model in this subsection. In the numerical results, the location of each user is randomly generated within a circular cell with the base station placed at the center. Each point in the figures is obtained by taking average over 500 randomly generated systems. The popularity matrix is generated according to the Zipf distribution. When the users have identical popularity, they give the same popularity index to a file, which leads to identical rows in the popularity matrix P. When the users have independent popularity, each user gives popularity indices to the files independently. In other words, identical popularity indicates that all users have the same preference, while independent popularity indicates that each user has an independent preference. The number of files in the library is M = 100, and the size of each file is 11.3 bits. We assume Rayleigh fading with path loss E{z} = d 4, where d represents the distance between the transmitter and the receiver. The transmission powers are set as P b = 23 db and P u = 20 db, and we choose the weights as ω i = ˆp i so that η represents the average system delay. In the numerical results, we compare the performance of our proposed algorithm with a naive algorithm, in which each user just caches the most popular µ files. This 231

251 Proposed algorithm with independent popularity Naive algorithm with independent popularity Proposed algorithm with identical popularity Naive algorithm with identical popularity Average delay η β Figure 8.2: Average delay η vs. Zipf exponent β naive algorithm is efficient when the base station does not have the knowledge of the channel fading statistics and the cached files at each user. In this circumstance, the users just cache files according to their own preference. In the case of naive algorithm with identical popularity, every user caches the same files, and they get the files they do not have via cellular downlink from the base station. Therefore, the gap between the two curves using naive algorithm in Figs (which will be discussed in detail next) demonstrates the benefit of enabling D2D communications. By allowing D2D transmission, the users far away from the base station can get files from their neighbors, which helps to significantly reduce the delay. In Fig. 8.2, we set N = 25, µ = 30 and plot the average delay η as a function of the Zipf exponent β. As β increases, the popularity difference increases. When β = 0, the users request all files with equal probability; when β +, each user only requests its most favorite file. Therefore, we only need to concentrate on the delay performance of fewer popular files as β increases, and it becomes easier to achieve better delay performance with limited caching space. That is the reason for having monotonically decreasing curves in Fig Another observation is that our 232

252 2.5 2 Proposed algorithm with independent popularity Naive algorithm with independent popularity Proposed algorithm with identical popularity Naive algorithm with identical popularity Average delay η Cache size µ Figure 8.3: Average delay η vs. cache size µ algorithm is more robust to the popularity setting. Compared to the curves using the naive algorithm, identical popularity model only slightly raises the delay of our algorithm. If a node can get a popular file from its near neighbor, then caching some less popular files might give better delay improvement. Therefore, our algorithm can enable D2D transmission even in an identical popularity model, which guarantees the robustness. In Fig. 8.3, we select β = 0.1, N = 25 and plot the average delay as a function of the caching size µ. When µ is small, the delay difference between different algorithms and different popularity settings is small. In such a situation, both algorithms cache the most popular files. As µ increases, the difference in performance increases. As we have mentioned in Algorithm 8.2, our algorithm searches for the optimal <file,user> pair that provides the maximum delay improvement, and this mechanism guarantees a very sharp decrease at the beginning. After exceeding a threshold, further increasing the caching size reduces the performance difference, because the system gets enough caching size to cache most of the popular files. Overall, Fig. 8.3 shows that our algorithm can achieve better delay performance with limited caching size. 233

253 Average delay η Proposed algorithm with independent popularity Naive algorithm with independent popularity Proposed algorithm with identical popularity Naive algorithm with identical popularity N Figure 8.4: Average delay η vs. the number of users N In Fig. 8.4, we select β = 0.1, µ = 30 and plot the average delay as a function of the number of users N. For the curve using the naive algorithm with identical popularity model, having more users does not affect the average delay because each user works in cellular mode and receives the files from the base station. For other curves, increased number of users enables more chances for D2D communication, and as a result the average delay decreases. Compared with the naive algorithm, our algorithm can achieve better performance when the number of users is large. 8.2 Scheduling in D2D Underlaid Cellular Networks with Deadline Constraints System Model and Transmission Modes System Model We consider an OFDMA cellular network with N available orthogonal channels, one base station (BS), N c cellular users {CU 1, CU 2,, CU Nc } and N d D2D pairs 234

254 {(DT 1, DR 1 ), (DT 2, DR 2 ),, (DT Nd, DR Nd )}. Each channel is assumed to have a bandwidth of B. Each cellular user transmits to the base station through an uplink channel, and receives data from the base station via a downlink channel. D2D transmission is assumed to be one-way between a D2D pair, in which DT i, the transmitter of the i th D2D pair, sends packets to its corresponding receiver DR i. The maximum transmission power of cellular users and D2D transmitters are set at ˆP c and ˆP d, respectively. When acting as a transmitter, the maximum transmission power of base station is ˆP b in each channel. We assume that the time is slotted, and each time slot has a duration of T. At the beginning of each time slot, the system runs a scheduling algorithm to allocate its channels to the users. Those users to which channels are not assigned are not activated for communication until they get an available channel in another time slot. In a cellular network with D2D users, D2D users can choose to transmit trough a direct link DT i DR i or a two-hop link DT i BS DR i. When it transmits through the base station, a D2D transmitter first sends packets to the base station, and then the base station decodes and forwards the packets to the corresponding D2D receiver. When a pair of D2D users chooses to communicate through the direct link, they can also decide whether to reuse the same channel with an uplink, a downlink or another D2D direct link. In this section, mode selection is performed by the scheduling algorithm, and the mode of each active user is determined by the channel allocation results. There are 4 assumptions for the channel assignment: 1. Each active link can occupy only one channel, and a D2D two-hop link is also regarded as one transmission link. 2. Each channel can be occupied by two transmission links at most. 3. A two-hop link cannot share its channel with other transmission links. 4. Cellular uplinks and downlinks can only share their channels with D2D direct 235

255 links. According to our assumptions, there are overall 7 different modes, namely D2D cellular mode, uplink reuse mode, downlink reuse mode, D2D reuse mode, uplink dedicated mode, downlink dedicated mode, and D2D dedicated mode. A detailed discussion about these possible modes is given in Section The channel fading is assumed to be block fading, in which the fading coefficients denoted by h stay constant in one time block and change independently across blocks. In Figs. 8.5, 8.6 and 8.7, the magnitude-square of the fading coefficients are denoted by z = h 2. At each receiver, the background noise is assumed to follow an independent complex Gaussian distribution with zero mean and variance σ 2, i.e., n CN (0, σ 2 ). For all transmitters, the data packets are stored in buffers before being sent to the corresponding receiver. For simplicity, we assume that new packets arrive at the beginning of each time slot, and the size of each packet is assumed to be fixed at I p bits. We further assume that there are totally 2N c + N d + 1 buffers in the system, operating in a first-in first-out (FIFO) manner. At each cellular user and D2D transmitter, there is a buffer storing the packets for cellular uplinks and D2D links, respectively. Although there might be only one physical buffer at the base station in reality, we can decompose it into N c + 1 virtual buffers, in which N c of them correspond to cellular downlinks, and there is one special buffer corresponding to the D2D cellular mode. We enumerate all these buffers from 1 to 2N c + N d, except the one corresponding to the D2D cellular mode. The system operates under deadline constraints, and the delay upper bound of the i th buffer is set to be D i. In the following subsection, we introduce all possible transmission modes and describe the relationship between mode selection and channel allocation. 236

256 Figure 8.5: System model in the D2D cellular mode Transmission Modes and Instantaneous Transmission Rate In this section our mode selection is done through scheduling. Depending on how the system uses each channel, we can determine the transmission mode for each user. According to our channel allocation assumptions, there are 7 possible modes, which can be further summarized into 3 categories. D2D Cellular Mode The first category is D2D cellular mode, and the model is shown in Fig In this mode, the channel is occupied by a D2D pair, transmitting with the help of the base station, and the packet arrival rate at DT i is represented by R Di. For simplicity, we assume that the whole time slot is divided into two sub-slots with duration τt and (1 τ)t, respectively 3. In the first sub-slot, D2D transmitter DT i sends packets to the base station, and base station stores the received packets in the special buffer corresponding to the D2D cellular mode. In the second sub-slot, base station forwards all the packets received in the first sub-slot to their destination DR 4 i. In each sub-slot, there is only one transmitter and one receiver, and the received signal at each receiver follows the form y = hx + n, (8.15) 3 In this section, we assume that the time cost for signal processing at the base station is negligible in cellular mode. 4 No deadline constraints are imposed on this special buffer at the base station, because all packets sent in the first sub-slot arrive at their destination in the second sub-slot 237

257 Figure 8.6: System model in the reuse mode where x is the transmitted signal, n is the additive Gaussian noise component, h is the corresponding channel fading coefficient. We can express the packet transmission rates (in packets/slot) of DT i BS link and BS DR i link as r Di,BS(τ, P d ) = τ T B ( log I P ) d p Bσ z 2 1 (8.16) and r BS,Di (τ, P b ) = (1 τ) T B ( log I P ) b p Bσ z 2 2, (8.17) respectively, where I p is the packet size, P d and P b are the transmission powers of DT i and base station respectively, and represents the floor function. The parameter τ is determined from r Di,BS = r BS,Di. Reuse Mode The second category is the reuse mode, and the system model is shown in Fig In this model, two transmitter-receiver pairs share the same channel, and they inflict interference on each other. According to the types of users sharing the channel, reuse mode includes uplink reuse mode, downlink reuse mode, and D2D reuse mode. In 238

258 Figure 8.7: System model in the dedicated mode the uplink reuse mode, a cellular uplink shares the channel with a D2D direct link; in the downlink reuse mode, a cellular downlink shares the channel with a D2D direct link; in D2D reuse mode, two pairs of D2D users transmit in the same channel. The received signal at each receiver follows the form y = hx + h inter x inter + n, (8.18) where x is the desired signal, h is the fading coefficient of the channel between this receiver and its corresponding transmitter, x inter is the interference signal, h inter is the fading coefficient of the interfering link, and n is the Gaussian noise. Treating the interference as noise, we can express the instantaneous packet transmission rates in these two links as r 1 (P 1, P 2 ) = ( T B log I p P 1 z 1 Bσ 2 + P 2 z 21 ) (8.19) and r 2 (P 1, P 2 ) = ( T B log I p P 2 z 2 Bσ 2 + P 1 z 12 ), (8.20) respectively. In (8.19) and (8.20), P 1 and P 2 denote the transmission powers of Tx 1 and Tx 2, respectively. Dedicated Mode The third category is the dedicated mode, which is depicted in Fig In this model, one transmitter-receiver pair occupies a channel without sharing it with others, and 239

259 the packet arrival rate is represented by R. Depending on the type of the transmission link occupying the channel, this category includes uplink dedicated mode, downlink dedicated mode, and D2D dedicated mode. Since there is no interference, the received signal at the receiver also follows the form given in (8.15), and the instantaneous rate can be expressed as r(p ) = ( T B log I P ) p Bσ z, (8.21) 2 where P is the transmission power Scheduling with Convex Delay Cost Method Convex Delay Cost Function The convex delay cost approach was proposed in [55], where a monotonic increasing convex cost function was employed for the packet delay. In our analysis, we use the same convex cost function proposed in [56]. More specifically, for the j th packet in the i th buffer with delay d i,j, the delay cost is given by C i,j (d i,j ) = ( di,j D i ) α, (8.22) where D i is the corresponding delay threshold for the packets in the i th buffer, and α 0 is a relaxation parameter. As α increases, the cost grows faster when the delay increases beyond the threshold. At the beginning of each slot, the delays for newly arrived packets are set as 0. 5 At the end of each slot, packet delay for all packets that have not been sent increases by 1. If we denote the length of the packet queue in the i th buffer as l i, then the 5 When α = 0, the cost of a new packet is defined as 1 240

260 overall cost of the entire cellular network can be expressed as C = 2N c+n d i=1 l i j= Scheduling Decisions and Utility C i,j (d i,j ). (8.23) The decision of a single channel assignment can be denoted by a set of active links that occupy this channel. Therefore, for the 3 dedicated modes, the corresponding decisions only contain a single direct link; for the D2D cellular mode, the decision contains a two-hop channel; for the 3 reuse modes, the decisions contain two direct links sharing the same channel. For each channel, there are overall N DC = 2N d N c + 2(N d + N c ) + N d (N d 1)/2 different possible decisions at most, and we enumerate all these decisions from 1 to N DC. For each channel, we select the optimal decision from all possible candidates, that minimizes the overall cost. If the system does not make any decision, then the overall cost would become C 0 = 2N c +N d i=1 l i j=1 C i,j (d i,j + 1), (8.24) at the end of the current slot. If the k th decision is selected, then the overall cost would be C k = 2N c+n d i=1 l i j=µ i,k +1 C i,j (d i,j + 1), (8.25) where µ i,k is the instantaneous departure rate (in packets/slot) from the i th buffer. We define the utility of the k th decision as U k = C 0 C k = 2N c +N d i=1 µi,k C i,j (d i,j + 1). (8.26) j=1 241

261 Since the decision with the highest utility can minimize the overall cost, the scheduling algorithm allocates each channel to the transmission link(s) giving the highest utility. In the next subsection, we discuss the utility maximization for each type of decision Utility Maximization and Scheduling Algorithm In previous subsections, we have characterized the instantaneous transmission rates for all possible modes, and formulated the cost function and utility. In this subsection, we first provide utility maximization algorithms for all possible modes, and propose our scheduling algorithm. In our scheduling algorithm, utility maximization is the same as power optimization, in which we find the optimal transmission powers that maximize the utility for a scheduling decision Utility Maximization in Dedicated Modes For uplink, downlink and D2D dedicated mode, there is only one direct transmission link occupying the channel, and the instantaneous transmission rate r(p ) is given by (8.21). In order to maximize the utility, the transmitter just transmits with its maximum power. Suppose that Fig. 8.7 describes the k th decision, in which the channel is occupied by a direct transmission link with maximum power P max. The instantaneous departure rate is µ i,k = min {r(p max ), l i }. (8.27) Then the maximum utility is given by µ i,k U k = C i,j (d i,j + 1). (8.28) j=1 242

262 Utility Maximization in D2D Cellular Mode In D2D cellular mode, a pair of D2D users transmit through a two-hop channel, and the base station works as the relay. In order to maximize the utility, the D2D transmitter and base station transmit with their maximum power. Assume that Fig. 8.5 describes the k th decision. By setting P d = ˆP d and P b = ˆP b, the instantaneous transmission rates r Di,BS and r BS,Di are given by (8.16) and (8.17), respectively. The optimal τ value is given by τ = log 2 (1 + ˆP b Bσ 2 z 2 ) log 2 (1 + ˆP d Bσ 2 z 1 ) + log 2 (1 + ˆP b Bσ 2 z 2 ), (8.29) which arises from r Di,BS = r BS,Di. Inserting the optimal τ value back into (8.16), we get the instantaneous departure rate as µ i,k = min {r Di,BS(τ, ˆP } d ), l i, (8.30) and the maximum utility is also given by (8.28) Utility Maximization in Reuse Modes In uplink, downlink and D2D reuse modes, two direct links transmit simultaneously in the same channel. Suppose that Fig. 8.6 describes the k th decision, in which the channel is occupied by two direct links, connecting with the i th 1 and i th 2 buffer, respectively. The instantaneous transmission rates r 1 and r 2 are given by (8.19) and (8.20), respectively, and the departure rates of these two buffers are given by µ is,k = min {r s, l is }, (8.31) 243

263 for s = 1, 2. The utility is given by U k = µi 1,k j=1 C i1,j(d i1,j + 1) + µi 2,k j=1 Then the optimization problem can be formulated as C i2,j(d i2,j + 1). (8.32) Maximize P1,P 2 U k Subject to 0 P 1 P 1max (8.33) 0 P 2 P 2max, (8.34) which is not a convex optimization problem. In addition, derivative-based algorithms are not applicable, due to the floor function in (8.19) and (8.20). Since the transmission rates r 1 and r 2 only take integer values, the optimization problem can be solved by an efficient search algorithm. If we fix the value of r 1, then the maximum utility is achieved when r 2 is maximized. Therefore, we can search over all possible r 1 values, and for any given r 1, the optimal utility is given by the power values P 1 and P 2 which maximize r 2. When we fix r 1, we can get P 1 = (2 r 1 Ip T B 1 ) (P 2 z 21 + Bσ 2 )/z 1 (8.35) and r 2 = T B log I p ( Bσ 2 + z 12 z 1 2 r 1 Ip T B 1 P 2 z 2 ) (8.36) (P 2 z 21 + Bσ 2 ) from (8.19) and (8.20). We can easily verify that r 2 given by (8.36) is an increasing function of P 2. Therefore, r 2 is maximized when P 2 achieves its upper bound. From (8.33) and (8.35), we 244

264 Utility Maximization for Reuse Mode Table 8.4: Algorithm Initialization: (a) Set r 2 = 1 and P 1 = P 1max. Solve P 2 from 8.20, find r 1max and µ i1,k max using (8.19) and (8.31), respectively. Set the optimal utility value Uk as 1. (b) If µ i1,k max = 0,, and then end the utility maximization process. 2. Searching: For r 1 = 1 : µ i1,k max (a) Compute the transmission power P 1 and P 2 using (8.35) and (8.38), respectively. Then find r 2 and µ i2,k from (8.20) and (8.31), respectively. (b) Compute the utility U k using (8.32). If U k > Uk, update the optimal utility value as Uk = U k. end get one upper bound on P 2 expressed as P 2 z 1 P ( 1max ) Bσ2. (8.37) z 21 2 r 1 I p T B 1 z 21 Combining this result with (8.34), the optimal P 2 is given by P2 = min P 2max, z 1 P ( 1max z 21 2 r 1 Ip T B 1 ) Bσ2. (8.38) z 21 The overall utility optimization algorithm is given in Table Scheduling Algorithm After providing the utility maximization algorithms for all possible modes, we now propose our scheduling algorithm. At the beginning of each slot, we schedule the channels one by one. For each channel, we compute the maximum utility values that each decision can achieve, and assign the channel according to the decision that gives the maximum utility value. The algorithm is described in Table 8.5. In order to have 245

265 Table 8.5: Algorithm 8.5 Scheduling Algorithm Randomly enumerate the channels from 1 to N. For n = 1 : N end 1. Using the fading coefficients in the n th channel, compute the maximum utility values for all decisions in the possible decision set. In the first loop, there are N DC possible decisions. 2. Assign the n th channel according to the decision that gives the maximum utility. 3. Remove all decisions that contain the links in the selected decision from the possible decision set. Therefore, the links in the selected decision would not be selected by other channels. lower complexity and high performance, we randomly enumerate the channels so that the order of channels would not affect our algorithm. Assume the number of users is N u = N c + N d, then it is easy to show that N DC 16N 2 u+40n u+1 24, where equality is achieved at N d = 4Nu 1 6. Therefore, the complexity of this algorithm is o(nn DC ), which is also equal to o (NN 2 u)) Numerical Results In this subsection we further investigate the performance of our scheduling algorithm via numerical results. In our numerical results, we use Monte Carlo simulations to obtain the delay violation probability, average delay, and average throughput. We consider Rayleigh fading with path loss E{z} = d 4, where d is the transmission distance, and we fix N = N c = N d = 3, ˆP c = ˆP d = db, ˆP b = 30 db, σ 2 = 0 db. For all users, we select the delay thresholds as 30 time slots. We use constant arrival rate in the simulations, and the arrival rates are set as R = ρne{r Direct }/(2N c + N d ), where r Direct is the transmission rate of the corresponding direct link, and ρ is the intensity parameter. When generating the position of each node, we fix the base station at the origin of the coordinate axes, and place the D2D and cellular users randomly in the cell with coverage radius equal to 3. In order to eliminate the influence of the random positions, we generate 100 systems randomly, in which all 246

266 ρ=0.95 ρ=1.00 ρ=1.20 Delay violation Probability α Figure 8.8: Delay violation probability vs. α nodes are uniformly distributed, and each point we plot in our figures is obtained by averaging the results from these 100 systems. Each simulation is conducted over time slots. In Figs. 8.8 and 8.9, we plot the delay violation probability and average delay as functions of the relaxation parameter α. From (8.22), we can see that as α increases, the importance of the maximum packet delay grows. When α = 0, the utility just depends on the transmission rate, and our scheduling algorithm becomes a greedy algorithm that maximizes the instantaneous transmission rate. When α is sufficiently large, the utility is mainly decided by the maximum packet delay of each buffer, and our algorithm would allocate the channel to the user(s) with the largest delay. From Figs. 8.8 and 8.9, we can see that α = 1 gives the best performance with very small delay and high throughput. The scheduling algorithm has degraded performance when the balance between the importance of instantaneous transmission rate and packet delay is broken. When α = 0, the algorithm does not use any delay information; when α is too large, the algorithm may often allocate the channels to the 247

267 ρ=0.95 ρ=1.00 ρ=1.20 Average delay (slot) α Figure 8.9: Average Delay vs. α users with large packet delay even when their channel conditions are not favorable, which lowers the throughput and increases the average delay. Figs and 8.11 show the advantage of using reuse modes. If we do not have reuse modes, the packet delay increases significantly. In order to satisfy the delay constraints and stabilize the system, the systems without reuse modes need to reduce their arrival rates, sacrificing the throughput. Therefore, by allowing D2D users to share channels with other users, D2D communication can have much better performance when deadline constraints are applied. 248

268 Delay violation probability ρ=1.4 ρ=1.4, without reuse modes ρ=1.5 ρ=1.5, without reuse modes α Figure 8.10: Delay violation probability vs. α Average delay (slot) ρ=1.4 ρ=1.4, without reuse modes ρ=1.5 ρ=1.5, without reuse modes α Figure 8.11: Average Delay vs. α 249

269 Figure 8.12: System model of C-RAN with ICI 8.3 Intercell Interference-Aware Scheduling for Delay Sensitive Applications in C-RAN System Model and Preliminaries System Model In this section, the uplink transmission in a C-RAN within an OFDMA setting is considered as shown in Fig There are N c cells in this network, and each cell is served by a base station with one RRH. RRHs are connected to a centralized BBU pool with multiple BBUs working cooperatively. All cells reuse N ch frequency bands/channels, and each channel has a bandwidth of B. The total number of mobile users in this network is fixed at N u, and users are assumed to be associated with their nearest RRHs. Each user is equipped with a buffer storing the arriving packets before sending them through the wireless uplink channels, and the size of each packet is assumed to be I p bits. All buffers are assumed to operate in a FIFO manner. The system is assumed to operate under delay constraints, and target delay of packets sent by the i th user is denoted by D i (time frames). Block fading is assumed in this section, in which the fading coefficients stay constant within one time frame with a 250

270 duration of T, and change across frames. Also it is assumed that the distributions of the fading coefficients are identical in different channels. At the beginning of each time frame, BBU pool allocates channel resources to the users using a scheduling algorithm. It is assumed that users keep silent until they get channel resources from the BBU pool, and the channel resources are returned back at the end of each time frame. There are 4 assumptions for the channel assignment: 1. The number of users is much greater than the number of available channels, N u N ch. In such a case, each user transmits using one channel at most. 2. Only the users that can satisfy the pair-wise interference constraints given in (8.47) can reuse the same channel resource. 3. Users associated with the same RRH cannot reuse the same channel resource. 4. The BBU pool is assumed to have perfect CSI, and it is also assumed to keep track of the buffer status (including the queue length and packet delay information) of each user. The first assumption addresses a heavy load scenario, in which all channels are reused by multiple users and ICI becomes a significant problem. In such a case, the assumption that each user transmits using one channel at most helps to reduce ICI caused by excessive frequency reuse. The second assumption limits the interference, and the third assumption guarantees that all interference comes from neighbouring cells. The last assumption guarantees that the BBU pool has enough information to conduct our scheduling algorithm. CSI is estimated at RRHs and sent to the BBU pool via optical fiber links. Information of the arrival rates at all users is also sent to the BBU pool via special feedback channels 6, and the BBU pool can track the queue status at each user. 6 We assume ideal feedback without delay and error. 251

271 Define Ψ j (t) as the set of users that use the j th channel in the t th time frame, and ξ i,j (t) as the indicator function that indicates whether the j th channel is assigned to the i th user in the t th time frame. In other words, ξ i,j (t) = 1 if i Ψ j (t), otherwise ξ i,j (t) = 0. According to our first channel assignment assumption, we have N ch j=1 ξ i,j(t) 1. Then for the t th time frame, the received signal corresponding to user i at its associated base station can be expressed as y i = h j i x i + k Ψ j (t),k i h j k,i x k + n j i (8.39) if ξ i,j (t) = 1. Above, x i represents the transmitted signal of user i, h j i denotes the fading coefficient of the channel between user i and its corresponding RRH, h j k,i denotes the fading coefficient of the interference channel between user k and the RRH associated with user i, and n j i is the background noise at the base station associated with user i which is assumed to follow an independent complex Gaussian distribution with zero mean and variance σ 2, i.e., n j i CN (0, σ2 ). The transmission rate of user i in the t th time frame is given by ) P i z r i (t) = T B log 2 (1 j i + Bσ 2 + k Ψ j (t),k i P kz j k,i bits/frame (8.40) where j is the index of the channel that is assigned to user i, P i represents the transmission power of user i, T is the duration of each time frame, B is the bandwidth of each channel, z j i = hj i 2, and z j k,i = hj k,i Convex Delay Cost and Utility In the convex delay cost approach, the cost function of a packet is formulated as an increasing convex function of its delay [55]. The high performance of this approach was shown in [56] for a single cell model without any interference. In our previous work [97], we designed a scheduling algorithm using the convex cost function provided in 252

272 [56] for a D2D communication setting, and verified via simulations that this approach has very good delay performance. Here, we define the cost of the j th packet in the buffer at user i as C j,i = d j,i D i, (8.41) where d j,i is the current delay of this packet, and D i is the target delay of user i. At user i, the number of packets that can be transmitted in the current time frame is µ i = min {l i, r i /I p }, (8.42) where l i is the number of packets waiting in the buffer at user i, I p is the size of each packet, and represents the floor function. The utility of user i is defined as µ i U i = C j,i, (8.43) j=1 and the utility of the system is defined as N u N u µ i U = U i = C j,i. (8.44) i=1 i=1 j=1 The utility given in (8.44) represents the total cost of the packets that can be transmitted to the base station in the current time frame. At the beginning of each time frame, the BBU pool runs a scheduling algorithm for channel assignment to maximize the utility. In the next subsection, a detailed discussion on our scheduling algorithm is provided. 253

273 8.3.2 ICI-Aware Scheduling Algorithm for C-RAN In this subsection, we introduce our scheduling algorithm. In each time frame, our scheduling algorithm assign channels to the users in a way that maximizes the utility given in (8.44). Since we consider a C-RAN architecture, the BBU pool has the knowledge of all fading distributions and cost functions of each packets, and it can allocate channel resources to all users in different cells together. Our scheduling algorithm can be divided into two steps, namely the user grouping step and channel matching step. In the first step, we divide all users into small groups such that the users in the same group reuse the same channel. In the second step, we match the channels to the user groups to maximize the utility User Grouping In the first step of our algorithm, we divide all users into small groups, and each group will be assigned a channel resource in the next step. Before channel assignment, we cannot compute the instantaneous transmission rates because the sets Ψ 1, Ψ 2,, Ψ Nch have not been determined yet. Therefore, we use a rate estimator ˆr i = 1 m t 1 τ=t m r i (τ) (8.45) instead. This rate estimator is essentially the average rate over the most recent m time frames. Plugging (8.45) into (8.42) and (8.43), we obtain the utility estimator of user i as ˆµ i Û i = C j,i = j=1 min{l i, ˆr i /I p } j=1 C j,i. (8.46) In order to control ICI, we assume that any two users (i 1 and i 2 ) reusing the same 254

274 Table 8.6: Algorithm 8.6 Input: γ, transmission power and utility estimator of each user, the fading coefficients. Output: User groups GP 1, GP 2,, GP Ng. Collect the utility estimators Ûi into a vector V = [Û1, Û2,, ÛN u ]. Set k = 1 While max(v) 0 Set V = V and GP k = While max(v ) 0 i = arg max(v ) Add user i into GP k. Set V(i) = 1 and V (i) = 1. For j from 1 to N u Set V (j) = 1 if user i and j cannot satisfy the interference constraints given in (8.47) or they are associated to the same RRH. End End k = k + 1 End channel resource have to satisfy the pairwise interference/sinr constraints given by { } P E i1 z i1 Bσ 2 +P i2 z i2 γe,i 1 { } P E i2 z i2 Bσ 2 +P i1 z i1 γe,i 2 { } Pi1 z i1 Bσ 2 { } Pi2 z i2, (8.47) Bσ 2 where the parameter γ is between 0 and 1. Since the distributions of the fading coefficients are identical in different channels, the expected values of the SINRs and SNRs in (8.47) do not depend on the channel assignment result. The details of our user grouping algorithm is given in Table 8.6, and we denote the number of the output user groups as N g. At the beginning, we set group GP k as an empty set. Each time, we select the user with the maximum utility estimator and include it into GP k. After adding a user into a group, we kick out the users that cannot reuse the same channel resource with this selected user by setting V (j) = 1, which can be processed in parallel at 255

275 the BBU pool. Our grouping algorithm aims to collect the users with high utility estimators together, which helps to serve these users with less channel resources. Note that the number of groups N g might be smaller than the number of channels N ch. In such cases, some of the channels cannot be assigned to users, and we need to break those groups with large sizes into several small groups so that N g = N ch. To divide a big group into two small groups, we select half of the users with smaller utility estimator values within the large group, and let them form a new small group Channel Matching In the second step, we assign channels to the user groups via the maximum-weight matching approach. In this step, we find a matching between user groups and channels that maximizes the system utility given in (8.44). Let us define η i,j as the indicator of the channel assignment result, i.e., η i,j = 1 if channel j is assigned to GP i, and η i,j = 0 if channel j is not matched to GP i. Then the matching problem can be formulated as Maximize ηi,j U Subject to η i,j {0, 1} N ch η i,j 1 j=1 N g i=1 η i,j = 1. In graph theory, the maximum-weight matching problem can be solved by the Hungarian algorithm (Kuhn-Munkres algorithm) [92]. To use the Hungarian algorithm, we have to first construct the utility matrix U, in which each row corresponds to a user group and each column corresponds to a channel. The element of this matrix U i,j is the sum utility of the users in GP i if the j th channel is assigned to that 256

276 group. The elements of the utility matrix can be computed in parallel at the BBU pool. After constructing the utility matrix, the Hungarian algorithm is applied, and channels are assigned to the users Summary and Complexity Analysis In summary, we propose a two-step scheduling algorithm with good delay performance for a multi-cell C-RAN model. In the first step, we group the users to control the ICI and aim to collect the users with high utility estimator values into smaller number of groups. In the second step, we formulate the channel allocation problem as a maximum-weight matching problem, and assign the channel resources to the user groups using the Hungarian algorithm. Although our algorithm only considers an uplink scenario, it can also be easily adapted to a downlink scenario. Since we consider a C-RAN model, our algorithm is performed considering users in multiple cells, and parallel processing can be performed in some parts of our algorithm at the BBU pool to reduce time consumption. Compared with conventional resource allocation algorithms, in which cooperative processing among multiple cells is not considered, our algorithm has a significant potential to achieve better performance. Assume that the number of processers at BBU pool is Θ(N c ), then the time complexity of the user grouping step is O(Nu/N 2 c ). In the matching step, the time consumption for constructing the utility matrix is O(N g N ch /N c ), and the time consumption of the Hungarian algorithm is O(max{N g, N ch } 3 ). To further accelerate this process, we can replace the Hungarian algorithm with some heuristic algorithms with time complexity of O(min{N g, N ch }). As an example, in each iteration, we can select the maximum element in the utility matrix, and match its corresponding group and channel together. The overall time consumption of this algorithm depends on the relationship among N u, N c, N g and N ch. 257

277 ρ=0.6 ρ=0.7 Delay violation probability γ Figure 8.13: Delay violation probability vs. interference control parameter γ Numerical Results In this subsection, we further study the performance of our algorithm and the influence of parameters via simulations. In our simulations, we consider a C-RAN with 3 adjacent cells, each with a radius of 2. The coordinates of the RRHs of these three cells are ( 2, 0), (0, 2) and (2, 0), respectively. In each cell, there are 5 randomly placed users, and each one has the maximum transmission power P max Bσ 2 = 13 db. The number of available channels is N ch = 5. We assume Rayleigh fading with path loss E{z} = s 4, where s represents the distance between the transmitter and the receiver. Each point on the curves is determined by taking the average over the results of 500 systems with randomly placed users, and the performance result of each system is evaluated over time frames. In Figs and 8.14, we study the influence of the interference control parameter γ, which is used in the pairwise interference constraints expressed in (8.47). The arrival rate at user i is set as λ i = ρe{t B log 2 (1 + P i z i /Bσ 2 )}, where the parameter ρ is the arrival intensity. The target delay is 25 time frames for all users, and all 258

278 Average throughput (packets/user/frame) ρ=0.6 ρ= γ Figure 8.14: Throughput vs. interference control parameter γ users transmit at their maximum power level. When γ is small, the ICI is not well controlled and the average transmission rate is not maximized. As γ increases, the system achieves lower delay violation probability and higher throughput due to better ICI management. However, when γ is too large, the interference constraints become too strict, which leads to less frequency reuse. In such cases, the throughput becomes smaller and the delay violation probability increases. In Figs and 8.16, we analyze the influence of power control on our algorithm. In several conventional ICI control algorithms such as SFR, cell center users transmit with small power to reduce the interference they cause to the cell edge users. In these two figures, the transmission power of user i is selected as P i = P max (s i /R cell ) α, where s i is the distance between the user and its corresponding RRH, and R cell is the radius of the cell. As α increases, cell center users are restricted to transmit with smaller power. Also, all arrival rates are set as λ = 1.5E{T B log 2 (1+P max z edge /Bσ 2 )}, where E{T B log 2 (1+P max z edge /Bσ 2 )} is the average transmission rate of a user at the edge of its associated cell. In Figs and 8.16, we notice that as α increases, both delay and throughput performances become worse. Our algorithm control the interference in the 259

279 Delay violation probability α Figure 8.15: Delay violation probability vs. power control parameter α 16 Average throughput (packets/user/frame) α Figure 8.16: Throughput vs. power control parameter α 260

280 Table 8.7: Comparison between our algorithm and SFR user grouping step. Users that cannot satisfy the pairwise interference constraints are not allowed to reuse the same channel resource. Further decrease in the transmission power of the cell center users reduces their transmission rates, making it more difficult to stabilize the system. Finally, we compare our algorithm with the conventional SFR scheme introduced in [72]. The arrival rates are set in the same way as in Figs and 8.14, and the target delay is 40 for all users. In our algorithm, all users transmit with maximum power. In the SFR scheme, users transmit with full power in the edge bands and they use 70% of their maximum power in the center bands. Channel assignment is conducted at the BBU of each cell individually to maximize the sum utility of the users in that cell. The results are provided in Table 8.7. As the arrival intensity increases, the advantage of our algorithm becomes obvious in terms of the average delay. With the C-RAN architecture, cooperative processing over multiple cells enhances the delay performance significantly. 261

281 Chapter 9 Conclusion 9.1 Summary In this thesis, we have studied the delay QoS provisioning and optimal resource allocation for wireless networks. The contributions of this thesis are summarized below. In Chapter 3, we have analyzed the throughput and energy efficiency of HARQ protocols in the presence of statistical queuing requirements when the QoS exponent θ is sufficiently small via Taylor expansion. In Section 3.1, we have investigated the throughput of HARQ-IR in the presence of queuing constraints imposed as limitations on buffer overflow probabilities. Using the statistical properties of the renewal counting process, we have identified the first-order expansion of the effective capacity of HARQ-IR in terms of the QoS exponent θ. We have taken into account hard deadline constraints by imposing an upper bound on the number of HARQ rounds to send a message. We have discussed that the main result on the first-order expansion of the effective capacity holds in the presence of deadline constraints with a modified description of the transmission time. Through numerical results, we have demonstrated that increasing the transmission rate R improves the throughput 262

282 monotonically in HARQ-IR and makes it approach the throughput of a system with perfect CSI at the transmitter, while it initially improves and then lowers the throughput of Type-I HARQ and HARQ Chase Combining protocols. We have also observed that increased throughput with larger R comes at the expense of longer transmission time or equivalently larger number of HARQ-IR rounds. We have shown that the throughput degrades when stricter queuing constraints or hard-deadline constraints are imposed. In particular, we have demonstrated that monotonic growth in the throughput with increasing R is not experienced in the presence of deadline limitations. In Section 3.2, we have analyzed the energy efficiency of the HARQ-CC scheme under outage, deadline, and statistical queuing constraints in the low-power and low-θ regimes by employing the notions of effective capacity and effective bandwidth from the stochastic network calculus while considering both constant-rate and random data arrivals to the buffer. Two queue models are considered. When outage happens, the transmitter discards the packet in the second queue model, while it transmits the same packet later in the first queue model. First, we have determined the minimum energy per bit and wideband slope achieved with HARQ-CC for fixed outage probability and both constantrate and Markov source models. Also, we have provided comparisons among different random arrival models. Analyzing the results, we have concluded that source burstiness does not affect the minimum energy per bit when ON-OFF discrete time and Markov fluid sources are considered. On the other hand, due to the Poisson arrivals and the resulting higher level of burstiness, MMPS is shown to have worse energy efficiency compared to the ON-OFF Markov fluid source. Moreover, among the considered arrival models, MMPS is the only source for which the minimum energy per bit depends on the QoS exponent θ and grows with stricter QoS constraints. In contrast to the characterizations 263

283 regarding the minimum energy per bit, we have shown that wideband slope in all cases varies with the QoS exponent θ and source statistics. The impact of source burstiness is clearly identified with additional terms introduced in the denominators of the wideband slope expressions. In Chapter 4, throughput of HARQ under statistical queuing constraints has been studied via the recurrence approach proposed in [15]. Compared with the low-θ approximation used in Chapter 3, recurrence approach is accurate for any QoS exponent value. In Section 4.1, we have analyzed the throughput of the HARQ-CC scheme under outage, deadline, and statistical queuing constraints by employing the notions of effective capacity and effective bandwidth from stochastic network calculus, while considering both constant-rate and random data arrivals to the buffer. Two typical queue models are considered. First, we have determined the throughput for the constant-rate arrival model. Then, with the effective capacity expression we obtained for the constant-rate arrival model, we have further formulated the throughput for the ON-OFF discrete-time and fluid Markov sources and MMPS. We have verified our analytical characterizations via Monte Carlo simulations. Finally, with the help of numerical results, we have compared the throughput values for the two considered queue models, and further investigated the impact of the deadline constraints, outage probability, queuing constraints and source randomness on the throughput. In Section 4.2, we have studied the throughput of HARQ-IR with finite blocklength codes, deadline limits, and statistical queuing constraints by employing the notions of effective capacity and effective bandwidth from stochastic network calculus. Two different arrival models, namely the constant-rate and ON-OFF discrete time Markov arrivals, have been studied, and throughput characteriza- 264

284 tions have been obtained for both arrival models. We have first characterized the distribution of the duration of the transmission period and outage probability, and determined the effective capacity using the results from the recurrence relation approach. Subsequently, we obtained the throughput expressions for both constant-rate and ON-OFF discrete time Markov arrival models. Our characterizations have been verified via Monte Carlo simulations. Finally, we have further investigated the impact of the deadline constraints, fixed transmission rate, queuing constraints and blocklength via numerical results. In Chapter 5, we have investigated the throughput of cooperative relay networks under statistical queuing constraints. Three types of cooperative relay networks are considered, namely two-hop relay channel, two-way relay channel and multi-source multi-destination relay network. In Section 5.1, we have investigated the throughput of the buffer-constrainted two-hop relay channel in the finite blocklength regime. We have initially characterized the system throughput through effective capacity analysis. Subsequently, we have formulated the throughput maximization problem, and investigated the properties of the optimal error probabilities for given time allocation parameter τ. Based on these properties, we have proposed an search algorithm in order to determine the optimal parameter setting more efficiently compared to directly searching in the three-dimensional bounded τ ϵ 1 ϵ 2 space. Finally, we have provided numerical results and investigated the impact of the source-relay distances, QoS exponents, and the blocklength on the throughput. In Section 5.2, we have investigated the throughput of two-way relaying under queueing constraints at both the source nodes and the relay. We have initially identified the instantaneous transmission/service rates in the multiple-access and broadcast phases of two-way relaying, and considered the stability condi- 265

285 tions and their impact on the system parameters (τ, ρ). Subsequently, we have defined the throughput region and provided characterizations of the maximum arrival rates, that can be supported by the sources, in terms of the QoS exponents and resource allocation parameters τ and ρ. We have provided numerical results and investigated the impact of the source-relay distances, signal-to-noise parameters, QoS exponents, and time-sharing between different decoding orders on the throughput. In Section 5.3, we have studied the throughput of multi-source multi-destination relay networks under statistical queueing constraints, for both cases of with and without CSI at the transmitter sides. When there is perfect CSI at the transmitter, transmission rates can be varied according to the instantaneous channel conditions. We have characterized the instantaneous channel capacities in different phases as functions of the system parameters τ, ρ and δ. When CSI is not available at the transmitter side, transmissions are performed at fixed rates, and decoding failures lead to retransmission requests via an ARQ protocol. We have modeled the links to be in ON or OFF states depending on the reliability of the reception. We have determined the probabilities of these states. Following these characterizations, we have described, for both perfect and no CSI cases, the stability conditions, and defined the feasible region of the transmission parameters. Finally, we have characterized the arrival rates under queueing constraints at the source and relay nodes as a function of the QoS exponents, channel fading and system parameters for both cases. In addition, the concavity of the throughput function is shown with respect to the system parameters δ and τ for the variable-rate scheme. We have verified the theoretical results via Monte Carlo simulations. Numerically, for the variable-rate model, we have investigated the optimal position of the relay node. Also, the throughput region is obtained via searching over the three dimensional parameter space. 266

286 In Chapter 6, we have investigated the mode selection between half-duplex and full-duplex modes in two-way MIMO systems operating under queuing constraints. To have a fair comparison, each antenna in full-duplex mode is assumed to transmit and receive at the same time so that it can have the same number of transmitting antennas as in half-duplex mode. We have characterized the system throughput for both half-duplex and full-duplex modes with given input covariance matrices. In the low-snr regime, we have proposed an iterative algorithm to find the optimal input covariance matrices, which achieves the smallest E b N 0 min of the system. In the numerical results, we have found that full-duplex mode has better performance at low SNRs and short distances because two users can transmit simultaneously and make more efficient use of the resources. On the other hand, half-duplex mode has better performance in the high-snr regime and at long distances. In Chapter 7, we have studied the mode selection and resource allocation algorithms for D2D cellular networks. In Section 7.1, we have studied the mode selection and resource allocation in a TDM cellular network with one cellular user and one pair of D2D users operating under queueing constraints. For all four possible modes, namely the cellular mode, dedicated mode, uplink reuse mode, and downlink reuse mode, we have first formulated the system throughput using the effective capacity, and proposed efficient throughput maximization algorithms. Via numerical results, we have analyzed the influence of the positions of each node. In Section 7.2, we have proposed, for D2D cellular networks, a joint mode selection and channel allocation algorithm that maximizes the system throughput under statistical queueing limitations and average SINR constraints. First, we have characterized the instantaneous rate and effective capacity for all possible modes, namely the D2D cellular mode, D2D dedicated mode, uplink dedicated mode, downlink dedicated mode, uplink reuse mode, downlink reuse mode, 267

287 and D2D reuse mode. Then, we have proposed our channel matching algorithm, which selects modes for each user and allocates the channels simultaneously. Finally, we have further studied the performance of our algorithm by comparing its throughput and the numbers of unserved users with the random allocation algorithm through simulation results. In the results, we have demonstrated that our new algorithm can achieve higher throughput and serve more users. In Section 7.3, we have proposed a joint mode selection and resource allocation algorithm for D2D underlaid cellular networks. We have decomposed the problem into three subproblems, and designed algorithms for each subproblem. In the first step, we divide the transmission links into small groups using vertex coloring algorithm. In the second step, we solve the power optimization problem using the interior-point method for each group and conduct mode selection for those D2D links which form a group, and we assign channel resources in the final step. Via simulation results, we have compared the performance of our algorithm with that of the coalitional game method, and have shown that our algorithm achieves higher sum rate and serves more users with relatively small time consumption. Also, the influence of the interference threshold step size γ is studied through numerical results, and the tradeoff between sum rate and the number of served users is identified. In Chapter 8, we have studied the delay performance of content delivery over wireless cellular networks. Three types of network models are considered, including D2D caching network, D2D cellular network, and C-RAN. In Section 8.1, we have proposed a caching algorithm for D2D cellular networks, which minimizes the weighted average delay. First, we have characterized the popularity model and average transmission delay of a request. Then, we have formulated the delay minimization problem and developed our algorithm which 268

288 can solve the weighted average delay minimization problem efficiently. We have also extended our algorithm for a more general scenario, in which the distributions of fading coefficients and system parameters change over time. Finally, we have further investigated the performance of our algorithm by comparing it with a naive algorithm which simply caches the most popular files at each user. By applying both algorithms to two different popularity models, we have shown that our algorithm is more robust to variations in the popularity models, and can achieve better performance, because the proposed algorithm can more effectively take advantage of D2D communications. Also, the influence of the popularity parameter, caching size and number of users is studied via numerical results. In Section 8.2, we have proposed a scheduling algorithm for D2D cellular networks with deadline constraints. First, we have characterized the instantaneous packet rates for all possible modes, namely the D2D cellular mode, D2D dedicated mode, uplink dedicated mode, downlink dedicated mode, uplink reuse mode, downlink reuse mode, and D2D reuse mode. Then, we have formulated the cost function and defined utility for all possible decisions. Each scheduling decision can be regarded as a result of joint mode selection and channel allocation. For each type of decision, we have provided a power allocation algorithm to maximize the utility. Our algorithm allocates each channel according to the decision that provides the maximum utility value. Finally, we have further studied the performance of our algorithm through numerical results. In the results, we have provided characterizations on the optimal value of the system parameter α, and verified the advantages of having D2D users sharing the channel with other users. In Section 8.3, we have proposed an ICI-aware scheduling algorithm for the 269

289 C-RAN architecture that minimizes the sum delay cost of the system. The procedure is divided into two steps, namely the user grouping step and the channel matching step. In the user grouping step, we have designed a grouping algorithm that partitions all users in the network into small groups by checking their pairwise interference levels. In order to serve those users with high utility values with less channel resources, our grouping algorithm aims to collect users with high utility estimator values into small number of groups. In the channel matching step, we have formulated the channel assignment problem as a maximum-weight matching problem, which can be solved using the Hungarian algorithm. In the second step, user groups are matched to the available channel resources with goal of maximizing the system utility. Finally, we have studied the impact of the interference threshold and power control parameter via simulations, and compared our algorithm to the conventional SFR scheme. With the advantages of cooperative processing and information sharing over multiple cells, it has been verified that our algorithm designed for C-RAN can achieve higher throughput and lower delay. 9.2 Future Research Directions Popularity Estimation and Scheduling for D2D Caching Systems In Section 8.1, we assume that the base station has perfect knowledge about the popularity matrix. However, it is very difficult for the base station to predict the popularity of each content for each user. A reasonable popularity estimation and tracking algorithm should be designed, in order to apply our proposed caching algorithm. Also, it is of interest to combine the estimation and caching algorithms with the scheduling algorithm proposed in Section 8.2, to make our work more complete. 270

290 Appendix A A.1 Proof of Theorem 1 The proof of Theorem 1 is based on the Taylor expansion of the cumulant generating function, which expresses C e as a polynomial function of θ. After deriving the zeroth and first order coefficients of this polynomial, a closed-form expression for the firstorder expansion is obtained for small θ. Before finding the polynomial approximation, we need to show that the moments of the random transmission time T are finite. Let us denote the j th moment of the random transmission time T by µ j = E{T j }. (A.1) The following characterization shows that T has finite support for any fixed transmission rate and therefore we have µ j < for all 1 j <. Lemma 1 If the expected value of the instantaneous capacity is strictly greater than zero, then for any fixed transmission rate R, the random transmission time T of HARQ-IR has finite support. Hence, all of its moments are finite. Proof 7 Since {z i } is a sequence of i.i.d. random variables, by the strong law of large numbers [98, Section 7.4], we have 1 n n i=1 T sb log 2 (1 + SNRz i ) converge to 271

291 E{T s B log 2 (1 + SNRz i )} = E{C i } almost surely, i.e., we have ( Pr lim n 1 n n i=1 ) T s B log 2 (1 + SNRz i ) = E{C i } = 1. (A.2) This almost sure convergence implies that with probability one, for any given ε > 0, there exists a positive integer n 1 such that for all n n 1 1 n n T s B log 2 (1 + SNRz i ) E{C i } ε. i=1 (A.3) or equivalently E{C i } ε 1 n n T s B log 2 (1 + SNRz i ) E{C i } + ε. i=1 (A.4) Hence, under the assumption that E{C i } > 0, we have the following lower bound with probability one for some 0 < ε < E{C i }: 1 n n T s B log 2 (1 + SNRz i ) E{C i } ε > 0. i=1 (A.5) Next, we consider a bound on R. For a fixed transmission rate R, we have n R lim n n = 0. (A.6) Therefore, for any ε 2 > 0, there exists an integer n 2 n 1 such that for all n n 2, we have R n ε 2. (A.7) Choosing ε 2 = E{C i } ε and using the bound in (A.5), we have for all n n 2 that R n 1 n n T s B log 2 (1 + SNRz i ) i=1 (A.8) 272

292 or equivalently R n T s B log 2 (1 + SNRz i ) i=1 (A.9) with probability one for all n n 2. According to the condition of successful decoding in equation (3.4), (A.8) implies that the random transmission time T for reliably sending R bits is upper bounded by n 2 with probability one, i.e., Pr(T n 2 ) = 1. Hence, for any given fixed transmission rate R, T has finite support as claimed in the lemma. Hence, the moments µ j = E{T j } n j 2 < are finite for all 1 j <. For HARQ Chase Combining, we can also show that all the moments of T are finite. Similar approach can be applied to chase combining protocol. Lemma 2 If the expected value of z i is strictly greater than zero, then for any fixed transmission rate R, the random transmission time T of HARQ Chase Combining has finite support. Hence, all of its moments are finite. Proof 8 By the strong law of large numbers [98, Section 7.4], we have 1 n n i=1 z i converge to E{z i } almost surely, i.e., we have ( Pr lim n 1 n n i=1 ) z i = E{z i } = 1. (A.10) Similar to the proof of Lemma 1, for any given E{z i } > ε > 0, there exists a positive integer n 1 such that for all n n 1, we can have 1 n n z i E{z i } ε > 0. i=1 (A.11) For a fixed rate R, (2 R TsB 1 ) /SNR is also a constant. Then we have lim n (2 R TsB 1 ) /(n SNR) = 0. (A.12) 273

293 Therefore, for any ε 2 > 0, there exists an integer n 2 n 1 such that for all n n 2, we have (2 R TsB 1 ) /(n SNR) ε 2. (A.13) Choosing ε 2 = E{z i } ε and using the bound in (A.11), we have for all n n 2 that (2 R TsB 1 ) /(n SNR) 1 n n i=1 z i (A.14) or equivalently n R T s B log 2 (1 + SNR z i ) i=1 (A.15) with probability one for all n n 2. Similar as in the HARQ-IR case, we have shown that for HARQ Chase Combining the transmission time T also has finite support, and hence all the moments of T are finite. Having shown the finiteness of all moments of T, the rest proof is the same for both HARQ-IR and HARQ Chase Combining. We next consider the cumulant generating function of N t, which is the logarithm of the moment generating function of N t, i.e., g(z) = log E{e znt }. (A.16) According to the theory of cumulant generating function, it can be expressed as g(z) = j=1 κ j (t) zj j! (A.17) where κ j (t) is the j th order cumulant of N t. Examining (3.8), we notice that effective 274

294 capacity is proportional to the cumulant generating function of N t and we can write 1 θt log e E{e θrn t } = 1 θt = j=1 j=1 κ j (t) t κ j (t) ( θr)j j! ( 1) j R j θ j 1. j! (A.18) (A.19) Applying the theory of cumulant generating function to our problem, the effective capacity can be expressed as C e = lim t 1 = lim t = j=1 θt log e E{e θrn t } κ j (t) ( 1) j R j θ j 1 t j! j=1 ) κ j (t) ( 1) j+1 R j θ j 1. t j! ( lim t (A.20) (A.21) (A.22) (by moving the limit inside the summation) It has been proven in [99] that if the moments of T are finite, then the j th cumulant of N t can be written as κ j (t) = a j t + b j + o(1) (A.23) for some constants a j and b j which depend on the moments of T. From this result, we conclude that κ j (t) lim t t = a j (A.24) 275

295 and hence C e = j=1 a j ( 1) j+1 R j j! θ j 1. (A.25) Furthermore, it has been shown in [99] and [100] that a 1 = 1 µ 1 and a 2 = µ 2 µ 2 1 µ 3 1 = σ2 µ 3 1 (A.26) where µ 1 = E{T } and µ 2 = E{T 2 } are the first and second moments of T and σ 2 is the variance of T. Plugging in these values into (A.25), we readily obtain C e = R R2 σ 2 θ + o(θ) µ 1 2µ 3 1 (A.27) where o(θ) denote the terms which decay faster than θ, i.e., lim θ 0 o(θ) θ the desired characterization in Theorem 1 is proved. = 0. Hence, 276

296 A.2 Proof of Theorem 2 Proof 9 For queue model I, the distribution of T Q1 is given by (3.32). Then, the expected value E{T Q1 } = µ Q1 can be found as µ Q1 = = = = t Pr{T Q1 = t} t=1 M v=1 k=0 (km + v) Pr{T Q1 = km + v} ( M ) (km + v)ε k Pr{V = v} v=1 k=0 ( ) M v Pr{V =v} ε k +M Pr{V =v} kε k v=1 v=1 k=0 k=0 = 1 M v Pr{V =v}+ Mε M Pr{V =v} 1 ε (1 ε) 2 = 1 1 ε M v=1 v=1 (A.28) (A.29) (A.30) (A.31) (A.32) v Pr{V = v} + Mε 1 ε. (A.33) Above, in (A.29), we replace t by km + v and sum over both k and v in order to more explicitly address possible violations of the maximum retransmission limit before successful packet transmission. Noting that k=0 εk = 1 1 ε (A.31) can be simplified to (A.32). and k=0 kεk = ε (1 ε) 2, Notice that M v=1 Pr{V = v} = Pr{V M} represents the probability that the transmission has been completed before violating the deadline constraint M, and hence is equal to 1 ε. Applying this fact to (A.32), we obtain (3.39). Similarly, the variance of T Q1 is given by σ 2 Q1 = E{T 2 Q1} µ 2 Q1 (A.34) 277

297 where E{T 2 Q1} = = t 2 Pr{T Q1 = t} t=1 ( M ) ε k (km + v) 2 Pr{V = v} v=1 = 1 1 ε k=0 M v 2 Pr{V = v} + 2Mε (1 ε) 2 v=1 M v=1 (A.35) (A.36) v Pr{V = v} + M 2 ε(1 + ε) (1 ε) 2. (A.37) Akin to the steps applied from (A.28) to (A.33), we again sum over km +v in (A.36), and then compute several summation terms with respect to k. Subsequently, using the fact that M v=1 Pr{V = v} = 1 ε, we obtain (A.37). Similarly, the distribution of T Q2 is given by (3.38) for queue model II, and we can directly obtain that µ Q2 = M t Pr{T Q2 = t} = t=1 M t Pr{V = t} + Mε, t=1 (A.38) and σq2 2 = E{TQ2} 2 µ 2 Q2 M = t 2 Pr{T Q2 = t} µ 2 Q2 = t=1 M t 2 Pr{V = t} + M 2 ε µ 2 Q2. t=1 (A.39) (A.40) (A.41) A.3 Proof of Theorem 3 Proof 10 In order to derive the minimum energy per bit and wideband slope expressions, we need to obtain the first and second derivatives of the throughput r avg (SNR) with respect to SNR at zero SNR. For the constant-rate arrival model, r avg or equivalently the effective capacity is given by (3.10). In this regard, for both queue models 278

298 I and II, the first and second derivatives of r avg (SNR) with respect to SNR, are given, respectively, by ṙ avg (SNR) = (1 + F 1 F 1 M (ε) M (ε)snr)µ log e 2 F ( 1 M (ε)θσ2 log e (1 + F 1 M (ε)snr)) 2 (1 + F 1 M (ε)snr)µ3 (log e 2) 2, (A.42) ( F 1 M r avg (SNR) = (ε)) 2 ( θσ 2 F 1 (1 + F 1 M (ε)snr)2 µ 3 (log e 2) M (ε)) 2 2 (1 + F 1 M (ε)snr)2 µ log e 2 ( F 1 M + (ε)) 2 θσ 2 log e (1 + F 1 M (ε)snr) (1 + F 1. (A.43) M (ε)snr)2 µ 3 (log e 2) 2 For queue model I, the corresponding µ and σ values are given by (3.39) and (3.40) respectively, and for queue model II, the corresponding µ and σ values are given by (3.41) and (3.42) respectively. Then, taking the limit as SNR 0 results in the following expressions: and r avg (0) = ṙ avg (0) = F 1 M (ε) µ log e 2, (A.44) ( F 1 M (ε)) 2 (θσ 2 + µ 2 log e 2) µ 3 (log e 2) 2. (A.45) Inserting the expressions in (A.44) and (A.45) into (3.26), (3.27), (3.28) and (3.29), the minimum bit energy and wideband slope for both queue models I and II are readily obtained. A.4 Proof of Proposition 1 Proof 11 Comparing (3.39) and (3.41), we obtain that µ Q1 = µ Q2 1 ε (A.46) 279

299 by replacing the summation index t with v in (3.41). Inserting (A.46) into (3.43) and comparing with (3.45), we obtain E b N 0 min Q1 = E b N 0 min Q2. Then, we rewrite (3.40) as σq1 2 = 1 M v 2 Pr{V =v}+ 2Mε M v Pr{V =v} + M 2 ε(1 + ε) µ 2 1 ε (1 ε) 2 (1 ε) 2 Q1 (A.47) v=1 v=1 ( M ) 1 = v 2 Pr{V =v}+m 2 ε(1+ε) µ 2 (1 ε) 2 Q2 v=1 ( ) 2Mε M ε M + v Pr{V =v} v 2 Pr{V =v} (A.48) (1 ε) 2 (1 ε) 2 v=1 v=1 ( 1 2Mε M ε M ) = (1 ε) 2 σ2 Q2 + v Pr{V = v} v 2 Pr{V = v}. (1 ε) 2 (1 ε) 2 v=1 v=1 (A.49) From (A.47) to (A.48), we use the fact that 1 1 ε = 1 ε to break 1 (1 ε) 2 (1 ε) 2 v} into two terms. From (A.48) to (A.49), we apply the expression of σ 2 Q2 Also, for the last two terms in (A.49), we have 1 ε M v=1 v2 Pr{V = in (3.42). 2Mε (1 ε) 2 2Mε (1 ε) 2 = Mε (1 ε) 2 0 M ε M v Pr{V =v} v 2 Pr{V =v} (1 ε) 2 v=1 v=1 v=1 M v Pr{V =v} ε M vm Pr{V =v} (1 ε) 2 M v Pr{V = v} v=1 v=1 (A.50) (A.51) (A.52) (A.53) From (A.50) to (A.51), we use v 2 vm, because we only consider the summation of v from 1 to M. Using the above results, we get σ 2 Q1 1 (1 ε) 2 σ2 Q2 (A.54) 280

300 from (A.49). Applying (A.46) and (A.54) to (3.44) and (3.46), we conclude that S 0 Q2 S 0 Q1. A.5 Proof of Theorem 4 Proof 12 Plugging the effective capacity formulation obtained in Theorem 1 into (2.13), and then taking the derivative with respect to SNR and evaluating as SNR 0, we have / [ p22 ṙ(0) =ĊE(0) 2 + p ] 22(p 11 + p 22 ) 2(p 11 + p 22 1) 2(2 p 11 p 22 ) =ĊE(0)/P ON. (A.55) (A.56) In determining (A.55), we have used the fact that lim SNR 0 r(snr) = 0 and lim SNR 0 C E (SNR) = 0. Note that when the transmit power approaches 0, the departure rate should also go to 0, which in turn makes the effective capacity approach 0. To satisfy the queuing constraints, the arrival rate r in the ON state should also diminish to 0. In the proof of Theorem 3, we have shown that Ċ E (0) = F 1 M (ε) µ log e 2 order derivative of the throughput evaluated as SNR goes to 0 as. Therefore, we can have the first ṙ avg (0) = ṙ(0)p ON = ĊE(0) = F 1 M (ε) µ log e 2. (A.57) Similarly, by taking the second order derivatives of the arrival rate r with respect to SNR and evaluating as SNR 0, we obtain r(0) = θċe(0) 2 + C E (0) ĊE(0) 2 θ(ζ + 1) P ON (A.58) 281

301 where ζ is defined in (3.49). In the proof of Theorem 3, we show that CE (0) = F 1 M (ε)2 (θσ 2 +µ 2 log e 2) µ 3 (log e 2) 2. Therefore, we can find r avg (0) = r(0)p ON = F 1 M (ε)2 ( θζ θσ2 +µ 2 log e 2 µ ) (µ log e 2) 2. (A.59) Inserting the results in (A.57) and (A.59) into (3.26) and (3.27), and replacing µ and σ 2 by µ Q1 and σ 2 Q1 respectively, we get the desired results for queue model I in Theorem 4. Similarly, inserting the results in (A.57) and (A.59) into (3.28) and (3.29), and replacing µ and σ 2 by µ Q2 and σ 2 Q2 respectively, we obtain the desired results for queue model II. A.6 Proof of Theorem 5 Proof 13 The proof is similar to the proof of Theorem 4. Plugging (2.18) into (2.4), taking the first and second order derivatives and evaluating as SNR 0, we get ṙ(0) = ĊE(0)/P ON. (A.60) From ṙ(0), we get ṙ avg (0) as ṙ avg (0) = ṙ(0)p ON = ĊE(0) = F 1 M (ε) µ log e 2. (A.61) Furthermore, we have α r avg (0) = r(0) α + β = C 2β E (0) Ċ2 E(0)θ α(α + β) ( F 1 M = (ε) ) 2 ( θσ 2 + µ 2 log e 2 µ log e 2 µ (A.62) (A.63) + 2θβ ). (A.64) α(α + β) 282

302 Inserting the results in (A.61) and (A.64) into (3.26) and (3.27), and replacing µ and σ 2 by µ Q1 and σ 2 Q1 respectively, we obtain the desired results for queue model I in Theorem 5. Similarly, inserting the results in (A.61) and (A.64) into (3.28) and (3.29), and replacing µ and σ 2 by µ Q2 and σ 2 Q2 respectively, we get the desired results for queue model II. A.7 Proof of Theorem 6 Proof 14 Using the characterization in (2.24), and taking the first and second order derivatives and evaluating as SNR 0, we get ṙ avg (0) = P ON θα(α + β) α 2 (e θ 1)ĊE(0) = θ e θ 1ĊE(0), (A.65) and r avg (0) = θ e θ 1 C 2βθ 2 E (0) 1)Ċ2 (α + β)(e θ E(0) (A.66) where ĊE(0) = F 1 M (ε) µ log e 2 and C E (0) = F 1 M (ε)2 (θσ 2 +µ 2 log e 2) µ 3 (log e 2) 2. Inserting the results in (A.65) and (A.66) into (3.26) and (3.27), and replacing µ and σ 2 by µ Q1 and σ 2 Q1 respectively, we obtain the desired results for queue model I in Theorem 6. Similarly, inserting the results in (A.65) and (A.66) into (3.28) and (3.29), and replacing µ and σ 2 by µ Q2 and σ 2 Q2 respectively, we get the desired results for queue model II. 283

303 A.8 Proof of Theorem 11 Proof 15 In [17], the throughput of the half-duplex two-hop relay system under queuing constraints is given by { min R = min } 1 θ 1 Λ S,R ( θ 1 ), 1 θ 2 Λ R,D ( θ 2 ) { 1θ1 Λ S,R ( θ 1 ), 1θ1 ( Λ R,D ( θ 2 ) + Λ S,R (θ 2 θ 1 ) θ 2 θ 1 ) } (A.67) θ 2 > θ 1 when the stability condition is satisfied. In our finite blocklength regime, the instantaneous rate of the uplink is equal to 0 or τmr 1 bits per block with probabilities ϵ 1 and 1 ϵ 1, respectively. Therefore, we can write the LMGF of the S R link as Λ S,R (θ) = log E z1 {ϵ 1 e τθ 1m0 + (1 ϵ 1 )e τθ 1mr 1 } = log E z1 {ϵ 1 + (1 ϵ 1 )e τθ 1mr 1 }. (A.68) Similarly, the LMGF of the R D link can be simplified as Λ R,D (θ) = log E{ϵ 2 + (1 ϵ 2 )e (1 τ)θ 2mr 2 }. (A.69) Plugging (A.68) and (A.69) into (A.67), and normalizing over the blocklength m (to change the unit from bits per block to bits per channel use), we obtain the throughput in the finite blocklength regime as in (5.7). A.9 Proof of Theorem 15 Proof 16 On the ϵ 1 ϵ 2 plane, for an arbitrary point (ˆϵ 1, ˆϵ 2 ) inside the stability region, the line segment (ˆϵ 1, ˆϵ 2 ) (ϵ 1, ϵ 2) has an intersection point with the boundary of the stability region because (ϵ 1, ϵ 2) given in Theorem 13 is assumed to be outside the stability region, and we denote the coordinates of this intersection point as 284

304 (ˆϵ 1, ˆϵ 2). In [20], it was shown that as ϵ increases from 0 to 1, Λ( θ) first increases and then decreases after achieving its maximum value in the single-hop model. This property can be applied directly to Λ S,R ( θ 1 ) and Λ R,D ( θ 2 ) in our halfduplex two-hop system. Since ˆϵ 1 is between ˆϵ 1 and ϵ 1, and ˆϵ 2 is between ˆϵ 2 and ϵ 2, we have Λ S,R ( θ 1 ) ˆϵ Λ S,R ( θ 1 ) and Λ 1 ˆϵ1 R,D ( θ 2 ) ˆϵ Λ R,D ( θ 2 ). In 2 ˆϵ2 the proof of Theorem 13, we find that the throughput is a non-decreasing function of both Λ S,R ( θ 1 ) and Λ R,D ( θ 2 ). Therefore, the error probability pair (ˆϵ 1, ˆϵ 2) gives the same or higher throughput, compared to (ˆϵ 1, ˆϵ 2 ). This implies that for any error probability pair inside the stability region, there exists a point on the boundary of the stability region that achieves the same or higher throughput, when (ϵ 1, ϵ 2) is outside the stability region. Therefore, the maximum throughput is achieved on the boundary of the stability region. A.10 Proof of Theorem 16 Proof 17 We know that both S 1 D 1 and S 2 D 2 links are restricted by two queuing constraints, one at the corresponding source node, and the other one at the relay node. We consider these two constraints separately, and then combine the results. First, we only consider the constraints at the source nodes. According to (2.25), the maximum arrival rate that can be supported under queuing constraints at a source node is given by R j = Λ S j,r( θ j ) θ j, (A.70) 285

305 for j = 1, 2. Similarly, when we only consider the queuing constraint at the relay node, the maximum arrival rates should satisfy 1 θ r Λ R,Dj ( θ r ) θ r θ j R j = ( 1 ΛR,Dj θ j ( θ r ) + Λ Sj,R(θ r θ j ) ) θ r > θ j, (A.71) which is obtained from (2.26) and (2.27). Combining these results, the overall maximum arrival rates that can be supported by the system should be the minimum of (A.70) and (A.71), i.e., { } min 1 θ j Λ Sj,R( θ j ), 1 θ r Λ R,Dj ( θ r ) { R j = min 1 θ j Λ Sj,R( θ j ), θ r θ j (A.72) 1 θ j ( ΛR,Dj ( θ r ) + Λ Sj,R(θ r θ j ) )} θ r > θ j, for j = 1, 2. Using the definition of LMGF, (A.72) can be expressed in terms of the instantaneous rates, which is given by (5.44). A.11 Proof of Theorem 17 Proof 18 Depending on the relationship between θ j and θ r for j = 1, 2, there are two possible cases identified by (5.44). Case 1 : θ r θ j. In this case, the throughput R j is given by R j = min {R j,1, R j,2 } (A.73) 286

306 where R j,1 and R j,2 are defined as R j,1 = R j,2 = ( { }) 1 θ j log E e θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}), ( }) 1 θ r log E {e θ r(1 τ)r R,Dj. (A.74) By taking the second order derivative with respect to δ, we can easily show the concavity of R j,1 and R j,2. The second order derivative of R j,1 is given by 2 R j,1 δ 2 θ j τ 2 = ( E{e θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } { ) 2 E { (R Sj,R{1,2} R Sj,R{2,1}) 2 e θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } E{e θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } ( E { (R Sj,R{1,2} R Sj,R{2,1})e θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) }) 2 }. (A.75) According to the Cauchy-Schwarz inequality, two random variables U and V should satisfy E 2 {UV } E{U 2 }E{V 2 }. Assuming that U = e 1 2 θ jτ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}), (A.76) and V = (R Sj,R{1,2} R Sj,R{2,1})U, (A.77) we can easily determine that the part inside the large curly brackets in (A.75) can be written as E{V 2 }E{U 2 } E 2 {UV } and hence is nonnegative. Then, we can readily determine that 2 R j,1 δ 2 0, which indicates that R j,1 is a concave function of δ. From (A.74), we notice that the expression of R j,2 does not contain δ. In other words, R j,2 is a constant function in terms of δ, and 2 R j,2 δ 2 = 0. Hence, we can still regard R j,2 as a concave function of δ. Since the pointwise minimum of concave functions is concave [95], the concavity 287

307 of R 1 and R 2 with respect to the time sharing parameter δ follows immediately when θ r θ j. Case 2 : θ r > θ j. In this case, the throughput R j is given by R j = min {R j,1, R j,3 } (A.78) where R j,3 is defined as R j,3 = 1 ( ) log(e{e θ r(1 τ)r R,Dj (θ }) + log(e{e r θ j )τ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) }). θ j (A.79) We have already shown the concavity of R j,1 in the previous case, and we can show the concavity of R j,3 following the same approach. The second order derivative of R j,3 is given by 2 R j,3 δ 2 = { (θ r θ j ) 2 τ 2 θ j (E{e (θ r θ j )τ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } ) 2 E { (R Sj,R{1,2} R Sj,R{2,1}) 2 e (θr θ j)τ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } E{e (θr θ j)τ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) } ( E { (R Sj,R{1,2} R Sj,R{2,1})e (θ r θ j )τ(δr Sj,R{1,2}+(1 δ)r Sj,R{2,1}) }) 2 }. (A.80) Again using the Cauchy-Schwarz inequality, we have 2 R j,3 δ 2 0, and the concavity follows. Since R j is the pointwise minimum of R j,1 and R j,3, R j is a concave function of δ. Now, we have shown in both cases that R 1 and R 2 are concave functions of δ. Finally, since the sum of two concave functions is also a concave function, the 288

308 sum rate is concave as well. A.12 Proof of Theorem 18 Proof 19 Similar to the proof of Theorem 17, Theorem 18 can be proved easily by evaluating the derivatives with respect to τ. The second order derivatives of R j,1, R j,2 and R j,3 with respect to τ are given, respectively, by 2 R j,1 τ 2 = θ j ( ) 2 {E{R 2 S j,re θ jτr Sj,R }E{e θ jτr Sj,R } E{e θ jτr Sj,R } ( ) 2 } E{R Sj,Re θ jτr Sj,R } (A.81) 2 R j,2 τ 2 = θ r (E{e θ r(1 τ)r R,Dj } ) 2 ) 2 } {E{R 2R,Dj e θ r(1 τ)r R,Dj }E{e θ r (1 τ)r R,Dj } (E{R R,Dj e θ r(1 τ)r R,Dj } (A.82) 2 R j,3 τ 2 = θ 2 r (θ j E{e θ r(1 τ)r R,Dj } ) 2 ) 2 } {E{R 2R,Dj e θ r(1 τ)r R,Dj }E{e θ r (1 τ)r R,Dj } (E{R R,Dj e θ r(1 τ)r R,Dj } (θ r θ j ) 2 ) 2 {E{R θ j (E{e 2 S j,re θ jτr Sj,R }E{e θ jτr Sj,R } θ jτr Sj,R } ( ) 2 } E{R Sj,Re θ jτr Sj,R }. (A.83) Using the Cauchy-Schwarz inequality and concavity-preserving property of pointwise minimum, the concavity of R 1, R 2 and the sum rate follow readily. 289

309 A.13 Proof of Theorem 20 Proof 20 We first consider user A and assume K xb is given. From (6.19), the instantaneous rate of A can be written as r A (SNR A ) = B log e 2 i ( ) log e 1 + N B SNR A λ i (H AB K xa H AB K 1 zb ), (A.84) where λ i ( ) denotes the i th eigenvalue of the matrix. Taking the derivative with respect to SNR A, we get ṙ A (SNR A ) = B N B λ i (H AB K xa H AB K 1 zb ) log e 2 i 1 + N B SNR A λ i (H AB K xa H AB K 1 zb ). (A.85) From (6.21), the first derivative of C A with respect to SNR A, evaluated at SNR A = 0 is given by Ċ A (0) = E {ṙ A (0)} = BN { } B log e 2 E tr(h AB K xa H AB K 1 zb ) (A.86) where tr( ) denotes the trace of the matrix. Note that both of K xa and K 1 zb are positive definite Hermitian matrices and therefore we can perform eigenvalue decomposition, and express them as K xa = V A Λ A V A = N A i=1 λ A,iv A,i v A,i and K 1 zb = V zb Λ zb V zb, where V A and V zb are unitary matrices, Λ A and Λ zb real diagonal matrices, v A,i is the i th column of V A, and λ A,i is the i th eigenvalue of K xa corresponding 290

310 to the eigenvector v A,i. Plugging this decomposition into (A.86), we get Ċ A (0) = BN B log e 2 E { tr(λ 1/2 zb V zb H ABK xa H AB V zbλ 1/2 zb ) } = BN B log e 2 = BN B log e 2 N A i=1 N A i=1 { } λ A,i E tr(λ 1/2 zb V zb H ABv A,i v A,i H AB V zbλ 1/2 zb ) { } λ A,i E tr(v A,i H AB K 1 zb H ABv A,i ) BN B log e 2 E { λ max (H AB K 1 zb H AB) } (A.87) (A.88) where λ max ( ) denotes the maximum eigenvalue of matrix. The equality is achieved when K xa = Ψ A Ψ A, where Ψ A is the eigenvector corresponding to the maximum eigenvalue of H AB K 1 zb H AB. Following the same approach, we can shown a similar result for K xb, and hence prove the theorem. A.14 Proof of Theorem 21 Proof 21 The second order derivative with respect to τ 1 is 2 R sum τ 2 1 ] θ C = E{e τ 1θ C r C,B} [E{r 2C,B e τ 1θ C r C,B }E{e τ 1θ C r C,B } E{r 2 C,B e τ 1θ C r C,B } 2. (A.89) Applying the Cauchy-Schwarz inequality, we can determine that 2 R sum τ 2 1 Through a similar approach, we can also determine that 2 R sum τ 2 2 is negative. 0 and 2 R sum τ Because τ 1 only appears in R C, τ 2 only appears in R B, and τ 3 only appears in R D, the Hessian matrix is diagonal, and can be expressed as ( ) 2 R sum H = Diag, 2 R sum, 2 R sum. (A.90) τ1 2 τ2 2 τ3 2 It is readily noted that H 0, and the concavity is shown. 291

311 Bibliography [1] C.-S. Chang, Stability, queue length, and delay of deterministic and stochastic queueing networks, IEEE Trans. Automat. Contr., vol. 39, no. 5, pp , [2] C.-S. Chang and T. Zajic, Effective bandwidths of departure processes from queues with time varying capacities, in IEEE INFOCOM, 1995, pp vol.3. [3] C.-S. Chang and J. Thomas, Effective bandwidth in high-speed digital networks, IEEE J. Select. Areas Commun., vol. 13, no. 6, pp , [4] D. Wu and R. Negi, Effective capacity: a wireless link model for support of quality of service, IEEE Trans. Wireless Commun., vol. 2, no. 4, pp , [5] S. Wicker, Error Control Systems for Digital Communication and Storage. Prentice Hall, [6] S. B. Wicker, Error Control Systems for Digital Communication and Storage. Prentice Hall Englewood Cliffs, 1995, vol. 1. [7] G. Caire and D. Tuninetti, The throughput of hybrid-arq protocols for the Gaussian collision channel, IEEE Trans. Inform. Theory, vol. 47, no. 5, pp , Jul

312 [8] P. Wu and N. Jindal, Performance of hybrid-arq in block-fading channels: A fixed outage probability analysis, IEEE Trans. Commun., Apr [9], Performance of Hybrid-ARQ in block-fading channels: A fixed outage probability analysis, IEEE Trans. Commun., vol. 58, no. 4, pp , Apr [10] J. Choi, J. Ha, and H. Jeon, On the energy delay tradeoff of HARQ-IR in wireless multiuser systems, IEEE Trans. Commun., Aug [11] I. Stanojev, O. Simeone, Y. Bar-Ness, and D. H. Kim, Energy efficiency of noncollaborative and collaborative Hybrid-ARQ protocols, IEEE Trans. Wireless Commun., vol. 8, no. 1, pp , Jan [12] F. Rosas, R. D. Souza, M. E. Pellenz, C. Oberli, G. Brante, M. Verhelst, and S. Pollin, Optimizing the code rate of energy-constrained wireless communications with harq, IEEE Trans. Wireless Commun., vol. 15, no. 1, pp , Jan [13] J. Choi, J. Ha, and H. Jeon, On the energy delay tradeoff of HARQ-IR in wireless multiuser systems, IEEE Trans. Commun., vol. 61, no. 8, pp , Aug [14] J. Choi and J. Ha, On the energy efficiency of AMC and HARQ-IR with QoS constraints, IEEE Trans. Veh. Technol., vol. 62, no. 7, pp , Sept [15] P. Larsson, J. Gross, H. Al-Zubaidy, L. K. Rasmussen, and M. Skoglund, Effective capacity of retransmission schemes: A recurrence relation approach, IEEE Trans. Commun., vol. 64, no. 11, pp , Nov

313 [16] J. Tang and X. Zhang, Cross-layer resource allocation over wireless relay networks for quality of service provisioning, IEEE J. Select. Areas Commun., vol. 25, no. 4, pp , [17] D. Qiao, M. Gursoy, and S. Velipasalar, Effective capacity of two-hop wireless communication systems, IEEE Trans. Inform. Theory, vol. 59, no. 2, pp , Sep [18] Y. Polyanskiy, H. Poor, and S. Verdu, Channel coding rate in the finite blocklength regime, IEEE Trans. Inform. Theory, Apr [19] V. Tan and M. Tomamichel, The third-order term in the normal approximation for the awgn channel, IEEE Trans. Inform. Theory, vol. 61, no. 5, pp , Mar [20] M. C. Gursoy, Throughput analysis of buffer-constrained wireless systems in the finite blocklength regime, EURASIP Journal on Wireless Communications and Networking, vol. 2013, no. 1, pp. 1 13, Dec [21] Y. Hu, J. Gross, and A. Schmeink, On the performance advantage of relaying under the finite blocklength regime, IEEE Commun. Lett., vol. 19, no. 5, pp , May [22], On the capacity of relaying with finite blocklength, IEEE Trans. Veh. Technol., vol. 65, no. 3, pp , Mar [23] J. Cui, The capacity of Gaussian orthogonal multiple-access relay channel, IEEE Commun. Lett., vol. 15, no. 4, pp , Apr [24] M. Osmani-Bojd, A. Sahebalam, and G. Hodtani, Capacity region of multiple access relay channels with orthogonal components, in ICTC, Sep. 2011, pp

314 [25] L. Sankar, Y. Liang, N. B. Mandayam, and H. Poor, Fading multiple access relay channels: Achievable rates and opportunistic scheduling, IEEE Trans. Inform. Theory, vol. 57, no. 4, pp , Apr [26] S. Salehkalaibar, L. Ghabeli, and M. Aref, Achievable rate region for multipleaccess-relay-networks, IET Commun., vol. 4, no. 15, pp , Oct [27] Y. Liu and A. Petropulu, On the sum-rate of amplify-and-forward relay networks with multiple source-destination pairs, IEEE Trans. Wireless Commun., vol. 10, no. 11, pp , Nov [28], QoS guarantees in AF relay networks with multiple source-destination pairs in the presence of imperfect CSI, IEEE Trans. Wireless Commun., vol. 12, no. 9, pp , Sep [29] F. Chen, W. Su, S. Batalama, and J. Matyjas, Joint power optimization for multi-source multi-destination relay networks, IEEE Trans. Signal Processing, vol. 59, no. 5, pp , May [30] C. Luo, S. McClean, G. Parr, P. Ren, and Y. Gong, Multiple-source multipledestination relay channels with network coding, IET Communications, vol. 7, no. 17, pp , Nov [31] P. Parag and J. Chamberland, Queueing analysis of a butterfly network for comparing network coding to classical routing, IEEE Trans. Inform. Theory, vol. 56, no. 4, pp , Mar [32] D. Qiao, M. Gursoy, and S. Velipasalar, On the achievable throughput region of multiple-access fading channels with QoS constraints, in IEEE ICC, May 2010, pp

315 [33] M. Ozmen and M. Gursoy, Throughput regions of multiple-access fading channels with Markov arrivals and QoS constraints, IEEE Wireless Commun. Lett., vol. 2, no. 5, pp , Oct [34] D. Qiao, M. Gursoy, and S. Velipasalar, Achievable throughput regions of fading broadcast and interference channels under QoS constraints, IEEE Trans. Commun., vol. 61, no. 9, pp , Sep [35] A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, In-band full-duplex wireless: Challenges and opportunities, IEEE J. Select. Areas Commun., vol. 32, no. 9, pp , Sep [36] T. Riihonen, S. Werner, and R. Wichman, Mitigation of loopback selfinterference in full-duplex mimo relays, IEEE Trans. Signal Process, vol. 59, no. 12, pp , Dec [37] H. Ju, X. Shang, H. Poor, and D. Hong, Bi-directional use of spatial resources and effects of spatial correlation, IEEE Trans. Wireless Commun., vol. 10, no. 10, pp , Oct [38] D. Kim, H. Ju, S. Park, and D. Hong, Effects of channel estimation error on full-duplex two-way networks, IEEE Trans. Veh. Technol., vol. 62, no. 9, pp , Nov [39] M. Gursoy, MIMO wireless communications under statistical queueing constraints, IEEE Trans. Inform. Theory, vol. 57, no. 9, pp , Sept [40] B. Kaufman and B. Aazhang, Cellular networks with an overlaid device to device network, in Asilomar Conference on Signals, Systems and Computers, Oct. 2008, pp

316 [41] A. Asadi, Q. Wang, and V. Mancuso, A survey on device-to-device communication in cellular networks, IEEE Communications Surveys Tutorials, vol. 16, no. 4, pp , Fourthquarter [42] K. Doppler, C.-H. Yu, C. Ribeiro, and P. Janis, Mode selection for device-todevice communication underlaying an LTE-advanced network, in IEEE Wireless Communications and Networking Conference (WCNC), Apr. 2010, pp [43] D. Feng, L. Lu, Y. Yuan-Wu, G. Y. Li, G. Feng, and S. Li, Device-todevice communications underlaying cellular networks, IEEE Trans. Commun., vol. 61, no. 8, pp , Jul [44] J. Han, Q. Cui, C. Yang, and X. Tao, Bipartite matching approach to optimal resource allocation in device to device underlaying cellular network, Electronics Letters, vol. 50, no. 3, pp , Jan [45] Y. Xu, R. Yin, T. Han, and G. Yu, Dynamic resource allocation for device-todevice communication underlaying cellular networks, International Journal of Communication Systems, vol. 27, no. 10, pp , Oct [46] G. Yu, L. Xu, D. Feng, R. Yin, G. Li, and Y. Jiang, Joint mode selection and resource allocation for device-to-device communications, IEEE Trans. Commun., vol. 62, no. 11, pp , Nov [47] C. Xu, L. Song, Z. Han, D. Li, and B. Jiao, Resource allocation using a reverse iterative combinatorial auction for device-to-device underlay cellular networks, in 2012 IEEE Global Communications Conference (GLOBECOM), Dec. 2012, pp [48] Y. Li, D. Jin, J. Yuan, and Z. Han, Coalitional games for resource allocation in the device-to-device uplink underlaying cellular networks, IEEE Trans. Wireless Commun., vol. 13, no. 7, pp , Jul

317 [49] D. Tsolkas, E. Liotou, N. Passas, and L. Merakos, A graph-coloring secondary resource allocation for D2D communications in LTE networks, in IEEE 17th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), Sep. 2012, pp [50] C. Lee, S. M. Oh, and J. S. Shin, Resource allocation for device-to-device communications based on graph-coloring, in International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Nov. 2015, pp [51] M. Hajiaghayi, C. Wijting, C. Ribeiro, and M. T. Hajiaghayi, Efficient and practical resource block allocation for LTE-based D2D network via graph coloring, Wireless networks, vol. 20, no. 4, pp , [52] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, R. Vijayakumar, and P. Whiting, Scheduling in a queuing system with asynchronously varying service rates, Probability in the Engineering and Informational Sciences, vol. 18, no. 02, pp , Apr [53] M. Sharma and X. Lin, OFDM downlink scheduling for delay-optimality: Many-channel many-source asymptotics with general arrival processes, in Information Theory and Applications Workshop (ITA). IEEE, Feb. 2011, pp [54] B. Ji, G. Gupta, X. Lin, and N. Shroff, Low-complexity scheduling policies for achieving throughput and asymptotic delay optimality in multichannel wireless networks, IEEE/ACM Trans. on Networks, vol. 22, no. 6, pp , Dec [55] J. A. Van Mieghem, Dynamic scheduling with convex delay costs: The generalized cµ rule, The Annals of Applied Probability, pp , Aug

318 [56], Due-date scheduling: Asymptotic optimality of generalized longest queue and generalized largest delay rules, Operations Research, vol. 51, no. 1, pp , Feb [57] Cisco visual networking index: Global mobile data traffic forecast update [Online]. Available: collateral/ns341/ns525/ns537/ns705/ns827/white paper c html [58] G. Paschos, E. Bastug, I. Land, G. Caire, and M. Debbah, Wireless caching: Technical misconceptions and business barriers, IEEE Commun. Mag., vol. 54, no. 8, pp , Aug [59] E. Bastug, M. Bennis, and M. Debbah, Living on the edge: The role of proactive caching in 5G wireless networks, IEEE Commun. Mag., vol. 52, no. 8, pp , [60] Z. Zhao, M. Peng, Z. Ding, W. Wang, and H. V. Poor, Cluster content caching: An energy-efficient approach to improve quality of service in cloud radio access networks, IEEE J. Select. Areas Commun., vol. 34, no. 5, pp , May [61] W. Han, A. Liu, and V. K. N. Lau, Phy-caching in 5G wireless networks: Design and analysis, IEEE Commun. Mag., vol. 54, no. 8, pp , Aug [62] M. Ji, A. M. Tulino, J. Llorca, and G. Caire, On the average performance of caching and coded multicasting with random demands, in International Symposium on Wireless Communications Systems (ISWCS). IEEE, 2014, pp

319 [63] M. A. Maddah-Ali and U. Niesen, Coding for caching: Fundamental limits and practical challenges, IEEE Commun. Mag., vol. 54, no. 8, pp , Aug [64] M. Ji, G. Caire, and A. F. Molisch, Wireless device-to-device caching networks: Basic principles and system performance, IEEE J. Select. Areas Commun., vol. 34, no. 1, pp , Jan [65] H. J. Kang, K. Y. Park, K. Cho, and C. G. Kang, Mobile caching policies for device-to-device (D2D) content delivery networking, in IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Apr. 2014, pp [66] N. Golrezaei, A. G. Dimakis, and A. F. Molisch, Wireless device-to-device communications with distributed caching, in IEEE International Symposium on Information Theory Proceedings (ISIT). IEEE, 2012, pp [67] Y. Lin, L. Shao, Z. Zhu, Q. Wang, and R. K. Sabhikhi, Wireless network cloud: Architecture and system requirements, IBM Journal of Research and Development, vol. 54, no. 1, pp. 4:1 4:12, Jan [68] C. Mobile, C-RAN: The road towards green RAN, White Paper, ver, vol. 2, Oct [69] A. Checko, H. L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. S. Berger, and L. Dittmann, Cloud RAN for mobile networks - A technology overview, IEEE Communications Surveys Tutorials, vol. 17, no. 1, pp , Firstquarter [70] A. Simonsson, Frequency reuse and intercell interference co-ordination in E- UTRA, in IEEE 65th Vehicular Technology Conference (VTC) Spring, Apr. 2007, pp

320 [71] M. C. Necker, Local interference coordination in cellular OFDMA networks, in IEEE 66th Vehicular Technology Conference (VTC) Fall, Sep. 2007, pp [72] Y. Xiang, J. Luo, and C. Hartmann, Inter-cell interference mitigation through flexible resource reuse in OFDMA based communication networks, in European wireless, vol. 2007, Apr. 2007, pp [73] S. Deb, P. Monogioudis, J. Miernik, and J. P. Seymour, Algorithms for enhanced inter-cell interference coordination (eicic) in LTE HetNets, IEEE/ACM Trans. Networking, vol. 22, no. 1, pp , Feb [74] C.-S. Chang and T. Zajic, Effective bandwidths of departure processes from queues with time varying capacities, in INFOCOM 95. Fourteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Bringing Information to People. Proceedings. IEEE, Apr. 1995, pp vol.3. [75] S. Tanwir and H. Perros, A survey of VBR video traffic models, IEEE Commun. Surveys & Tutorials, vol. 15, no. 4, pp , [76] M. Ozmen and M. Gursoy, Impact of channel and source variations on the energy efficiency under QoS constraints, in IEEE International Symposium on Information Theory Proceedings (ISIT), Jul. 2012, pp [77] M. Ozmen and M. C. Gursoy, Wireless throughput and energy efficiency with random arrivals and statistical queuing constraints, IEEE Trans. Inform. Theory, vol. 62, no. 3, pp , Mar [78] C. Chang, Performance Guarantees in Communication Networks, ser. Performance Guarantees in Communication Networks. Springer London, [Online]. Available: 301

321 [79] G. Kesidis, J. Walrand, and C.-S. Chang, Effective bandwidths for multiclass Markov fluids and other ATM sources, IEEE/ACM Trans. on Networking, vol. 1, no. 4, pp , Aug [80] G. Caire and D. Tuninetti, The throughput of hybrid-arq protocols for the Gaussian collision channel, IEEE Trans. Inform. Theory, Jul [81] D. Wu and R. Negi, Effective capacity: a wireless link model for support of quality of service, IEEE Trans. Wireless Commun., vol. 2, no. 4, pp , Jul [82] R. Serfozo, Basics of Applied Stochastic Processes. Springer, [83] G. Paschos, E. Bastug, I. Land, G. Caire, and M. Debbah, Wireless caching: technical misconceptions and business barriers, IEEE Commun. Mag., vol. 54, no. 8, pp , Aug [84] S. Verdu, Spectral efficiency in the wideband regime, IEEE Trans. Inform. Theory, vol. 48, no. 6, pp , Sep [Online]. Available: dx.doi.org/ /tit [85] Y. Polyanskiy, H. V. Poor, and S. Verdú, Dispersion of Gaussian channels, in IEEE International Symposium on Information Theory (ISIT). IEEE, Aug. 2009, pp [86] Y. Li, G. Ozcan, M. C. Gursoy, and S. Velipasalar, Energy efficiency of hybrid- ARQ under statistical queuing constraints, IEEE Trans. Commun., vol. 64, no. 10, pp , Oct [87] Y. Li, M. Gursoy, and S. Velipasalar, On the throughput of Hybrid-ARQ under statistical queuing constraints, IEEE Trans. Veh. Technol., vol. 64, no. 6, pp , Jun

322 [88] L. Li and A. J. Goldsmith, Capacity and optimal resource allocation for fading broadcast channels-part I: Ergodic capacity, IEEE Trans. Inform. Theory, vol. 47, no. 3, pp , Mar [89] D. Qiao, M. Gursoy, and S. Velipasalar, The impact of QoS constraints on the energy efficiency of fixed-rate wireless transmissions, IEEE Trans. Wireless Commun., vol. 8, no. 12, pp , Dec [90] M. Knox, Single antenna full duplex communications using a common carrier, in IEEE 13th Annual Wireless and Microwave Technology Conference (WAMICON), Apr. 2012, pp [91] J. Papandriopoulos and J. Evans, Low-complexity distributed algorithms for spectrum balancing in multi-user DSL networks, in 2006 IEEE International Conference on Communications (ICC), vol. 7, Jun. 2006, pp [92] H. W. Kuhn, The hungarian method for the assignment problem, Naval research logistics quarterly, vol. 2, no. 1-2, pp , Mar [93] Y. Li, M. C. Gursoy, and S. Velipasalar, Joint mode selection and resource allocation for D2D communications under queueing constraints, in 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Apr. 2016, pp [94] D. J. Welsh and M. B. Powell, An upper bound for the chromatic number of a graph and its application to timetabling problems, The Computer Journal, vol. 10, no. 1, pp , Jan [95] S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge university press,

323 [96] M. Akkouchi, On the convolution of exponential distributions, J. Chungcheong Math. Soc, vol. 21, no. 4, pp , [97] Y. Li, M. C. Gursoy, and S. Velipasalar, Scheduling in D2D underlaid cellular networks with deadline constraints, in IEEE Vehicular Technology Conference (VTC) Fall, Sep [98] G. Grimmett and D. Stirzaker, Probability and Random Processes. Oxford University Press, [99] W. L. Smith, On the cumulants of renewal processes, Biometrika, vol. 46, no. 1/2, pp. pp. 1 29, [Online]. Available: [100] W. Feller, Fluctuation theory of recurrent events, Transactions of the American Mathematical Society, vol. 67, no. 1, pp. pp , [Online]. Available: 304

Vita Yi Li received his B.S. degree in Electrical Engineering and Information Science in 2011 from University of Science and Technology of China, Hefei, China. He ﬁnished the Ph.D.

324 Vita Yi Li received his B.S. degree in Electrical Engineering and Information Science in 2011 from University of Science and Technology of China, Hefei, China. He ﬁnished the Ph.D. degree in the Department of Electrical Engineering and Computer Science, Syracuse University, in During his Ph.D. study, he has 3 ﬁrst author journal publications and 13 ﬁrst author conference publications. His research interests are in the ﬁelds of wireless communications, cooperative communication systems and device-to-device cellular networks. 305

WIRELESS COMMUNICATIONS AND COGNITIVE RADIO TRANSMISSIONS UNDER QUALITY OF SERVICE CONSTRAINTS AND CHANNEL UNCERTAINTY

WIRELESS COMMUNICATIONS AND COGNITIVE RADIO TRANSMISSIONS UNDER QUALITY OF SERVICE CONSTRAINTS AND CHANNEL UNCERTAINTY University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Theses, Dissertations, and Student Research from Electrical & Computer Engineering Electrical & Computer Engineering, Department