Capacity of Block Rayleigh Fading Channels Without CSI

Similar documents
What is the Value of Joint Processing of Pilots and Data in Block-Fading Channels?

Multiuser Capacity in Block Fading Channel

Optimal Transmit Strategies in MIMO Ricean Channels with MMSE Receiver

Lecture 7 MIMO Communica2ons

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Dirty Paper Coding vs. TDMA for MIMO Broadcast Channels

On Comparability of Multiple Antenna Channels

RECURSIVE TRAINING WITH UNITARY MODULATION FOR CORRELATED BLOCK-FADING MIMO CHANNELS

On the Low-SNR Capacity of Phase-Shift Keying with Hard-Decision Detection

Optimum Power Allocation in Fading MIMO Multiple Access Channels with Partial CSI at the Transmitters

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

On the Required Accuracy of Transmitter Channel State Information in Multiple Antenna Broadcast Channels

Lecture 2. Capacity of the Gaussian channel

Chapter 4: Continuous channel and its capacity

Capacity Pre-Log of SIMO Correlated Block-Fading Channels

Nearest Neighbor Decoding in MIMO Block-Fading Channels With Imperfect CSIR

Morning Session Capacity-based Power Control. Department of Electrical and Computer Engineering University of Maryland

MULTIPLE-INPUT multiple-output (MIMO) systems

Optimal Power Control in Decentralized Gaussian Multiple Access Channels

Using Noncoherent Modulation for Training

On the Secrecy Capacity of Fading Channels

Optimal Data and Training Symbol Ratio for Communication over Uncertain Channels

Title. Author(s)Tsai, Shang-Ho. Issue Date Doc URL. Type. Note. File Information. Equal Gain Beamforming in Rayleigh Fading Channels

WE study the capacity of peak-power limited, single-antenna,

USING multiple antennas has been shown to increase the

Capacity of Memoryless Channels and Block-Fading Channels With Designable Cardinality-Constrained Channel State Feedback

Wideband Fading Channel Capacity with Training and Partial Feedback

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Lecture 6 Channel Coding over Continuous Channels

CHANNEL FEEDBACK QUANTIZATION METHODS FOR MISO AND MIMO SYSTEMS

Finite-Blocklength Bounds on the Maximum Coding Rate of Rician Fading Channels with Applications to Pilot-Assisted Transmission

Capacity and Reliability Function for Small Peak Signal Constraints

Single-User MIMO systems: Introduction, capacity results, and MIMO beamforming

Capacity and Power Allocation for Fading MIMO Channels with Channel Estimation Error

VECTOR QUANTIZATION TECHNIQUES FOR MULTIPLE-ANTENNA CHANNEL INFORMATION FEEDBACK

Non Orthogonal Multiple Access for 5G and beyond

Appendix B Information theory from first principles

Capacity Pre-log of Noncoherent SIMO Channels via Hironaka s Theorem

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY Uplink Downlink Duality Via Minimax Duality. Wei Yu, Member, IEEE (1) (2)

Min-Capacity of a Multiple-Antenna Wireless Channel in a Static Rician Fading Environment

Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels

2318 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 6, JUNE Mai Vu, Student Member, IEEE, and Arogyaswami Paulraj, Fellow, IEEE

Capacity of multiple-input multiple-output (MIMO) systems in wireless communications

Channel. Feedback. Channel

Min-Capacity of a Multiple-Antenna Wireless Channel in a Static Ricean Fading Environment

The Optimality of Beamforming: A Unified View

Upper Bounds on the Capacity of Binary Intermittent Communication

5032 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 11, NOVEMBER /$ IEEE

DETERMINING the information theoretic capacity of

On Compound Channels With Side Information at the Transmitter

Lecture 5: Antenna Diversity and MIMO Capacity Theoretical Foundations of Wireless Communications 1. Overview. CommTh/EES/KTH

Advanced Topics in Digital Communications Spezielle Methoden der digitalen Datenübertragung

Two-Way Training: Optimal Power Allocation for Pilot and Data Transmission

Schur-convexity of the Symbol Error Rate in Correlated MIMO Systems with Precoding and Space-time Coding

Multiple-Input Multiple-Output Systems

ELEC546 Review of Information Theory

Hybrid Pilot/Quantization based Feedback in Multi-Antenna TDD Systems

Under sum power constraint, the capacity of MIMO channels

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Achievability of Nonlinear Degrees of Freedom in Correlatively Changing Fading Channels

On non-coherent Rician fading channels with average and peak power limited input

Unitary Isotropically Distributed Inputs Are Not Capacity-Achieving for Large-MIMO Fading Channels

Delay Performance of Wireless Communications with Imperfect CSI and Finite Length Coding

Fading Wiretap Channel with No CSI Anywhere

On the Throughput of Proportional Fair Scheduling with Opportunistic Beamforming for Continuous Fading States

IN this paper, we show that the scalar Gaussian multiple-access

Transmit Directions and Optimality of Beamforming in MIMO-MAC with Partial CSI at the Transmitters 1

WIRELESS COMMUNICATIONS AND COGNITIVE RADIO TRANSMISSIONS UNDER QUALITY OF SERVICE CONSTRAINTS AND CHANNEL UNCERTAINTY

The Fading Number of a Multiple-Access Rician Fading Channel

Statistical Beamforming on the Grassmann Manifold for the Two-User Broadcast Channel

Vector Channel Capacity with Quantized Feedback

ACOMMUNICATION situation where a single transmitter

Optimal power-delay trade-offs in fading channels: small delay asymptotics

On the Duality between Multiple-Access Codes and Computation Codes

Symmetric Characterization of Finite State Markov Channels

Performance Analysis of Multiple Antenna Systems with VQ-Based Feedback

Optimal Signal Constellations for Fading Space-Time Channels

On Optimal Training and Beamforming in Uncorrelated MIMO Systems with Feedback

How Much Training and Feedback are Needed in MIMO Broadcast Channels?

Multi-User Gain Maximum Eigenmode Beamforming, and IDMA. Peng Wang and Li Ping City University of Hong Kong

The Noncoherent Rician Fading Channel Part II : Spectral Efficiency in the Low-Power Regime

Information Rates of Time-Varying Rayleigh Fading Channels in Non-Isotropic Scattering Environments

On the Capacity of MIMO Rician Broadcast Channels

Chapter 9 Fundamental Limits in Information Theory

Uniform Power Allocation in MIMO Channels: A Game-Theoretic Approach

Multiple Antennas in Wireless Communications

Blind Joint MIMO Channel Estimation and Decoding

Exploiting Partial Channel Knowledge at the Transmitter in MISO and MIMO Wireless

Dispersion of the Gilbert-Elliott Channel

Energy State Amplification in an Energy Harvesting Communication System

Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels

Optimum Pilot Overhead in Wireless Communication: A Unified Treatment of Continuous and Block-Fading Channels

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Capacity of the Discrete-Time AWGN Channel Under Output Quantization

Rate-Optimum Beamforming Transmission in MIMO Rician Fading Channels

On the Secrecy Capacity of the Z-Interference Channel

12.4 Known Channel (Water-Filling Solution)

Joint FEC Encoder and Linear Precoder Design for MIMO Systems with Antenna Correlation

ECE Information theory Final (Fall 2008)

The Poisson Channel with Side Information

Transcription:

Capacity of Block Rayleigh Fading Channels Without CSI Mainak Chowdhury and Andrea Goldsmith, Fellow, IEEE Department of Electrical Engineering, Stanford University, USA Email: mainakch@stanford.edu, andrea@wsl.stanford.edu Abstract A system with a single antenna at the transmitter and receiver and no channel state information at either is considered. The channel experiences block Rayleigh fading with a coherence time of T symbol times and the fading statistics are assumed to be known perfectly. The system operates with a finite average transmit power. It is shown that the capacity optimal input distribution in the T -dimensional space is the product of the distribution of an isotropically-distributed unit vector and a distribution on the norm in the T -dimensional space which is discrete and has a finite number of points in the support. Numerical evaluations of this distribution and the associated capacity for a channel with fading and Gaussian noise for a coherence time T = are presented for representative SNRs.It is also shown numerically that an implicit channel estimation is done by the capacity-achieving scheme. Index Terms Block fading channels, no CSI, capacityachieving input distribution, noncoherent communications I. INTRODUCTION Channel estimation and the subsequent use of the channel estimates for data transmission lies at the basis of many wireless communication systems in use today. In this work, we explore an alternative paradigm where channel estimation and data transmission are not performed one after the other but rather jointly with the end-goal of maximizing the data rate. This maximum data rate equals the channel s Shannon capacity under the assumption that only the channel statistics are known at the transmitter and the receiver. Capacity results in this setting are few and far between. However, there is a rich history of work investigating many special cases. One such example is the finite state Markov channel, whose capacity under no CSI was studied in [1], []. In these works, the Markov property of the channel was used both to compute good bounds for the capacity as well as exact capacities for some special classes of channels. Capacity of i.i.d. as well as block fading channels with no CSI (but with perfect knowledge of the channel statistics) has also been extensively investigated. In particular, the capacity-achieving distribution for i.i.d. Rayleigh and Ricean fading channels without CSI was derived in [3], []. Based on a characterization of the Karush- Kuhn-Tucker (KKT) conditions associated with the convex optimization problem of maximizing the mutual information of these channels, the authors established that the optimal capacity-achieving input distribution is discrete with a finite number of mass points in the norm. A series of fundamental contributions were made starting in the early s on the capacity of block fading channels without CSI at the transmitter or the receiver, also called noncoherent channels. One such contribution is the notion of unitarily invariant codes proposed in [5] and [] for noncoherent MIMO channels. In the asymptotically large SNR regimes, the capacity-achieving schemes depend only on the fading distribution and perform space-time coding over the Grassman manifold associated with the channel matrix. Multiuser counterparts of these ideas can be found in [7], []. Results about optimal random codes in general block fading channels for low-to-moderate SNR regimes are harder to come by, since the codes depend not only on the fading distribution but also on the noise distribution. One example of such a work is [9]. In this work, the authors established that the probability distribution of the error-exponent optimal random block code for a SISO channel is supported on a finite number of discrete mass points in the norm of the block code. In our work we investigate the capacity and capacityachieving distribution of a block fading model with a coherence time T > 1 and any SNR. Our analysis determines that, similar to known results for T = 1 and for the error exponent-optimal distribution for T > 1, the capacity for T > 1 is achieved by a distribution on the norm x which is supported on a finite number of mass points. Based on this observation, we present numerical results for the capacity and the capacity-achieving distributions for channels with fading and noise and a coherence time of T. We find, based on the capacity-achieving distribution for T =, that sequential channel estimation using pilot symbols and subsequent data transmission achieves strictly lower data rates than the capacity. We also examine the mutual information between the channel output and the channel state under the capacity-achieving distribution. We show that this mutual information is non-zero and increases with SNR, which indicates that some form of implicit channel estimation is inherent in optimal decoding. These results are relevant to signal design in many existing or emerging wireless systems where, on the one hand, the effects of an imprecise channel estimate on achievable data rates are poorly understood and, on the other hand, precise channel state information may be expensive to acquire. In such cases, joint channel estimation and data transmission or just noncoherent transmission may be better than separate channel estimation and transmission. A line of work exploring the cost of separate channel estimation is [1]. Specifically, this work explores the utility

of channel state information under schemes which involve separate channel estimation and transmission (henceforth referred to as partially-coherent schemes). This work assumes a certain channel estimation overhead, and makes precise various aspects of the optimal learning overhead needed to achieve good performance. A surprising outcome from this line of work is that the overhead needed to achieve good rates (measured by the capacity) is often not that large. Our results suggest in addition that, even when the coherence times are small, the channel output when data symbols are transmitted already contains information about the channel state. This suggests that a form of joint channel estimation and data transmission might achieve better performance in practice than the commonly-used pilot-based channel estimation. The rest of the paper is organized as follows. We present the system model in Section II, describe some properties of the output distribution in Section III and characterize the structure of the optimal capacity-achieving input distribution in Section IV. Based on a numerical optimization of these expressions for our channel model, we present the capacity and the capacityachieving distribution as a function of the SNR in Section V-A. We discuss the implications of our results relative to pilotbased channel estimation in Section V-B and finally present our concluding thoughts in Section VI. II. SYSTEM MODEL We consider one single antenna transmitter and one single antenna receiver. The system across a single block of T symbol times may be represented as y = hx + ν (1) with y, ν R T, h R, x R T. Each ν i N (, σ ), and h N (, 1). We restrict attention to real-valued channel coefficients for simplicity of the exposition and the numerical optimization. Extensions to complex domains follow very similar lines and are presented in the extended version of this work [11]. We use capital letters to refer to a random variable and lowercase letters to refer to a realization. We use p Y ( ) to refer to the density function of the continuous random variable Y, and µ X ( ) to refer to the probability measure on random variable X. We assume a block fading model with a coherence time T. We assume no instantaneous CSI at the transmitter or the receiver, an average transmit power of 1, and that the receiver doesn t know the instantaneous channel realization at the beginning of each new channel realization. We consider coding across blocklengths and seek to understand the optimal signaling strategies to achieve capacity. Note that since this channel can be thought of as a memoryless system p Y x ( ), the fundamental limits for the achievable rates of this system is achieved by a distribution on the T -dimensional space of all possible inputs over T time slots (i.e., a space-time random code). The channel in Fig. 1 is completely specified by the conditional density p Y x ( ) which in turn is specified as follows: if x R T Channel Fig. 1: The system model y R T y R T refers to the T -dimensional output of the channel, then, given x, y is distributed as y N (, Σ x ), where the (q, r) th entry of the matrix Σ x is specified by Σ x,q,r = x r x q + σ I(q = r), with the indicator function I equal to 1 if the condition is satisfied and zero otherwise. III. PROPERTIES OF THE COVARIANCE MATRIX Σ x In this section we point out some properties of the covariance matrix Σ x in addition to the ones listed in Sections II-C and IV in [1]. These properties are useful in understanding the nature of the optimal input distributions and are also used in establishing the results in Lemma 1. More specifically, the positive definiteness of the matrix Σ x at all points in the domain is used to establish the existence of the linear transformation used to establish a contradiction. Proofs of the identities listed below have been included in the extended version of this work [11]. (a) Σ x has T 1 eigenvalues with value σ and a single eigenvalue with value x + σ. (b) The (unnormalized) i th eigenvector corresponding to the first T 1 eigenvalues of σ are along ( x i+1 /x 1, e i ), where e i is the unit row vector with only one nonzero entry (unity) at position i. The T th eigenvector is along (x 1 /x T,..., x T 1/x T, 1). (c) Σ x is positive definite. (d) The determinant of Σ x is a function of x. IV. CHARACTERIZING THE CAPACITY-ACHIEVING DISTRIBUTION The problem of maximizing mutual information for the channel described in Fig. 1 can be written as or equivalently as sup µ X ( ) I(Y; X) subject to 1 T x dµ X (x) 1, inf I(Y; X) µ X ( ) subject to 1 x dµ X (x) 1, T The above optimization is performed over all distributions µ X ( ). I(Y; X) is the mutual information between Y and X and can be expressed as I(Y; X) = E Y [ log(p Y ( )] E X [h(y X = x)], where h(y X = x) is the differential entropy of Y given a fixed value x. The first expectation is () (3)

performed with respect to the distribution induced on Y by the distribution µ X ( ), i.e., p Y ( ) = p Y x ( )dµ X (x). Many structural properties of the capacity-achieving distribution have been derived in [1]. According to this work, the capacity-achieving distribution is the product of the distributions associated with the isotropically-distributed unitary vectors in the T -dimensional space, together with a distribution on the norm r = x. The rest of the discussions in this section focuses on finding the optimal distribution associated with the norm r = x. We observe that the objective function in (3) is convex in µ X ( ). It can also be shown that the limit point of any sequence {µ (n) x ( )} n= of measures lying in S = {µ( ) : x dµ(x) T } also lies in S. Thus the infimum in (3) is attained by an optimal µ X ( ). Necessary and sufficient conditions for the optimality of the solution µ X ( ) can be obtained by writing down the KKT conditions. In particular, the Lagrangian L(µ X ( ), λ 1, λ ) of the above optimization problem can be expressed as L(µ X ( ), λ 1, λ ) ( ) ( ) = p Y x (y)dµ X (x) log p Y x (y)dµ X (x) dy y +.5 log((πe) T Σ x )dµ X (x)+ x ( ) ( ) λ 1 x dµ X (x) T + λ dµ X (x) 1, where λ 1 R + and λ R. In the above we used the fact that h(y X = x) =.5 log((πe) T Σ x ). The first-order necessary condition for the optimal µ X ( ) states that whenever µ X ( ) assigns positive measure to a neighborhood around x (i.e, µ X (Bx δ ) >, for all δ < δ, where δ > and Bδ x {z : z x < δ}), the following must hold: (1 + log(p Y(y))) p Y x (y)dy y () +.5 log((πe) T Σ x ) + λ 1 x + λ g(p Y( ), x) =, where p Y is the distribution on y induced by µ X( ), and g(, ) is appropriately defined for the above relation to hold. We now state some properties of g(, ). These may be proved by observing that µ X ( ) is only a function of r = x (referred to as µ R ( ) afterwards), which in turn follows from the results in [5]. Lemma 1. The following hold: (a) If there exists an x such that for any neighborhood around x (i.e., Bδ x : {z : z x δ} for any positive δ), g(p Y ( ), x) is zero at some point inside the neighborhood, then p Y ( ) cannot be a valid probability distribution. (b) There exists an R < such that g(p Y ( ), x) > for all x such that x > R. (5) (c) The optimal distribution µ X ( ) assigns a non-zero measure to. Proof Sketch. We present a brief sketch of the proofs below. Proof details are presented in the extended version of this manuscript [11]. (a) This follows from the fact that if such a case exists then, in particular, there exists a linear transformation for which g(, x) is zero in an interval around x along a transformed coordinate. Thus by the Identity Theorem from complex analysis [13], g(, ) is identically zero along that coordinate. Then one can use methods very similar to those used in [3] to argue how the probability density function p Y ( ) is nonintegrable (i.e., observing that the relation defines a Laplace transform and that the relation can be inverted uniquely to a non-integrable distribution as described in Section IV-A in [3]). (b) We first observe that under a power constraint, log(p Y(y))p Y x (y)dy is bounded by a term logarithmic in the norm of x and that λ is fixed regardless of x. The result follows by noting that, since λ 1 >, as x, λ x c log( x + σ ) is unbounded for any finite constant c and hence cannot be equal to zero. (c) This may be established by contradiction. If all points in the support of the optimal distribution have a norm greater than zero, then, by arguments similar to those in [3], the mutual information is increased by bringing any coordinate closer to zero, while meeting the power constraint. The following corollary results from using this lemma together with results from real and complex analysis: Corollary 1. The support of µ R ( ) corresponding to the optimal µ X ( ) is bounded and finite in r = x. Proof. The invariance under unitary transformations follows from [1]. The support is discrete in x because otherwise (a) of Lemma 1 would imply a contradiction. The support of µ X ( ) and µ R ( ) is bounded by (b) of Lemma 1. The number of points with a non-zero probability mass in x is finite because otherwise, by the Bolzano Weierstrass theorem, we have a limit point and (a) of Lemma 1 would again imply a contradiction. Note that the arguments used to establish this result are very similar to the arguments presented in [3]. The only difference in analysis is due to the fact that T > 1. To establish the result in this case, we apply a linear transformation (at a limit point of a sequence of points with a non zero probability measure under µ X ( )) to reduce it to the case considered in [3] and hence establish a contradiction.

.9 1 1 Radial probability ma Capacity in bits per symbol time 1.5 1 1.5 ss 1 1 1 SNR Fig. : Capacity vs. SNR for T = 1 (dotted) and T = (solid) Based on Corollary 1, we now proceed to compute the capacity and capacity-achieving distribution for our channel model. In Section V-B we point out connections of these results with channel estimation...7..5..3..1. x1 x Fig. 3: µ X ( ) for SNR =.5. Blue cylinders represent the product distribution over two time slots based on the p?x for T = 1, whereas the red cylinders represents the optimal distribution for T =. The height of the cylinders is proportional to the mass at a particular radius (T = ) or at a particular point (T = 1). The blue cylinders are staggered slightly for visibility. V. N UMERICAL RESULTS In this section we consider an average transmit power of 1 unit and the SNR to be completely specified by the noise variance σ, i.e., SNR = σ1. We study the effects of SNR on the capacity of the block fading channel as well as on the information that can be extracted about the channel for the capacityachieving distribution. These distributions are specified by a finite number of support points in kxk and their corresponding probability mass functions. To obtain these distributions, the number of points in the capacity-achieving distribution was increased until, within the tolerances, the mutual information did not increase further and the dual variables λ1 and λ stayed the same. The optimizations were performed using the fmin_slsqp routine in SciPy [1]. Multiple random starting points were used to test the numerical stability of the optimization problem and the optimization routine; the capacity was found to be the same regardless of the starting point whenever the optimization completed successfully. We present capacity results in Section V-A and discuss the implications for channel estimation in Section V-B. A. Capacity results We first show how the capacity of the channel per symbol time changes with increasing SNR in Fig.. We note that, as expected, coding across time improves performance on the order of a few dbs of coding gain. We next present a visualization of the optimal µ X in the D space of all x in Figs. 3 and. Note that these figures specify µ X for both T = 1 (i.i.d.) and T = (block fading). We observe that there is a significant mass point at x =, for both T = 1 and T =. We also observe that the capacity-achieving distribution for T = 1 is not optimal for T =. Moreover, in the domain of all x R, a pilot-based scheme performing channel estimation only would correspond to just a single point with a probability mass of 1, whereas a scheme corresponding to pilot-based channel estimation and subsequent use of Gaussian codebooks would correspond to a distribution supported on a one-dimensional line in the D space. The observation that the capacityachieving scheme in Figs. 3- is neither of these demonstrates the suboptimality (from a capacity point of view) of pilotbased channel estimation to maximize the achievable rates of the block fading channel. B. Information about channel state Many existing communication systems separately estimate the channel using pilot symbols and then use the estimate for subsequent data transmission either assuming that the estimate is perfect or by modeling the channel estimation error. In this section, we discuss how the capacity-achieving distribution can inform channel estimation. In Fig. 5 we plot I(H; Y) under the optimal signaling distribution µ X for different SNRs and compare it with the mutual information I(H; Y) computed under the distribution corresponding to using pilots for channel estimation with T symbol times, namely, µ X ( ) = 1 if and only if x = x, where x is a vector whose norm satisfies the power con-

. ss Radial probability ma.7..5..3..1. x1 x Fig. : µ X ( ) for SNR = 1. Same comments as those in the caption of Fig. 3. 1 transmitter or the receiver, but with knowledge of the fading statistics. We establish that, similar to known results for the capacity of the Rayleigh and Ricean i.i.d. fading channel, the capacity of the block Rayleigh fading channel is also achieved by an input distribution µx ( ) which is only a function of r = kxk and in which the measure µr ( ) on the norm r is finite and discrete in r. We use this result to present numerical estimates of the capacity and the corresponding capacityachieving distributions and demonstrate numerically that pilotbased channel estimation has strictly lower rates than capacity. In addition, our numerical results show that under the capacityachieving distribution, the mutual information between the channel state and output is non-zero, thereby suggesting that channel estimation is implicitly performed by the capacityoptimal decoder. Further investigation of this phenomenon and its implications for practical system design are topics for future work. R EFERENCES [1] [] I(H; Y) [3] [] 1 1 [5] 1 1 1 [] SNR Fig. 5: Mutual information between the channel and the channel output with pilot symbols (dashed line) and with µ X (solid line) for T =. [7] [] straints. The latter is just the AWGN capacity expression.5 log (1 + T SNR). We observe from the figure that even with the capacityachieving distribution µ x, the information content about the channel h in the output y measured by the mutual information I(H; Y) is nonzero. This suggests that the capacity-achieving input distribution also allows information about the channel state to be obtained at the decoder even without any pilot symbols. This has implications for both the theory and practice of joint channel estimation and data transmission. The figures show conclusively that, ignoring computational complexity constraints, data transmission at channel capacity does not preclude channel estimation. VI. C ONCLUSIONS We consider the capacity of a block Rayleigh fading channel without instantaneous channel state information at either the [9] [1] [11] [1] [13] [1] M. Mushkin and I. Bar-David, Capacity and coding for the Gilbert-Elliott channels, IEEE Trans. Inf. Theory, vol. 35, no., pp. 177 19, 199. A. J. Goldsmith and P. P. Varaiya, Capacity, mutual information, and coding for finite-state Markov channels, IEEE Trans. Inf. Theory, vol., no. 3, pp., 199. I. C. Abou-Faycal et al., The capacity of discrete-time memoryless Rayleigh-fading channels, IEEE Trans. Inf. Theory, vol. 7, no., pp. 19 131, 1. M. C. Gursoy et al., The noncoherent Rician fading channelpart I: Structure of the capacity-achieving input, IEEE Trans. Wireless Commun., vol., no. 5, pp. 193, 5. B. M. Hochwald and T. L. Marzetta, Unitary space-time modulation for multiple-antenna communications in Rayleigh flat fading, IEEE Trans. Inf. Theory, vol., no., pp. 53 5,. L. Zheng and D. N. C. Tse, Communication on the grassmann manifold: A geometric approach to the noncoherent multipleantenna channel, IEEE Trans. Inf. Theory, vol., no., pp. 359 33,. S. Shamai and T. L. Marzetta, Multiuser capacity in block fading with no channel state information, IEEE Trans. Inf. Theory, vol., no., pp. 93 9,. S. Murugesan et al., Optimization of training and scheduling in the non-coherent SIMO multiple access channel, IEEE Journal on Selected Areas in Communications, vol. 5, no. 7, pp. 1 15, 7. I. Abou-Faycal and B. M. Hochwald, Coding requirements for multiple-antenna channels with unknown Rayleigh fading, Bell Labs Technical Memo, 1999. N. Jindal and A. Lozano, Optimum pilot overhead in wireless communication: A unified treatment of continuous and blockfading channels, ArXiv preprint arxiv:93.1379, 9. M. Chowdhury and A. Goldsmith, Capacity of block fading SIMO channels without CSI, To be submitted. T. L. Marzetta and B. M. Hochwald, Capacity of a mobile multiple-antenna communication link in rayleigh flat fading, IEEE Trans. Inf. Theory, vol. 5, no. 1, pp. 139 157, 1999. W. Rudin, Real and complex analysis. Tata McGraw-Hill Education, 197. E. Jones et al., Scipy: Open source scientific tools for Python, [Online; accessed 1-5-], 1. [Online]. Available: http://www.scipy.org/.