Classical Capacities of Quantum Channels

Size: px
Start display at page:

Download "Classical Capacities of Quantum Channels"

Transcription

1 Classical Capacities of Quantum Channels Jens Christian Bo Jørgensen Supervisor: Jan Philip Solovej Thesis for the Master degree in Mathematics Institute for Mathematical Sciences University of Copenhagen Denmark February 4, 2008

2 i Preface The work in this master thesis was carried out during the period February 2007 to February The initial idea to study the classical capacities of quantum channels was inspired by a talk given by Nilanjana Datta during my participation in a the spring school on Theoretical and Technological Perspectives on Quantum Information and Communication in Marseilles. The project later evolved into a review of many results relating to classical capacities of quantum channels, including the Holevo Conjecture. This thesis is intended for an audience of other master students in mathematics or physics. The prerequisites for reading this paper consist of a basic knowledge of quantum mechanics and bachelor level mathematics. I would like to take the opportunity to thank Jan Philip Solovej for careful and patient supervising and numerous interesting discussions. I would also like to thank Mary Beth Ruskai and Chris King for taking their time to discuss quantum information theory with me during my visit in Boston in the fall For meticulous proofreading and critical comments I would like to thank Jonatan Brask Bohr. Thank you also to my father for correcting my English.

3 Contents 1 Introduction The Holevo Additivity Conjecture Representation of Quantum Channels 10 3 Capacity Notation Capacity of a classical channel The Classical Capacity of a Quantum Channel Capacity Relations The HSW Theorem Shannon Capacity Convexity Theory General Convexity Results The Set of States The Holevo Capacity - Reconsidered Equivalence of Additivity Conjectures 63 7 Towards a Proof or a Counterexample Winter s proof Conclusion and Outlook 87 A Properties of the Von Neumann Entropy 89 B Channel Extension Computation 91 C Equality of Distributions 93 D Explicit form of gaussian rate function 94 Bibliography 96 ii

4 Introduction 1 1 Introduction A central issue in information theory as well as in quantum information theory is to understand ultimate rates of communication. Suppose two parties, Alice and Bob, are connected by some sort of communication channel, and we ask the question: How much information can Alice transmit to Bob per use of the channel? The answer of course depends on the nature of the channel, and on what is meant by "information". In classical information theory a simple model of communication, the so-called classical noisy channel, has been widely studied. In a noisy channel each input is subject to random noise, so that Alice only knows with some probability what message Bob will receive. Noisy channels are all around us. Any type of digital data transmission is subject to noise due to losses in wires, poor weather conditions and so on. These sources of noise can effectively be modeled as being random. For noisy channels the question above has been answered satisfactorily. Shannon s famous Noisy Channel Coding Theorem provides a formula for the capacity of a noisy channel. The capacity is the ultimate rate of asymptotically perfect transmission of information by n independent uses of the channel, as n. For a concrete channel, Shannon s formula allows for a relatively easy calculation of the capacity; at least numerically. Now suppose Alice and Bob are connected by a quantum channel. A quantum channel is a very general model of a device capable of changing and transmitting quantum states. It is a generalization of the classical noisy channel. For a concrete example, think of an optical cable connecting Alice and Bob, through which individual photons are send. Alice can prepare the individual states of the photons, and Bob can make measurements on the received photons in accordance with the laws of quantum mechanics. On their way through the channel, the states of the photons change due to interactions with the channel and due to noise. How much classical 1 information can Alice send to Bob per use of the channel, that is, per photon? In other words, what is the classical capacity of the quantum channel. Though this question was first posed in the early days of quantum information theory, nearly forty years ago, it still remains open. In the quantum world things are not so simple. There is not one, but (at least) four capacities to consider for a quantum channel. Answering the question above amounts to showing how these four capacities relate to each other. Much progress was made around the turn of the millennium, when it was discovered that two of these four capacities are in fact identical and a (relatively) simple formula was provide for a third capacity, called the Holevo capacity. However, it is still not fully understood how all the capacities relate to each other. To this end, the main open question left is the additive of the Holevo capacity. We will see precisely what this means in Section 1.1. If the Holevo capacity is additive, the four capacities collapse into two, both of which 1 There exist purely quantum mechanical measures of information in the literature. We shall not consider them here. Everywhere in this paper "information" will mean classical information, as introduced by Claude Shannon.

5 2 Introduction are given by (relatively) simple formulas. If not, the capacity question remains open. In this paper we will review some of the major result concerning the classical capacity of a quantum channel and provide a peek into some of the latest research on the issue. The paper consist of two parts. In the first part, which comprises Chapter 1-4, we study quantum channels and define the four different capacities in terms of information transmission rates. These capacity measures are then related to each other, and we provide a formula for each of them. Some questions are necessarily left open as they depend on the Holevo Conjecture. The second part of this paper comprises Chapter 5-7 and focuses on the Holevo Conjecture. We begin with an outline of useful tools from convex analysis. These tools are then applied to the Holevo capacity to provide five equivalent formulations of the Holevo Conjecture. Finally, we close the paper with a presentation of a very recent proof strategy for the Holevo Conjecture, which is based on one of these formulations. The structure of the thesis is displayed in Figure 1. Classical Capacity The Holevo Conjecture 2. Rep. of Quantum Channels 5. Convexity Theory 3. Capacity Measures 6. Equivalence of Additivity Conjectures 4. Capacity Relations 7. Towards a proof or or a Counterexample Figure 1 The structure of the thesis. 1.1 The Holevo Additivity Conjecture In this introductory chapter we will take the beeline to the Holevo conjecture. We will define the quantities necessary to understand the precise statement of the conjecture, but leave out motivation and physical interpretation for later chapters. The objective is, that the reader quickly gets to the mathematical kernel of this paper. This section was inspired by [1].

6 1.1. The Holevo Additivity Conjecture 3 The Classical Noisy Channel Let us begin with a definition of a classical noisy channel or stochastic map. Definition 1.1 (Classical Channel) Let X and Y be finite sets. A stochastic map is a map Φ: X Y [0,1] satisfying, Φ(x,y) = 1,x X (1) y Y In this setting we call the sets alphabets, and the elements they contain letters Alice X x 1 x 2 x 3 x 4 x 5 Φ P(y x) y 1 y 2 y 3 y 4 Bob Y x 6 Figure 2 Given the input letter x from Alice, the channel transmits the letter y to Bob with probability P(y x). We should think of Φ as the mathematical model of a real physical channel. The idea is that Alice is placed at one end of the channel equipped with the alphabet X and Bob at the other end equipped with the alphabet Y. Now Alice wants to send messages to Bob, but the channel does not (necessarily) transmit letters faithfully. When Alice transmits the letter x Bob receives the letter y with probability Φ(x, y). By definition, for any x X, the map y Φ(x, y) is a probability distribution. Sometimes we shall write the stochastic map as P(y x) = Φ(x, y) emphasizing that we think of P(y x) as the conditional probability of receiving y, given that x was sent. Many noisy channels appearing in real life can be modeled by a stochastic map. Data transmission from one mobile phone to another is an example. On the way from sender to receiver the data is exposed to many sources of noise such as bad weather conditions, losses in wires, a bird flying into the broadcasting mast etc 2. The effect of all this noise can be modeled by choosing an appropriate probability distribution Φ(x, ) for each x. Given a channel Φ, and an input probability distribution π on X, the channel transforms π into an output probability distribution π = Φπ on Y, given by π (y) = x X Φ(x,y)π(x). Let δ x, x X denote the degenerate probability distribution on X, i.e. { 1 for k = x δ x (k) =. (2) 0 else 2 The reason we can actually send messages from one mobile phone to another is due to clever error correcting codes. Essentially, the probability of error is reduced by introducing redundancies in the messages that are to be transmitted. As a simplified example: The message "hhhhhhhhhhiiiiiiiiii" is sent in stead of "hi".

7 4 Introduction A channel Φ for which we for any x X have Φδ x = δ yx for some y x Y is noiseless. Denote by P(X) = {π: π(x) 0, x X π(x) = 1}, (3) the convex set of probability distributions on X. In a convex set K, a point x is called extreme if it does not lie in the open line segment denoted by ]a,b[ for any points a, b K. The set of extreme points of K is called the extreme boundary, and is denoted by ext K. If any point x K can be written uniquely as a convex combination of extreme points, the set K is called a simplex. We will return to convexity theory in Chapter 5. It is straight forward to check that the extreme points of the set P(X) are the degenerate probability distributions and that P(X) is a simplex. The channel Φ n input a 1 a 2 a 3 Φ Φ Φ Φ n output b 1 b 2 b 3 a 4 Φ b 4 Proposition 1.2 For a direct product X 1 X 2 of two alphabets, extreme points of P(X 1 X 2 ) are precisely the product of extreme points of P(X 1 ) and P(X 2 ), i.e., ext P(X 1 X 2 ) = ext P(X 1 ) ext P(X 2 ) (4) Proof. Follows immediately from δ (x1,x 2 ) = δ x1 δ x2. When Alice wants to send n letters to Bob she uses the physical channel n-times. In the most general situation any use of the physical channel may depend on previous uses of the channel. However, we will assume that the physical channel is "memoryless", so that this is not the case. How do we model multiple uses of a channel? Consider two channels Φ i : X i Y i [0,1], i = 1,2. We define the channel product Φ 1 Φ 2 : (X i X 2 ) (Y i Y 2 ) [0,1], by Φ 1 Φ 2 (x 1,x 2,y 1,y 2 ) = Φ 1 (x 1,y 1 )Φ 2 (x 2,y 2 ). (5) It is easy to verify that Φ 1 Φ 2 is a stochastic map and that is associative. The product on the right-hand-side of (5) is exactly the "memoryless"-assumption. Then n uses of the physical channel correspond to applying the channel Φ } {{ Φ }. n

8 1.1. The Holevo Additivity Conjecture 5 According to Shannon s Noisy Channel Coding Theorem, the capacity of a quantum channel is given by C(Φ) = sup { H(Φπ) π(x)h(φδ x ) }. (6) π P(X) The term in curly brackets is called the Shannon mutual information of the probability distributions π and Φπ and H(π) = x x π(x)log π(x), (7) is the Shannon entropy. Here, and in the following, log will always denote the base 2 logarithm. We have lim x 0 xlog x = 0, so define 0log 0 := 0. Then x xlog x is continuous on [0, ). In Chapter 3 we will see how to define the capacity in terms of data transmission, but for now we can take (6) either on faith or as a temporary definition. We can view P(X) as a compact subset of the finite dimensional vector space R X and P(Y) as a compact subset of R Y. The map π Φπ is continuous, since it can be extended to a linear map R X R Y. Furthermore the entropy function H : P(Y) [0,1] is continuous. Thus the mutual information H(π : Φπ) = H(Φπ) x π(x)h(φδ x ), (8) is a continuous function on P(X). Hence it follows that { C(Φ) = max H(Φπ) π(x)h(φδ x ) }, (9) π P(X) i.e. the supremum in the definition of C is in fact a maximum. A probability distribution attaining the max is called optimal. The capacity has the following property. Proposition 1.3 (Additivity of classical capacity) For two stochastic maps Φ 1 and Φ 2 we have, C(Φ 1 Φ 2 ) = C(Φ 1 ) + C(Φ 2 ). (10) In colloquial terms, this proposition says that two independent channels viewed as one channel can transmit as much information as the sum of what the individual channels are capable of transmitting. This is what one would expect. Proof. " "(superadditivity). Observe that H(σ 1 σ 2 ) = H(σ 1 ) + H(σ 2 ) for any two probability distributions. It is also straightforward to see that Φ 1 Φ 2 (δ x1 δ x2 ) = Φ 1 δ x1 Φ 2 δ x2, for x i X i, i = 1,2, simply by unfolding definitions. Now let π i be the optimal distribution for Φ i according to (9). Then we get C(Φ 1 Φ 2 ) H(π 1 π 2 : Φ 1 Φ 2 π 1 π 2 ) = H(Φ 1 Φ 2 π 1 π 2 ) x 1,x 2 π 1 (x 1 )π 2 (x 2 )H(Φ 1 Φ 2 δ (x1,x 2 )) = H(Φ 1 π 1 ) + H(Φ 2 π 2 ) x 1,x 2 π 1 (x 1 )π 2 (x 2 ) [ H(Φ 1 π 1 ) + H(Φ 2 δ x2 ) ] x = [ H(Φ 1 π 1 ) x 1 π 1 (x 1 )H(Φ 1 π 1 ) ] + [ H(Φ 2 π 2 ) x 2 π 1 (x 2 )H(Φ 1 π 2 ) ] = C(Φ 1 ) + C(Φ 2 ).

9 6 Introduction " "(subadditivity). Let σ be a distribution on Y 1 Y 2. Let σ 1 and σ 2 be the marginal distributions on Y 1 and Y 2, respectively. That is, σ 1 (y 1 ) = y 2 σ(y 1,y 2 ) and σ 2 (y 2 ) = y 1 σ(y 1,y 2 ). The entropy is subadditive, meaning that H(σ) H(σ 1 ) + H(σ 2 ). Let π a distribution attaining the maximum in (9) and let π i, i = 1,2 be the marginal distributions. Then C(Φ 1 Φ 2 ) = H(Φ 1 Φ 2 π) x π(x)h((φ 1 Φ 2 )δ x ) H(Φπ 1 ) + H(Φ 2 π 2 ) π(x 1,x 2 )H((Φ 1 Φ 2 )δ x1 δ x2 ) x 1,x 2 = H(Φπ 1 ) + H(Φ 2 π 2 ) π(x 1,x 2 ) [ H(Φ 1 δ x1 ) + H(Φ 2 δ x2 ) ] x 1,x 2 = [ H(Φ 1 π 1 ) π 1 (x 1 )H(Φ 1 π 1 ) ] + [ H(Φ 2 π 2 ) π 1 (x 2 )H(Φ 1 π 2 ) ] x 1 x 2 C(Φ 1 ) + C(Φ 2 ). In particular we have that C(Φ n ) = nc(φ). The Quantum Channel According to basic quantum mechanics every physical system comes associated with a Hilbert space H, in such a way that the states of the physical system are described by the set of density matrices D(H) = {ϱ B(H): tr(ϱ) = 1,ϱ 0}. (11) The positivity requirement ϱ 0 means that x ϱ x 0 for any unit vector x H. The space B(H) is itself a Hilbert space when equipped with the Hilbert-Schmidt inner product a,b = tr(ab ), for a,b B(H). The space D(H) is easily seen to be convex and it is not hard to show that the set is bounded in the norm induced by the inner product. Hence D(H) is also compact. Any time evolution of a system in some state ϱ A D(H) to a state ϱ B D(H B ), where dim H A,dim H B < can be described by a so-called quantum channel. We will consider quantum channels more closely in Chapter 2. A quantum channel is defined as follows. Definition 1.4 (Quantum Channel) Let Φ : B(H 1 ) B(H 2 ) be a linear map, where B(H i ) is the space of bounded operators on the (finite) dimensional Hilbert spaces H i, for i = 1,2. If Φ is completely positive and trace preserving (CPT), it is called a quantum channel or a CPT map. That is, the linear map Φ must satisfy: 1. (trace preserving) tr Φ(ϱ) = tr(ϱ) for all ϱ B(H 1 ). 2. (completely positive). For any auxiliary Hilbert space V, the map I Φ : B(V H 1 ) B(V H 2 ), where I : B(V ) B(V ) is the identity map, must be positive. That is, (I Φ)(A) 0 for A 0. Remark 1.5 Note that the domain and range of a quantum channel are vector spaces of the form B(H). This is a matter of convenience rather than a reflection of physical

10 1.1. The Holevo Additivity Conjecture 7 reality. The states of a system are objects in the much smaller convex set D(H) B(H). A completely positive map is a map satisfying the last criteria above, but which is not necessarily trace preserving. We shall consider such maps in Chapter 6. As for the classical channels, we can consider the capacity of a quantum channel. In the quantum world things are not so simple. There are many types of capacities depending on which restrictions we put on the transmission of information, or even depending on what we mean by information. In the rapidly expanding field of quantum information, attempts are being made to introduce a purely quantum mechanical notion of information. We shall not consider this here. Instead we will restrict ourself to the simpler information transmission protocol in which Alice sends classical information to Bob, but using a quantum channel. That is, first Alice must encode her information in quantum states. The states are then transmitted to Bob using the quantum channel. It is up to Bob to decode the received quantum states in order to retrieve the original message. Capacities for this type of protocol are called classical capacities for quantum channels. Even in this simpler situation, there are (at least) four different capacity measures to consider depending on what restrictions we put on the coding and decoding methods Alice and Bob use. In Chapter 4 we investigate the various relations among these capacities. As part of this, we will prove that two of these seemingly different measures are in fact identical. We will not be able to provide a full description of the capacity relations as this is a subject of current research. One of the four (or really three) capacity measures has emerged as particular important. This is the Holevo capacity χ. It has been conjectured that the Holevo capacity is additive, precisely as was the case for the classical capacity. The conjecture has been standing for nearly 10 years. If additivity is proved to hold, another two of the capacity measure will collapse into one, leaving us with only two capacity measures. Among the two remaining capacities, the Holevo capacity will be the more interesting, since it measure the highest obtainable classical capacity for a quantum channel. The other capacity measure pertains to the restrictive information transmission protocol in which entangled codewords are not allowed, or equivalently in which the quantum channel is used as a classical channel. What is also important is, that the Holevo capacity is given by a formula, which can be explicitly computed for concrete channels - at least numerically (though not as simply as with Shannon s formula). In this respect, a proof of the additivity conjecture is the missing link in the description of the classical capacity of a quantum channel. The Holevo Capacity An ensemble E = (p i,ϱ i ) in D(H) is a finite collection of states ϱ i together with a probability ditribution i p i. The quantum analogue of (6) is the Holevo capacity [ χ(φ) = sup S( p i Φ(ϱ i )) p j S(Φ(ϱ j )) ]. (12) E i j The supremum here is over all ensembles in D(H) and S(ϱ) = tr(ϱlog ϱ), (13) is the Von Neumann entropy. Any density matrix ϱ has a unique spectral decomposition ϱ = n i=1 λ i x i x i, where λ i are the eigenvalues and { x i } i is an orthonormal basis.

11 8 Introduction By definition then ϱlog ϱ = n λ i log(λ i ) x i x i,. i=1 where by definition λ i log(λ i ) = 0 if λ i = 0. Hence S(ϱ) = H({λ i }). Consider the Holevo quantity defined by K(E) = S( i p i ϱ i ) j p j S(ϱ j ). (14) For a channel Φ: B(H) B(K) we let ΦE denote the ensemble (p i,φ(ϱ i )) in D(K). The Holevo capacity of the channel Φ can then be written as where again the supremum is over ensembles in D(H). χ(φ) = supk(φe), (15) E A state ϱ in D(H) is said to be pure if ϱ = x x for some unit vector x H. The set of pure states form the extreme boundary of the set D(H). This we show in Chapter 5. In this chapter we also show that the Holevo capacity is given by χ(φ) = max {p j,ϱ j } k j=1 ϱ j pure [ S( p i Φ(ϱ i )) i j p j S(Φ(ϱ j )) ]. (16) That is, the supremum has been replaced with a maximum over ensembles containing k pure states. Here k is a fixed number only depending on the dimension of dim H A. It can be shown that k = (dim H A ) 2, but we shall not prove nor use this result. For two quantum mechanical systems A 1 and A 2 with possible states D(H A1 ) and D(H A2 ) the axioms of quantum mechanics dictate that the possible states of the combined system A 1 A 2 are D(H A1 H A2 ). For two quantum channels Φ i : B(A i ) B(B i ), with i = 1,2, consider the channel Φ 1 Φ 2 : B(H A1 H A2 ) B(H B1 H B2 ), uniquely defined by its action on product states, Φ 1 Φ 2 (ϱ 1 ϱ 2 ) = Φ 1 (ϱ 1 ) Φ 2 (ϱ 2 ), (17) for all ϱ i B(H Ai ), with i = 1,2. As for the classical channel, this definition, assumes that the channels Φ 1 and Φ 2 are independent. The additivity conjecture is the following. Conjecture 1.6 For all quantum channels Φ 1 and Φ 2 the Holevo capacity is additive, i.e., χ(φ 1 Φ 2 )? = χ(φ 1 ) + χ(φ 2 ). (18) Now let us mimic the proof of Proposition 1.3 with the classical capacity replaced by the Holevo capacity, and see where it breaks down. Incomplete proof of additivity. " "(superadditivity). Let E 1 = {p i,ϱ i } i and E 2 = {q j,τ j } j be ensembles attaining the maximum in 16, and let E 1 E 2 denote the ensemble

12 1.1. The Holevo Additivity Conjecture 9 {p i q j,ϱ i τ j } i,j. Then we have, χ(φ 1 Φ 2 ) S( i,j p i q j Φ 1 Φ 2 (ϱ i τ 2 )) i,j p i p j S(Φ 1 Φ 2 (ϱ i τ j )) = S( i p i Φ 1 ϱ i ) + S( j q j Φ 2 τ i )) i p i S(Φ 1 (ϱ i )) j q j S(Φ 2 (τ j )) = χ(φ 1 ) + χ(φ 2 ). Here we have used the property of the Von Neumann entropy that H(ϱ τ) = H(ϱ) + H(τ). This proves that χ is superadditive. " "(subadditivity). The Von Neumann entropy is subadditive, meaning that for ϱ 12 D(H 1 H 2 ) we have S(ϱ) S(ϱ 1 ) + S(ϱ 2 ), where ϱ 1 = tr 2 ϱ, ϱ 2 = tr 1 ϱ, where tr i indicates a trace over H i for i = 1,2. Let E = {p i,ϱ i } i be an optimal ensemble in D(H A1 H A2 ) for χ(φ 1 Φ 2 ). Using the subadditivity property we get χ(φ 1 Φ 2 ) = S( p i Φ 1 Φ 2 (ϱ i )) i i [ S( p i Φ 1 (ϱ i1 )) + S( i i p i S(Φ 1 Φ 2 (ϱ i )) ] p i Φ 2 (ϱ i2 ))? The question mark is where the proof breaks down. Looking back at the proof for the additivity of classical capacity, we used at this particular step that ext P(X 1 X 2 ) = ext P(X 1 ) ext P(X 2 ), (19) to write δ x = δ x1 δ x2. In the quantum case we have, ext(d(h 1 H 2 )) ext D(H 1 ) ext D(H 2 ). (20) (Se Chapter 5). In other words, there exists states ϱ D(H 1 H 2 ), which cannot be written as ϱ = ϱ 1 ϱ 2, for any choice of ϱ i D(H i ), i = 1,2. Such states are called entangled. The existence of entangles states is a simple consequence of the axioms of quantum mechanics, but the implications of the existence are vast and not well understood. In fact, the whole field of quantum information theory essentially deals with understanding entanglement as a resource for fast communication. Our "failed" proof above is an indication that more sophisticated methods need to be taken into account when attacking the additivity conjecture. In Chapter 7 we will consider some of the latest developments on the conjecture including a possible proof strategy.

13 10 Representation of Quantum Channels 2 Representation of Quantum Channels The quantum channel is the mathematical model describing the evolution of a quantum system with finitely many degrees of freedom. Suppose that the initial state of our system A is ϱ A and that this state is not coupled with the environment. This means that the joint state of the system and the environment has the form ϱ A ϱ E, where ϱ E is the state of the environment. According to the laws of quantum mechanics, in the Schrödinger picture, the system will undergo unitary evolution as time progresses. Thus, at a later time the joint state will be U(ϱ A ϱ E )U, for some unitary matrix U. The matrix U depends on the nature of the physical system involved. The end state of system A is thus tr E (U(ϱ A ϱ E )U ), where we have traced out the environment. Now in general we may be interested in the end state of some subsystem B of the joint system, and not just the system A. Therefore, the most general type of maps we will be concerned with are of the form Φ: B(H A ) B(H B ), with Φ(ϱ) = tr F (U(ϱ ϱ E )U ), (21) where ϱ E B(H E ) is a density matrix, U B(H A H E ) is a unitary operator, and H A, H B, H E, H F are finite dimensional Hilbert spaces s.t. H A H E = H B H F. Figure 3 displays the various Hilbert spaces involved in this definition. We will temporarily call a map of the form (21) a Stinespring map after W. F. Stinespring who considered maps of this sort in the 1950 s. time evolution, U E A Φ F B Figure 3 The environment represented by systems E and F participates in the time evolution of the total system. The quantum channel is obtained by tracing out the environment. It turns out that a Stinespring map and a quantum channel are in fact the same object. Further more, a quantum channel has a nice operator-sum representation due to K. Kraus and M. D. Choi. We collect these results in the following theorem. Theorem 2.1 (Representation of Quantum Channels) The following are equivalent for a map Φ: B(H A ) B(H B ): (a) Φ is a quantum channel (b) Φ is a Stinepring map

14 Representation of Quantum Channels 11 (c) Φ(τ) = i E iτe i (finite sum) for all τ B(H A) and for some operators E i : H A H B, s.t. i E i E i = I. Remark 2.2 Note that the order of E i and E i is interchanged in the two sums in (c). The operator elements are far from unique. It can be shown that two lists of elements {E 1,...,E m } and {F 1,...,F n } give rise to the same quantum operation when E i = m j=1 v ijf j for i = 1,...,n, where v is an n m matrix s.t. v v = I m and vv = I n. Proof. (a) (c). Suppose Φ is a quantum channel. We will find an operator-sum representation of Φ. Let H R be a Hilbert space with the same dimension as H A and let i R and i A be orthonormal bases for the spaces. Define a state α H R H A by α = i A i R. i And put σ = (I R Φ)( α α ), where I R is the identity operator on B(H R ). Note that σ 0 since Φ is CP. In particular we can diagonalize it and write σ = j s j s j for some (not necessarily unit length) vectors s i H R H B. Define linear operators E i : H A H B by E i ( j A ) = j R s i. Let us check that these operators fit the bill in (c). We have E i k A l A E i = k R s i s i l R i i = k R σ l R = k R (I Φ)( α α ) l R = k R i R j R Φ( i A j A ) l R i,j = Φ( k A l A ), for any k,l. By linearity it follows that Φ(τ) = i E iτe i, for any τ B(H A). Finally we need to show that i E i E i = I. Since Φ is trace preserving, we must have 1 = tr(φ(τ)) = tr( i E i τe i) = tr( i E i E iτ), (22) for all density matrices τ D(H A ). The operator L = i E i E i on B(H A ) is positive, so we can diagonalize. Let Λ = diag(λ 1,...,λ a ) be the diagonal matrix representing L w.r.t. the basis of eigenvectors with a = dim H A. By (22) it follows that tr(λx) = 1, for any density matrix x B(C a ). In particular for x = diag(x 1,...,x a ) we get that λ i x i = 1, with λ i,x i 0 and i x i = 1. It follows that Λ = I n and hence i E i E i = I, where these are the unit matrix and operator, respectively. This proves (c). Showing (c) (a) is straightforward. To see that a map Φ on the form (c) is trace preserving is essentially the calculation in equation (22). Complete positivity follows by noting that (I Φ)(σ) = i (I E i) σ(i E i ), which is easily seen to be a positive map.

15 12 Representation of Quantum Channels (c) (b). Let Φ: B(H A ) B(H B ) be of the form Φ(τ) = i E iτe i, with I = i E i E i. Put a = dim(h A ) and b = dim(h B ). Then, as we just showed also (a) holds. By the construction done in the proof of (a) (c) it is seen that we can represent Φ using exactly a operator elements (this observation is the reason why we did the seemingly superfluous part (c) (a). Let H E be a Hilbert space of the same dimension as H B and let H F be a Hilbert space of the same dimension as H A. Let i F be a basis of H F, where i = 1,...,a. The task is to construct a unitary operator U : H A H E H B H F s.t. Φ(τ) = tr F (U(τ ϱ E )U ), (23) where ϱ E is a density matrix in B(H E ). Let { j E } be a basis of H E, and put ϱ E = 1 E 1 E. We can rewrite (23) as Φ(τ) = i i F U 1 E τ 1 E U i F. We are done, if we can construct U s.t. E i = i F U 1 E for all i. To understand this requirement, consider the matrix representation of U w.r.t the basis { j A k E } for the domain and the basis { l B m F } for the range 3. The requirement then translates into U being on the form [E 1 ]..... [E 2 U = ] , [E a]..... where [E i ] is the matrix representation of E i w.r.t the basis { j A } for the domain, and { l B for the range. Note that only the first block column is determined by the requirement. Furthermore, the assumption that i E i E i = I says that the first b columns of U consist of mutually orthogonal normal vectors. From basic linear algebra, we know that it is possible to choose the remaining columns of U so that U becomes a unitary matrix. We have thus shown, that we can write Φ on the form (23). (2) (1). Suppose Φ is a Stinespring map. Clearly Φ is linear. We have tr Φ(τ) = tr(uτ ϱ E U ) = tr(τ ϱ E ) = tr(τ)tr(ϱ E ) = tr(τ), so Φ is trace preserving. Note that (I Φ)(x) = tr F ((I U)x ϱ E (I U) ), (24) for any x B(V H A H E ). Here I denotes the identity operator on the Hilbert space V. The formula (24) is easily seen to hold for x on the form x = y z, with y B(V ) and z B(H A H E ). Both sides of (24) are linear in x, so the equation must hold for all x B(V H A H E ) as well. For x 0 we have x ϱ E 0 and hence (I U)x ϱ E (I U) ) 0 by unitary invariance. It follows that (I Φ)(x) 0. Hence Φ is a quantum channel. 3 the basis elements are ordered lexicographically, that is, according to the convention: (1,1), (1,2),..., (2, 1),(2,2), etc.

16 Capacity 13 3 Capacity In Chapter 1 we defined the capacity of a classical channel Ψ as C(Ψ) = max { H(Ψπ) π(x)h(ψδ x ) }, (25) π P(X) and the Holevo capacity of a quantum channel Φ as x χ(φ) = sups( E i π i ϱ i ) j p j S(ϱ j ). (26) So far these are merely numbers associated with channels, and it is far from clear why they deserve to be named capacities. In this chapter we start from scratch and setup a reasonable definition of capacity in terms of information transmission. We begin with the classical case. The main theorem for classical channels is Shannon s Noisy Channel Coding Theorem which states that the capacity of a classical (noisy) channel is given by the formula (25). There are different proofs of this theorem in the literature. The proof we will present uses the idea of random coding by typical sequences and is taken from [2]. The random coding techniques will be reused later in the proof of the HSW Theorem, Theorem 4.7, at the end of this chapter. After considering capacities of classical channels we move onto quantum channels. The theory will be developed in parallel with the classical counterpart, but with emphasis on points where quantum theory is different. Based on the notion of entanglement we introduce four different capacities. We then move on to derive a formula for each of these capacities, similar to Shannon s formula. One of these four capacities is the Holevo as given 26 and this is the content of the aforementioned HSW Theorem. 3.1 Notation Until now we have expressed quantities such as entropy and mutual information in terms of probability distributions. Equivalent formulations in terms of random variables sometimes simplify notation. We shall consider random variables many places in comming chapters, so let us take this opportunity to refresh basic definitions. A general random variable is a measurable function X : Ω S, where (Ω, E, p) is a probability space, with probability measure p and σ-algebra E. Furthermore (S, F) is a measurable space, with σ-algebra F. The push forward measure X (p) on S, induced by X is called the distribution of X. It is defined by X (p)(a) = p(x 1 A), for A F. (27) For a measure α on (S, F) and a non-negative integrable function f w.r.t. α, we define a measure f α on (S, F) given by f α(a) = fdα for A F. (28) A

17 14 Capacity In some situations, the sample space (Ω, E,p) plays no role in the arguments and only the distribution matters. Then it is customary to "define" the random variable by specifying the image space (S, F) together with the distribution. If S = R n or C n, F is by default the Borel algebra. If S is discrete, F is by default the family of all subsets. A probability density function (pdf) for the random variable X on S, is a non-negative measurable function f : S R, s.t. P(X V ) = p({s Ω: X(s) V }) = fd(x p), (29) for any V F. Here P is the generic symbol for "probability of". The integration is with respect to the distribution. V Discrete Random Variables Suppose X is a discrete random variable with distribution p on a set X. Then p is uniquely given by the values p X (x) := p({x}), for x X, and with a slight misuse of words it is customary to refer to the function p X as the distribution of X. For two discrete random variables X and Y on the finite sets X and Y, we denote by (X,Y ) a random variable on X Y with marginal random variables X and Y. This means that p X (x) = y Y p(x,y) and p Y (y) = x X p(x,y) are the distributions of X and Y, respectively. The notation (X, Y ) does not uniquely specify the random variables as two different random variables can have the same marginals. Let us introduce some entropy quantities we will need in the following. Two of these, the Shannon entropy and the mutual entropy, we already encountered in Chapter 1. Definition 3.1 (Entropy) Let (X,Y ) be a random variable on a set X Y with probability distribution p. Here X and Y are the marginal random variables on X and Y, respectively. We define the following quantities. Entropy: H(X) = x p X (x)log p X (x) (30) Joint Entropy: H(X,Y ) = x,y p(x,y)log p(x,y) (31) Conditional Entropy: H(X Y ) = H(X,Y ) H(Y ) = y p Y (y)h(x Y = y) (32) Mutual Information: H(X : Y ) = H(X) + H(Y ) H(X,Y ). (33) The entropy H(X) of a random variable X quantifies the amount of uncertainty about the value of X before we learn its value. We will not give a review of general classical information theory here. The reader interested in learning more about entropy is encourage to consult for example [2]. Concavity of entropy implies the following result Proposition 3.2 (Subadditivity of Shannon Entropy) Let X and Y be discrete random variables with joint random variable (X, Y ). The Shannon entropy is subadditive, meaning that, H(X,Y ) H(X) + H(Y ), and equality holds if and only if X and Y are independent.

18 3.1. Notation 15 Proof. We can w.l.o.g. assume that p X and p Y are nonzero (if not, we can obtain new random variables X, Y and (X,Y ) with H(X ) = H(X), H(Y ) = H(Y ) and H(X,Y ) = H(X,Y ) by restricting X and Y to the set for which p X and p Y is nonzero, respectively). We have log xln 2 = ln x x 1 for all positive x and equality holds if and only if x = 1. Let p be the probability distribution of (X,Y ). We have H(X) + H(Y ) H(X,Y ) = x p X (x)log p X (x) y p Y (y)log p Y (y) + x,y p(x,y)log p(x,y) = p(x, y) p(x,y)log p x,y X (x)p Y (y) 1 ( p(x, y) 1 p(x,y) ) ln2 p x,y X (x)p Y (y) = 1 ( p(x,y) px (x)p Y (y) ) ln2 = 0 x,y Equality holds in the third line if and only if p X (x)p Y (y) = p(x,y) for all x and y. That is, if and only if X and Y are independent. Random Variables Induced by Channels Let Φ: X Y [0,1] be a classical channel, and suppose we are given a random variable X on X. The channel ΦU induce a random variable on X Y and consequently also a marginal random variable on Y which we denote by ΦX. In this particular case the joint random variable (X, ΦX) is defined to have the distribution p(x,y) = Φ(x,y)p X (x), where p X is the distribution of X. Thus the distribution of ΦX is given by p ΦX (y) = x Φ(x,y)p X(x). A map F : X Y, where X and Y are finite sets, can be viewed as a noiseless classical channel with input X and output Y. That is, it corresponds to the channel (x,y) δ F(x) (y), where δ is the Kronecker delta. When there is no risk of misunderstandings we will denote the corresponding noiseless classical channel by the same letter. Composition of channels Two channels Φ 1 : X Y [0,1], and Φ 2 : Y Z [0,1] can be composed 4 into a channel Φ 2 Φ 1 : X Z [0,1], given by Φ 2 Φ 1 (x,z) = y Φ 1 (x,y)φ 2 (y,z). (34) It is easy to verify, that is associative. As this is not ordinary function composition we have used the symbol " " rather than " " to represent it. The definition is based on the intuitive idea of composing channels in which the output of the first channel 4 The composition notation here is not standard in the literature.

19 16 Capacity is the input of the second, as displayed in Figure 4. Note that Φ 2 Φ 1 (x,z) is the probability of getting z out, given that x was sent. To follow convention from ordinary function composition the rightmost channel in the composition Φ 2 Φ 1 is the channel first applied. Φ 1 Φ 2 X Y Z Φ 2 Φ 1 Figure 4 The composition of the channels Φ 1 and Φ Capacity of a classical channel Suppose Alice and Bob can communicate via a classical channel, and that Alice wants to send information to Bob. What is the maximal number of bits per use of the channel that Alice can send? Let us make the setup precise. Suppose we are given a channel Φ : X Y [0,1] with probability matrix Φ(x,y) = P(y x), for x X and y Y. Alice has a list of possible messages {1,...,2 m }, where m N. For the sake of simplicity, we assume the number of messages to be a power of 2. Alice performs an encoding described by a map C m : {1,...,2 m } X. The message is then sent through the channel Φ to Bob who performs a decoding, described by a map D m : Y {1,...,2 m }. We call m the (bit) length of the coding and decoding, respectively. There is no guarantee, that the message sent by Alice is identical to the one Bob reads after decoding. This will in general depend on the encoding, the channel and the decoding. Consider first a channel K : Z Z [0,1], with the same input and output alphabet. For a channel of this form, define the average error δ, by δ(k) = 1 [1 K(i,i)], (35) n i Z where n = Z is the number of elements in the alphabet. Note that δ(k) is the probability that Bob does not receive the same letter that Alice sent, given that each letter is sent equally often through the channel. For an arbitrary channel Φ as above and a coding C and decoding D of bit length m, consider the channel K = D Φ C. This is a channel with identical input and output alphabets. Note that δ(k) is the probability that Bob does not read out the same letter that Alice sent, given that all messages are sent equally often by Alice. Define the error δ m of Φ by, δ m (Φ) = min δ(d Φ C), (36) C,D where the minimum is over all encodings C and decodings D of bit length m. The minimum makes sense since there is a finite number of possible encodings and decodings. An encoding/decoding scheme attaining the minimum is called optimal. Note

20 3.2. Capacity of a classical channel 17 that δ m (Φ) is the smallest possible error you can get, when you require to send 2 m messages. It is not hard to show that δ m (Φ) 1 as m and clearly δ 0 (Φ) = 0 (only one letter in the alphabet). An encoding/decoding scheme of length m for Φ is called perfect, if δ m (Φ) = 0. If we apply the channel multiple times we can of course send more messages with small average error. The relationship between the number of times n we apply the channel and the alphabet bit size m is crucial. The fraction m n is the bit rate of channel, when applied n times. In Figure 5 a coding protocol for n uses of the channel Φ is illustrated. Classical Coding/Decoding Protocol Messages Φ Alice s Bob s C Φ n code- words Φ code- words Φ Φ D n Messages K n Figure 5 A message is first encoded using the map C n, then sent through the channel Φ n, and finally decoded by the map D n. The composition of encoding, the channel Φ n and the decoding, is equivalent to a classical channel K n = D n Φ n C n. We are interested in protocols with arbitrarily low average error. This leads to the following definition. Definition 3.3 (Reliable rate) Let R 0. Then R is called a reliable rate if there exist a strictly increasing sequence (a k ) k in N s.t. δ ak R (Φ a k ) 0. (37) Here is the "ceiling" function, which for a given x R returns the unique integer k, s.t. k 1 < x k. In other words, a rate R 0 is reliable if we can get an arbitrarily small error by applying the channel enough times, while still sending R nr /n bits per use of the channel, as n is large. The capacity is the ultimate bit rate with which we can reliably send messages through the channels. This is the content of the following definition. Definition 3.4 (Capacity of a classical channel) For a channel Φ, the capacity C(Φ) is defined as C(Φ) = sup{r: R reliable}. From this definition it is not at all clear how to calculate the capacity of a specific channel. The following theorem, by Claude Shannon, is one of the main theorems of classical information theory. The theorem provides a formula for the capacity from which it can be calculated - at least numerically. Theorem 3.5 (Shannon s Noisy Channel Coding Theorem) For a classical channel Φ: X Y [0,1], the capacity C(Φ) is given by

21 18 Capacity C(Φ) = suph(x : ΦX), (38) X where X is a random variable on X and ΦX is the induced random variable on Y. The proof is rather long so we will separate it into two according to the inequalities " " and " ". For " " we need the following two lemmas. Lemma 3.6 (The Fano Inequality) Let U be a random variable defined on {u 1,...,u m } and let V be a random variable defined on {v 1,...,v m } s.t. their joint probability distribution p(u, v) satisfies Then H(U V ) 1 + δ log m. m p(u i,v i ) = 1 δ, δ > 0. i=1 Proof. We have H(U V ) = v j p V (v j )H(U V = v j ). (39) Now fix j and put s i = p(u i v j ). Then H(U V = v j ) = i s i log s i = H({s j,1 s j }) + (1 s j )H({ s 1 1 s j,..., 1 + (1 s j )log m. ŝ j 1 s j,..., Here denotes that the element is omitted from the list. Thus (39) gives s m 1 s j }). H(U V ) v j p V (v j )(1 + (1 s j )log m) = 1 + (1 v j p V (v j )s j )log m = 1 + (1 v j p(u j,v j ))log m = 1 + δ log m. Corollary 3.7 Let K : Z Z [0,1] be a classical channel with the same input and output alphabet and with Z = 2 m for m N. Then m 1 + δ(k)m + suph(x : KX), (40) X where the supremum is over all random variables X on Z.

22 3.2. Capacity of a classical channel 19 Proof. Let U be the equidistributed random variable on Z and put V = KU. Then, m = H(U) = H(U V ) + H(U : V ) (see Def. 3.1) (41) The distribution of (U,V ) is given by p(u,v) = K(u,v) 1 2 m. Thus δ(k) = 1 2 m [1 K(i,i)] i Z = 1 i Z p(i,i). By Lemma 3.6, H(U V ) 1 + δ(k)m and the corollary follows. A sequence of random variables X 1 X 2 X 3 are said to form a Markov chain if X n+1 is independent of X 1,...,X n 1, given X n. Formally this means that, p(x n+1 = x n + 1 X n = x n,...,x 1 = x 1 ) = p(x n+1 = x n+1 X n = x n ), where x i X i and X i is a random variable on X i. Given a composition of classical channels and a random variable on the input space of the first channel, the random variables induced by the channels will form a Markov chain. This is our primary motivation for studying Markov chains. Mutual information can only decrease in a Markov chain as the following theorem states. Lemma 3.8 (Data Processing Inequality) Let X,Y,Z be random variables s.t. X Y Z forms a Markov chain; that is p(z y) = p(z x,y), (42) where x, y and z is shorthand for X = x, Y = y and Z = z. Then, H(X) H(X : Y ) H(X : Z). Proof. From the definition of Shannon entropy it is seen that H(X : Y ) H(X : Z) is equivalent to H(X Y ) H(X Z). Unraveling definitions it is easily seen that (42) is equivalent to p(x y) = p(x y,z), saying that Z Y X is a Markov chain. From this we immediately get H(X Y ) = H(X Y, Z). Now using the definition of conditional entropy we have H(X Y,Z) = H(X,Y Z) H(Y Z) H(X Z). By the last equation in the definition of conditional entropy, Definition 3.1, it suffices to show H(X,Y Z = z) H(Y Z = z) H(X Z = z), for any z. However, this follows from subadditivity of Shannon entropy. This proves the second inequality. The first inequality follows by applying what we have just shown to the Markov chain X X Y using that H(X) = H(X : X). We are now ready to prove the " " part of Shannons Noisy Channel Coding Theorem.

23 20 Capacity Theorem 3.9 For a classical channel Φ: X Y [0, 1], the capacity C(Φ) satisfies C(Φ) suph(x : ΦX), (43) X where the supremum is over all random variables X on X. Proof.. Suppose R is a reliable rate and fix n N. Let C and D be optimal encodings and decodings for Φ n, respectively, of bit length m. Consider the classical channel K = D Φ n C : Z Z [0,1] induced by this encoding/decoding scheme. Here Z = {1,...,2 m }. Optimality implies that δ(k) = δ m (Φ n ). From Corollary 3.7 we get that m 1 + δ m (Φ n )m + suph(u : KU), (44) U where the supremum is over all random variables U on Z. Let now U be a fixed random variable on Z. Consider the Markov chain of random variables, U C S Φ n T D V, (45) where by definition S = CU, T = Φ n S and V = DT = KU. By the Data Processing Inequality, we get H(U : V ) H(U : T) H(S : T). For the last term in (44) we thus get sup U H(U : KU) suph(s : Φ n S), (46) S where the supremum on the left-hand-side is over all random variables U on Z and the supremum on the right-hand-side is over all random variables S on X n. According to Proposition 1.3 of Section 1.1, the right-hand-side is additive in Φ, meaning that sup S H(S : Φ n S) = n suph(x : ΦX), (47) X where the supremum on the right-hand-side is over all random variables X on X. Since R is a reliable rate, there exist a strictly increasing sequence (a k ) k s.t. δ ak R (Φ a k ) 0, (48) as k. Setting m = a k R in (44) and dividing by a k we get, R a kr a k 1 a k + δ ak R (Φ a k ) a kr a k + sup S H(S : ΦS) suph(x : ΦX), (49) X as k. This proves the claim. Typicality In order to prove the part of Shannon s Noisy Coding Theorem we first need to establish some properties of randomly generated sequences. These sequences will later be used as codewords in the proof of Shannon s theorem.

24 3.2. Capacity of a classical channel 21 Consider an alphabet X = {1,...,m} and a random variable X on X with probability distribution p 1,...,p m. Consider also the random variable X (n) = (X 1,...,X n ) on X n, where X i = X are independent. Now suppose we choose a codeword x (n) = (x 1,...,x n ) X n according to the distribution of X (n). That is the same as choosing each letter x i randomly and independently according to the probability distribution p 1,...,p m. If n is large, we would expect a sequence x (n) to contain approximately p 1 n 1 s, p 2 n 2 s... and p m n m s. Suppose for simplicity that p 1 n,p 2 n,...,p m n N and that x (n) consist of exactly this number of 1 s, 2 s etc. Let us give a rough estimate of the probability p(x (n) ) = P(X (n) = x (n) ), that is, the probability of choosing exactly the sequence x (n). From elementary combinatorics, we get p(x (n) ) = n! (p 1 n)!(p 2 n)!... (p 3 n)!. (50) Stirlings approximation gives log(n!) n log n when n is large, so log p(x (n) ) n log n p 1 n log(p 1 n)...p n n log(p n n) = nh(x). Thus p(x (n) ) 2 nh(x) or equivalently 1 n log p(x(n) ) H(X) 0 (51) Any sequence satisfying (51), is said to be typical (to be defined rigorously below). Now let Y be another alphabet and consider a random variable (X,Y ) on X Y, the marginal random variable X on X, and the marginal random variable Y on Y. As above, each of these random variables give rise to a notion of typical sequences in the spaces X n, Y n and X n Y n, respectively. If x (n) X n is a typical sequence and y (n) Y n is a typical sequence and the combined sequence (x (n),y (n) ) X n Y n is typical, x (n) and y (n) are said to be jointly typical. The precise definition is the following: Definition 3.10 (Jointly typical sequences) For ɛ > 0 and a random variable (X, Y ) on X Y with probability distribution p(x,y), we say that the sequences x (n) X n and y (n) Y n are jointly ɛ-typical if the following 3 properties hold: 1 n log p(x(n) ) H(X) < ɛ 1 n log p(y(n) ) H(Y ) < ɛ 1 n log p(x(n),y (n) ) H(X,Y ) < ɛ.

25 22 Capacity When n is large and we randomly choose a pair of sequences x (n) and y (n) according to the distribution of (X,Y ) (n) it is very likely to find the sequences to be jointly typical. If X and Y are correlated (= not independent), then the mutual information is non-zero, i.e. H(X : Y ) 0. If we choose sequences x (n) and y (n) independently according the distributions of X (n) and Y (n), respectively, then it is very unlikely that the sequences are jointly typical. This is the content of the following theorem. Theorem 3.11 (Asymptotic equipartition property) Let X and Y be finite sets, and let (X,Y ) be a random variable on the product set X Y, where X and Y are the marginal random variables on X and Y, respectively. Let (X,Y ) (n) denote the random variable on X n Y n, formed out of n independent copies of (X,Y ) and with marginals X (n) and Y (n). Let A ɛ (n) denote the set of jointly typical sequences in X n Y n. Then 1. P(A (n) ɛ ) 1 as n. 2. A ɛ (n) 2 n(h(x,y )+ɛ), where denotes the number of elements in the set. 3. Given independent random variables X (n), Ỹ (n) marginals as X (n),y (n) then The set A (n) ɛ is illustrated on Figure 6. P(( X (n),ỹ (n) ) A (n) ɛ ) 2 n(h(x:y ) 3ɛ). (52) The Typical Sequences A (n) ɛ Y n 2 nh(y ) 2 nh(x,y ) 2 nh(x) X n Figure 6 Sets of typical sequences. If (x (n),y (n) ) A (n) ɛ X n Y n, that is (x (n),y (n) ) is jointly typical, then in particular x (n) is typical and y (n) is typical. There are 2 nh(x) typical sequences x (n), 2 nh(y ) typical sequences y (n) and 2 nh(x,y ) jointly typical sequences (x (n),y (n) ). The proof of Theorem 3.11 is an easy application of the Weak Law of Large Numbers. Theorem 3.12 (Weak Law of Large Numbers) Consider a sequence of independent, identically distributed (i.i.d) random variables X i with finite mean E(X i ) = µ and finite variance E( X µ 2 ) = σ 2. Then for any ɛ > 0, lim n P ( 1 n n i=1 ) X i µ ɛ = 0. (53)

Entropies & Information Theory

Entropies & Information Theory Entropies & Information Theory LECTURE I Nilanjana Datta University of Cambridge,U.K. See lecture notes on: http://www.qi.damtp.cam.ac.uk/node/223 Quantum Information Theory Born out of Classical Information

More information

to mere bit flips) may affect the transmission.

to mere bit flips) may affect the transmission. 5 VII. QUANTUM INFORMATION THEORY to mere bit flips) may affect the transmission. A. Introduction B. A few bits of classical information theory Information theory has developed over the past five or six

More information

Entropy in Classical and Quantum Information Theory

Entropy in Classical and Quantum Information Theory Entropy in Classical and Quantum Information Theory William Fedus Physics Department, University of California, San Diego. Entropy is a central concept in both classical and quantum information theory,

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

Lecture: Quantum Information

Lecture: Quantum Information Lecture: Quantum Information Transcribed by: Crystal Noel and Da An (Chi Chi) November 10, 016 1 Final Proect Information Find an issue related to class you are interested in and either: read some papers

More information

Multiplicativity of Maximal p Norms in Werner Holevo Channels for 1 < p 2

Multiplicativity of Maximal p Norms in Werner Holevo Channels for 1 < p 2 Multiplicativity of Maximal p Norms in Werner Holevo Channels for 1 < p 2 arxiv:quant-ph/0410063v1 8 Oct 2004 Nilanjana Datta Statistical Laboratory Centre for Mathematical Sciences University of Cambridge

More information

Quantum Data Compression

Quantum Data Compression PHYS 476Q: An Introduction to Entanglement Theory (Spring 2018) Eric Chitambar Quantum Data Compression With the basic foundation of quantum mechanics in hand, we can now explore different applications.

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities.

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities. Chapter 6 Quantum entropy There is a notion of entropy which quantifies the amount of uncertainty contained in an ensemble of Qbits. This is the von Neumann entropy that we introduce in this chapter. In

More information

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Capacity of a channel Shannon s second theorem. Information Theory 1/33 Capacity of a channel Shannon s second theorem Information Theory 1/33 Outline 1. Memoryless channels, examples ; 2. Capacity ; 3. Symmetric channels ; 4. Channel Coding ; 5. Shannon s second theorem,

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013

AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013 AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013 Lecturer: Dr. Mark Tame Introduction With the emergence of new types of information, in this case

More information

Shannon s Noisy-Channel Coding Theorem

Shannon s Noisy-Channel Coding Theorem Shannon s Noisy-Channel Coding Theorem Lucas Slot Sebastian Zur February 2015 Abstract In information theory, Shannon s Noisy-Channel Coding Theorem states that it is possible to communicate over a noisy

More information

Chapter 5. Density matrix formalism

Chapter 5. Density matrix formalism Chapter 5 Density matrix formalism In chap we formulated quantum mechanics for isolated systems. In practice systems interect with their environnement and we need a description that takes this feature

More information

Shannon s A Mathematical Theory of Communication

Shannon s A Mathematical Theory of Communication Shannon s A Mathematical Theory of Communication Emre Telatar EPFL Kanpur October 19, 2016 First published in two parts in the July and October 1948 issues of BSTJ. First published in two parts in the

More information

Lecture 3: Channel Capacity

Lecture 3: Channel Capacity Lecture 3: Channel Capacity 1 Definitions Channel capacity is a measure of maximum information per channel usage one can get through a channel. This one of the fundamental concepts in information theory.

More information

ELEC546 Review of Information Theory

ELEC546 Review of Information Theory ELEC546 Review of Information Theory Vincent Lau 1/1/004 1 Review of Information Theory Entropy: Measure of uncertainty of a random variable X. The entropy of X, H(X), is given by: If X is a discrete random

More information

(Classical) Information Theory III: Noisy channel coding

(Classical) Information Theory III: Noisy channel coding (Classical) Information Theory III: Noisy channel coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract What is the best possible way

More information

Quantum Information Chapter 10. Quantum Shannon Theory

Quantum Information Chapter 10. Quantum Shannon Theory Quantum Information Chapter 10. Quantum Shannon Theory John Preskill Institute for Quantum Information and Matter California Institute of Technology Updated January 2018 For further updates and additional

More information

Lecture 15: Conditional and Joint Typicaility

Lecture 15: Conditional and Joint Typicaility EE376A Information Theory Lecture 1-02/26/2015 Lecture 15: Conditional and Joint Typicaility Lecturer: Kartik Venkat Scribe: Max Zimet, Brian Wai, Sepehr Nezami 1 Notation We always write a sequence of

More information

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission

More information

9. Distance measures. 9.1 Classical information measures. Head Tail. How similar/close are two probability distributions? Trace distance.

9. Distance measures. 9.1 Classical information measures. Head Tail. How similar/close are two probability distributions? Trace distance. 9. Distance measures 9.1 Classical information measures How similar/close are two probability distributions? Trace distance Fidelity Example: Flipping two coins, one fair one biased Head Tail Trace distance

More information

Quantum Information Types

Quantum Information Types qitd181 Quantum Information Types Robert B. Griffiths Version of 6 February 2012 References: R. B. Griffiths, Types of Quantum Information, Phys. Rev. A 76 (2007) 062320; arxiv:0707.3752 Contents 1 Introduction

More information

Lecture 19 October 28, 2015

Lecture 19 October 28, 2015 PHYS 7895: Quantum Information Theory Fall 2015 Prof. Mark M. Wilde Lecture 19 October 28, 2015 Scribe: Mark M. Wilde This document is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike

More information

Entanglement Manipulation

Entanglement Manipulation Entanglement Manipulation Steven T. Flammia 1 1 Perimeter Institute for Theoretical Physics, Waterloo, Ontario, N2L 2Y5 Canada (Dated: 22 March 2010) These are notes for my RIT tutorial lecture at the

More information

Some Bipartite States Do Not Arise from Channels

Some Bipartite States Do Not Arise from Channels Some Bipartite States Do Not Arise from Channels arxiv:quant-ph/0303141v3 16 Apr 003 Mary Beth Ruskai Department of Mathematics, Tufts University Medford, Massachusetts 0155 USA marybeth.ruskai@tufts.edu

More information

Lecture 18: Quantum Information Theory and Holevo s Bound

Lecture 18: Quantum Information Theory and Holevo s Bound Quantum Computation (CMU 1-59BB, Fall 2015) Lecture 1: Quantum Information Theory and Holevo s Bound November 10, 2015 Lecturer: John Wright Scribe: Nicolas Resch 1 Question In today s lecture, we will

More information

Ensembles and incomplete information

Ensembles and incomplete information p. 1/32 Ensembles and incomplete information So far in this course, we have described quantum systems by states that are normalized vectors in a complex Hilbert space. This works so long as (a) the system

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Quantum Information Theory

Quantum Information Theory Chapter 5 Quantum Information Theory Quantum information theory is a rich subject that could easily have occupied us all term. But because we are short of time (I m anxious to move on to quantum computation),

More information

Shannon s noisy-channel theorem

Shannon s noisy-channel theorem Shannon s noisy-channel theorem Information theory Amon Elders Korteweg de Vries Institute for Mathematics University of Amsterdam. Tuesday, 26th of Januari Amon Elders (Korteweg de Vries Institute for

More information

On some special cases of the Entropy Photon-Number Inequality

On some special cases of the Entropy Photon-Number Inequality On some special cases of the Entropy Photon-Number Inequality Smarajit Das, Naresh Sharma and Siddharth Muthukrishnan Tata Institute of Fundamental Research December 5, 2011 One of the aims of information

More information

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Network coding for multicast relation to compression and generalization of Slepian-Wolf Network coding for multicast relation to compression and generalization of Slepian-Wolf 1 Overview Review of Slepian-Wolf Distributed network compression Error exponents Source-channel separation issues

More information

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Lecture 6 I. CHANNEL CODING. X n (m) P Y X 6- Introduction to Information Theory Lecture 6 Lecturer: Haim Permuter Scribe: Yoav Eisenberg and Yakov Miron I. CHANNEL CODING We consider the following channel coding problem: m = {,2,..,2 nr} Encoder

More information

Compression and entanglement, entanglement transformations

Compression and entanglement, entanglement transformations PHYSICS 491: Symmetry and Quantum Information April 27, 2017 Compression and entanglement, entanglement transformations Lecture 8 Michael Walter, Stanford University These lecture notes are not proof-read

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Problem Set: TT Quantum Information

Problem Set: TT Quantum Information Problem Set: TT Quantum Information Basics of Information Theory 1. Alice can send four messages A, B, C, and D over a classical channel. She chooses A with probability 1/, B with probability 1/4 and C

More information

Noisy channel communication

Noisy channel communication Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 6 Communication channels and Information Some notes on the noisy channel setup: Iain Murray, 2012 School of Informatics, University

More information

1. Basic rules of quantum mechanics

1. Basic rules of quantum mechanics 1. Basic rules of quantum mechanics How to describe the states of an ideally controlled system? How to describe changes in an ideally controlled system? How to describe measurements on an ideally controlled

More information

A Holevo-type bound for a Hilbert Schmidt distance measure

A Holevo-type bound for a Hilbert Schmidt distance measure Journal of Quantum Information Science, 205, *,** Published Online **** 204 in SciRes. http://www.scirp.org/journal/**** http://dx.doi.org/0.4236/****.204.***** A Holevo-type bound for a Hilbert Schmidt

More information

Chapter 14: Quantum Information Theory and Photonic Communications

Chapter 14: Quantum Information Theory and Photonic Communications Chapter 14: Quantum Information Theory and Photonic Communications Farhan Rana (MIT) March, 2002 Abstract The fundamental physical limits of optical communications are discussed in the light of the recent

More information

EE 4TM4: Digital Communications II. Channel Capacity

EE 4TM4: Digital Communications II. Channel Capacity EE 4TM4: Digital Communications II 1 Channel Capacity I. CHANNEL CODING THEOREM Definition 1: A rater is said to be achievable if there exists a sequence of(2 nr,n) codes such thatlim n P (n) e (C) = 0.

More information

Useful Concepts from Information Theory

Useful Concepts from Information Theory Chapter 2 Useful Concepts from Information Theory 2.1 Quantifying Information 2.1.1 The entropy It turns out that there is a way to quantify the intuitive notion that some messages contain more information

More information

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality

More information

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 1. Cascade of Binary Symmetric Channels The conditional probability distribution py x for each of the BSCs may be expressed by the transition probability

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Free probability and quantum information

Free probability and quantum information Free probability and quantum information Benoît Collins WPI-AIMR, Tohoku University & University of Ottawa Tokyo, Nov 8, 2013 Overview Overview Plan: 1. Quantum Information theory: the additivity problem

More information

Channel Coding: Zero-error case

Channel Coding: Zero-error case Channel Coding: Zero-error case Information & Communication Sander Bet & Ismani Nieuweboer February 05 Preface We would like to thank Christian Schaffner for guiding us in the right direction with our

More information

Entanglement: concept, measures and open problems

Entanglement: concept, measures and open problems Entanglement: concept, measures and open problems Division of Mathematical Physics Lund University June 2013 Project in Quantum information. Supervisor: Peter Samuelsson Outline 1 Motivation for study

More information

AN INTRODUCTION TO SECRECY CAPACITY. 1. Overview

AN INTRODUCTION TO SECRECY CAPACITY. 1. Overview AN INTRODUCTION TO SECRECY CAPACITY BRIAN DUNN. Overview This paper introduces the reader to several information theoretic aspects of covert communications. In particular, it discusses fundamental limits

More information

Lecture 14 February 28

Lecture 14 February 28 EE/Stats 376A: Information Theory Winter 07 Lecture 4 February 8 Lecturer: David Tse Scribe: Sagnik M, Vivek B 4 Outline Gaussian channel and capacity Information measures for continuous random variables

More information

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page. EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 28 Please submit on Gradescope. Start every question on a new page.. Maximum Differential Entropy (a) Show that among all distributions supported

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

Chapter I: Fundamental Information Theory

Chapter I: Fundamental Information Theory ECE-S622/T62 Notes Chapter I: Fundamental Information Theory Ruifeng Zhang Dept. of Electrical & Computer Eng. Drexel University. Information Source Information is the outcome of some physical processes.

More information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information 204 IEEE International Symposium on Information Theory Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information Omur Ozel, Kaya Tutuncuoglu 2, Sennur Ulukus, and Aylin Yener

More information

Density Matrices. Chapter Introduction

Density Matrices. Chapter Introduction Chapter 15 Density Matrices 15.1 Introduction Density matrices are employed in quantum mechanics to give a partial description of a quantum system, one from which certain details have been omitted. For

More information

An Introduction to Quantum Information. By Aditya Jain. Under the Guidance of Dr. Guruprasad Kar PAMU, ISI Kolkata

An Introduction to Quantum Information. By Aditya Jain. Under the Guidance of Dr. Guruprasad Kar PAMU, ISI Kolkata An Introduction to Quantum Information By Aditya Jain Under the Guidance of Dr. Guruprasad Kar PAMU, ISI Kolkata 1. Introduction Quantum information is physical information that is held in the state of

More information

LECTURE 10. Last time: Lecture outline

LECTURE 10. Last time: Lecture outline LECTURE 10 Joint AEP Coding Theorem Last time: Error Exponents Lecture outline Strong Coding Theorem Reading: Gallager, Chapter 5. Review Joint AEP A ( ɛ n) (X) A ( ɛ n) (Y ) vs. A ( ɛ n) (X, Y ) 2 nh(x)

More information

Remarks on the Additivity Conjectures for Quantum Channels

Remarks on the Additivity Conjectures for Quantum Channels Contemporary Mathematics Remarks on the Additivity Conjectures for Quantum Channels Christopher King Abstract. In this article we present the statements of the additivity conjectures for quantum channels,

More information

Quantum Error Correcting Codes and Quantum Cryptography. Peter Shor M.I.T. Cambridge, MA 02139

Quantum Error Correcting Codes and Quantum Cryptography. Peter Shor M.I.T. Cambridge, MA 02139 Quantum Error Correcting Codes and Quantum Cryptography Peter Shor M.I.T. Cambridge, MA 02139 1 We start out with two processes which are fundamentally quantum: superdense coding and teleportation. Superdense

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

Entanglement-Assisted Capacity of a Quantum Channel and the Reverse Shannon Theorem

Entanglement-Assisted Capacity of a Quantum Channel and the Reverse Shannon Theorem IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 10, OCTOBER 2002 2637 Entanglement-Assisted Capacity of a Quantum Channel the Reverse Shannon Theorem Charles H. Bennett, Peter W. Shor, Member, IEEE,

More information

Module 1. Introduction to Digital Communications and Information Theory. Version 2 ECE IIT, Kharagpur

Module 1. Introduction to Digital Communications and Information Theory. Version 2 ECE IIT, Kharagpur Module ntroduction to Digital Communications and nformation Theory Lesson 3 nformation Theoretic Approach to Digital Communications After reading this lesson, you will learn about Scope of nformation Theory

More information

The Capacity Region for Multi-source Multi-sink Network Coding

The Capacity Region for Multi-source Multi-sink Network Coding The Capacity Region for Multi-source Multi-sink Network Coding Xijin Yan Dept. of Electrical Eng. - Systems University of Southern California Los Angeles, CA, U.S.A. xyan@usc.edu Raymond W. Yeung Dept.

More information

The Adaptive Classical Capacity of a Quantum Channel,

The Adaptive Classical Capacity of a Quantum Channel, The Adaptive Classical Capacity of a Quantum Channel, or Information Capacities of Three Symmetric Pure States in Three Dimensions Peter W. Shor 1 AT&T Labs Research Florham Park, NJ 07932 02139 1 Current

More information

An Introduction To Resource Theories (Example: Nonuniformity Theory)

An Introduction To Resource Theories (Example: Nonuniformity Theory) An Introduction To Resource Theories (Example: Nonuniformity Theory) Marius Krumm University of Heidelberg Seminar: Selected topics in Mathematical Physics: Quantum Information Theory Abstract This document

More information

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity 5-859: Information Theory and Applications in TCS CMU: Spring 23 Lecture 8: Channel and source-channel coding theorems; BEC & linear codes February 7, 23 Lecturer: Venkatesan Guruswami Scribe: Dan Stahlke

More information

FRAMES IN QUANTUM AND CLASSICAL INFORMATION THEORY

FRAMES IN QUANTUM AND CLASSICAL INFORMATION THEORY FRAMES IN QUANTUM AND CLASSICAL INFORMATION THEORY Emina Soljanin Mathematical Sciences Research Center, Bell Labs April 16, 23 A FRAME 1 A sequence {x i } of vectors in a Hilbert space with the property

More information

A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels

A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels S. Sandeep Pradhan a, Suhan Choi a and Kannan Ramchandran b, a {pradhanv,suhanc}@eecs.umich.edu, EECS Dept.,

More information

Lecture 11: Polar codes construction

Lecture 11: Polar codes construction 15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006) MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK SATELLITE COMMUNICATION DEPT./SEM.:ECE/VIII UNIT V PART-A 1. What is binary symmetric channel (AUC DEC 2006) 2. Define information rate? (AUC DEC 2007)

More information

Transmitting and Hiding Quantum Information

Transmitting and Hiding Quantum Information 2018/12/20 @ 4th KIAS WORKSHOP on Quantum Information and Thermodynamics Transmitting and Hiding Quantum Information Seung-Woo Lee Quantum Universe Center Korea Institute for Advanced Study (KIAS) Contents

More information

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources Wei Kang Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College

More information

Shannon s Noisy-Channel Coding Theorem

Shannon s Noisy-Channel Coding Theorem Shannon s Noisy-Channel Coding Theorem Lucas Slot Sebastian Zur February 13, 2015 Lucas Slot, Sebastian Zur Shannon s Noisy-Channel Coding Theorem February 13, 2015 1 / 29 Outline 1 Definitions and Terminology

More information

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122 Lecture 5: Channel Capacity Copyright G. Caire (Sample Lectures) 122 M Definitions and Problem Setup 2 X n Y n Encoder p(y x) Decoder ˆM Message Channel Estimate Definition 11. Discrete Memoryless Channel

More information

Nullity of Measurement-induced Nonlocality. Yu Guo

Nullity of Measurement-induced Nonlocality. Yu Guo Jul. 18-22, 2011, at Taiyuan. Nullity of Measurement-induced Nonlocality Yu Guo (Joint work with Pro. Jinchuan Hou) 1 1 27 Department of Mathematics Shanxi Datong University Datong, China guoyu3@yahoo.com.cn

More information

Introduction To Information Theory

Introduction To Information Theory Introduction To Information Theory Edward Witten PiTP 2018 We will start with a very short introduction to classical information theory (Shannon theory). Suppose that you receive a message that consists

More information

An introduction to basic information theory. Hampus Wessman

An introduction to basic information theory. Hampus Wessman An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on

More information

An exponential separation between quantum and classical one-way communication complexity

An exponential separation between quantum and classical one-way communication complexity An exponential separation between quantum and classical one-way communication complexity Ashley Montanaro Centre for Quantum Information and Foundations, Department of Applied Mathematics and Theoretical

More information

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Nan Liu and Andrea Goldsmith Department of Electrical Engineering Stanford University, Stanford CA 94305 Email:

More information

LECTURE 13. Last time: Lecture outline

LECTURE 13. Last time: Lecture outline LECTURE 13 Last time: Strong coding theorem Revisiting channel and codes Bound on probability of error Error exponent Lecture outline Fano s Lemma revisited Fano s inequality for codewords Converse to

More information

Variable Length Codes for Degraded Broadcast Channels

Variable Length Codes for Degraded Broadcast Channels Variable Length Codes for Degraded Broadcast Channels Stéphane Musy School of Computer and Communication Sciences, EPFL CH-1015 Lausanne, Switzerland Email: stephane.musy@ep.ch Abstract This paper investigates

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Lecture 11 September 30, 2015

Lecture 11 September 30, 2015 PHYS 7895: Quantum Information Theory Fall 015 Lecture 11 September 30, 015 Prof. Mark M. Wilde Scribe: Mark M. Wilde This document is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes Shannon meets Wiener II: On MMSE estimation in successive decoding schemes G. David Forney, Jr. MIT Cambridge, MA 0239 USA forneyd@comcast.net Abstract We continue to discuss why MMSE estimation arises

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

18.2 Continuous Alphabet (discrete-time, memoryless) Channel 0-704: Information Processing and Learning Spring 0 Lecture 8: Gaussian channel, Parallel channels and Rate-distortion theory Lecturer: Aarti Singh Scribe: Danai Koutra Disclaimer: These notes have not

More information

Lecture 21: Quantum communication complexity

Lecture 21: Quantum communication complexity CPSC 519/619: Quantum Computation John Watrous, University of Calgary Lecture 21: Quantum communication complexity April 6, 2006 In this lecture we will discuss how quantum information can allow for a

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Quantum Information Theory and Cryptography

Quantum Information Theory and Cryptography Quantum Information Theory and Cryptography John Smolin, IBM Research IPAM Information Theory A Mathematical Theory of Communication, C.E. Shannon, 1948 Lies at the intersection of Electrical Engineering,

More information

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK DEPARTMENT: ECE SEMESTER: IV SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A 1. What is binary symmetric channel (AUC DEC

More information

are Banach algebras. f(x)g(x) max Example 7.4. Similarly, A = L and A = l with the pointwise multiplication

are Banach algebras. f(x)g(x) max Example 7.4. Similarly, A = L and A = l with the pointwise multiplication 7. Banach algebras Definition 7.1. A is called a Banach algebra (with unit) if: (1) A is a Banach space; (2) There is a multiplication A A A that has the following properties: (xy)z = x(yz), (x + y)z =

More information

On Function Computation with Privacy and Secrecy Constraints

On Function Computation with Privacy and Secrecy Constraints 1 On Function Computation with Privacy and Secrecy Constraints Wenwen Tu and Lifeng Lai Abstract In this paper, the problem of function computation with privacy and secrecy constraints is considered. The

More information

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 6 (draft; 9/6/03. Error exponents for Discrete Memoryless Channels http://www.enee.umd.edu/ abarg/enee739c/course.html

More information

Capacity of AWGN channels

Capacity of AWGN channels Chapter 3 Capacity of AWGN channels In this chapter we prove that the capacity of an AWGN channel with bandwidth W and signal-tonoise ratio SNR is W log 2 (1+SNR) bits per second (b/s). The proof that

More information

On the Duality between Multiple-Access Codes and Computation Codes

On the Duality between Multiple-Access Codes and Computation Codes On the Duality between Multiple-Access Codes and Computation Codes Jingge Zhu University of California, Berkeley jingge.zhu@berkeley.edu Sung Hoon Lim KIOST shlim@kiost.ac.kr Michael Gastpar EPFL michael.gastpar@epfl.ch

More information

Majorization-preserving quantum channels

Majorization-preserving quantum channels Majorization-preserving quantum channels arxiv:1209.5233v2 [quant-ph] 15 Dec 2012 Lin Zhang Institute of Mathematics, Hangzhou Dianzi University, Hangzhou 310018, PR China Abstract In this report, we give

More information

Exercise 1. = P(y a 1)P(a 1 )

Exercise 1. = P(y a 1)P(a 1 ) Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a

More information