Quantum Statistics -First Steps

Similar documents
Linear Algebra and Dirac Notation, Pt. 2

Linear Algebra using Dirac Notation: Pt. 2

Quantum Information & Quantum Computing

Linear Algebra 2 Spectral Notes

MP 472 Quantum Information and Computation

The Principles of Quantum Mechanics: Pt. 1

The Framework of Quantum Mechanics

Quantum Mechanics II: Examples

Quantum Computing Lecture 2. Review of Linear Algebra

Lecture 7: Positive Semidefinite Matrices

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Chapter 3 Transformations

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture 2: Linear operators

Ensembles and incomplete information

Linear Algebra: Matrix Eigenvalue Problems

DECAY OF SINGLET CONVERSION PROBABILITY IN ONE DIMENSIONAL QUANTUM NETWORKS

Linear Algebra Massoud Malek

PHY305: Notes on Entanglement and the Density Matrix

CLASSIFICATION OF COMPLETELY POSITIVE MAPS 1. INTRODUCTION

Ph 219/CS 219. Exercises Due: Friday 20 October 2006

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

A completely entangled subspace of maximal dimension

Lecture 3: Hilbert spaces, tensor products

Spectral Theorem for Self-adjoint Linear Operators

2. Introduction to quantum mechanics

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Chapter 5. Density matrix formalism

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Chapter 2 The Density Matrix

Math 108b: Notes on the Spectral Theorem

Introduction to quantum information processing

LINEAR ALGEBRA SUMMARY SHEET.

Introduction to Quantum Information Hermann Kampermann

Ph.D. Katarína Bellová Page 1 Mathematics 2 (10-PHY-BIPMA2) EXAM - Solutions, 20 July 2017, 10:00 12:00 All answers to be justified.

Math 113 Final Exam: Solutions

arxiv: v1 [quant-ph] 26 Sep 2018

Information quantique, calcul quantique :

Mathematical Methods wk 2: Linear Operators

Free probability and quantum information

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Chapter Two Elements of Linear Algebra

4 Matrix product states

1 Quantum states and von Neumann entropy

1 Algebra of State Vectors

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

3 Symmetry Protected Topological Phase

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Lecture 4: Postulates of quantum mechanics

MATH 583A REVIEW SESSION #1

1 Last time: least-squares problems

Lecture 19 October 28, 2015

The Spectral Theorem for normal linear maps

November 18, 2013 ANALYTIC FUNCTIONAL CALCULUS

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Algebra I Fall 2007

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

Quantum Physics II (8.05) Fall 2002 Assignment 3

1 Planar rotations. Math Abstract Linear Algebra Fall 2011, section E1 Orthogonal matrices and rotations

Cheng Soon Ong & Christian Walder. Canberra February June 2017

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Some Introductory Notes on Quantum Computing

The quantum way to diagonalize hermitean matrices

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

Asymptotic Pure State Transformations

LUCK S THEOREM ALEX WRIGHT

Algebraic Theory of Entanglement

Math 1553, Introduction to Linear Algebra

arxiv: v1 [quant-ph] 31 Aug 2007

Eigenvectors and Hermitian Operators

An Interpolation Problem by Completely Positive Maps and its. Quantum Cloning

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

The following definition is fundamental.

On common eigenbases of commuting operators

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Operator norm convergence for sequence of matrices and application to QIT

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Math Linear Algebra II. 1. Inner Products and Norms

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

A PRIMER ON SESQUILINEAR FORMS

2 The Density Operator

MAT265 Mathematical Quantum Mechanics Brief Review of the Representations of SU(2)

Quantum Entanglement and Error Correction

SPECTRAL THEORY EVAN JENKINS

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.

Exercise Sheet 1.

On the distinguishability of random quantum states

NORMS ON SPACE OF MATRICES

Borromean Entanglement Revisited

Linear Algebra and Dirac Notation, Pt. 3

Detailed Proof of The PerronFrobenius Theorem

Representation Theory

By allowing randomization in the verification process, we obtain a class known as MA.

Lecture 4: Purifications and fidelity

Exercises * on Linear Algebra

Information measures in simple coding problems

Incompatibility Paradoxes

Linear algebra and applications to graphs Part 1

Transcription:

Quantum Statistics -First Steps Michael Nussbaum 1 November 30, 2007 Abstract We will try an elementary introduction to quantum probability and statistics, bypassing the physics in a rapid first glance. From a formal point of view, classical probability distributions on finite sets are replaced by matrices, and certain operations on these matrices (called measurements ) give rise to random variables. An important class of measurements are projection operators. It will be seen that quantum statistics has a distinct "linear algebra flavor". We will then describe the problem of quantum hypothesis testing and its relation to optimal measurements, and the quantum version of the Neyman-Pearson lemma. We will conclude with the quantum analog of a classical large sample result for Bayesian testing. 1 Mathematical setting This discussion focuses on the aspect of generalizing classical probability. For an introduction with more physical background, see e.g. Gill [3]. A. Finite probability spaces and random variables. Let us first recall some basic and simple facts about classical probability. Consider a finite sample space S, whichfor convenience we take to be the numbers S = {1,...,k}. Assume a probability distribution P given by p j, j =1,...,k ( P k p j =1, p j 0). Together they form a finite probability space (S, P ). A random variable is a map : S 7 R taking real values (j) =x j, j =1,...,k.If is one-to-one then Pr ( = x j )=p j, (1) in general takes values x j with probability Pr ( = x j )= p j. i:(i)=x j In any case, the expectation of under P is E P = (j)p j. 1 Dept. of Mathematics, Cornell University 1

B. First generalization. Quantum probability uses complex numbers, but we start with a simplified version using only real numbers. States. The state of a physical system is a symmetric nonnegative definite k k matrix ρ = k ρ ij having trace 1: i, Tr [ρ] := ρ ii =1. i=1 Recall that ρ is symmetric if ρ = ρ > and ρ is nonnegative definite if for any x R k we have x > ρx 0. For brevity we call such matrices positive (and strictly positive if x > ρx > 0 for all x). Recall the spectral decomposition of a symmetric matrix: ρ = λ j e j e > j (2) where e 1,...,e k is an orthonormal basis of R k ;thee j are the eigenvectors pertaining to the real eigenvalues λ j. Another way of writing (2) is ρ = CΛC > where Λ is the diagonal matrix with diagonal elements λ j and C is the k k matrix having e 1,...,e k as columns. Then C is an orthogonal matrix, i.e. C > C = I. The eigenvectors e j are uniquely determined if all eigenvalues λ j are different. When all eigenvalues are the same: λ j = λ then ρ = λi and e 1,...,e k can be chosen as any orthonormal basis. If there are only two different eigenvalues, λ 0 and λ 1 say, then there are two eigenspaces (linear subspaces of R k ) within which two bases can be arbitrarily chosen. Measurements. Suppose we have different mechanisms which from a given states ρ generate different probability distributions. Such a mechanism is called a measurement. Ameasure- ment M is defined to be a symmetric k k matrix M (notethatitisnotrequiredtobe positive). Let M = P k x jm j m > j be the spectral decomposition of M; the eigenvalues x 1,...,x k can be any real numbers. We postulate that M generates a random variable with values in {x 1,...,x k } in the following way. Definition 1 The random variable M generated by the measurement M = P k x jm j m > j takes real values x 1,...,x k.ifthex j are all different then Pr ( M = x j )=m > j ρm j. In general Pr ( M = x j )= i:x i =x j m > i ρm i. (3) 2

Let us check that this indeed gives a probability distribution. First m > j ρm j 0 since ρ is positive. Furthermore m > j ρm j =1. Proof. To see this note that for any a, b R k,wemaywriteforthethek k matrix ab > Setting a = m > j, b = ρm j we obtain h Tr ab >i = m > j ρm j = a j b j = a > b. h i Tr ρm j m > j. Also the trace operation is linear on matrices A, B: Tr [A]+Tr[B] =Tr[A + B]. Hence h i Tr ρm j m > j =Tr ρ m j m > j. Now P k m jm > j = I since the left side is a spectral decomposition of the unit matrix I (recall that any orthonormal system can be chosen for I and the eigenvalues are all 1). Hence the right side above is Tr [ρ] =1 by the assumption on the state ρ. As a consequence we may write the expectation of M under ρ E ρ M = x j m > j ρm j. Applying the same reasoning as in the proof above, including now the real numbers x j we may write x j m > j ρm j =Tr ρ x j m j m > j =Tr[ρM]. Thus we have shown Proposition 2 (Trace rule) The random variable M defined above from the measurement M has expectation E ρ M =Tr[ρM]. Note that we write E ρ for the expectation, i.e. "expectation under the state ρ". Different measurements M give different distributions of the random variable M. In statistics, to 3

discriminate between two possible states, one now has to select a measurement first, from which one obtains two different distributions of the random variable M. Then one has to discriminate between these two distributions by means of a classical test. Classical probability as a special case. Suppose our state ρ is a diagonal matrix ρ 11 0 ρ =.... 0 ρ kk Setting p j = ρ jj we obtain P k p j = Tr[ρ] = 1 and p j = η > j ρη j 0 where η > j = (0,...,1,...,0) with the 1 at the j-th position. Thus a diagonal state ρ gives a classical probability distribution p j, j =1,...,k. Suppose also that we admit only one kind of measurement: we fix the orthonormal basis η 1,...,η k and allow all measurements M = P k x jη j η > j where the real values x 1,...,x k are arbitrary. Then M is also a diagonal matrix x 1 0 M =... 0 x k By definition 1 we obtain a bunch of random variables M where Pr ( M = x j )=η > j ρη j = p j i.e. we have reproduced (1). That means we have obtained all random variables on a given probability space (S, P ) where P is the measure p 1,...,p k and S = {1,...,k}. It turns out that fixing the set of eigenvectors of the measurement M reduces the setup to the classical one. It can be seen that we may fix any other set of eigenvectors m 1,...,m k of the measurement and fix anarbitrystateρ, and in this way obtain all random variables on a given probability space (S, P 0 ). Then P 0 is the measure p 0 j = m> j ρm j,,...,k. In the physical context, the set of eigenvectors m 1,...,m k of the measurement determines "directions" or "angles" in which one measures. Changing the angle means changing the underlying probability space. An example: qbits. Aqbit(quantumbit)isa2 2 state. (Recall however that we are not yet in "true" quantum probability which requires use of complex numbers). Such states generalize Bernoulli distributions: any random variable M from a measurement M can at most take two different values x 1, x 2. Consider the qbit µ 1 ε 0 ρ = 0 ε which is diagonal. a) If we admit only diagonal measurements then we obtain all functions of a Bernoulli random variable with distribution Bern(ε). Indeed suppose we measure with system of eigenvectors (1, 0), (0, 1). Then we obtain the probability space (S, P ) with S = {1, 2} and P =Bern(ε) (identifying formally S with {0, 1}). b) If we admit only measurements with system of eigenvectors m 1,m 2 where m > 1 =(cosφ, sin φ), 4

m > 2 =( sin φ, cos φ) then we obtain the probability space (S, P 0 ) with S = {1, 2} and P 0 given by p 0 1 = m > 1 ρm 1 =(1 ε)cos 2 φ + ε sin 2 φ, p 0 2 = m > 2 ρm 2 =(1 ε)sin 2 φ + ε cos 2 φ. We may choose φ = π/4 so that cos φ =sinφ =1/ 2,then p 0 1 = 1/2. p 0 2 = 1/2 which means we have the uniform distribution Bern(1/2). Here is a small bit of physical background: the famous Stern Gerlach double slit experiment (1922) on the deflection of particles was used to show how the angle φ in which a certain binary outcome (let s call it "spin up or down") was measured affects the probability distribution of the outcome. For one angle a uniform distribution resulted, in another angle a non-uniform distribution like (1 ε, ε). Observe that in our example it is possible that ε =0: then the first measurement results in a probability distribution (1, 0) on the set of outcomes S and the second measurement still results in a uniform distribution (1/2, 1/2). Thisiseven more baffling: an outcome which appears deterministic when measured in one angle appears random with uniform distribution when measured in another angle. Projection measurements and hypothesis testing. In the last example we have seen measurements with only two possible outcomes. Such measurements can be made on arbitrary states ρ, not just qbits. In that case one uses projection measurements. Recall that a projection matrix is a symmetric matrix M which has only eigenvalues x j =0or x j =1. Thus M = x j m j m > j = m j m > j j µ where µ is a subset of the indices {1,...,k}. If µ = {1,...,k} then M = I and if µ = {j} then M = m j m > j is a projector of rank one. In every case we see that M fulfills MM = M and Mx = x for all x in the linear subpace spanned by {m j, j µ}. That subspace is called the eigenspace of M; within the eigenspace we can freely change the basis from {m j, j µ} to { m j, j µ} and still obtain M = m j m > j. j µ Also, I M is again a projection, and projects onto the orthogonal complement of the eigenspace of M. Projections are the measurements used in quantum hypothesis testing; they come up in the following way. Suppose we have two states ρ, σ andwewishtomeasureandthendecide which one of the two states is the true one. So one has to produce a random variable M by measurement and then decide according to the outcome. We have to select M not knowing which state is the true one. Suppose we have selected an arbitrary M = P k x jm j m > j ; accordingtodefinition 1 we obtain two probability distributions Pr ( M = x j ρ) = m > j ρm j =: p j, Pr ( M = x j σ) = m > j σm j =: q j,,...,k. 5

(assume that all x j are different; indeed we might want this to obtain maximal information). After this the problem becomes classical: find a test, i.e. a function ϕ : {x 1,...,x k } 7 {0, 1} and decide "true state is σ" if the random variable ϕ ( M ) takes value 1. Thus effectively we use the Bernoulli random variable ϕ ( M ) for our decision. It has distribution given by Pr (ϕ ( M )=1 ρ) = p j = m > j ρm j =Tr ρ m j m > j j:ϕ(x j )=1 j:ϕ(x j )=1 j:ϕ(x j )=1 Now define the projection matrix M ϕ = m j m > j, j:ϕ(x j )=1 then Pr (ϕ ( M )=1 ρ) = Tr[ρM ϕ ], Pr (ϕ ( M )=1 σ) = Tr[σM ϕ ] and as a consequence Pr (ϕ ( M )=0 ρ) = Tr[ρ (I M ϕ )], Pr (ϕ ( M )=0 σ) = Tr[σ (I M ϕ )] where I M ϕ is also a projection. Thus we can identify a quantum test with a projection matrix M ϕ. Here M ϕ can be of any rank and pertaining to any eigenspace; we have not fixed the classical test ϕ (even included the constants ϕ =1or ϕ =0) nor the vector basis m 1,...,m k. The last four displays determine the error probabilities: the error of first kind is and the error of second kind is Err 1 (M ϕ )=Pr(ϕ ( M )=1 ρ) =Tr[ρM ϕ ] Err 2 (M ϕ )=Pr(ϕ ( M )=0 σ) =Tr[σ (I M ϕ )]. C. Second generalization: actual quantum probability. Before we come to the problem of finding the best test between ρ and σ, let us pay tribute to the fact that quantum probability uses complex numbers. States. The state of a physical system is a self adjoint positive k k matrix ρ = ρ ij k i, with complex elements having trace 1: Tr [ρ] := ρ ii =1. i=1 Recall that ρ is self-adjoint (or Hermitian) if ρ = ρ where ρ is the complex conjugate transpose: take the transpose ρ > first and then all complex conjugates of the elements (or 6

the other way round). Also, ρ is positive means that for any complex vector x C k we have x ρx 0. It is well known that Hermitian matrices have a spectral decomposition analogous to real symmetric ones: ρ = λ j e j e j (4) where e 1,...,e k is an orthonormal basis of C k ;thee j are the eigenvectors pertaining to the real eigenvalues λ j. In fact it can be shown that all eigenvalues for selfadjoint ρ must be real, the same is true for all diagonal elements ρ jj : if λ j is the complex conjugate of λ j and v j is any vector from C k then λ j = λ j = v j ρv j = v j ρ v j = v j ρv j = λ j (here we used (AB) = B A and A = A for any complex matrices, but ρ = ρ for selfadjoint ρ), hence λ j = λ j and λ j is real. Thus Tr [ρ] is always real and can be set 1. Basically everything is analogous to the real case if R k is replaced by C k and the transpose A > of any matrix (and any vector) is replaced by the complex conjugate transpose (or adjoint) A. Measurements. AmeasurementM is defined to be a selfadjoint k k matrix M (again not required to be positive). Let M = P k x jm j m j be the spectral decomposition of M; the eigenvalues x 1,...,x k can be any real numbers. We postulate that M generates a real random variable with values in {x 1,...,x k } as before: if the x j are all different then Pr ( M = x j ρ) =m jρm j. Asaboveitisseenthatallm j ρm j are real, nonnegative and they sum to one. Then the trace rule holds E ρ M =Tr[ρM] and again classical probability is a special case since the whole "simplified" setup above with real states ρ is a special case. But we may also fix anybasism 1,...,m k in C k and limit ourselves to measurements P k x jm j m j ; this also gives a classical probability space. 2 Quantum Neyman-Person lemma Again consider hypothesis testing between states ρ and σ, but now these are states with complex elements. As above it can be shown that the procedure of obtaing a real valued r.v. by measurement M and then apply classical testing is equivalent to generating a Bernoulli random variable ϕ ( M )=ϕ using a projection measurement M, and the error probabilities are Err 1 (M) = Pr(ϕ =1 ρ) =Tr[ρM], Err 2 (M) = Pr(ϕ =0 σ) =Tr[σ (I M)]. 7

In what follows a quantum test is a complex projection matrix M, i.e. M is selfadjoint and has only eigenvalues 0 and 1. Suppose we have prior probabilities 1 π,π on the states ρ, σ, i.e. π is the a priori probability that the true state is σ. The Bayesian error probability is Err(M) =(1 π)err 1 (M)+πErr 2 (M). To find the best (Bayesian) test, let us define for any selfadjoint matrix A the expression suppa + : if A = P k α ja j a j is a spectral decomposition then suppa + = a j a j. j:α j >0 It obvious that suppa + is always a projection (and independent of the choice of basis a j if this choice is not unique). This projection is called the support projection for the positive part A + of A (with obvious definition A + = P j:α j >0 α ja j a j ). If A is strictly positive then suppa + is trivial: suppa + = I. Theorem 3 (Holevo-Helstrom). Suppose 0 < π < 1. All tests M fulfill Err (M) Err (R) where R is the test R = supp (πσ (1 π) ρ) +. Note that the matrix πσ (1 π) ρ is selfadjoint but not positive (a difference of two positives). Thus R is not trivial. However if both ρ, σ are diagonal then R corresponds to the Bayesian likelihood ratio test: R is diagonal with diagonal elements whichinthecaseofπ =1/2 reduces to r ii = 1 {πσ ii (1 π) ρ ii > 0}, i =1,...,k r ii = 1 {σ ii >ρ ii }, i =1,...,k However in the general case, even when π =1/2 the best test R is does not have such an explicit expression: then R = supp (σ ρ) + which cannot in general be expressed in terms of the two eigenbases involved (or ρ and of σ). In fact the eigenbasis of σ ρ is neither that of ρ nor of σ and there is no explicit expression known. Facts like this, i.e. problems associated to behaviour of eigenbases of composed matrices, make out much of the challenge of quantum statistics. Proof. Write the error probability Err(M) = (1 π)tr[ρm]+πtr [σ (I M)] = = π +Tr[((1 π) ρ πσ) M] = π Tr [(πσ (1 π) ρ) M]. 8

Thus to minimize Err(M) we have to maximize Tr [(πσ (1 π) ρ) M] for projections M. Let A = P k α ja j a j be a spectral decomposition of then A =(πσ (1 π) ρ), Tr [AM] = α j a jma j Now clearly the numbers a j Ma j are between 0 and 1 (indeed a j Ma j 0 since M is positive, a j Ma j 1 since the largest eigenvalue of m is one). This implies the above sum cannot exceed the sum of all positive α j,i.e. Tr [AM] j:α j >0 α j This upper bound is attained for M = R = a j a j, j:α j >0 indeed: Ã! Tr [AR] = α j a jra j = α j a j a i a i = j:α j >0 α j. i:α i >0 a j 3 Asymptotics for quantum hypothesis testing What is the analog of n i.i.d. data in the quantum setting? Consider the n-fold tensor product of a density matrix (or state) ρ, i.e. ρ n. Recall that the tensor product of two matrices A, B is given by a 11 B... a 1k B A B =......... a k1 B... a kk B which is a matrix of dimension k 2 k 2.Itcanbeshownρ ρ is again a state, i.e it is positive and has trace one (for that check Tr [A B)] = Tr [A] Tr [B]). It can also be verified that if ρ is diagonal and we limit ourselves to diagonal measurements then the classical notion of product measure is obtained. Now for a large sample asymptotics with n, one assumes the state is ρ n = ρ ρ... ρ. This is a k n k n matrix. In the testing problem, we have to discriminate between ρ n and σ n, using a test measurement M on the whole system, i.e. M is a k n k n projection matrix 9

(a projection in C kn ). The error criterion for symmetric Bayesian hypothesis testing (with π =1/2) is Err n (M) = 1 Tr ρ n M +Tr σ n (I M) 2 where I is the identity operator in C kn. AccordingtoTheorem3thebesttestistheHolevo- Helstrom projection R n = supp σ n ρ n + If we are intested in the asympotics of the errror probability as n, we are faced with the fact that the Holevo-Helstrom projection uses the eigenbasis of σ n ρ n. As noted already, σ n ρ n has a completely different eigenbasis than either σ n or ρ n. Also, this is a computation in k n -dimensional space. The following generalizes a classical result on the asymptotics of the Bayesian error probability for π =1/2, the Chernoff bound. Theorem 4 [Quantum Chernoff Lower Bound] Let ρ, σ be two k k density matrices representing quantum states. Then any sequence of k n k n test projections M n, n N, satisfies lim inf n 1 n log Err n(m n ) inf log Tr ρ 1 s 0 s 1 0 ρ s 1. (5) For the proof and further references see [1]. The lower bound is attainable, cf. [2]. For quantum computing, entanglement and paradoxes cf [4]. References [1] Nussbaum, M. and Szkoła, A. (2007). The Chernoff lower bound for symmetric quantum hypothesis testing. To appear, The Annals of Statistics. Available under www.minu.de/math/papers. [2] Audenaert, K. M. R., Nussbaum, M., Szkoła, A. and Verstraete, F. (2007). Asymptotic Error Rates in Quantum Hypothesis Testing. ariv:0708.4282v1 [quant-ph], To appear, Commun. Math. Phys [3] Gill, R. (2001). Asymptotics in quantum statistics. In: State of the Art in Probability and Statistics (A.W. van der Vaart, M. de Gunst, C.A.J. Klaassen, Eds.), IMS Lecture Notes - Monograph Series 36, 255-285. Also at ariv:math/0405571v1 [4] Nielsen, M. and Chuang, I. (2000). Quantum Computation and Quantum Information, Cambridge University Press 10