Appendix B: Inequalities Involving Random Variables and Their Expectations

Size: px
Start display at page:

Download "Appendix B: Inequalities Involving Random Variables and Their Expectations"

Transcription

1 Chapter Fourteen Appendix B: Inequalities Involving Random Variables and Their Expectations In this appendix we present specific properties of the expectation (additional to just the integral of measurable functions on possibly infinite measure spaces) It is to be expected that on probability spaces we may obtain more specific properties since the probability space has measure 1 Proposition 141 (Markov inequality) Let Z be a rv and let g : R [0, ] be an increasing, positive measurable function Then E [ g(z ) ] E [ g(z )1 {Z c} ] g(c)p(z c) Thus P(Z c) E[g(Z )] g(c) for all g increasing functions and c>0 Proof: Take >0 arbitrary and define the random variable Y = 1 { X } Handbook of Probability, First Edition Ionuţ Florescu and Ciprian Tudor 2014 John Wiley & ons, Inc Published 2014 by John Wiley & ons, Inc 434

2 CHAPTER 14 Appendix B: Inequalities Involving Random Variables 435 Then clearly Y X and taking the expectation, we get EY = P( X ) E X EXAMPLE 141 pecial cases of the Markov inequality If we take g(x) = x an increasing function and X a positive random variable, then we obtain P(Z c) E(Z ) c To get rid of the condition X 0, we take the random variable Z = X Then we obtain the classical form of the Markov inequality: P( X c) E( X ) c If we take g(x) = x 2, Z = X E(X ) and we use the definition of variance, we obtain the Chebyshev inequality: P( X E(X ) c) Var(X ) c 2 If we denote E(X ) = and Var(X ) = and we take c = k in the previous inequality, we will obtain the classical Chebyshev inequality presented in undergraduate courses: Proposition 142 EX 2 < we have For every 0 and for any random variable X such that P( X k) 1 k 2 If we take g(x) = e x, with >0, then P(Z c) e c E(e z ) This last inequality states that the tail of the distribution decays exponentially in c if Z has finite exponential moments With simple manipulations, one can obtain Chernoff s inequality using it

3 436 CHAPTER 14 Appendix B: Inequalities Involving Random Variables Remark 143 In fact the Chebyshev inequality is far from being sharp Consider, for example, a random variable X with standard normal distribution N (0, 1) If we calculate the probability of the normal using a table of the normal law or using the computer, we obtain P(X 2) = = However, if we bound the probability using Chebyshev inequality, we obtain P(X 2) = 1 2 P( X 2) = 1 8 = 0125, which is very far from the actual probability The following definition is just a reminder Definition 144 A function g : I R is called a convex function on I (where I is any open interval in R, if its graph lies below any of its chords) Mathematically: For any x, y I and for any (0, 1), we have g( x + (1 )y) g(x) + (1 )g(y) A function g is called concave if the opposite is happening: g( x + (1 )y) g(x) + (1 )g(y) ome examples of convex functions on the whole R: x, x 2, and e x, with >0 Lemma 145 (Jensen s inequality) Let f be a convex function and let X be an rv in L 1 () Assume that E(f (X )), then f (E(X )) E(f (X )) Proof: kipped The classic approach indicators simple functions positive measurable measurable is a standard way to prove Jensen Remark 146 The discrete form of Jensen s inequality is as follows: Let ϕ : R R be a convex function and let x 1,, x n R and a i > 0 for i = 1,, n Then ( n ϕ a ) n ix i n a a iϕ(x i ) n i a i If the function ϕ is concave, we have ( n ϕ a ) ix i n a i n a iϕ(x i ) n a i The remark is a particular case of the Jensen inequality Indeed, consider a discrete random variable X with outcomes x i and corresponding probabilities

4 CHAPTER 14 Appendix B: Inequalities Involving Random Variables 437 a i / a i Apply the classic Jensen approach above to the convex function ϕ using the expression of expectation of discrete random variables A Historical Remark The next inequality, one of the most famous and useful in any area of analysis (not only probability), is usually credited to Cauchy for sums and chwartz for integrals and is usually known as the Cauchy chwartz inequality However,the Russian mathematician Victor Yakovlevich Bunyakovsky ( ) discovered and first published the inequality for integrals in 1859 (when chwartz was 16) Unfortunately, he was born in eastern Europe However, all who are born in eastern Europe (including myself) learn the inequality by its proper name Lemma 147 (Cauchy Bunyakovsky chwarz inequality) If X, Y L 2 (), then XY L 1 () and E[XY ] E[ XY ] X 2 Y 2, where we used the notation of the norm in L p : X p = ( E[ X p ] ) 1 p Proof: The first inequality is clear applying Jensen inequality to the function x We need to show Let Clearly, W, Z 0 E[ XY ] (E[X 2 ]) 1/2 (E[Y 2 ]) 1/2 W = X and Z = Y Truncation Let W n = W n and Z n = Z n that is { W (ω), if W (ω) <n, W n (ω) = n, if W (ω) n Clearly, defined in this way, W n,z n are bounded Let a, b R two constants Then 0 E[(aW n + bz n ) 2 ] = a 2 E(Wn 2 ) + 2abE(W nz n ) + b 2 E(Zn 2 ) If we let a/b = c,weget c 2 E(Wn 2 ) + 2cE(W nz n ) + E(Zn 2 ) 0, c R This means that the quadratic function in c has to be positive But this is only possible if the determinant of the equation is negative and the leading coefficient

5 438 CHAPTER 14 Appendix B: Inequalities Involving Random Variables E(Wn 2 ) is strictly positive; the later condition is obviously true Thus we must have 4(E(W n Z n )) 2 4E(W 2 n )E(Z 2 n ) 0 (E(W n Z n )) 2 E(W 2 n )E(Z 2 n ) E(W 2 )E(Z 2 ) n, which is in fact the inequality for the truncated variables If we let n and we use the monotone convergence theorem, we get (E(WZ )) 2 E(W 2 )E(Z 2 ) A generalization of the Cauchy Buniakovski chwartz is: Lemma 148 (Hölder inequality) If 1/p + 1/q = 1, X L p (), and Y L q (), then XY L 1 () and E XY X p Y q = ( E X p) 1 p (E Y q ) 1 q Proof: The proof is simple and uses the following inequality (Young inequality): If a and b are positive real numbers and p, q are as in the theorem, then ab ap p + bq q, with equality if and only if a p = b q Taking this inequality as given (not hard to prove), define f = X X p, g = Y Y p Note that the Holder inequality is equivalent to E[f g] 1 (Note that X p and Y q are just numbers which can be taken in and out of integral using the linearity property of the integral) To finish the proof, apply the Young inequality to f 0 and g 0 and then integrate to obtain E[f g] 1 p E[f p ] + 1 q E[g q ] = 1 p + 1 q = 1, since E[f p ] = 1 and similarly for g Finally, the extreme cases (p = 1, q =, etc) may be treated separately, but they will yield the same inequality This inequality and Riesz representation theorem creates the notion of conjugate space This notion is only provided to create links with real analysis For further details we recommend Royden (1988)

6 CHAPTER 14 Appendix B: Inequalities Involving Random Variables 439 Definition 149 (Conjugate space of L p ) For p>0 let L p () define the space on (,F, P) The number q>0 with the property 1/p + 1/q = 1 is called the conjugate index of p The corresponding space L q () is called the conjugate space of L p () Any of these spaces are metric spaces with the distance induced by the norm, that is, d (X, Y ) = X Y p = ( E [ X Y p]) 1 p The fact that this is a properly defined linear space is implied by the triangle inequality in L p the next theorem Lemma 1410 (Minkowski inequality) If X, Y L p then X + Y L p and Proof: We clearly have X + Y p X p + Y p X + Y p 2 p 1 ( X p + Y p ) For example, to show this inequality in terms of real numbers, just use the definition of convexity for the function x p with x = X and y = Y and = 1/2 Integrating the inequality will impliy that X + Y L p Now we can write X + Y p p = E[ X + Y p ] E [ ( X + Y ) X + Y p 1] = E [ X X + Y p 1] + E [ Y X + Y p 1] Holder ( E [ X p]) 1/p ( E [ X + Y (p 1)q ]) 1/q + ( E [ Y p]) 1/p ( E [ X + Y (p 1)q ]) 1/q ( ) q= p p 1 = ( X p + Y p )( E [ X + Y p ]) 1 1 p = ( X p + Y p ) E [ X + Y p ] X + Y p Finally, identifying the left and right hand after simplifications, we obtain the result The Case of L 2 The case when p = 2 is quite special This is because 2 is its own conjugate index (1/2 + 1/2 = 1) Because of this, the space is quite similar to the Euclidian space If X, Y L 2, we may define the inner product: <X,Y>= E[XY ] = XY d P,

7 440 CHAPTER 14 Appendix B: Inequalities Involving Random Variables which is a well-defined quantity using the Cauchy Bunyakovsky chwartz inequality The existence of the inner product and the completeness of the norm makes L 2 a Hilbert space with all the benefits that follow In particular, the notion of orthogonality is well-defined Two variables X and Y in L 2 are orthogonal if and only if <X,Y>= 0 In turn the orthogonality definition allows a Fourier representation and, in general, representations in terms of an orthonormal basis of functions in L 2 Again, we do not wish to enter into more details than necessary; please consult, (Billingsley, 1995, ection 19) for further reference A consequence of the Markov inequality is the Berstein inequality Proposition 1411 (Berstein inequality) Let X 1,X 2,,X n be independent random variable square integrable with zero expectation Assume that there exists a constant M>0 such that for every i = 1,,nwe have X i M almost surely, that is, the variables are bounded by M almost surely Then, for every t 0, we have ( n ) P X i >t e t2 2 n EX i 2+ 2Mt 3 EXAMPLE 142 A random variable X has finite variance 2 how that for any number c, how that if E(X ) = 0, then P(X t) E[(X + c)2 ] (t + c) 2 if t> c P(X t) 2, t >0 2 + t 2 olution: Let us use a technique similar to the Markov inequality to prove the first inequality Let F (x) be the distribution function of X For any c R we may write E [ (X + c) 2] = t (x + c) 2 df (x) + t (x + c) 2 df (x)

8 141 Functions of Random Variables The Transport Formula 441 The first integral is always positive and if t> c, then t + c>0and on the interval x (t, ) the function (x + c) 2 is increasing Therefore we may continue: E [ (X + c) 2] t (t + c) 2 df (x) = (t + c) 2 P(X >t) Rewriting the final expression gives the first assertion To show the second assertion, note that if E[x] = 0, then V (X ) = E[X 2 ] and thus E [ (X + c) 2] = 2 + c 2 Thus the inequality we just proved reads in this case: P(X t) 2 + c 2, if t> c (t + c) 2 Now take c = 2 This is a negative value for any positive t, so the condition is t satisfied for any t positive ubstituting after simplifications, we obtain exactly what we need You may wonder (and should wonder) how we came up with the value 2 The explanation is simple that is, the value of c which minimizes t the expression 2 +c 2 ; in other words, the value of c which produces the best (t+c) 2 bound 141 Functions of Random Variables The Transport Formula In the previous chapters dedicated to discrete and continuous random variables, we learned how to calculate distributions in particular, pdf s for continuous random variables In this appendix we present a more general result This general result allows us to construct random variables and, in particular, distributions on any abstract space This is the result that allows us to claim that studying random variables on ([0, 1], B ([0, 1]),) is enough We had to postpone presenting the result until this point since we had to learn first how to integrate Theorem 1412 (General Transport Formula) Let (,R,P) be a probability space Let f be a measurable function such that (,F ) f ϕ (, G ) (R, B (R)), where (, G ) is a measurable space Assuming that at least one of the integrals exists, we then have ϕ fdp = ϕdp f 1, for all ϕ measurable functions Proof: We will use the standard argument technique discussed above

9 442 CHAPTER 14 Appendix B: Inequalities Involving Random Variables 1 Let ϕ be the indicator function ϕ = 1 A for A G : { 1 ifω A, 1 A (ω) = 0 otherwise Then we get 1 A fdp = 1 A (f (ω)) d P(ω) = 1 f 1 (A)(ω) d P(ω) = P(f 1 (A)) = P f 1 (A) = 1 A d (P f 1 ), recalling the definition of the integral of an indicator 2 Let ϕ be a simple function ϕ = n a i1 Ai, where a i s are constant and A i G ( n ) ϕ fdp = a i 1 Ai fdp = (part 1) = n a i (1 Ai f ) d P = n a i 1 Ai d P f 1 = n a i 1 Ai fdp n a i 1 Ai d P f 1 = ϕd P f 1 3 Let ϕ be a positive measurable function and let ϕ n be a sequence of simple functions such that ϕ n ϕ, then ϕ fdp = ( lim ϕ n ) fdp n = lim (ϕ monotone convergence n f ) d P = lim ϕ n fdp n n (part 2) 1 monotone convergence = lim ϕ n d P f = lim ϕ n d P f 1 n n = ϕd(p f 1 ) 4 Let ϕ be a measurable function then ϕ + = max(ϕ, 0), ϕ = max( ϕ, 0) This then gives us ϕ = ϕ + ϕ ince at least one integral is assumed to exist, we get that ϕ + and ϕ exist Also note that ϕ + f (ω) = ϕ + (f 1 (ω)) = max(ϕ(f (ω)), 0), max(ϕ f (ω), 0) = (ϕ f ) + (ω)

10 141 Functions of Random Variables The Transport Formula 443 Then ϕ + d P f 1 = ϕ + fd P = (ϕ f ) + d P, ϕ d P f 1 = ϕ fd P = (ϕ f ) d P These equalities follow from part 3 of the proof After subtracting both, we obtain ϕdp f 1 = ϕ fdp EXAMPLE 143 If X and Y are independent random variables defined on (,R,P) with X, Y L 1 (), then XY L 1 (): XY d P = XdP YdP (E(XY ) = E(X )E(Y )) olution: Let us solve this example using the transport formula Let us take f : R 2, f (ω) = (X (ω),y(ω)); and ϕ : R 2 R, ϕ(x, y) = xy Then we have from the transport formula the following: X (ω)y (ω) dp(ω) (T = ) xy dp (X, Y ) 1 R 2 The integral on the left is E(XY ), while the integral on the right can be calculated as xy d (P X 1,P Y 1 ) = xdp X 1 ydp Y 1 R 2 R R (T ) = X (ω) dp(ω) Y (ω) dp(ω) = E(X )E(Y )

11 444 CHAPTER 14 Appendix B: Inequalities Involving Random Variables EXAMPLE 144 Finally we conclude with an application of the transport formula which will produce one of the most useful formulas Let X be an rv defined on the probability space (,F, P) with distribution function F (x) how that E(X ) = xdf(x), R where the integral is understood in the Riemann tieltjes sense Proving the formula is immediate Take f : R, f (ω) = X (ω) and ϕ : R R, ϕ(x) = x Then from the transport formula, we have E(X ) = = R X (ω) d P(ω) = xdf(x) x X (ω) d P(ω) (T) = R xdp X 1 (x) Clearly if the distribution function F (x) is derivable with df (x) = f (x) dx or df (x) = f (x) dx, we obtain the lower-level classes formula for calculating expectation of a continuous random variable: E(X ) = xf(x) dx R

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write Lecture 3: Expected Value 1.) Definitions. If X 0 is a random variable on (Ω, F, P), then we define its expected value to be EX = XdP. Notice that this quantity may be. For general X, we say that EX exists

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES CLASSICAL PROBABILITY 2008 2. MODES OF CONVERGENCE AND INEQUALITIES JOHN MORIARTY In many interesting and important situations, the object of interest is influenced by many random factors. If we can construct

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Lectures 22-23: Conditional Expectations

Lectures 22-23: Conditional Expectations Lectures 22-23: Conditional Expectations 1.) Definitions Let X be an integrable random variable defined on a probability space (Ω, F 0, P ) and let F be a sub-σ-algebra of F 0. Then the conditional expectation

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Measure, Integration & Real Analysis

Measure, Integration & Real Analysis v Measure, Integration & Real Analysis preliminary edition 10 August 2018 Sheldon Axler Dedicated to Paul Halmos, Don Sarason, and Allen Shields, the three mathematicians who most helped me become a mathematician.

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get 18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in

More information

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27 Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple

More information

Brownian Motion and Conditional Probability

Brownian Motion and Conditional Probability Math 561: Theory of Probability (Spring 2018) Week 10 Brownian Motion and Conditional Probability 10.1 Standard Brownian Motion (SBM) Brownian motion is a stochastic process with both practical and theoretical

More information

Product measure and Fubini s theorem

Product measure and Fubini s theorem Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

Proving the central limit theorem

Proving the central limit theorem SOR3012: Stochastic Processes Proving the central limit theorem Gareth Tribello March 3, 2019 1 Purpose In the lectures and exercises we have learnt about the law of large numbers and the central limit

More information

1. Probability Measure and Integration Theory in a Nutshell

1. Probability Measure and Integration Theory in a Nutshell 1. Probability Measure and Integration Theory in a Nutshell 1.1. Measurable Space and Measurable Functions Definition 1.1. A measurable space is a tuple (Ω, F) where Ω is a set and F a σ-algebra on Ω,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang. Florida State University March 1, 2018 Framework 1. (Lizhe) Basic inequalities Chernoff bounding Review for STA 6448 2. (Lizhe) Discrete-time martingales inequalities via martingale approach 3. (Boning)

More information

Vectors in Function Spaces

Vectors in Function Spaces Jim Lambers MAT 66 Spring Semester 15-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V, also

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

Examples of Dual Spaces from Measure Theory

Examples of Dual Spaces from Measure Theory Chapter 9 Examples of Dual Spaces from Measure Theory We have seen that L (, A, µ) is a Banach space for any measure space (, A, µ). We will extend that concept in the following section to identify an

More information

Linear Normed Spaces (cont.) Inner Product Spaces

Linear Normed Spaces (cont.) Inner Product Spaces Linear Normed Spaces (cont.) Inner Product Spaces October 6, 017 Linear Normed Spaces (cont.) Theorem A normed space is a metric space with metric ρ(x,y) = x y Note: if x n x then x n x, and if {x n} is

More information

Functional Analysis MATH and MATH M6202

Functional Analysis MATH and MATH M6202 Functional Analysis MATH 36202 and MATH M6202 1 Inner Product Spaces and Normed Spaces Inner Product Spaces Functional analysis involves studying vector spaces where we additionally have the notion of

More information

Probability inequalities 11

Probability inequalities 11 Paninski, Intro. Math. Stats., October 5, 2005 29 Probability inequalities 11 There is an adage in probability that says that behind every limit theorem lies a probability inequality (i.e., a bound on

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

Lecture 21: Expectation of CRVs, Fatou s Lemma and DCT Integration of Continuous Random Variables

Lecture 21: Expectation of CRVs, Fatou s Lemma and DCT Integration of Continuous Random Variables EE50: Probability Foundations for Electrical Engineers July-November 205 Lecture 2: Expectation of CRVs, Fatou s Lemma and DCT Lecturer: Krishna Jagannathan Scribe: Jainam Doshi In the present lecture,

More information

ABSTRACT CONDITIONAL EXPECTATION IN L 2

ABSTRACT CONDITIONAL EXPECTATION IN L 2 ABSTRACT CONDITIONAL EXPECTATION IN L 2 Abstract. We prove that conditional expecations exist in the L 2 case. The L 2 treatment also gives us a geometric interpretation for conditional expectation. 1.

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................

More information

Conditional expectation

Conditional expectation Chapter II Conditional expectation II.1 Introduction Let X be a square integrable real-valued random variable. The constant c which minimizes E[(X c) 2 ] is the expectation of X. Indeed, we have, with

More information

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s. 20 6. CONDITIONAL EXPECTATION Having discussed at length the limit theory for sums of independent random variables we will now move on to deal with dependent random variables. An important tool in this

More information

MATH Solutions to Probability Exercises

MATH Solutions to Probability Exercises MATH 5 9 MATH 5 9 Problem. Suppose we flip a fair coin once and observe either T for tails or H for heads. Let X denote the random variable that equals when we observe tails and equals when we observe

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Math Spring Practice for the final Exam.

Math Spring Practice for the final Exam. Math 4 - Spring 8 - Practice for the final Exam.. Let X, Y, Z be three independnet random variables uniformly distributed on [, ]. Let W := X + Y. Compute P(W t) for t. Honors: Compute the CDF function

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real

More information

David Hilbert was old and partly deaf in the nineteen thirties. Yet being a diligent

David Hilbert was old and partly deaf in the nineteen thirties. Yet being a diligent Chapter 5 ddddd dddddd dddddddd ddddddd dddddddd ddddddd Hilbert Space The Euclidean norm is special among all norms defined in R n for being induced by the Euclidean inner product (the dot product). A

More information

STAT 7032 Probability. Wlodek Bryc

STAT 7032 Probability. Wlodek Bryc STAT 7032 Probability Wlodek Bryc Revised for Spring 2019 Printed: January 14, 2019 File: Grad-Prob-2019.TEX Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221 E-mail address:

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Probability Background

Probability Background Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second

More information

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality (October 29, 2016) Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/notes 2016-17/03 hsp.pdf] Hilbert spaces are

More information

L p Spaces and Convexity

L p Spaces and Convexity L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Analysis III Theorems, Propositions & Lemmas... Oh My!

Analysis III Theorems, Propositions & Lemmas... Oh My! Analysis III Theorems, Propositions & Lemmas... Oh My! Rob Gibson October 25, 2010 Proposition 1. If x = (x 1, x 2,...), y = (y 1, y 2,...), then is a distance. ( d(x, y) = x k y k p Proposition 2. In

More information

Advanced Probability

Advanced Probability Advanced Probability Perla Sousi October 10, 2011 Contents 1 Conditional expectation 1 1.1 Discrete case.................................. 3 1.2 Existence and uniqueness............................ 3 1

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Problem set 4, Real Analysis I, Spring, 2015.

Problem set 4, Real Analysis I, Spring, 2015. Problem set 4, Real Analysis I, Spring, 215. (18) Let f be a measurable finite-valued function on [, 1], and suppose f(x) f(y) is integrable on [, 1] [, 1]. Show that f is integrable on [, 1]. [Hint: Show

More information

f(x)dx = lim f n (x)dx.

f(x)dx = lim f n (x)dx. Chapter 3 Lebesgue Integration 3.1 Introduction The concept of integration as a technique that both acts as an inverse to the operation of differentiation and also computes areas under curves goes back

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Lecture 4: September Reminder: convergence of sequences

Lecture 4: September Reminder: convergence of sequences 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused

More information

Lecture 2 One too many inequalities

Lecture 2 One too many inequalities University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 2 One too many inequalities In lecture 1 we introduced some of the basic conceptual building materials of the course.

More information

Chapter 1 Preliminaries

Chapter 1 Preliminaries Chapter 1 Preliminaries 1.1 Conventions and Notations Throughout the book we use the following notations for standard sets of numbers: N the set {1, 2,...} of natural numbers Z the set of integers Q the

More information

Limiting Distributions

Limiting Distributions Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the

More information

2. Variance and Higher Moments

2. Variance and Higher Moments 1 of 16 7/16/2009 5:45 AM Virtual Laboratories > 4. Expected Value > 1 2 3 4 5 6 2. Variance and Higher Moments Recall that by taking the expected value of various transformations of a random variable,

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Expectation is a positive linear operator

Expectation is a positive linear operator Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 6: Expectation is a positive linear operator Relevant textbook passages: Pitman [3]: Chapter

More information

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan Wir müssen wissen, wir werden wissen. David Hilbert We now continue to study a special class of Banach spaces,

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

17. Convergence of Random Variables

17. Convergence of Random Variables 7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This

More information

Legendre transformation and information geometry

Legendre transformation and information geometry Legendre transformation and information geometry CIG-MEMO #2, v1 Frank Nielsen École Polytechnique Sony Computer Science Laboratorie, Inc http://www.informationgeometry.org September 2010 Abstract We explain

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

1. Stochastic Processes and filtrations

1. Stochastic Processes and filtrations 1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S

More information

(z 0 ) = lim. = lim. = f. Similarly along a vertical line, we fix x = x 0 and vary y. Setting z = x 0 + iy, we get. = lim. = i f

(z 0 ) = lim. = lim. = f. Similarly along a vertical line, we fix x = x 0 and vary y. Setting z = x 0 + iy, we get. = lim. = i f . Holomorphic Harmonic Functions Basic notation. Considering C as R, with coordinates x y, z = x + iy denotes the stard complex coordinate, in the usual way. Definition.1. Let f : U C be a complex valued

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

High Dimensional Probability

High Dimensional Probability High Dimensional Probability for Mathematicians and Data Scientists Roman Vershynin 1 1 University of Michigan. Webpage: www.umich.edu/~romanv ii Preface Who is this book for? This is a textbook in probability

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Lecture Notes for MA 623 Stochastic Processes. Ionut Florescu. Stevens Institute of Technology address:

Lecture Notes for MA 623 Stochastic Processes. Ionut Florescu. Stevens Institute of Technology  address: Lecture Notes for MA 623 Stochastic Processes Ionut Florescu Stevens Institute of Technology E-mail address: ifloresc@stevens.edu 2000 Mathematics Subject Classification. 60Gxx Stochastic Processes Abstract.

More information

Solution of the 8 th Homework

Solution of the 8 th Homework Solution of the 8 th Homework Sangchul Lee December 8, 2014 1 Preinary 1.1 A simple remark on continuity The following is a very simple and trivial observation. But still this saves a lot of words in actual

More information

CHAPTER 3: LARGE SAMPLE THEORY

CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually

More information

Probability Theory I: Syllabus and Exercise

Probability Theory I: Syllabus and Exercise Probability Theory I: Syllabus and Exercise Narn-Rueih Shieh **Copyright Reserved** This course is suitable for those who have taken Basic Probability; some knowledge of Real Analysis is recommended( will

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

CHANGE OF MEASURE. D.Majumdar

CHANGE OF MEASURE. D.Majumdar CHANGE OF MEASURE D.Majumdar We had touched upon this concept when we looked at Finite Probability spaces and had defined a R.V. Z to change probability measure on a space Ω. We need to do the same thing

More information

STA 711: Probability & Measure Theory Robert L. Wolpert

STA 711: Probability & Measure Theory Robert L. Wolpert STA 711: Probability & Measure Theory Robert L. Wolpert 6 Independence 6.1 Independent Events A collection of events {A i } F in a probability space (Ω,F,P) is called independent if P[ i I A i ] = P[A

More information

CHAPTER 1. Metric Spaces. 1. Definition and examples

CHAPTER 1. Metric Spaces. 1. Definition and examples CHAPTER Metric Spaces. Definition and examples Metric spaces generalize and clarify the notion of distance in the real line. The definitions will provide us with a useful tool for more general applications

More information

Your first day at work MATH 806 (Fall 2015)

Your first day at work MATH 806 (Fall 2015) Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies

More information

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions Chapter 5 andom Variables (Continuous Case) So far, we have purposely limited our consideration to random variables whose ranges are countable, or discrete. The reason for that is that distributions on

More information

3 Orthogonality and Fourier series

3 Orthogonality and Fourier series 3 Orthogonality and Fourier series We now turn to the concept of orthogonality which is a key concept in inner product spaces and Hilbert spaces. We start with some basic definitions. Definition 3.1. Let

More information

Exercises to Applied Functional Analysis

Exercises to Applied Functional Analysis Exercises to Applied Functional Analysis Exercises to Lecture 1 Here are some exercises about metric spaces. Some of the solutions can be found in my own additional lecture notes on Blackboard, as the

More information

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm 1. Feedback does not increase the capacity. Consider a channel with feedback. We assume that all the recieved outputs are sent back immediately

More information

Mathematics Department Stanford University Math 61CM/DM Inner products

Mathematics Department Stanford University Math 61CM/DM Inner products Mathematics Department Stanford University Math 61CM/DM Inner products Recall the definition of an inner product space; see Appendix A.8 of the textbook. Definition 1 An inner product space V is a vector

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013. Conditional expectations, filtration and martingales

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013. Conditional expectations, filtration and martingales MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013 Conditional expectations, filtration and martingales Content. 1. Conditional expectations 2. Martingales, sub-martingales

More information

ECE Lecture #9 Part 2 Overview

ECE Lecture #9 Part 2 Overview ECE 450 - Lecture #9 Part Overview Bivariate Moments Mean or Expected Value of Z = g(x, Y) Correlation and Covariance of RV s Functions of RV s: Z = g(x, Y); finding f Z (z) Method : First find F(z), by

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

COMPSCI 240: Reasoning Under Uncertainty

COMPSCI 240: Reasoning Under Uncertainty COMPSCI 240: Reasoning Under Uncertainty Andrew Lan and Nic Herndon University of Massachusetts at Amherst Spring 2019 Lecture 20: Central limit theorem & The strong law of large numbers Markov and Chebyshev

More information

Projection Theorem 1

Projection Theorem 1 Projection Theorem 1 Cauchy-Schwarz Inequality Lemma. (Cauchy-Schwarz Inequality) For all x, y in an inner product space, [ xy, ] x y. Equality holds if and only if x y or y θ. Proof. If y θ, the inequality

More information