MAS113 Introduction to Probability and Statistics. Proofs of theorems

Similar documents
MAS113 Introduction to Probability and Statistics. Proofs of theorems

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

1 Random Variable: Topics

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

1 Review of Probability

Lecture 4: Probability and Discrete Random Variables

Northwestern University Department of Electrical Engineering and Computer Science

Probability and Distributions

3 Continuous Random Variables

Chapter 4. Chapter 4 sections

Expectation of Random Variables

3 Multiple Discrete Random Variables

Continuous Random Variables and Continuous Distributions

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Review of Probability Theory

Topic 3: The Expectation of a Random Variable

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

SDS 321: Introduction to Probability and Statistics

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Slides 8: Statistical Models in Simulation

1 Review of Probability and Distributions

3. Probability and Statistics

Chapter 3: Random Variables 1

More on Distribution Function

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

Class 8 Review Problems solutions, 18.05, Spring 2014

SDS 321: Introduction to Probability and Statistics

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

Part (A): Review of Probability [Statistics I revision]

Analysis of Engineering and Scientific Data. Semester

ENGG2430A-Homework 2

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

STAT 430/510: Lecture 16

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Random variables (discrete)

Math Bootcamp 2012 Miscellaneous

Chapter 3: Random Variables 1

CME 106: Review Probability theory

Expectation is a positive linear operator

4 Pairs of Random Variables

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

2 Continuous Random Variables and their Distributions

CMPSCI 240: Reasoning Under Uncertainty

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

BASICS OF PROBABILITY

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Introduction to Machine Learning

1 Solution to Problem 2.1

STAT 430/510 Probability Lecture 7: Random Variable and Expectation

Random Variables and Their Distributions

Topic 3: The Expectation of a Random Variable

Preliminary Statistics. Lecture 3: Probability Models and Distributions

1.1 Review of Probability Theory

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Probability Review. Chao Lan

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

First and Last Name: 2. Correct The Mistake Determine whether these equations are false, and if so write the correct answer.

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Discrete Probability Refresher

Final Exam # 3. Sta 230: Probability. December 16, 2012

Introduction to Probability

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

SDS 321: Introduction to Probability and Statistics

Things to remember when learning probability distributions:

Recitation 2: Probability

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Chp 4. Expectation and Variance

FINAL EXAM: Monday 8-10am

2 (Statistics) Random variables

Probability Review. Gonzalo Mateos

Lecture 5: Expectation

S n = x + X 1 + X X n.

Examples of random experiment (a) Random experiment BASIC EXPERIMENT

Fundamental Tools - Probability Theory II

Lecture 16. Lectures 1-15 Review

Lectures on Elementary Probability. William G. Faris

Quick Tour of Basic Probability Theory and Linear Algebra

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Math Spring Practice for the final Exam.

Chapter 4: Continuous Random Variables and Probability Distributions

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

Exponential Distribution and Poisson Process

Joint Distribution of Two or More Random Variables

Continuous Random Variables

18.440: Lecture 28 Lectures Review

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Notes for Math 324, Part 19

F X (x) = P [X x] = x f X (t)dt. 42 Lebesgue-a.e, to be exact 43 More specifically, if g = f Lebesgue-a.e., then g is also a pdf for X.

Chapter 3 Discrete Random Variables

Motivation and Applications: Why Should I Study Probability?

Chapter 4. Continuous Random Variables

Bayesian statistics, simulation and software

This does not cover everything on the final. Look at the posted practice problems for other topics.

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Transcription:

MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a measure, ma) = mb) + ma \ B); rearranging gives the result Note that more generally ie not assuming B A) ma \ B) = ma) ma B), by the same argument M2 As ma \ B) 0 by definition of a measure, this follows immediately from M1 M3 Apply M1 with B = A; then A \ A = so the LHS is m ), and the RHS is ma) ma) = 0 M4 We can write A B = A B\A), and the two sets here are disjoint So using the definition of measure ma B) = ma) + mb \ A) Applying M1, we get ma) + mb) ma B) which gives the result Note A B and B A are the same) M5 See Exercise 4 M6 See Exercise 4 Theorem 3 Law of Total Probability) Because the E i form a partition, they are disjoint Hence their intersections with F, F E i, are also disjoint Again because the E i are a partition, any element of F must be in one of them, so the union of F E i for i = 1,, n must be the whole of F, and the previous sentence says it is a disjoint union Hence P F ) = n i=1 P F E i) The second form of the statement which is the more useful one in practice) follows immediately by writing P F E i ) = P E i )P F E i ) from the definition of conditional probability) 1

Theorem 4 Bayes Theorem) By the definition of conditional probability, P E i F ) = P E i F )/P F ) However, we also know from the definition of conditional probability that P E i F ) = P E i )P F E i ) Hence P E i F ) = P E i)p F E i ) P F ) Theorem 5 Before the full proof, consider Example 35 again Here we have a random variable X with range R X = { 1, 0, 1}, and we let Y = X 2 Thus R Y = {0, 1} By definition, we have EY ) = y R Y yp Y = y) = P Y = 1) after a bit of simplification) So we need to consider the event {Y = 1} For Y to be 1 means that either X = 1 or X = 1, and by the obvious) disjointness of the two possibilities P Y = 1) = P X = 1)+P X = 1), so we can say that EY ) = P X = 1)+P X = 1) Now consider the general case, and let Y = gx) Then, by definition, EY ) = y R Y yp Y y) = y R Y yp Y = y) In the example above, we split the event {Y = 1} up into events in terms of X which give Y = 1 More generally, the event {Y = y} is the disjoint union of events {X = x} for each x R X such that gx) = y If g is injective, there will be only one event in the union) So EY ) = yp Y = y) y R Y = y R Y y =,gx)=y y R Y,gx)=y = y R Y,gx)=y P X = x) yp X x) gx)p X x), and the double sum here is equivalent to, giving the result Theorem 6 By definition and Theorem 5, VarX) = EX EX)) 2 ) = 2 x EX)) 2 p X x)

Expanding the brackets, we have VarX) = x 2 p X x) 2EX) xp X x) + EX) 2 p X x) Note that here we have used the fact that 2EX) and EX) 2 are constants which do not depend on x, so can be taken outside the sum Then xp X x) = EX), by definition, and p X x) = 1 as p X is a probability mass function, so we get VarX) = EX 2 ) 2EX)EX) + EX) 2 = EX 2 ) EX) 2, as required Theorem 7, mean part By Theorem 5, EaX + b) = ax + b)p X x) = a xp X x) + b xp X x) = aex) + b, again using xp X x) = EX) and p X x) = 1 Hence we have the result Theorem 7, variance part By definition, By the mean part, we get VaraX + b) = EaX + b EaX + b)) 2 VaraX + b) = EaX + b aex) b)) 2 ) = EaX EX))) 2 ) = Ea 2 X EX)) 2 ) = a 2 VarX) Theorem 8 The definition of expectation gives EX + Y ) = zp X + Y = z) z R X+Y 3

Now, if z R X+Y we can write z = x + y where x R X and y R Y Hence we can replace the sum over z by a sum over x and y: EX + Y ) = x + y)p X = x, Y = y) y R Y Split the sum up: EX + Y ) = xp X = x, Y = y) + yp X = x, Y = y) y R Y y R Y = x P X = x, Y = y) + y P X = x, Y = y) y R Y y R Y If R X or R Y is infinite, you ll need to take on trust that the reversal of the order of summation is OK here) Now y R Y P X = x, Y = y) = P X = x), and similarly P X = x, Y = y) = P Y = y) So we get EX + Y ) = xp X = x) + yp Y = y) y R Y = EX) + EY ) Theorem 9 This is similar to Theorem 8 Start with EXY ) = z R XY zp XY = z) = y R Y xy)p X = x, Y = y) By independence, P X = x, Y = y) = P X = x)p Y = y), so we get EXY ) = xy)p X = x)p Y = y) y R Y Now, with respect to y, we can regard x and P X = x) as constants, so we take them out of the sum with respect to y, and get EXY ) = xp X = x) yp Y = y), y R Y which immediately gives EXY ) = EX)EY ) 4

Corollary 10 Start with the variance identity Theorem 6): VarX + Y ) = EX + Y ) 2 ) EX + Y )) 2 Use Theorems 8 and 7 to get VarX + Y ) = EX 2 + 2XY + Y 2 ) EX)) 2 EY )) 2 2EX)EY ) = EX 2 ) + EY 2 ) + 2EXY ) EX)) 2 EY )) 2 2EX)EY ) = VarX) + VarY ) + 2EXY ) EX)EY )), and by Theorem 9 EXY ) EX)EY ) = 0, giving the result Theorem 11 This is an easy exercise with the definitions of mean and variance: EX) = 0 1 p) + 1 p = p and EX 2 ) is also p since X only takes values 0 and 1, so X and X 2 are actually the same) Hence VarX) = EX 2 ) = EX)) 2 = p p 2 = p1 p) Theorem 12 Use the fact that X = n i=1 Z i, where Z i = 1 if trial i is a success and Z i = 0 if it is a failure By assumption, the Z i are independent Thus Theorems 8 and 11 tell us n EX) = E Z i ) = i=1 n EZ i ) = np, i=1 and Corollary 10 and Theorem 11 give n VarX) = Var Z i ) = i=1 n VarZ i ) = np1 p) i=1 Theorem 13 We are looking for n x which, writing ) n x = n! depend on n, becomes λ x x! x!n x)! ) λ n ) x 1 λ n) n x, and factoring out terms which do not nn 1)n 2) n x + 1) 1 λ ) x 1 λ n n x n n) 5

Now, and n 1 n = n 2 n is also 1 So we are left with = = n x + 1 n 1 λ ) x n λ x x! 1 λ n) n = 1, By Note 639 in MAS110, or below, 1 λ n) n = e λ, so we are left with e λ λ x as required Limit of 1 + n) r n This it is needed for Theorem 13, and also occurs in other areas of mathematics One example is compound interest in financial mathematics) In fact, the exponential function is sometimes defined to be equal to this it; the argument below assumes that we have a different definition of the exponential function which implies the familiar properties of the exponential and logarithmic functions, including the derivative of the latter x! Assume r 0 We can start off with the fact that h 0 log1+h) h = 1 This is the statement that the derivative of log x at x = 1 is 1 Multiply both sides by r to get r log1 + h) = r, h 0 h and now let n = r/h so that h = r/n If r > 0 then n corresponds to h 0 from above, and if r < 0 the case we actually need in Theorem 13) then n corresponds to h 0 from below, but the it is valid in both cases So 1 n log + r ) = r, n 6

and now take exponentials of both sides and use the continuity of the exponential function to obtain 1 + n) r n = e r Theorem 14: valid pmf As p X x) 0, we just need to check x=0 p Xx) = 1 Checking, we have p X x) = x=0 e λ λ x x=0 x! = e λ x=0 λ x x! = e λ e λ = 1, recognising the sum as the series expansion of the exponential function Theorem 14: mean and variance We have EX) = x=0 x e λ λ x x! = e λ x=1 λ x x 1)! using x! = xx 1)!) Changing variables to y = x 1, we get EX) = e λ λ y+1 y! y=0 = λe λ y=0 λ y y! = λe λ e λ = λ For the variance, we have EX 2 ) = λ 2 + λ see Exercise 36) and thus VarX) = λ 2 + λ) λ) 2 = λ Theorem 15 For x N, F X x) = x P X = a) = a=1 x a=1 1 p) a 1 p = p 1 p)x p 1 1 p) geometric series with x terms, first term p and common ratio 1 p) and that simplifies to 1 1 p) x as required Theorem 16 By the definition of mean, EX) = x1 p) x 1 p x=1 7

The Binomial Theorem negative integer case) tells us that for θ < 1, 1 θ) 2 = n=0 n + 1)θn = m=1 mθm 1, which you can also obtain by differentiating term by term the formula for the sum of an infinite geometric series Using this with θ = 1 p gives x1 p) x 1 p = p x1 p) x 1 = p1 1 p)) 2 = 1 p x=1 x=1 For the variance, we start by finding EXX 1)) = xx 1)1 p) x 1 p = x=1 xx 1)1 p) x 1 p, as the x = 1 term is zero Again the Binomial Theorem or term by term differentiation says x=2 1 θ) 3 = m=2 x=2 mm 1) θ m 2, 2 and thus xx 1)1 p) x 1 p = 2p1 p) xx 1)1 p) x 2 = p1 p)1 1 p)) 3 = x=2 21 p) p 2 Now, EX 2 ) = EXX 1) + X) = 1 p p 2 + 1 p, and VarX) = EX 2 ) EX)) 2 = 21 p) p 2 + 1 p 1 p 2 = 1 p p 2 Theorem 17 For x 0, F X x) = P X x) = x 0 λe λt dt = 1 e λx Note that if x < 0, P X x) = 0 as x cannot be negative, so in full { 1 e λx x 0 F X x) = 0 x < 0 Theorem 18 We have EX) = λxe λx dx Integration by parts gives 0 λ [ 1 ) λ xe λx ] 1 0 + λ e λx dx 8 0

As xe λx 0 as x, we get 0 e λx dx which gives 1/λ In the lecture I did this carefully, with the improper integral treated as a it of integrals from 0 to t as t ) For the variance see exercise 45 for EX 2 ) = 2/λ 2, from which it follows that VarX) = 1/λ 2 Theorem 19 By the definition of conditional probability, the left hand side is P {X > x + a} {X > a}) P X > a) However {X > x + a} {X > a} = {X > x + a}, so we get P X > x + a) P X > a) = e λx+a) e λa = e λx = P X > x), where we have used Theorem 18 and the fact that it implies P X > x) = e λx for all x > 0 Theorem 20 For x [a, b], F X x) = x a Theorem 21 We have EX) = b For the variance, first find EX 2 ) = Then b a a 1 b a x 1 [ b a dx = x 2 2b a) [ x 2 1 b a dx = x 3 3b a) VarX) = b2 + ab + a 2 3 Theorem 22 By definition, Φ z) = a + b)2 4 z dt = x a b a ] b a ] b a = b2 a 2 2b a) = b + a 2 = b3 a 3 3b a) = b2 + ab + a 2 3 = b2 + a 2 2ab 12 φt) dt = b a)2 12 9

Change variables to s = t and use the symmetry of φ to get Φ z) = which is 1 Φz) as required Theorem 23 For the expectation, EZ) = z φ s) ds = zφz) dz = 1 z φs) ds, ze z2 /2 dz Considering the improper integral as a it, this is which becomes 1 1 s,t /2 s,t [e z2 ] t s = 1 t For the variance, we need to calculate s EZ 2 ) = 1 ze z2 /2 dz, /2 s,t e t2 e s2 /2 ) = 0 z 2 e z2 /2 dz Writing z 2 e z2 /2 as z ze z2 /2 and integrating by parts, we get EZ 2 ) = 1 [ ] ) ze z2 /2 + e z2 /2 dz The integral is just the integral of the Normal pdf again, so is 1, and ze z2 /2 0 both as z and z Hence we get EZ 2 ) = 1, and so VarZ) = 1 EZ) 2 = 1, as EZ) = 0 Theorem 24 If X = µ+σz, consider the cumulative) distribution function of X: F X x) = P X x) = P µ + σz x) = P Z x µ ) σ ) x µ = Φ σ 10

To get the probability density function of X, differentiate, using the chain rule: f X x) = F Xx) = 1 ) x µ σ φ σ That EX) = µ and VarX) = σ 2 follows from Theorem 7 11