Probability. Prof. F.P. Kelly. Lent These notes are maintained by Paul Metcalfe. Comments and corrections to

Size: px
Start display at page:

Download "Probability. Prof. F.P. Kelly. Lent These notes are maintained by Paul Metcalfe. Comments and corrections to"

Transcription

1 Probability Prof. F.P. Kelly Lent 996 These notes are maintained by Paul Metcalfe. Comments and corrections to

2 Revision:. Date: 2002/09/20 4:45:43 The following people have maintained these notes. June 2000 Kate Metcalfe June 2000 July 2004 Andrew Rogers July 2004 date Paul Metcalfe

3 Contents Introduction v Basic Concepts. Sample Space Classical Probability Combinatorial Analysis Stirling s Formula The Axiomatic Approach 5 2. The Axioms Independence Distributions Conditional Probability Random Variables 3. Expectation Variance Indicator Function Inclusion - Exclusion Formula Independence Inequalities Jensen s Inequality Cauchy-Schwarz Inequality Markov s Inequality Chebyshev s Inequality Law of Large Numbers Generating Functions 3 5. Combinatorial Applications Conditional Expectation Properties of Conditional Expectation Branching Processes Random Walks Continuous Random Variables Jointly Distributed Random Variables Transformation of Random Variables Moment Generating Functions iii

4 iv CONTENTS 6.4 Central Limit Theorem Multivariate normal distribution

5 Introduction These notes are based on the course Probability given by Prof. F.P. Kelly in Cambridge in the Lent Term 996. This typed version of the notes is totally unconnected with Prof. Kelly. Other sets of notes are available for different courses. At the time of typing these courses were: Probability Discrete Mathematics Analysis Further Analysis Methods Quantum Mechanics Fluid Dynamics Quadratic Mathematics Geometry Dynamics of D.E. s Foundations of QM Electrodynamics Methods of Math. Phys Fluid Dynamics 2 Waves (etc.) Statistical Physics General Relativity Dynamical Systems Combinatorics Bifurcations in Nonlinear Convection They may be downloaded from v

6 vi INTRODUCTION

7 Chapter Basic Concepts. Sample Space Suppose we have an experiment with a set Ω of outcomes. Then Ω is called the sample space. A potential outcome ω Ω is called a sample point. For instance, if the experiment is tossing coins, then Ω {H, T }, or if the experiment was tossing two dice, then Ω {(i, j) : i, j {,..., 6}}. A subset A of Ω is called an event. An event A occurs is when the experiment is performed, the outcome ω Ω satisfies ω A. For the coin-tossing experiment, then the event of a head appearing is A {H} and for the two dice, the event rolling a four would be A {(, 3), (2, 2), (3, )}..2 Classical Probability If Ω is finite, Ω {ω,..., ω n }, and each of the n sample points is equally likely then the probability of event A occurring is P(A) A Ω Example. Choose r digits from a table of random numbers. Find the probability that for 0 k 9,. no digit exceeds k, 2. k is the greatest digit drawn. Solution. The event that no digit exceeds k is A k {(a,..., a r ) : 0 a i k, i... r}. Now A k (k + ) r, so that P(A k ) ( ) k+ r. 0 Let B k be the event that k is the greatest digit drawn. Then B k A k \ A k. Also A k A k, so that B k (k + ) r k r. Thus P(B k ) (k+)r k r 0 r

8 2 CHAPTER. BASIC CONCEPTS The problem of the points Players A and B play a series of games. The winner of a game wins a point. The two players are equally skillful and the stake will be won by the first player to reach a target. They are forced to stop when A is within 2 points and B within 3 points. How should the stake be divided? Pascal suggested that the following continuations were equally likely AAAA AAAB AABB ABBB BBBB AABA ABBA BABB ABAA ABAB BBAB BAAA BABA BBBA BAAB BBAA This makes the ratio : 5. It was previously thought that the ratio should be 6 : 4 on considering termination, but these results are not equally likely..3 Combinatorial Analysis The fundamental rule is: Suppose r experiments are such that the first may result in any of n possible outcomes and such that for each of the possible outcomes of the first i experiments there are n i possible outcomes to experiment i. Let a i be the outcome of experiment i. Then there are a total of r i n i distinct r-tuples (a,..., a r ) describing the possible outcomes of the r experiments. Proof. Induction..4 Stirling s Formula For functions g(n) and h(n), we say that g is asymptotically equivalent to h and write g(n) h(n) if g(n) h(n) as n. Theorem. (Stirling s Formula). As n, and thus n! 2πnn n e n. log n! 2πnnn e n 0 We first prove the weak form of Stirling s formula, that log(n!) n log n. Proof. log n! n log k. Now n log xdx and z logx dx z log z z +, and so n log k n+ log xdx, n log n n + log n! (n + ) log(n + ) n. Divide by n log n and let n to sandwich Therefore log n! n log n. log n! n log n between terms that tend to.

9 .4. STIRLING S FORMULA 3 Now we prove the strong form. Proof. For x > 0, we have Now integrate from 0 to y to obtain Let h n log x + x 2 x 3 < + x < x + x2. y y 2 /2 + y 3 /3 y 4 /4 < log( + y) < y y 2 /2 + y 3 /3. n!en. Then we obtain n n+/2 2n 2 2n 3 h n h n+ 2n 2 + 6n 3. For n 2, 0 h n h n+ n. Thus h 2 n is a decreasing sequence, and 0 h 2 h n+ n r2 (h r h r+ ) r. Therefore h 2 n is bounded below, decreasing so is convergent. Let the limit be A. We have obtained n! e A n n+/2 e n. We need a trick to find A. Let I r π/2 sin r θ dθ. We obtain the recurrence I 0 r r r I r 2 by integrating by parts. Therefore I 2n (2n)! (2 n n!) π/2 and I 2 2n+ (2n n!) 2 (2n+)!. Now I n is decreasing, so I 2n I 2n+ I 2n I 2n+ + 2n. But by substituting our formula in, we get that I 2n I 2n+ π 2 Therefore e 2A 2π as required. 2n + n 2 2π e2a e 2A. by playing silly buggers with log + n

10 4 CHAPTER. BASIC CONCEPTS

11 Chapter 2 The Axiomatic Approach 2. The Axioms Let Ω be a sample space. Then probability P is a real valued function defined on subsets of Ω satisfying :-. 0 P(A) for A Ω, 2. P(Ω), 3. for a finite or infinite sequence A, A 2, Ω of disjoint events, P( A i ) i P(A u). The number P(A) is called the probability of event A. We can look at some distributions here. Consider an arbitrary finite or countable Ω {ω, ω 2,... } and an arbitrary collection {p, p 2,... } of non-negative numbers with sum. If we define P(A) p i, i:ω i A it is easy to see that this function satisfies the axioms. The numbers p, p 2,... are called a probability distribution. If Ω is finite with n elements, and if p p 2 p n n we recover the classical definition of probability. Another example would be to let Ω {0,,... } and attach to outcome r the λ λr probability p r e r! for some λ > 0. This is a distribution (as may be easily verified), and is called the Poisson distribution with parameter λ. Theorem 2. (Properties of P). A probability P satisfies. P(A c ) P(A), 2. P( ) 0, 3. if A B then P(A) P(B), 4. P(A B) P(A) + P(B) P(A B). 5

12 6 CHAPTER 2. THE AXIOMATIC APPROACH Proof. Note that Ω A A c, and A A c. Thus P(Ω) P(A)+P(A c ). Now we can use this to obtain P( ) P( c ) 0. If A B, write B A (B A c ), so that P(B) P(A) + P(B A c ) P(A). Finally, write A B A (B A c ) and B (B A) (B A c ). Then P(A B) P(A) + P(B A c ) and P(B) P(B A) + P(B A c ), which gives the result. Theorem 2.2 (Boole s Inequality). For any A, A 2, Ω, ( n ) n P A i P(A i ) ( ) P A i P(A i ) Proof. Let B A and then inductively let B i A i \ i B k. Thus the B i s are disjoint and i B i i A i. Therefore ( ) ( ) P A i P B i i i i P(B i ) i i i P(A i ) as B i A i. Theorem 2.3 (Inclusion-Exclusion Formula). ( n ) P A i ( ) S P A j. j S S {,...,n} S Proof. We know that P(A A 2 ) P(A ) + P(A 2 ) P(A A 2 ). Thus the result is true for n 2. We also have that P(A A n ) P(A A n ) + P(A n ) P((A A n ) A n ). But by distributivity, we have ( n ) P A i P i ( n ) A i + P(A n ) P Application of the inductive hypothesis yields the result. Corollary (Bonferroni Inequalities). S {,...,r} S ( ) S P A j j S ( n or P ) (A i A n ). ( n ) A i according as r is even or odd. Or in other words, if the inclusion-exclusion formula is truncated, the error has the sign of the omitted term and is smaller in absolute value. Note that the case r is Boole s inequality.

13 2.2. INDEPENDENCE 7 Proof. The result is true for n 2. If true for n, then it is true for n and r n by the inductive step above, which expresses a n-union in terms of two n unions. It is true for r n by the inclusion-exclusion formula. Example (Derangements). After a dinner, the n guests take coats at random from a pile. Find the probability that at least one guest has the right coat. Solution. Let A k be the event that guest k has his own coat. We want P( n i A i). Now, P(A i A ir ) (n r)!, n! by counting the number of ways of matching guests and coats after i,..., i r have taken theirs. Thus ( ) n (n r)! P(A i A ir ) r n! r!, i < <i r and the required probability is ( n ) P A i i which tends to e as n. 2! + 3! + + ( )n, n! Furthermore, let P m (n) be the probability that exactly m guests take the right coat. Then P 0 (n) e and n! P 0 (n) is the number of derangements of n objects. Therefore ( ) n P0 (n m) (n m)! P m (n) m n! 2.2 Independence P 0(n m) m! e m! as n. Definition 2.. Two events A and B are said to be independent if P(A B) P(A) P(B). More generally, a collection of events A i, i I are independent if ( ) P A i P(A i ) i J for all finite subsets J I. i J Example. Two fair dice are thrown. Let A be the event that the first die shows an odd number. Let A 2 be the event that the second die shows an odd number and finally let A 3 be the event that the sum of the two numbers is odd. Are A and A 2 independent? Are A and A 3 independent? Are A, A 2 and A 3 independent? I m not being sexist, merely a lazy typist. Sex will be assigned at random...

14 8 CHAPTER 2. THE AXIOMATIC APPROACH Solution. We first calculate the probabilities of the events A, A 2, A 3, A A 2, A A 3 and A A 2 A 3. Event Probability A A 2 As above, 2 A A A A A A A 2 A 3 0 Thus by a series of multiplications, we can see that A and A 2 are independent, A and A 3 are independent (also A 2 and A 3 ), but that A, A 2 and A 3 are not independent. Now we wish to state what we mean by 2 independent experiments 2. Consider Ω {α,... } and Ω 2 {β,... } with associated probability distributions {p,... } and {q,... }. Then, by 2 independent experiments, we mean the sample space Ω Ω 2 with probability distribution P((α i, β j )) p i q j. Now, suppose A Ω and B Ω 2. The event A can be interpreted as an event in Ω Ω 2, namely A Ω 2, and similarly for B. Then P(A B) p i q j p i q j P(A) P(B), α i A β j B α i A β j B which is why they are called independent experiments. The obvious generalisation to n experiments can be made, but for an infinite sequence of experiments we mean a sample space Ω Ω 2... satisfying the appropriate formula n N. You might like to find the probability that n independent tosses of a biased coin with the probability of heads p results in a total of r heads. 2.3 Distributions The binomial distribution with parameters n and p, 0 p has Ω {0,..., n} and probabilities p i ( n i) p i ( p) n i. Theorem 2.4 (Poisson approximation to binomial). If n, p 0 with np λ held fixed, then ( n )p r ( p) n r λ λr e r r!. 2 or more generally, n.

15 2.4. CONDITIONAL PROBABILITY 9 Proof. ( ) n p r ( p) n r r n(n )... (n r + ) p r ( p) n r r! n n n r ( n i + n i n... n r + (np) r ( p) n r n r! ) ( λ r λ r! n λr r! e λ λ λr e r!. ) n ( λ n ) r Suppose an infinite sequence of independent trials is to be performed. Each trial results in a success with probability p (0, ) or a failure with probability p. Such a sequence is called a sequence of Bernoulli trials. The probability that the first success occurs after exactly r failures is p r p( p) r. This is the geometric distribution with parameter p. Since 0 p r, the probability that all trials result in failure is zero. 2.4 Conditional Probability Definition 2.2. Provided P(B) > 0, we define the conditional probability of A B 3 to be P(A B) P(A B). P(B) Whenever we write P(A B), we assume that P(B) > 0. Note that if A and B are independent then P(A B) P(A). Theorem P(A B) P(A B) P(B), 2. P(A B C) P(A B C) P(B C) P(C), 3. P(A B C) P(A B C) P(B C), 4. the function P( B) restricted to subsets of B is a probability function on B. Proof. Results to 3 are immediate from the definition of conditional probability. For result 4, note that A B B, so P(A B) P(B) and thus P(A B). P(B B) (obviously), so it just remains to show the last axiom. For disjoint A i s, ( ) P A i B P( i (A i B)) P(B) i i P(A i B) P(B) i P(A i B), as required. 3 read A given B.

16 0 CHAPTER 2. THE AXIOMATIC APPROACH Theorem 2.6 (Law of total probability). Let B, B 2,... be a partition of Ω. Then P(A) i P(A B i ) P(B i ). Proof. P(A Bi ) P(B i ) P(A B i ) ( ) P A B i i P(A), as required. Example (Gambler s Ruin). A fair coin is tossed repeatedly. At each toss a gambler wins if a head shows and loses if tails. He continues playing until his capital reaches m or he goes broke. Find p x, the probability that he goes broke if his initial capital is x. Solution. Let A be the event that he goes broke before reaching m, and let H or T be the outcome of the first toss. We condition on the first toss to get P(A) P(A H) P(H) + P(A T ) P(T ). But P(A H) p x+ and P(A T ) p x. Thus we obtain the recurrence p x+ p x p x p x. Note that p x is linear in x, with p 0, p m 0. Thus p x x m. Theorem 2.7 (Bayes Formula). Let B, B 2,... be a partition of Ω. Then P(B i A) P(A B i) P(B i ) j P(A B j) P(B j ). Proof. P(B i A) P(A B i) P(A) by the law of total probability. P(A B i) P(B i ) j P(A B j) P(B j ),

17 Chapter 3 Random Variables Let Ω be finite or countable, and let p ω P({ω}) for ω Ω. Definition 3.. A random variable X is a function X : Ω R. Note that random variable is a somewhat inaccurate term, a random variable is neither random nor a variable. Example. If Ω {(i, j), i, j t}, then we can define random variables X and Y by X(i, j) i + j and Y (i, j) max{i, j} Let R X be the image of Ω under X. When the range is finite or countable then the random variable is said to be discrete. We write P(X x i ) for ω:x(ω)x i p ω, and for B R P(X B) x B P(X x). Then (P(X x), x R X ) is the distribution of the random variable X. Note that it is a probability distribution over R X. 3. Expectation Definition 3.2. The expectation of a random variable X is the number E[X] ω Ω p w X(ω) provided that this sum converges absolutely.

18 2 CHAPTER 3. RANDOM VARIABLES Note that E[X] p w X(ω) ω Ω p ω X(ω) x R X ω:x(ω)x x R X x ω:x(ω)x p ω x R X xp(x x). Absolute convergence allows the sum to be taken in any order. If X is a positive random variable and if ω Ω p ωx(ω) we write E[X] +. If x R X x 0 xp(x x) and xp(x x) x R X x<0 then E[X] is undefined. Example. If P(X r) e λ λr r!, then E[X] λ. Solution. E[X] r0 λ λr re r! λe λ r λ r (r )! λe λ e λ λ Example. If P(X r) ( n r) p r ( p) n r then E[X] np.

19 3.. EXPECTATION 3 Solution. E[X] n ( ) n rp r ( p) n r r r0 n n! r r!(n r)! pr ( p) n r r0 n (n )! n (r )!(n r)! pr ( p) n r np r n r (n )! (r )!(n r)! pr ( p) n r n (n )! np (r)!(n r)! pr ( p) n r r n ( ) n np p r ( p) n r r np r For any function f : R R the composition of f and X defines a new random variable f and X defines the new random variable f(x) given by f(x)(w) f(x(w)). Example. If a, b and c are constants, then a + bx and (X c) 2 are random variables defined by Note that E[X] is a constant. Theorem 3... If X 0 then E[X] 0. (a + bx)(w) a + bx(w) and (X c) 2 (w) (X(w) c) If X 0 and E[X] 0 then P(X 0). 3. If a and b are constants then E[a + bx] a + be[x]. 4. For any random variables X, Y then E[X + Y ] E[X] + E[Y ]. [ 5. E[X] is the constant which minimises E (X c) 2]. Proof.. X 0 means X w 0 w Ω So E[X] ω Ω p ω X(ω) 0 2. If ω Ω with p ω > 0 and X(ω) > 0 then E[X] > 0, therefore P(X 0).

20 4 CHAPTER 3. RANDOM VARIABLES 3. E[a + bx] (a + bx(ω)) p ω ω Ω a p ω + b p ω X(ω) ω Ω ω Ω a + E[X]. 4. Trivial. 5. Now E [ (X c) 2] E [ (X E[X] + E[X] c) 2] E [ [(X E[X]) 2] + 2(X E[X])(E[X] c) + [(E[X] c)] 2 ] E [ (X E[X]) 2] + 2(E[X] c)e[(x E[X])] + (E[X] c) 2 E [ (X E[X]) 2] + (E[X] c) 2. This is clearly minimised when c E[X]. Theorem 3.2. For any random variables X, X 2,..., X n [ n ] n E X i E[X i ] i i Proof. [ n ] [ n ] E X i E X i + X n i E i [ n ] X i + E[X] i Result follows by induction. 3.2 Variance Var X E [ X 2] E[X] 2 Standard Deviation Var X E[X E[X]] 2 σ 2 for Random Variable X Theorem 3.3. Properties of Variance (i) Var X 0 if Var X 0, then P(X E[X]) Proof - from property of expectation (ii) If a, b constants, Var (a + bx) b 2 Var X

21 3.2. VARIANCE 5 Proof. Var a + bx E[a + bx a be[x]] b 2 E[X E[X]] b 2 Var X (iii) Var X E [ X 2] E[X] 2 Proof. E[X E[X]] 2 E [ X 2 2XE[X] + (E[X]) 2] E [ X 2] 2E[X] E[X] + E[X] 2 E [ X 2] (E[X]) 2 Example. Let X have the geometric distribution P(X r) pq r with r 0,, 2... and p + q. Then E[X] q p and Var X q p 2. Solution. E[X] rpq r pq E [ X 2] r0 pq r0 r0 rq r d dq (qr ) pq d ( ) dq q pq( q) 2 q p r 2 p 2 q 2r r0 ( ) pq r(r + )q r rq r r r 2 pq( ( q) 3 ( q) 2 2q p 2 q p Var X E [ X 2] E[X] 2 2q p 2 q p q2 p q p 2 Definition 3.3. The co-variance of random variables X and Y is: The correlation of X and Y is: Cov(X, Y ) E[(X E[X])(Y E[Y ])] Corr(X, Y ) Cov(X, Y ) Var X Var Y

22 6 CHAPTER 3. RANDOM VARIABLES Linear Regression Theorem 3.4. Var (X + Y ) Var X + Var Y + 2Cov(X, Y ) Proof. Var (X + Y ) E [ (X + Y ) 2 E[X] E[Y ] ] 2 E [ (X E[X]) 2 + (Y E[Y ]) 2 + 2(X E[X])(Y E[Y ]) ] Var X + Var Y + 2Cov(X, Y ) 3.3 Indicator Function Definition 3.4. The Indicator Function I[A] of an event A Ω is the function I[A](w) {, if ω A; 0, if ω / A. (3.) NB that I[A] is a random variable. E[I[A]] P(A) E[I[A]] p ω I[A](w) ω Ω P(A) 2. I[A c ] I[A] 3. I[A B] I[A]I[B] 4. I[A B] I[A] + I[B] I[A]I[B] I[A B](ω) if ω A or ω B I[A B](ω) I[A](ω) + I[B](ω) I[A]I[B](ω) WORKS! Example. n couples are arranged randomly around a table such that males and females alternate. Let N The number of husbands sitting next to their wives. Calculate

23 3.3. INDICATOR FUNCTION 7 the E[N] and the Var N. N n I[A i ] i [ n ] E[N] E I[A i ] i n E[I[A i ]] i n i 2 n Thus E[N] n 2 n 2 E [ ( n ) 2 N 2] E I[A i ] i n E I[A i ] i A i event couple i are together i j I[A i ]I[A j ] ne [ I[A i ] 2] + n(n )E[(I[A ]I[A 2 ])] E [ I[A i ] 2] E[I[A i ]] 2 n E[(I[A ]I[A 2 ])] IE[[A B 2 ]] P(A A 2 ) P(A ) P(A 2 A ) 2 ( n n n n 2 ) 2 n n Var N E [ N 2] E[N] 2 2 ( + 2(n 2)) 2 n 2(n 2) n

24 8 CHAPTER 3. RANDOM VARIABLES 3.4 Inclusion - Exclusion Formula ( N N A i A c i [ N ] [( N I A i I I ) c A c i [ N ) c ] A c i ] N I[A c i] N ( I[A i ]) N I[A i ] i i 2 I[A ]I[A 2 ] ( ) j+ I[A ]I[A 2 ]...I[A j ] +... i i 2... i j Take Expectation [ N ] ( N ) E A i P A i N P(A i ) i i 2 P(A A 2 ) ( ) j+ i i 2... i j P ( Ai A i2... A ij ) Independence Definition 3.5. Discrete random variables X,..., X n are independent if and only if for any x...x n : P(X x, X 2 x 2...X n x n ) N P(X i x i ) Theorem 3.5 (Preservation of Independence). If X,..., X n are independent random variables and f, f 2...f n are functions R R then f (X )...f n (X n ) are independent random variables

25 3.5. INDEPENDENCE 9 Proof. P(f (X ) y,..., f n (X n ) y n ) x :f (X )y,... x n:f n(x n)y n N P(X i x i ) x i:f i(x i)y i N P(f i (X i ) y i ) P(X x,..., X n x n ) Theorem 3.6. If X...X n are independent random variables then: [ N ] E X i N E[X i ] NOTE that E[ X i ] E[X i ] without requiring independence. Proof. Write R i for R Xi the range of X i [ N ] E X i... x..x n P(X x, X 2 x 2..., X n x n ) x R x n R n ( ) N P(X i x i ) x i R i N E[X i ] Theorem 3.7. If X,..., X n are independent random variables and f...f n are function R R then: [ N ] N E f i (X i ) E[f i (X i )] Proof. Obvious from last two theorems! Theorem 3.8. If X,..., X n are independent random variables then: ( n ) n Var X i Var X i i i

26 20 CHAPTER 3. RANDOM VARIABLES Proof. ( n ) ( n ) 2 [ n ] 2 Var X i E X i E X i i i i E Xi 2 + [ n ] 2 X i X j E X i i i j i E [ X 2 ] i + E[X i X j ] E[X i ] 2 E[X i ] E[X j ] i i j i i j ( E [ X 2 ] i E[Xi ] 2) i i Var X i Theorem 3.9. If X,..., X n are independent identically distributed random variables then Proof. Var Var ( n ( n ) n X i n Var X i i ) n X i n 2 Var X i i n n 2 Var X i i n Var X i Example. Experimental Design. Two rods of unknown lengths a, b. A rule can measure the length but with but with error having 0 mean (unbiased) and variance σ 2. Errors independent from measurement to measurement. To estimate a, b we could take separate measurements A, B of each rod. E[A] a Var A σ 2 E[B] b Var B σ 2

27 3.5. INDEPENDENCE 2 Can we do better? YEP! Measure a + b as X and a b as Y E[X] a + b Var X σ 2 E[Y ] a b Var Y σ 2 [ ] X + Y E a 2 Var X + Y 2 2 σ2 [ ] X Y E b 2 Var X Y 2 2 σ2 So this is better. Example. Non standard dice. You choose then I choose one. Around this cycle a B P(A B) 2 3. So the relation A better that B is not transitive.

28 22 CHAPTER 3. RANDOM VARIABLES

29 Chapter 4 Inequalities 4. Jensen s Inequality A function f; (a, b) R is convex if f(px + qy) pf(x) + ( p)f(y) - x, y (a, b) - p (0, ) Strictly convex if strict inequality holds when x y f is concave if f is convex. f is strictly concave if f is strictly convex 23

30 24 CHAPTER 4. INEQUALITIES Concave neither concave or convex. We know that if f is twice differentiable and f (x) 0 for x (a, b) the if f is convex and strictly convex if f (x) 0 forx (a, b). Example. f(x) is strictly convex on (0, ) Example. Strictly concave. f(x) log x f (x) x f (x) x 2 0 f(x) x log x f (x) ( + logx) f (x) x 0 Example. f(x x 3 is strictly convex on (0, ) but not on (, ) Theorem 4.. Let f : (a, b) R be a convex function. Then: ( n n ) p i f(x i ) f p i x i i x,..., X n (a, b), p,..., p n (0, ) and p i. Further more if f is strictly convex then equality holds if and only if all x s are equal. i E[f(X)] f(e[x])

31 4.. JENSEN S INEQUALITY 25 Proof. By induction on n n nothing to prove n 2 definition of convexity. Assume results holds up to n-. Consider x,..., x n (a, b), p,..., p n (0, ) and pi For i 2...n, set p i p n i, such that p i p i i2 Then by the inductive hypothesis twice, first for n-, then for 2 n n p i f i (x i ) p f(x ) + ( p ) p if(x i ) ( n ) f p i x i i i2 ( n ) p f(x ) + ( p )f p ix i f ( p x + ( p ) i2 ) n p ix i f is strictly convex n 3 and not all the x i s equal then we assume not all of x 2...x n are equal. But then ( n n ) ( p j ) p if(x i ) ( p j )f p ix i So the inequality is strict. i2 Corollary (AM/GM Inequality). Positive real numbers x,..., x n ( n i x i ) n n n i Equality holds if and only if x x 2 x n Proof. Let P(X x i ) n then f(x) log x is a convex function on (0, ). So x i i2 i2 E[f(x)] f (E[x]) (Jensen s Inequality) E[log x] log E[x] [] Therefore n log x i log n x n n ( n i x i ) n n n x i [2] For strictness since f strictly convex equation holds in [] and hence [2] if and only if x x 2 x n i

32 26 CHAPTER 4. INEQUALITIES If f : (a, b) R is a convex function then it can be shown that at each point y (a, b) a linear function α y + β y x such that f(x) α y + β y x x (a, b) f(y) α y + β y y If f is differentiable at y then the linear function is the tangent f(y) + (x y)f (y) Let y E[X], α α y and β β y f (E[X]) α + βe[x] So for any random variable X taking values in (a, b) E[f(X)] E[α + βx] α + βe[x] f (E[X]) 4.2 Cauchy-Schwarz Inequality Theorem 4.2. For any random variables X, Y, Proof. For a, b R Let E[XY ] 2 E [ X 2] E [ Y 2] LetZ ax by Then0 E [ Z 2] E [ (ax by ) 2] a 2 E [ X 2] 2abE[XY ] + b 2 E [ Y 2] quadratic in a with at most one real root and therefore has discriminant 0.

33 4.3. MARKOV S INEQUALITY 27 Take b 0 E[XY ] 2 E [ X 2] E [ Y 2] Corollary. Corr(X, Y ) Proof. Apply Cauchy-Schwarz to the random variables X E[X] and Y E[Y ] 4.3 Markov s Inequality Theorem 4.3. If X is any random variable with finite mean then, Proof. Let P( X a) E[ X ] a for any a 0 A X a T hen X ai[a] Take expectation E[ X ] ap(a) E[ X ] ap( X a) 4.4 Chebyshev s Inequality Theorem 4.4. Let X be a random variable with E [ X 2]. Then ɛ 0 Proof. P( X ɛ) E[ X 2] ɛ 2 I[ X ɛ] x2 ɛ 2 x

34 28 CHAPTER 4. INEQUALITIES Then Take Expectation I[ X ɛ] x2 ɛ 2 [ ] x 2 P( X ɛ) E E[ X 2] ɛ 2 ɛ 2 Note. The result is distribution free - no assumption about the distribution of X (other than E [ X 2] ). 2. It is the best possible inequality, in the following sense X +ɛ with probability c 2ɛ 2 ɛ with probability c 2ɛ 2 0 with probability c ɛ 2 Then P( X ɛ) c ɛ 2 E [ X 2] c P( X ɛ) c ɛ 2 E [ ] X 2 ɛ 2 3. If µ E[X] then applying the inequality to X µ gives Often the most useful form. P(X µ ɛ) Var X ɛ Law of Large Numbers Theorem 4.5 (Weak law of large numbers). Let X, X 2... be a sequences of independent identically distributed random variables with Variance σ 2 Let S n n i X i Then ( ) S n ɛ 0, P n µ ɛ 0 as n

35 4.5. LAW OF LARGE NUMBERS 29 Proof. By Chebyshev s Inequality ( ) S n P n µ ɛ ( S n Thus P n µ E[ ( Sn n µ)2] ɛ 2 E[ (S n nµ) 2] n 2 ɛ 2 Var S n n 2 ɛ 2 Since E[S n ] nµ But Var S n nσ 2 ) ɛ nσ2 n 2 ɛ 2 σ2 nɛ 2 0 properties of expectation Example. A, A 2... are independent events, each with probability p. Let X i I[A i ]. Then Theorem states that S n n na n number of times A occurs number of trials µ E[I[A i ]] P(A i ) p ( ) S n P n p ɛ 0 as n Which recovers the intuitive definition of probability. Example. A Random Sample of size n is a sequence X, X 2,..., X n of independent identically distributed random variables ( n observations ) X n i X i n is called the SAMPLE MEAN Theorem states that provided the variance of X i is finite, the probability that the sample mean differs from the mean of the distribution by more than ɛ approaches 0 as n. We have shown the weak law of large numbers. Why weak? larger numbers. ( ) Sn P n µ as n This is NOT the same as the weak form. What does this mean? ω Ω determines S n, n, 2,... n as a sequence of real numbers. Hence it either tends to µ or it doesn t. ( P ω : S ) n(ω) µ as n n a strong form of

36 30 CHAPTER 4. INEQUALITIES

37 Chapter 5 Generating Functions In this chapter, assume that X is a random variable taking values in the range 0,, 2,.... Let p r P(X r) r 0,, 2,... Definition 5.. The Probability Generating Function (p.g.f) of the random variable X,or of the distribution p r 0,, 2,..., is p(z) E [ z X] z r P(X r) p r z r r0 This p(z) is a polynomial or a power series. If a power series then it is convergent for z by comparison with a geometric series. r0 p(z) r p r z r r p r Example. p r 6 r,..., 6 p(z) E [ z X] 6 z z 6 6 z ( + z +... z 6 ) Theorem 5.. The distribution of X is uniquely determined by the p.g.f p(z). Proof. We know that we can differential p(z) term by term for z Repeated differentiation gives p (z) p + 2p 2 z +... and so p (0) p (p(0) p 0 ) p (i) (z) ri r! (r i)! p rz r i and has p (i) 0 i!p i Thus we can recover p 0, p,... from p(z) 3

38 32 CHAPTER 5. GENERATING FUNCTIONS Theorem 5.2 (Abel s Lemma). Proof. p (z) E[X] lim z p (z) rp r z r z ri For z (0, ), p (z) is a non decreasing function of z and is bounded above by Choose ɛ 0, N large enough that Then True ɛ 0 and so z ri E[X] rp r ri N rp r E[X] ɛ ri N N lim rp r z r lim rp r z r rp r z ri E[X] lim z p (z) ri Usually p (z) is continuous at z, then E[X] p (). ( Recall p(z) z z 6 ) 6 z Theorem 5.3. Proof. E[X(X )] lim z p (z) p (z) Proof now the same as Abel s Lemma r(r )pz r 2 r2 Theorem 5.4. Suppose that X, X 2,..., X n are independent random variables with p.g.f s p (z), p 2 (z),..., p n (z). Then the p.g.f of X + X X n is Proof. p (z)p 2 (z)... p n (z) E [ z X+X2+...Xn] E [ z X.z X2....z Xn] E [ z X] E [ z X2]... E [ z Xn] p (z)p 2 (z)... p n (z)

39 33 Example. Suppose X has Poisson Distribution Then Let s calculate the variance of X Then λ λr P(X r) e r! E [ z X] r0 r 0,,... z r λ λr e r! e λ e λz e λ( z) p λe λ( z) p λ 2 e λ( z) E[X] lim z p (z) p ()( Since p (z) continuous at z )E[X] λ E[X(X )] p () λ 2 Var X E [ X 2] E[X] 2 E[X(X )] + E[X] E[X] 2 λ 2 + λ λ 2 λ Example. Suppose that Y has a Poisson Distribution with parameter µ. If X and Y are independent then: E [ z X+Y ] E [ z X] E [ z Y ] e λ( z) e µ( z) e (λ+µ)( z) But this is the p.g.f of a Poisson random variable with parameter λ + µ. By uniqueness (first theorem of the p.g.f) this must be the distribution for X + Y Example. X has a binomial Distribution, ( n P(X r) r E [ z X] n r0 ) p r ( p) n r r 0,,... ( ) n p r ( p) n r z r r (pz + p) n This shows that X Y + Y Y n. Where Y + Y Y n are independent random variables each with P(Y i ) p P(Y i 0) p Note if the p.g.f factorizes look to see if the random variable can be written as a sum.

40 34 CHAPTER 5. GENERATING FUNCTIONS 5. Combinatorial Applications Tile a (2 n) bathroom with (2 ) tiles. How many ways can this be done? Say f n f n f n + f n 2 f 0 f Let F (z) f n z n n0 f n z n f n z n + f n 2 z n f n z n f n z n + f n 2 z n n2 n2 n0 F (z) f 0 zf z(f (z) f 0 ) + z 2 F (z) F (z)( z z 2 ) f 0 ( z) + zf z + z. Since f 0 f, then F (z) z z 2 Let α + 5 α F (z) ( α z)( α 2 z) α The coefficient of z n, that is f n, is ( α z) α 2 ( α 2 z) ( α α n z n α 2 α α 2 f n n0 5.2 Conditional Expectation n0 α α 2 (α n+ α n+ 2 ) Let X and Y be random variables with joint distribution Then the distribution of X is P(X x) P(X x, Y y) y R y P(X x, Y y) α n 2 z n ) This is often called the Marginal distribution for X. The conditional distribution for X given by Y y is P(X x Y y) P(X x, Y y) P(Y y)

41 5.3. PROPERTIES OF CONDITIONAL EXPECTATION 35 Definition 5.2. The conditional expectation of X given Y y is, E[X x Y y] x R x xp(x x Y y) The conditional Expectation of X given Y is the random variable E[X Y ] defined by Thus E[X Y ] : Ω R E[X Y ] (ω) E[X Y Y (ω)] Example. Let X, X 2,..., X n be independent identically distributed random variables with P(X ) p and P(X 0) p. Let Then Y X + X X n P(X Y r) P(X, Y r) P(Y r) P(X, X X n r ) P(Y r) P(X ) P(X X n r ) P(Y r) p( ) n r p r ( p) n r ) pr ( p) n r ( n r ( n r ( n r) ) r n Then E[X Y r] 0 P(X 0 Y r) + P(X Y r) r n E[X Y Y (ω)] n Y (ω) Therefore E[X Y ] n Y Note a random variable - a function of Y. 5.3 Properties of Conditional Expectation Theorem 5.5. E[E[X Y ]] E[X]

42 36 CHAPTER 5. GENERATING FUNCTIONS Proof. E[E[X Y ]] P(Y y) E[X Y y] y R y y y E[X] P(Y y) P(X x Y y) x R x xp(x x Y y) x Theorem 5.6. If X and Y are independent then E[X Y ] E[X] Proof. If X and Y are independent then for any y R y E[X Y y] xp(x x Y y) xp(x x) E[X] x R x x Example. Let X, X 2,... be i.i.d.r.v s with p.g.f p(z). Let N be a random variable independent of X, X 2,... with p.g.f h(z). What is the p.g.f of: X + X X N E [ z X+,...,Xn] E [ E [ z X+,...,Xn N ]] P(N n) E [ z X+,...,Xn N n ] n0 P(N n) (p(z)) n n0 h(p(z)) Then for example E[X +,..., X n ] d dz h(p(z)) z h ()p () E[N] E[X ] Exercise Calculate d2 dz 2 h(p(z)) and hence Var X +,..., X n In terms of Var N and Var X

43 5.4. BRANCHING PROCESSES Branching Processes X 0, X... sequence of random variables. X n number of individuals in the n th generation of population. Assume.. X 0 2. Each individual lives for unit time then on death produces k offspring, probability f k. f k 3. All offspring behave independently. Assume Where Y n i. f f 0 + f X n+ Y n + Y n Y n n are i.i.d.r.v s. Y n i number of offspring of individual i in generation n. Let F(z) be the probability generating function ofy n i. Let F (z) f k z k E [ z Xi] ] E [z Y n i n0 F n (z) E [ z Xn] Then F (z) F (z) the probability generating function of the offspring distribution. Theorem 5.7. F n (z) is an n-fold iterative formula. F n+ (z) F n (F (z)) F (F (... (F (z))... ))

44 38 CHAPTER 5. GENERATING FUNCTIONS Proof. F n+ (z) E [ z Xn+] E [ E [ ]] z Xn+ X n P(X n k) E [ z Xn+ X n k ] n0 n0 n0 ] P(X n k) E [z Y n +Y n 2 + +Y n n ] [ ] P(X n k) E [z Y n... E z Y n n P(X n k) (F (z)) k n0 F n (F (z)) Theorem 5.8. Mean and Variance of population size If m and σ 2 kf k k0 (k m) 2 f k k0 Mean and Variance of offspring distribution. Then E[X n ] m n Var X n { σ 2 m n (m n ) m, m nσ 2, m (5.) Proof. Prove by calculating F (z), F (z) Alternatively E[X n ] E[E[X n X n ]] E[m X n ] me[x n ] m n by induction E [ (X n mx n ) 2] E [ E [ (X n mx n ) 2 X n ]] E[Var (X n X n )] E [ σ 2 X n ] σ 2 m n Thus E [ Xn] 2 2mE[Xn X n ] + m 2 E [ Xn 2 ] 2 σ 2 m n

45 5.4. BRANCHING PROCESSES 39 Now calculate E[X n X n ] E[E[X n X n X n ]] E[X n E[X n X n ]] E[X n mx n ] me [ Xn 2 ] Then E [ Xn 2 ] σ 2 m n + m 2 E[X n ] 2 Var X n E [ Xn 2 ] E[Xn ] 2 m 2 E [ Xn 2 ] + σ 2 m n m 2 E[X n ] 2 m 2 Var X n + σ 2 m n m 4 Var X n 2 + σ 2 (m n + m n ) m 2(n ) Var X + σ 2 (m n + m n + + m 2n 3 ) σ 2 m n ( + m + + m n ) To deal with extinction we need to be careful with limits as n. Let A n X n 0 Extinction occurs by generation n and let A A n the event that extinction ever occurs Can we calculate P(A) from P(A n )? More generally let A n be an increasing sequence A A 2... and define A lim n A n A n Define B n for n B A B n A n ( n i A n A c n A i ) c

46 40 CHAPTER 5. GENERATING FUNCTIONS B n for n are disjoint events and Thus A i i n A i i i n i B i B i ( ) ( ) P A i P B i i i P(B i ) lim n lim n P(B i ) n B i n i A i n i lim n lim n P(A n) ( ) P lim A n lim P(A n) n n Probability is a continuous set function. Thus P(extinction ever occurs) lim n P(A n) lim n P(X n 0) q, Say Note P(X n 0), n, 2, 3,... is an increasing sequence so limit exists. But P(X n 0) F n (0) F n is the p.g.f of X n So Also q lim n F n(0) ( ) F (q) F lim F n(0) n lim F (F n(0)) n lim F n+(0) n Thus F (q) q Since F is continuous q is called the Extinction Probability.

47 5.4. BRANCHING PROCESSES 4 Alternative Derivation q k P(X k) P(extinction X k) P(X k) q k F (q) Theorem 5.9. The probability of extinction, q, is the smallest positive root of the equation F (q) q. m is the mean of the offspring distribution. If m then q, while if m thenq Proof. F () m 0 kf k lim z F (z) F (z) j(j )z j 2 in 0 z Since f 0 + f Also F (0) f 0 0 jz Thus if m, there does not exists a q (0, ) with F (q) q. If m then let α be the smallest positive root of F (z) z then α. Further, F (0) F (α) α F (F (0)) F (α) α F n (0) α n q lim F n(0) 0 n q α Since q is a root of F (z) z

48 42 CHAPTER 5. GENERATING FUNCTIONS 5.5 Random Walks Let X, X 2,... be i.i.d.r.vs. Let S n S 0 + X + X X n Where, usually S 0 0 Then S n (n 0,, 2,... is a dimensional Random Walk. We shall assume X n {, with probability p, with probability q (5.2) This is a simple random walk. If p q 2 then the random walk is called symmetric Example (Gambler s Ruin). You have an initial fortune of A and I have an initial fortune of B. We toss coins repeatedly I win with probability p and you win with probability q. What is the probability that I bankrupt you before you bankrupt me?

49 5.5. RANDOM WALKS 43 Set a A + B and z B Stop a random walk starting at z when it hits 0 or a. Let p z be the probability that the random walk hits a before it hits 0, starting from z. Let q z be the probability that the random walk hits 0 before it hits a, starting from z. After the first step the gambler s fortune is either z or z + with prob p and q respectively. From the law of total probability. p z qp z + pp z+ 0 z a Also p 0 0 and p a. Must solve pt 2 t + q 0. t ± 4pq 2p ± 2p 2p or q p General Solution for p q is and so p z A + B ( ) z q A + B 0A p p z If p q, the general solution is A + Bz ( ) z q p ( ) a q p ( ) a q p p z z a To calculate q z, observe that this is the same problem with p, q, z replaced by p, q, a z respectively. Thus ) a ( ) z q q z ( q p ( q p p ) a if p q

50 44 CHAPTER 5. GENERATING FUNCTIONS or q z a z if p q z Thus q z + p z and so on, as we expected, the game ends with probability one. What happens as a? P(hits 0 before a) q z P( paths hit 0 ever) q z ( q p ( q p Or a z z az+ ) a ( q p )z ) a if p q if p q path hits 0 before it hits a P(hits 0 ever) lim P(hits 0 before a) a lim q z a ( ) q p q p p q Let G be the ultimate gain or loss. { a z, with probability p z G (5.3) z, with probability q z E[G] { ap z z, if p q 0, if p q (5.4) Fair game remains fair if the coin is fair then then games based on it have expected reward 0. Duration of a Game Let D z be the expected time until the random walk hits 0 or a, starting from z. Is D z finite? D z is bounded above by x the mean of geometric random variables (number of window s of size a before a window with all + s or s). Hence D z is finite. Consider the first step. Then D z + pd z+ + qd z E[duration] E[E[duration first step]] p (E[duration first step up]) + q (E[duration first step down]) p( + D z+ ) + q( + D z ) Equation holds for 0 z a with D 0 D a 0. Let s try for a particular solution D z C z C q p for p q C z C p (z + ) + C q (z ) +

51 5.5. RANDOM WALKS 45 Consider the homogeneous relation General Solution for p q is Substitute z 0, a to get A and B pt 2 t + q 0 t t 2 q p D z D z A + B z q p ( ) z q + z p q p a q p ( ) z q p ( q p ) a p q If p q then a particular solution is z 2. General solution D z z 2 + A + Bz Substituting the boundary conditions given., D z z(a z) p q Example. Initial Capital. p q z a P(ruin) E[gain] E[duration] Stop the random walk when it hits 0 or a. We have absorption at 0 or a. Let U z,n P(r.w. hits 0 at time n starts at z) U z,n+ pu z+,n + qu z,n 0 z a n 0 U 0,n U a,n 0 n 0 U a,0 U z,0 0 0 z a Let U z U z,n s n. n0 Now multiply by s n+ and add for n 0,, 2... Where U 0 (s) and U a (s) 0 U z (s) psu z+ (s) + qsu z (s) Look for a solution U x (s) (λ(s)) z λ(s)

52 46 CHAPTER 5. GENERATING FUNCTIONS Must satisfy Two Roots, Every Solution of the form λ(s) ps ((λ(s)) 2 + qs λ (s), λ 2 (s) ± 4pqs 2 2ps U z (s) A(s) (λ (s)) z + B(s) (λ 2 (s)) z Substitute U 0 (s) and U a (s) 0.A(s) + B(s) and A(s) (λ (s)) a + B(s) (λ 2 (s)) a 0 U z (s) (λ (s)) a (λ 2 (s)) z (λ (s)) z (λ 2 (s)) a (λ (s)) a (λ 2 (s)) a But λ (s)λ 2 (s) q recall quadratic p ( ) q (λ (s)) a z (λ 2 (s)) a z U z (s) p (λ (s)) a (λ 2 (s)) a Same method give generating function for absorption probabilities at the other barrier. Generating function for the duration of the game is the sum of these two generating functions.

53 Chapter 6 Continuous Random Variables In this chapter we drop the assumption that Ω id finite or countable. Assume we are given a probability p on some subset of Ω. For example, spin a pointer, and let ω Ω give the position at which it stops, with Ω ω : 0 ω 2π. Let P(ω [0, θ]) θ (0 θ 2π) 2π Definition 6.. A continuous random variable X is a function X : Ω R for which Where f(x) is a function satisfying. f(x) 0 P(a X(ω) b) b a f(x)dx 2. + f(x)dx The function f is called the Probability Density Function. For example, if X(ω) ω given position of the pointer then x is a continuous random variable with p.d.f f(x) { 2π, (0 x 2π) 0, otherwise (6.) This is an example of a uniformly distributed random variable. On the interval [0, 2π] 47

54 48 CHAPTER 6. CONTINUOUS RANDOM VARIABLES in this case. Intuition about probability density functions is based on the approximate relation. P(X [x, x + xδx]) x+xδx Proofs however more often use the distribution function F (x) P(X x) F (x) is increasing in x. x f(z)dz If X is a continuous random variable then F (x) and so F is continuous and differentiable. x f(z)dz F (x) f(x) (At any point x where then fundamental theorem of calculus applies). The distribution function is also defined for a discrete random variable, F (x) and so F is a step function. ω:x(ω) x p ω In either case P(a X b) P(X b) P(X a) F (b) F (a)

55 49 Example. The exponential distribution. Let { e λx, 0 x F (x) 0, x 0 (6.2) The corresponding pdf is f(x) λe λx 0 x this is known as the exponential distribution with parameter λ. If X has this distribution then P(X x + z X z) P(X x + z) P(X z) e λ(x+z) e λz e λx P(X x) This is known as the memoryless property of the exponential distribution. Theorem 6.. If X is a continuous random variable with pdf f(x) and h(x) is a continuous strictly increasing function with h (x) differentiable then h(x) is a continuous random variable with pdf f h (x) f ( h (x) ) d dx h (x) Proof. The distribution function of h(x) is P(h(X) x) P ( X h (x) ) F ( h (x) ) Since h is strictly increasing and F is the distribution function of X Then. d P(h(X) x) dx is a continuous random variable with pdf as claimed f h. Note usually need to repeat proof than remember the result.

56 50 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Example. Suppose X U[0, ] that is it is uniformly distributed on [0, ] Consider Y log x Thus Y is exponentially distributed. More generally P(Y y) P( log X y) P ( X e Y ) e Y dx e Y Theorem 6.2. Let U U[0, ]. For any continuous distribution function F, the random variable X defined by X F (u) has distribution function F. Proof. P(X x) P ( F (u) x ) P(U F (x)) F (x) U[0, ] Remark. a bit more messy for discrete random variables P(X X i ) p i i 0,,... Let X x j if j p i U i0 i0 j p i U U[0, ] 2. useful for simulations 6. Jointly Distributed Random Variables For two random variables X and Y the joint distribution function is Let F (x, y) P(X x, Y y) F : R 2 [0, ] F X (x) P(Xz x) P(X x, Y ) F (x, ) lim F (x, y) y

57 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 5 This is called the marginal distribution of X. Similarly F Y (x) F (, y) X, X 2,..., X n are jointly distributed continuous random variables if for a set c R b P((X, X 2,..., X n ) c)... f(x,..., x n )dx... dx n (x,...,x n) c For some function f called the joint probability density function satisfying the obvious conditions.. 2. Example. (n 2) f(x,..., x n )dx 0... f(x,..., x n )dx... dx n R n and so f(x, y) 2 F (x, y) x y F (x, y) P(X x, Y y) x y f(u, v)dudv Theorem 6.3. provided defined at (x, y). If X and y are jointly continuous random variables then they are individually continuous. Proof. Since X and Y are jointly continuous random variables is the pdf of X. P(X A) P(X A, Y (, + )) where f X (x) f(x, y)dy A f A f X (x)dx Jointly continuous random variables X and Y are Independent if f(x, y) f X (x)f Y (y) Then P(X A, Y B) P(X A) P(Y B) f(x, y)dxdy Similarly jointly continuous random variables X,..., X n are independent if n f(x,..., x n ) f Xi (x i ) i

58 52 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Where f Xi (x i ) are the pdf s of the individual random variables. Example. Two points X and Y are tossed at random and independently onto a line segment of length L. What is the probability that: X Y l? Suppose that at random means uniformly so that f(x, y) L 2 x, y [0, L] 2

59 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 53 Desired probability A f(x, y)dxdy area of A L 2 L2 2 2 (L l)2 2Ll l2 L 2 L 2 Example (Buffon s Needle Problem). A needle of length l is tossed at random onto a floor marked with parallel lines a distance L apart l L. What is the probability that the needle intersects one of the parallel lines. Let θ [0, 2π] be the angle between the needle and the parallel lines and let x be the distance from the bottom of the needle to the line closest to it. It is reasonable to suppose that X is distributed Uniformly. X U[0, L] Θ U[0, π) and X and Θ are independent. Thus f(x, θ) lπ 0 x L and 0 θ π

60 54 CHAPTER 6. CONTINUOUS RANDOM VARIABLES The needle intersects the line if and only if X sin θ The event A f(x, θ)dxdθ l A π 0 2l πl sin θ πl dθ Definition 6.2. The expectation or mean of a continuous random variable X is E[X] xf(x)dx provided not both of xf(x)dx and 0 xf(x)dx are infinite Example (Normal Distribution). Let f(x) e (x µ)2 2σ 2 2πσ x This is non-negative for it to be a pdf we also need to check that Make the substitution z x µ σ. Then f(x)dx Thus I 2 [ ] [ ] e x2 2 dx e y2 2 dy 2π I 2πσ 2π 2π 2π 0 2π 0 2π 0 dθ e (x µ) 2 2σ 2 dx e z2 2 dz e (y2 +x 2 ) 2 dxdy re θ2 2 drdθ Therefore I. A random variable with the pdf f(x) given above has a Normal distribution with parameters µ and σ 2 we write this as The Expectation is E[X] 2πσ 2πσ X N[µ, σ 2 ] xe (x µ) 2 2σ 2 dx (x µ)e (x µ) 2 2σ 2 dx + 2πσ µe (x µ) 2 2σ 2 dx.

61 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 55 The first term is convergent and equals zero by symmetry, so that E[X] 0 + µ µ Theorem 6.4. If X is a continuous random variable then, Proof. result follows. Similarly E[X] P(X x) dx P(X x) dx P(X x) dx 0 0 x P(X x) dx [ 0 0 y ] f(y)dy dx I[y x]f(y)dydx dxf(y)dy yf(y)dy yf(y)dy Note This holds for discrete random variables and is useful as a general way of finding the expectation whether the random variable is discrete or continuous. If X takes values in the set [0,,..., ] Theorem states and a direct proof follows E[X] P(X n) n0 P(X n) n0 n0 m0 I[m n]p(x m) ( ) I[m n] P(X m) m0 n0 mp(x m) m0 Theorem 6.5. Let X be a continuous random variable with pdf f(x) and let h(x) be a continuous real-valued function. Then provided h(x) f(x)dx E[h(x)] h(x)f(x)dx

62 56 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Proof. Similarly 0 0 P(h(X) y) dy 0 0 [ x:h(x) P(h(X) y) x:h(x) 0 f(x)dx x:h(x) 0 x:h(x) 0 ] dy I[h(x) y]f(x)dxdy [ ] h(x) 0 dy f(x)dx 0 x:h(x) 0 h(x)f(x)dy h(x)f(x)dy So the result follows from the last theorem. Definition 6.3. The variance of a continuous random variable X is Var X E [ (X E[X]) 2] Note The properties of expectation and variance are the same for discrete and continuous random variables just replace with in the proofs. Example. Var X E [ X 2] E[X] 2 ( x 2 f(x)dx xf(x)dx Example. Suppose X N[µ, σ 2 ] Let z X µ σ ( ) X µ P(Z z) P z σ ( Let u x µ ) σ P(X µ + σz) µ+σz z e (x µ)2 2σ 2 dx 2πσ 2π e u2 2 du then Φ(z) The distribution function of a N(0, ) random variable Z N(0, ) ) 2

63 6.2. TRANSFORMATION OF RANDOM VARIABLES 57 What is the variance of Z? Var X E [ Z 2] E[Z] 2 Last term is zero 2π Var X [ ze z2 2 2π 0 + z 2 e z2 2 dz ] + e z2 2 dz Variance of X? X µ + σz Thus E[X] µ we know that already Var X σ 2 Var Z Var X σ 2 X (µ, σ 2 ) 6.2 Transformation of Random Variables Suppose X, X 2,..., X n have joint pdf f(x,..., x n ) let Y r (X, X 2,..., X n ) Y 2 r 2 (X, X 2,..., X n ). Y n r n (X, X 2,..., X n ) Let R R n be such that P((X, X 2,..., X n ) R) Let S be the image of R under the above transformation suppose the transformation from R to S is - (bijective).

64 58 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Then inverse functions x s (y, y 2,..., y n ) x 2 s 2 (y, y 2,..., y n )... x n s n (y, y 2,..., y n ) Assume that si y j exists and is continuous at every point (y, y 2,..., y n ) in S s y... J..... s n y... s y n s n y n (6.3) If A R P((X,..., X n ) A) [] A B f(x,..., x n )dx... dx n f (s,..., s n ) J dy... dy n Where B is the image of A P((Y,..., Y n ) B) [2] Since transformation is - then [],[2] are the same Thus the density for Y,..., Y n is g((y, y 2,..., y n ) f (s (y, y 2,..., y n ),..., s n (y, y 2,..., y n )) J y, y 2,..., y n S 0 otherwise. Example (density of products and quotients). Suppose that (X, Y ) has density f(x, y) { 4xy, for 0 x, 0 y 0, Otherwise. (6.4) Let U X Y and V XY

65 6.2. TRANSFORMATION OF RANDOM VARIABLES 59 X V UV Y U x v uv y u x u v x 2 u v u 2 v y u 2 Therefore J 2u and so v 2 u 3 2 g(u, v) 2u (4xy) Note U and V are NOT independent 2u 4 uv v u 2 u v if (u, v) D 0 Otherwise. g(u, v) 2 u I[(u, v) D] v y v 2 uv. not product of the two identities. When the transformations are linear things are simpler still. Let A be the n n invertible matrix. Then the pdf of (Y,..., Y n ) is Y. A Y n X. X n. J det A det A g(y,..., n ) det A f(a g) Example. Suppose X, X 2 have the pdf f(x, x 2 ). Calculate the pdf of X + X 2. Let Y X + X 2 and Z X 2. Then X Y Z and X 2 Z. ( ) A (6.5) 0 Then det A det A g(y, z) f(x, x 2 ) f(y z, z)

66 60 CHAPTER 6. CONTINUOUS RANDOM VARIABLES joint distributions of Y and X. Marginal density of Y is g(y) or g(y) f(y z, z)dz y f(z, y z)dz By change of variable If X and X 2 are independent, with pgf s f and f 2 then f(x, x 2 ) f(x )f(x 2 ) and then g(y) f(y z)f(z)dz - the convolution of f and f 2 For the pdf f(x) ˆx is a mode if f(ˆx) f(x) x ˆx is a median if ˆx f(x)dx ˆx For a discrete random variable, ˆx is a median if f(x)dx 2 P(X ˆx) 2 or P(X ˆx) 2 If X,..., X n is a sample from the distribution then recall that the sample mean is n X i n Let Y,..., Y n (the statistics) be the values of X,..., X n arranged in increasing order. Then the sample median is Y n+ if n is odd or any value in 2 [ ] Y n, Y 2 n+ if n is even 2 If Y n maxx,..., X n and X,..., X n are iidrv s with distribution F and density f then, P(Y n y) P(X y,..., X n y) (F (y)) n

67 6.2. TRANSFORMATION OF RANDOM VARIABLES 6 Thus the density of Y n is g(y) d (F (y))n dy Similarly Y minx,..., X n and is Then the density of Y is n (F (y)) n f(y) P(Y y) P(X y,..., X n y) ( F (y)) n What about the joint density of Y, Y n? G(y, y n ) P(Y y, Y n y n ) n ( F (y)) n f(y) P(Y n y n ) P(Y n y n, Y ) P(Y n y n ) P(y X y n, y X 2 y n,..., y X n y n ) (F (y n )) n (F (y n ) F (y )) n Thus the pdf of Y, Y n is g(y, y n ) 2 y y n G(y, y n ) n(n ) (F (y n ) F (y )) n 2 f(y )f(y n ) y y n 0 otherwise What happens if the mapping is not -? X f(x) and X g(x)? P( X (a, b)) b a (f(x) + f( x)) dx g(x) f(x) + f( x) Suppose X,..., X n are iidrv s. What is the pdf of Y,..., Y n the order statistics? { n!f(y )... f(y n ), y y 2 y n g(y,..., y n ) (6.6) 0, Otherwise Example. Suppose X,..., X n are iidrv s exponentially distributed with parameter λ. Let z Y z 2 Y 2 Y. z n Y n Y n Where Y,..., Y n are the order statistics of X,..., X n. What is the distribution of the z s?

68 62 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Z AY Where det(a) A h(z,..., z n ) g(y,..., y n ) n!f(y )... f(y n ) n!λ n e λy... e λyn n!λ n e λ(y+ +yn) n!λ n e λ(z2z2+ +nzn) n λie λizn+ i i (6.7) Thus h(z,..., z n ) is expressed as the product of n density functions and Z n+ i exp(λi) exponentially distributed with parameter λi, with z,..., z n independent. Example. Let X and Y be independent N(0.) random variables. Let D R 2 X 2 + Y 2 then tan Θ Y X then ( y d x 2 + y 2 and θ arctan x) J 2x y x 2 +( x) y 2 2y x +( x) y 2 2 (6.8)

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

3. DISCRETE RANDOM VARIABLES

3. DISCRETE RANDOM VARIABLES IA Probability Lent Term 3 DISCRETE RANDOM VARIABLES 31 Introduction When an experiment is conducted there may be a number of quantities associated with the outcome ω Ω that may be of interest Suppose

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Probability Models. 4. What is the definition of the expectation of a discrete random variable? 1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions

More information

Probability. J.R. Norris. December 13, 2017

Probability. J.R. Norris. December 13, 2017 Probability J.R. Norris December 13, 2017 1 Contents 1 Mathematical models for randomness 6 1.1 A general definition............................... 6 1.2 Equally likely outcomes.............................

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

2 (Statistics) Random variables

2 (Statistics) Random variables 2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

4. CONTINUOUS RANDOM VARIABLES

4. CONTINUOUS RANDOM VARIABLES IA Probability Lent Term 4 CONTINUOUS RANDOM VARIABLES 4 Introduction Up to now we have restricted consideration to sample spaces Ω which are finite, or countable; we will now relax that assumption We

More information

Statistics 100A Homework 5 Solutions

Statistics 100A Homework 5 Solutions Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Chp 4. Expectation and Variance

Chp 4. Expectation and Variance Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Random Models Tusheng Zhang February 14, 013 1 Introduction In this module, we will introduce some random models which have many real life applications. The course consists of four parts. 1. A brief review

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

4 Branching Processes

4 Branching Processes 4 Branching Processes Organise by generations: Discrete time. If P(no offspring) 0 there is a probability that the process will die out. Let X = number of offspring of an individual p(x) = P(X = x) = offspring

More information

PROBABILITY VITTORIA SILVESTRI

PROBABILITY VITTORIA SILVESTRI PROBABILITY VITTORIA SILVESTRI Contents Preface 2 1. Introduction 3 2. Combinatorial analysis 6 3. Stirling s formula 9 4. Properties of Probability measures 12 5. Independence 17 6. Conditional probability

More information

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) Last modified: March 7, 2009 Reference: PRP, Sections 3.6 and 3.7. 1. Tail-Sum Theorem

More information

Week 12-13: Discrete Probability

Week 12-13: Discrete Probability Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Multivariate distributions

Multivariate distributions CHAPTER Multivariate distributions.. Introduction We want to discuss collections of random variables (X, X,..., X n ), which are known as random vectors. In the discrete case, we can define the density

More information

THE QUEEN S UNIVERSITY OF BELFAST

THE QUEEN S UNIVERSITY OF BELFAST THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M

More information

Probability: Handout

Probability: Handout Probability: Handout Klaus Pötzelberger Vienna University of Economics and Business Institute for Statistics and Mathematics E-mail: Klaus.Poetzelberger@wu.ac.at Contents 1 Axioms of Probability 3 1.1

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1 ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1 Last compiled: November 6, 213 1. Conditional expectation Exercise 1.1. To start with, note that P(X Y = P( c R : X > c, Y c or X c, Y > c = P( c Q : X > c, Y

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

2.1 Elementary probability; random sampling

2.1 Elementary probability; random sampling Chapter 2 Probability Theory Chapter 2 outlines the probability theory necessary to understand this text. It is meant as a refresher for students who need review and as a reference for concepts and theorems

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

Actuarial Science Exam 1/P

Actuarial Science Exam 1/P Actuarial Science Exam /P Ville A. Satopää December 5, 2009 Contents Review of Algebra and Calculus 2 2 Basic Probability Concepts 3 3 Conditional Probability and Independence 4 4 Combinatorial Principles,

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Probability. Copier s Message. Schedules. This is a pure course.

Probability. Copier s Message. Schedules. This is a pure course. Probability This is a pure course Copier s Message These notes may contain errors In fact, they almost certainly do since they were just copied down by me during lectures and everyone makes mistakes when

More information

Part (A): Review of Probability [Statistics I revision]

Part (A): Review of Probability [Statistics I revision] Part (A): Review of Probability [Statistics I revision] 1 Definition of Probability 1.1 Experiment An experiment is any procedure whose outcome is uncertain ffl toss a coin ffl throw a die ffl buy a lottery

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

Probability. Computer Science Tripos, Part IA. R.J. Gibbens. Computer Laboratory University of Cambridge. Easter Term 2008/9

Probability. Computer Science Tripos, Part IA. R.J. Gibbens. Computer Laboratory University of Cambridge. Easter Term 2008/9 Probability Computer Science Tripos, Part IA R.J. Gibbens Computer Laboratory University of Cambridge Easter Term 2008/9 Last revision: 2009-05-06/r-36 1 Outline Elementary probability theory (2 lectures)

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

1 Generating functions

1 Generating functions 1 Generating functions Even quite straightforward counting problems can lead to laborious and lengthy calculations. These are greatly simplified by using generating functions. 2 Definition 1.1. Given a

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

S n = x + X 1 + X X n.

S n = x + X 1 + X X n. 0 Lecture 0 0. Gambler Ruin Problem Let X be a payoff if a coin toss game such that P(X = ) = P(X = ) = /2. Suppose you start with x dollars and play the game n times. Let X,X 2,...,X n be payoffs in each

More information

MATH/STAT 3360, Probability Sample Final Examination Model Solutions

MATH/STAT 3360, Probability Sample Final Examination Model Solutions MATH/STAT 3360, Probability Sample Final Examination Model Solutions This Sample examination has more questions than the actual final, in order to cover a wider range of questions. Estimated times are

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 17: Continuous random variables: conditional PDF Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions CHAPTER Random Variables and Probability Distributions Random Variables Suppose that to each point of a sample space we assign a number. We then have a function defined on the sample space. This function

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1). Name M362K Final Exam Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. There is a table of formulae on the last page. 1. Suppose X 1,..., X 1 are independent

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

Stochastic Models of Manufacturing Systems

Stochastic Models of Manufacturing Systems Stochastic Models of Manufacturing Systems Ivo Adan Organization 2/47 7 lectures (lecture of May 12 is canceled) Studyguide available (with notes, slides, assignments, references), see http://www.win.tue.nl/

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems

More information

Probability Review. Gonzalo Mateos

Probability Review. Gonzalo Mateos Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 11, 2018 Introduction

More information

PROBABILITY VITTORIA SILVESTRI

PROBABILITY VITTORIA SILVESTRI PROBABILITY VITTORIA SILVESTRI Contents Preface. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8 4. Properties of Probability measures Preface These lecture notes are for the course

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 18

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 18 EECS 7 Discrete Mathematics and Probability Theory Spring 214 Anant Sahai Note 18 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω,

More information

Practice Examination # 3

Practice Examination # 3 Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Stat 134 Fall 2011: Notes on generating functions

Stat 134 Fall 2011: Notes on generating functions Stat 3 Fall 0: Notes on generating functions Michael Lugo October, 0 Definitions Given a random variable X which always takes on a positive integer value, we define the probability generating function

More information

M378K In-Class Assignment #1

M378K In-Class Assignment #1 The following problems are a review of M6K. M7K In-Class Assignment # Problem.. Complete the definition of mutual exclusivity of events below: Events A, B Ω are said to be mutually exclusive if A B =.

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair

More information

We introduce methods that are useful in:

We introduce methods that are useful in: Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more

More information

Math Bootcamp 2012 Miscellaneous

Math Bootcamp 2012 Miscellaneous Math Bootcamp 202 Miscellaneous Factorial, combination and permutation The factorial of a positive integer n denoted by n!, is the product of all positive integers less than or equal to n. Define 0! =.

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Recap of Basic Probability Theory

Recap of Basic Probability Theory 02407 Stochastic Processes? Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline. Random Variables Amappingthattransformstheeventstotherealline. Example 1. Toss a fair coin. Define a random variable X where X is 1 if head appears and X is if tail appears. P (X =)=1/2 P (X =1)=1/2 Example

More information

Discrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 215 Note 2 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the number

More information

Math Introduction to Probability. Davar Khoshnevisan University of Utah

Math Introduction to Probability. Davar Khoshnevisan University of Utah Math 5010 1 Introduction to Probability Based on D. Stirzaker s book Cambridge University Press Davar Khoshnevisan University of Utah Lecture 1 1. The sample space, events, and outcomes Need a math model

More information

Chapter 2: Random Variables

Chapter 2: Random Variables ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:

More information

Lecture 16. Lectures 1-15 Review

Lecture 16. Lectures 1-15 Review 18.440: Lecture 16 Lectures 1-15 Review Scott Sheffield MIT 1 Outline Counting tricks and basic principles of probability Discrete random variables 2 Outline Counting tricks and basic principles of probability

More information

Introduction to Information Entropy Adapted from Papoulis (1991)

Introduction to Information Entropy Adapted from Papoulis (1991) Introduction to Information Entropy Adapted from Papoulis (1991) Federico Lombardo Papoulis, A., Probability, Random Variables and Stochastic Processes, 3rd edition, McGraw ill, 1991. 1 1. INTRODUCTION

More information

Classical Probability

Classical Probability Chapter 1 Classical Probability Probability is the very guide of life. Marcus Thullius Cicero The original development of probability theory took place during the seventeenth through nineteenth centuries.

More information

Chapter 1. Sets and probability. 1.3 Probability space

Chapter 1. Sets and probability. 1.3 Probability space Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 16. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 16. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 213 Vazirani Note 16 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the

More information