Probability. Prof. F.P. Kelly. Lent These notes are maintained by Paul Metcalfe. Comments and corrections to
|
|
- Ashley Hudson
- 6 years ago
- Views:
Transcription
1 Probability Prof. F.P. Kelly Lent 996 These notes are maintained by Paul Metcalfe. Comments and corrections to
2 Revision:. Date: 2002/09/20 4:45:43 The following people have maintained these notes. June 2000 Kate Metcalfe June 2000 July 2004 Andrew Rogers July 2004 date Paul Metcalfe
3 Contents Introduction v Basic Concepts. Sample Space Classical Probability Combinatorial Analysis Stirling s Formula The Axiomatic Approach 5 2. The Axioms Independence Distributions Conditional Probability Random Variables 3. Expectation Variance Indicator Function Inclusion - Exclusion Formula Independence Inequalities Jensen s Inequality Cauchy-Schwarz Inequality Markov s Inequality Chebyshev s Inequality Law of Large Numbers Generating Functions 3 5. Combinatorial Applications Conditional Expectation Properties of Conditional Expectation Branching Processes Random Walks Continuous Random Variables Jointly Distributed Random Variables Transformation of Random Variables Moment Generating Functions iii
4 iv CONTENTS 6.4 Central Limit Theorem Multivariate normal distribution
5 Introduction These notes are based on the course Probability given by Prof. F.P. Kelly in Cambridge in the Lent Term 996. This typed version of the notes is totally unconnected with Prof. Kelly. Other sets of notes are available for different courses. At the time of typing these courses were: Probability Discrete Mathematics Analysis Further Analysis Methods Quantum Mechanics Fluid Dynamics Quadratic Mathematics Geometry Dynamics of D.E. s Foundations of QM Electrodynamics Methods of Math. Phys Fluid Dynamics 2 Waves (etc.) Statistical Physics General Relativity Dynamical Systems Combinatorics Bifurcations in Nonlinear Convection They may be downloaded from v
6 vi INTRODUCTION
7 Chapter Basic Concepts. Sample Space Suppose we have an experiment with a set Ω of outcomes. Then Ω is called the sample space. A potential outcome ω Ω is called a sample point. For instance, if the experiment is tossing coins, then Ω {H, T }, or if the experiment was tossing two dice, then Ω {(i, j) : i, j {,..., 6}}. A subset A of Ω is called an event. An event A occurs is when the experiment is performed, the outcome ω Ω satisfies ω A. For the coin-tossing experiment, then the event of a head appearing is A {H} and for the two dice, the event rolling a four would be A {(, 3), (2, 2), (3, )}..2 Classical Probability If Ω is finite, Ω {ω,..., ω n }, and each of the n sample points is equally likely then the probability of event A occurring is P(A) A Ω Example. Choose r digits from a table of random numbers. Find the probability that for 0 k 9,. no digit exceeds k, 2. k is the greatest digit drawn. Solution. The event that no digit exceeds k is A k {(a,..., a r ) : 0 a i k, i... r}. Now A k (k + ) r, so that P(A k ) ( ) k+ r. 0 Let B k be the event that k is the greatest digit drawn. Then B k A k \ A k. Also A k A k, so that B k (k + ) r k r. Thus P(B k ) (k+)r k r 0 r
8 2 CHAPTER. BASIC CONCEPTS The problem of the points Players A and B play a series of games. The winner of a game wins a point. The two players are equally skillful and the stake will be won by the first player to reach a target. They are forced to stop when A is within 2 points and B within 3 points. How should the stake be divided? Pascal suggested that the following continuations were equally likely AAAA AAAB AABB ABBB BBBB AABA ABBA BABB ABAA ABAB BBAB BAAA BABA BBBA BAAB BBAA This makes the ratio : 5. It was previously thought that the ratio should be 6 : 4 on considering termination, but these results are not equally likely..3 Combinatorial Analysis The fundamental rule is: Suppose r experiments are such that the first may result in any of n possible outcomes and such that for each of the possible outcomes of the first i experiments there are n i possible outcomes to experiment i. Let a i be the outcome of experiment i. Then there are a total of r i n i distinct r-tuples (a,..., a r ) describing the possible outcomes of the r experiments. Proof. Induction..4 Stirling s Formula For functions g(n) and h(n), we say that g is asymptotically equivalent to h and write g(n) h(n) if g(n) h(n) as n. Theorem. (Stirling s Formula). As n, and thus n! 2πnn n e n. log n! 2πnnn e n 0 We first prove the weak form of Stirling s formula, that log(n!) n log n. Proof. log n! n log k. Now n log xdx and z logx dx z log z z +, and so n log k n+ log xdx, n log n n + log n! (n + ) log(n + ) n. Divide by n log n and let n to sandwich Therefore log n! n log n. log n! n log n between terms that tend to.
9 .4. STIRLING S FORMULA 3 Now we prove the strong form. Proof. For x > 0, we have Now integrate from 0 to y to obtain Let h n log x + x 2 x 3 < + x < x + x2. y y 2 /2 + y 3 /3 y 4 /4 < log( + y) < y y 2 /2 + y 3 /3. n!en. Then we obtain n n+/2 2n 2 2n 3 h n h n+ 2n 2 + 6n 3. For n 2, 0 h n h n+ n. Thus h 2 n is a decreasing sequence, and 0 h 2 h n+ n r2 (h r h r+ ) r. Therefore h 2 n is bounded below, decreasing so is convergent. Let the limit be A. We have obtained n! e A n n+/2 e n. We need a trick to find A. Let I r π/2 sin r θ dθ. We obtain the recurrence I 0 r r r I r 2 by integrating by parts. Therefore I 2n (2n)! (2 n n!) π/2 and I 2 2n+ (2n n!) 2 (2n+)!. Now I n is decreasing, so I 2n I 2n+ I 2n I 2n+ + 2n. But by substituting our formula in, we get that I 2n I 2n+ π 2 Therefore e 2A 2π as required. 2n + n 2 2π e2a e 2A. by playing silly buggers with log + n
10 4 CHAPTER. BASIC CONCEPTS
11 Chapter 2 The Axiomatic Approach 2. The Axioms Let Ω be a sample space. Then probability P is a real valued function defined on subsets of Ω satisfying :-. 0 P(A) for A Ω, 2. P(Ω), 3. for a finite or infinite sequence A, A 2, Ω of disjoint events, P( A i ) i P(A u). The number P(A) is called the probability of event A. We can look at some distributions here. Consider an arbitrary finite or countable Ω {ω, ω 2,... } and an arbitrary collection {p, p 2,... } of non-negative numbers with sum. If we define P(A) p i, i:ω i A it is easy to see that this function satisfies the axioms. The numbers p, p 2,... are called a probability distribution. If Ω is finite with n elements, and if p p 2 p n n we recover the classical definition of probability. Another example would be to let Ω {0,,... } and attach to outcome r the λ λr probability p r e r! for some λ > 0. This is a distribution (as may be easily verified), and is called the Poisson distribution with parameter λ. Theorem 2. (Properties of P). A probability P satisfies. P(A c ) P(A), 2. P( ) 0, 3. if A B then P(A) P(B), 4. P(A B) P(A) + P(B) P(A B). 5
12 6 CHAPTER 2. THE AXIOMATIC APPROACH Proof. Note that Ω A A c, and A A c. Thus P(Ω) P(A)+P(A c ). Now we can use this to obtain P( ) P( c ) 0. If A B, write B A (B A c ), so that P(B) P(A) + P(B A c ) P(A). Finally, write A B A (B A c ) and B (B A) (B A c ). Then P(A B) P(A) + P(B A c ) and P(B) P(B A) + P(B A c ), which gives the result. Theorem 2.2 (Boole s Inequality). For any A, A 2, Ω, ( n ) n P A i P(A i ) ( ) P A i P(A i ) Proof. Let B A and then inductively let B i A i \ i B k. Thus the B i s are disjoint and i B i i A i. Therefore ( ) ( ) P A i P B i i i i P(B i ) i i i P(A i ) as B i A i. Theorem 2.3 (Inclusion-Exclusion Formula). ( n ) P A i ( ) S P A j. j S S {,...,n} S Proof. We know that P(A A 2 ) P(A ) + P(A 2 ) P(A A 2 ). Thus the result is true for n 2. We also have that P(A A n ) P(A A n ) + P(A n ) P((A A n ) A n ). But by distributivity, we have ( n ) P A i P i ( n ) A i + P(A n ) P Application of the inductive hypothesis yields the result. Corollary (Bonferroni Inequalities). S {,...,r} S ( ) S P A j j S ( n or P ) (A i A n ). ( n ) A i according as r is even or odd. Or in other words, if the inclusion-exclusion formula is truncated, the error has the sign of the omitted term and is smaller in absolute value. Note that the case r is Boole s inequality.
13 2.2. INDEPENDENCE 7 Proof. The result is true for n 2. If true for n, then it is true for n and r n by the inductive step above, which expresses a n-union in terms of two n unions. It is true for r n by the inclusion-exclusion formula. Example (Derangements). After a dinner, the n guests take coats at random from a pile. Find the probability that at least one guest has the right coat. Solution. Let A k be the event that guest k has his own coat. We want P( n i A i). Now, P(A i A ir ) (n r)!, n! by counting the number of ways of matching guests and coats after i,..., i r have taken theirs. Thus ( ) n (n r)! P(A i A ir ) r n! r!, i < <i r and the required probability is ( n ) P A i i which tends to e as n. 2! + 3! + + ( )n, n! Furthermore, let P m (n) be the probability that exactly m guests take the right coat. Then P 0 (n) e and n! P 0 (n) is the number of derangements of n objects. Therefore ( ) n P0 (n m) (n m)! P m (n) m n! 2.2 Independence P 0(n m) m! e m! as n. Definition 2.. Two events A and B are said to be independent if P(A B) P(A) P(B). More generally, a collection of events A i, i I are independent if ( ) P A i P(A i ) i J for all finite subsets J I. i J Example. Two fair dice are thrown. Let A be the event that the first die shows an odd number. Let A 2 be the event that the second die shows an odd number and finally let A 3 be the event that the sum of the two numbers is odd. Are A and A 2 independent? Are A and A 3 independent? Are A, A 2 and A 3 independent? I m not being sexist, merely a lazy typist. Sex will be assigned at random...
14 8 CHAPTER 2. THE AXIOMATIC APPROACH Solution. We first calculate the probabilities of the events A, A 2, A 3, A A 2, A A 3 and A A 2 A 3. Event Probability A A 2 As above, 2 A A A A A A A 2 A 3 0 Thus by a series of multiplications, we can see that A and A 2 are independent, A and A 3 are independent (also A 2 and A 3 ), but that A, A 2 and A 3 are not independent. Now we wish to state what we mean by 2 independent experiments 2. Consider Ω {α,... } and Ω 2 {β,... } with associated probability distributions {p,... } and {q,... }. Then, by 2 independent experiments, we mean the sample space Ω Ω 2 with probability distribution P((α i, β j )) p i q j. Now, suppose A Ω and B Ω 2. The event A can be interpreted as an event in Ω Ω 2, namely A Ω 2, and similarly for B. Then P(A B) p i q j p i q j P(A) P(B), α i A β j B α i A β j B which is why they are called independent experiments. The obvious generalisation to n experiments can be made, but for an infinite sequence of experiments we mean a sample space Ω Ω 2... satisfying the appropriate formula n N. You might like to find the probability that n independent tosses of a biased coin with the probability of heads p results in a total of r heads. 2.3 Distributions The binomial distribution with parameters n and p, 0 p has Ω {0,..., n} and probabilities p i ( n i) p i ( p) n i. Theorem 2.4 (Poisson approximation to binomial). If n, p 0 with np λ held fixed, then ( n )p r ( p) n r λ λr e r r!. 2 or more generally, n.
15 2.4. CONDITIONAL PROBABILITY 9 Proof. ( ) n p r ( p) n r r n(n )... (n r + ) p r ( p) n r r! n n n r ( n i + n i n... n r + (np) r ( p) n r n r! ) ( λ r λ r! n λr r! e λ λ λr e r!. ) n ( λ n ) r Suppose an infinite sequence of independent trials is to be performed. Each trial results in a success with probability p (0, ) or a failure with probability p. Such a sequence is called a sequence of Bernoulli trials. The probability that the first success occurs after exactly r failures is p r p( p) r. This is the geometric distribution with parameter p. Since 0 p r, the probability that all trials result in failure is zero. 2.4 Conditional Probability Definition 2.2. Provided P(B) > 0, we define the conditional probability of A B 3 to be P(A B) P(A B). P(B) Whenever we write P(A B), we assume that P(B) > 0. Note that if A and B are independent then P(A B) P(A). Theorem P(A B) P(A B) P(B), 2. P(A B C) P(A B C) P(B C) P(C), 3. P(A B C) P(A B C) P(B C), 4. the function P( B) restricted to subsets of B is a probability function on B. Proof. Results to 3 are immediate from the definition of conditional probability. For result 4, note that A B B, so P(A B) P(B) and thus P(A B). P(B B) (obviously), so it just remains to show the last axiom. For disjoint A i s, ( ) P A i B P( i (A i B)) P(B) i i P(A i B) P(B) i P(A i B), as required. 3 read A given B.
16 0 CHAPTER 2. THE AXIOMATIC APPROACH Theorem 2.6 (Law of total probability). Let B, B 2,... be a partition of Ω. Then P(A) i P(A B i ) P(B i ). Proof. P(A Bi ) P(B i ) P(A B i ) ( ) P A B i i P(A), as required. Example (Gambler s Ruin). A fair coin is tossed repeatedly. At each toss a gambler wins if a head shows and loses if tails. He continues playing until his capital reaches m or he goes broke. Find p x, the probability that he goes broke if his initial capital is x. Solution. Let A be the event that he goes broke before reaching m, and let H or T be the outcome of the first toss. We condition on the first toss to get P(A) P(A H) P(H) + P(A T ) P(T ). But P(A H) p x+ and P(A T ) p x. Thus we obtain the recurrence p x+ p x p x p x. Note that p x is linear in x, with p 0, p m 0. Thus p x x m. Theorem 2.7 (Bayes Formula). Let B, B 2,... be a partition of Ω. Then P(B i A) P(A B i) P(B i ) j P(A B j) P(B j ). Proof. P(B i A) P(A B i) P(A) by the law of total probability. P(A B i) P(B i ) j P(A B j) P(B j ),
17 Chapter 3 Random Variables Let Ω be finite or countable, and let p ω P({ω}) for ω Ω. Definition 3.. A random variable X is a function X : Ω R. Note that random variable is a somewhat inaccurate term, a random variable is neither random nor a variable. Example. If Ω {(i, j), i, j t}, then we can define random variables X and Y by X(i, j) i + j and Y (i, j) max{i, j} Let R X be the image of Ω under X. When the range is finite or countable then the random variable is said to be discrete. We write P(X x i ) for ω:x(ω)x i p ω, and for B R P(X B) x B P(X x). Then (P(X x), x R X ) is the distribution of the random variable X. Note that it is a probability distribution over R X. 3. Expectation Definition 3.2. The expectation of a random variable X is the number E[X] ω Ω p w X(ω) provided that this sum converges absolutely.
18 2 CHAPTER 3. RANDOM VARIABLES Note that E[X] p w X(ω) ω Ω p ω X(ω) x R X ω:x(ω)x x R X x ω:x(ω)x p ω x R X xp(x x). Absolute convergence allows the sum to be taken in any order. If X is a positive random variable and if ω Ω p ωx(ω) we write E[X] +. If x R X x 0 xp(x x) and xp(x x) x R X x<0 then E[X] is undefined. Example. If P(X r) e λ λr r!, then E[X] λ. Solution. E[X] r0 λ λr re r! λe λ r λ r (r )! λe λ e λ λ Example. If P(X r) ( n r) p r ( p) n r then E[X] np.
19 3.. EXPECTATION 3 Solution. E[X] n ( ) n rp r ( p) n r r r0 n n! r r!(n r)! pr ( p) n r r0 n (n )! n (r )!(n r)! pr ( p) n r np r n r (n )! (r )!(n r)! pr ( p) n r n (n )! np (r)!(n r)! pr ( p) n r r n ( ) n np p r ( p) n r r np r For any function f : R R the composition of f and X defines a new random variable f and X defines the new random variable f(x) given by f(x)(w) f(x(w)). Example. If a, b and c are constants, then a + bx and (X c) 2 are random variables defined by Note that E[X] is a constant. Theorem 3... If X 0 then E[X] 0. (a + bx)(w) a + bx(w) and (X c) 2 (w) (X(w) c) If X 0 and E[X] 0 then P(X 0). 3. If a and b are constants then E[a + bx] a + be[x]. 4. For any random variables X, Y then E[X + Y ] E[X] + E[Y ]. [ 5. E[X] is the constant which minimises E (X c) 2]. Proof.. X 0 means X w 0 w Ω So E[X] ω Ω p ω X(ω) 0 2. If ω Ω with p ω > 0 and X(ω) > 0 then E[X] > 0, therefore P(X 0).
20 4 CHAPTER 3. RANDOM VARIABLES 3. E[a + bx] (a + bx(ω)) p ω ω Ω a p ω + b p ω X(ω) ω Ω ω Ω a + E[X]. 4. Trivial. 5. Now E [ (X c) 2] E [ (X E[X] + E[X] c) 2] E [ [(X E[X]) 2] + 2(X E[X])(E[X] c) + [(E[X] c)] 2 ] E [ (X E[X]) 2] + 2(E[X] c)e[(x E[X])] + (E[X] c) 2 E [ (X E[X]) 2] + (E[X] c) 2. This is clearly minimised when c E[X]. Theorem 3.2. For any random variables X, X 2,..., X n [ n ] n E X i E[X i ] i i Proof. [ n ] [ n ] E X i E X i + X n i E i [ n ] X i + E[X] i Result follows by induction. 3.2 Variance Var X E [ X 2] E[X] 2 Standard Deviation Var X E[X E[X]] 2 σ 2 for Random Variable X Theorem 3.3. Properties of Variance (i) Var X 0 if Var X 0, then P(X E[X]) Proof - from property of expectation (ii) If a, b constants, Var (a + bx) b 2 Var X
21 3.2. VARIANCE 5 Proof. Var a + bx E[a + bx a be[x]] b 2 E[X E[X]] b 2 Var X (iii) Var X E [ X 2] E[X] 2 Proof. E[X E[X]] 2 E [ X 2 2XE[X] + (E[X]) 2] E [ X 2] 2E[X] E[X] + E[X] 2 E [ X 2] (E[X]) 2 Example. Let X have the geometric distribution P(X r) pq r with r 0,, 2... and p + q. Then E[X] q p and Var X q p 2. Solution. E[X] rpq r pq E [ X 2] r0 pq r0 r0 rq r d dq (qr ) pq d ( ) dq q pq( q) 2 q p r 2 p 2 q 2r r0 ( ) pq r(r + )q r rq r r r 2 pq( ( q) 3 ( q) 2 2q p 2 q p Var X E [ X 2] E[X] 2 2q p 2 q p q2 p q p 2 Definition 3.3. The co-variance of random variables X and Y is: The correlation of X and Y is: Cov(X, Y ) E[(X E[X])(Y E[Y ])] Corr(X, Y ) Cov(X, Y ) Var X Var Y
22 6 CHAPTER 3. RANDOM VARIABLES Linear Regression Theorem 3.4. Var (X + Y ) Var X + Var Y + 2Cov(X, Y ) Proof. Var (X + Y ) E [ (X + Y ) 2 E[X] E[Y ] ] 2 E [ (X E[X]) 2 + (Y E[Y ]) 2 + 2(X E[X])(Y E[Y ]) ] Var X + Var Y + 2Cov(X, Y ) 3.3 Indicator Function Definition 3.4. The Indicator Function I[A] of an event A Ω is the function I[A](w) {, if ω A; 0, if ω / A. (3.) NB that I[A] is a random variable. E[I[A]] P(A) E[I[A]] p ω I[A](w) ω Ω P(A) 2. I[A c ] I[A] 3. I[A B] I[A]I[B] 4. I[A B] I[A] + I[B] I[A]I[B] I[A B](ω) if ω A or ω B I[A B](ω) I[A](ω) + I[B](ω) I[A]I[B](ω) WORKS! Example. n couples are arranged randomly around a table such that males and females alternate. Let N The number of husbands sitting next to their wives. Calculate
23 3.3. INDICATOR FUNCTION 7 the E[N] and the Var N. N n I[A i ] i [ n ] E[N] E I[A i ] i n E[I[A i ]] i n i 2 n Thus E[N] n 2 n 2 E [ ( n ) 2 N 2] E I[A i ] i n E I[A i ] i A i event couple i are together i j I[A i ]I[A j ] ne [ I[A i ] 2] + n(n )E[(I[A ]I[A 2 ])] E [ I[A i ] 2] E[I[A i ]] 2 n E[(I[A ]I[A 2 ])] IE[[A B 2 ]] P(A A 2 ) P(A ) P(A 2 A ) 2 ( n n n n 2 ) 2 n n Var N E [ N 2] E[N] 2 2 ( + 2(n 2)) 2 n 2(n 2) n
24 8 CHAPTER 3. RANDOM VARIABLES 3.4 Inclusion - Exclusion Formula ( N N A i A c i [ N ] [( N I A i I I ) c A c i [ N ) c ] A c i ] N I[A c i] N ( I[A i ]) N I[A i ] i i 2 I[A ]I[A 2 ] ( ) j+ I[A ]I[A 2 ]...I[A j ] +... i i 2... i j Take Expectation [ N ] ( N ) E A i P A i N P(A i ) i i 2 P(A A 2 ) ( ) j+ i i 2... i j P ( Ai A i2... A ij ) Independence Definition 3.5. Discrete random variables X,..., X n are independent if and only if for any x...x n : P(X x, X 2 x 2...X n x n ) N P(X i x i ) Theorem 3.5 (Preservation of Independence). If X,..., X n are independent random variables and f, f 2...f n are functions R R then f (X )...f n (X n ) are independent random variables
25 3.5. INDEPENDENCE 9 Proof. P(f (X ) y,..., f n (X n ) y n ) x :f (X )y,... x n:f n(x n)y n N P(X i x i ) x i:f i(x i)y i N P(f i (X i ) y i ) P(X x,..., X n x n ) Theorem 3.6. If X...X n are independent random variables then: [ N ] E X i N E[X i ] NOTE that E[ X i ] E[X i ] without requiring independence. Proof. Write R i for R Xi the range of X i [ N ] E X i... x..x n P(X x, X 2 x 2..., X n x n ) x R x n R n ( ) N P(X i x i ) x i R i N E[X i ] Theorem 3.7. If X,..., X n are independent random variables and f...f n are function R R then: [ N ] N E f i (X i ) E[f i (X i )] Proof. Obvious from last two theorems! Theorem 3.8. If X,..., X n are independent random variables then: ( n ) n Var X i Var X i i i
26 20 CHAPTER 3. RANDOM VARIABLES Proof. ( n ) ( n ) 2 [ n ] 2 Var X i E X i E X i i i i E Xi 2 + [ n ] 2 X i X j E X i i i j i E [ X 2 ] i + E[X i X j ] E[X i ] 2 E[X i ] E[X j ] i i j i i j ( E [ X 2 ] i E[Xi ] 2) i i Var X i Theorem 3.9. If X,..., X n are independent identically distributed random variables then Proof. Var Var ( n ( n ) n X i n Var X i i ) n X i n 2 Var X i i n n 2 Var X i i n Var X i Example. Experimental Design. Two rods of unknown lengths a, b. A rule can measure the length but with but with error having 0 mean (unbiased) and variance σ 2. Errors independent from measurement to measurement. To estimate a, b we could take separate measurements A, B of each rod. E[A] a Var A σ 2 E[B] b Var B σ 2
27 3.5. INDEPENDENCE 2 Can we do better? YEP! Measure a + b as X and a b as Y E[X] a + b Var X σ 2 E[Y ] a b Var Y σ 2 [ ] X + Y E a 2 Var X + Y 2 2 σ2 [ ] X Y E b 2 Var X Y 2 2 σ2 So this is better. Example. Non standard dice. You choose then I choose one. Around this cycle a B P(A B) 2 3. So the relation A better that B is not transitive.
28 22 CHAPTER 3. RANDOM VARIABLES
29 Chapter 4 Inequalities 4. Jensen s Inequality A function f; (a, b) R is convex if f(px + qy) pf(x) + ( p)f(y) - x, y (a, b) - p (0, ) Strictly convex if strict inequality holds when x y f is concave if f is convex. f is strictly concave if f is strictly convex 23
30 24 CHAPTER 4. INEQUALITIES Concave neither concave or convex. We know that if f is twice differentiable and f (x) 0 for x (a, b) the if f is convex and strictly convex if f (x) 0 forx (a, b). Example. f(x) is strictly convex on (0, ) Example. Strictly concave. f(x) log x f (x) x f (x) x 2 0 f(x) x log x f (x) ( + logx) f (x) x 0 Example. f(x x 3 is strictly convex on (0, ) but not on (, ) Theorem 4.. Let f : (a, b) R be a convex function. Then: ( n n ) p i f(x i ) f p i x i i x,..., X n (a, b), p,..., p n (0, ) and p i. Further more if f is strictly convex then equality holds if and only if all x s are equal. i E[f(X)] f(e[x])
31 4.. JENSEN S INEQUALITY 25 Proof. By induction on n n nothing to prove n 2 definition of convexity. Assume results holds up to n-. Consider x,..., x n (a, b), p,..., p n (0, ) and pi For i 2...n, set p i p n i, such that p i p i i2 Then by the inductive hypothesis twice, first for n-, then for 2 n n p i f i (x i ) p f(x ) + ( p ) p if(x i ) ( n ) f p i x i i i2 ( n ) p f(x ) + ( p )f p ix i f ( p x + ( p ) i2 ) n p ix i f is strictly convex n 3 and not all the x i s equal then we assume not all of x 2...x n are equal. But then ( n n ) ( p j ) p if(x i ) ( p j )f p ix i So the inequality is strict. i2 Corollary (AM/GM Inequality). Positive real numbers x,..., x n ( n i x i ) n n n i Equality holds if and only if x x 2 x n Proof. Let P(X x i ) n then f(x) log x is a convex function on (0, ). So x i i2 i2 E[f(x)] f (E[x]) (Jensen s Inequality) E[log x] log E[x] [] Therefore n log x i log n x n n ( n i x i ) n n n x i [2] For strictness since f strictly convex equation holds in [] and hence [2] if and only if x x 2 x n i
32 26 CHAPTER 4. INEQUALITIES If f : (a, b) R is a convex function then it can be shown that at each point y (a, b) a linear function α y + β y x such that f(x) α y + β y x x (a, b) f(y) α y + β y y If f is differentiable at y then the linear function is the tangent f(y) + (x y)f (y) Let y E[X], α α y and β β y f (E[X]) α + βe[x] So for any random variable X taking values in (a, b) E[f(X)] E[α + βx] α + βe[x] f (E[X]) 4.2 Cauchy-Schwarz Inequality Theorem 4.2. For any random variables X, Y, Proof. For a, b R Let E[XY ] 2 E [ X 2] E [ Y 2] LetZ ax by Then0 E [ Z 2] E [ (ax by ) 2] a 2 E [ X 2] 2abE[XY ] + b 2 E [ Y 2] quadratic in a with at most one real root and therefore has discriminant 0.
33 4.3. MARKOV S INEQUALITY 27 Take b 0 E[XY ] 2 E [ X 2] E [ Y 2] Corollary. Corr(X, Y ) Proof. Apply Cauchy-Schwarz to the random variables X E[X] and Y E[Y ] 4.3 Markov s Inequality Theorem 4.3. If X is any random variable with finite mean then, Proof. Let P( X a) E[ X ] a for any a 0 A X a T hen X ai[a] Take expectation E[ X ] ap(a) E[ X ] ap( X a) 4.4 Chebyshev s Inequality Theorem 4.4. Let X be a random variable with E [ X 2]. Then ɛ 0 Proof. P( X ɛ) E[ X 2] ɛ 2 I[ X ɛ] x2 ɛ 2 x
34 28 CHAPTER 4. INEQUALITIES Then Take Expectation I[ X ɛ] x2 ɛ 2 [ ] x 2 P( X ɛ) E E[ X 2] ɛ 2 ɛ 2 Note. The result is distribution free - no assumption about the distribution of X (other than E [ X 2] ). 2. It is the best possible inequality, in the following sense X +ɛ with probability c 2ɛ 2 ɛ with probability c 2ɛ 2 0 with probability c ɛ 2 Then P( X ɛ) c ɛ 2 E [ X 2] c P( X ɛ) c ɛ 2 E [ ] X 2 ɛ 2 3. If µ E[X] then applying the inequality to X µ gives Often the most useful form. P(X µ ɛ) Var X ɛ Law of Large Numbers Theorem 4.5 (Weak law of large numbers). Let X, X 2... be a sequences of independent identically distributed random variables with Variance σ 2 Let S n n i X i Then ( ) S n ɛ 0, P n µ ɛ 0 as n
35 4.5. LAW OF LARGE NUMBERS 29 Proof. By Chebyshev s Inequality ( ) S n P n µ ɛ ( S n Thus P n µ E[ ( Sn n µ)2] ɛ 2 E[ (S n nµ) 2] n 2 ɛ 2 Var S n n 2 ɛ 2 Since E[S n ] nµ But Var S n nσ 2 ) ɛ nσ2 n 2 ɛ 2 σ2 nɛ 2 0 properties of expectation Example. A, A 2... are independent events, each with probability p. Let X i I[A i ]. Then Theorem states that S n n na n number of times A occurs number of trials µ E[I[A i ]] P(A i ) p ( ) S n P n p ɛ 0 as n Which recovers the intuitive definition of probability. Example. A Random Sample of size n is a sequence X, X 2,..., X n of independent identically distributed random variables ( n observations ) X n i X i n is called the SAMPLE MEAN Theorem states that provided the variance of X i is finite, the probability that the sample mean differs from the mean of the distribution by more than ɛ approaches 0 as n. We have shown the weak law of large numbers. Why weak? larger numbers. ( ) Sn P n µ as n This is NOT the same as the weak form. What does this mean? ω Ω determines S n, n, 2,... n as a sequence of real numbers. Hence it either tends to µ or it doesn t. ( P ω : S ) n(ω) µ as n n a strong form of
36 30 CHAPTER 4. INEQUALITIES
37 Chapter 5 Generating Functions In this chapter, assume that X is a random variable taking values in the range 0,, 2,.... Let p r P(X r) r 0,, 2,... Definition 5.. The Probability Generating Function (p.g.f) of the random variable X,or of the distribution p r 0,, 2,..., is p(z) E [ z X] z r P(X r) p r z r r0 This p(z) is a polynomial or a power series. If a power series then it is convergent for z by comparison with a geometric series. r0 p(z) r p r z r r p r Example. p r 6 r,..., 6 p(z) E [ z X] 6 z z 6 6 z ( + z +... z 6 ) Theorem 5.. The distribution of X is uniquely determined by the p.g.f p(z). Proof. We know that we can differential p(z) term by term for z Repeated differentiation gives p (z) p + 2p 2 z +... and so p (0) p (p(0) p 0 ) p (i) (z) ri r! (r i)! p rz r i and has p (i) 0 i!p i Thus we can recover p 0, p,... from p(z) 3
38 32 CHAPTER 5. GENERATING FUNCTIONS Theorem 5.2 (Abel s Lemma). Proof. p (z) E[X] lim z p (z) rp r z r z ri For z (0, ), p (z) is a non decreasing function of z and is bounded above by Choose ɛ 0, N large enough that Then True ɛ 0 and so z ri E[X] rp r ri N rp r E[X] ɛ ri N N lim rp r z r lim rp r z r rp r z ri E[X] lim z p (z) ri Usually p (z) is continuous at z, then E[X] p (). ( Recall p(z) z z 6 ) 6 z Theorem 5.3. Proof. E[X(X )] lim z p (z) p (z) Proof now the same as Abel s Lemma r(r )pz r 2 r2 Theorem 5.4. Suppose that X, X 2,..., X n are independent random variables with p.g.f s p (z), p 2 (z),..., p n (z). Then the p.g.f of X + X X n is Proof. p (z)p 2 (z)... p n (z) E [ z X+X2+...Xn] E [ z X.z X2....z Xn] E [ z X] E [ z X2]... E [ z Xn] p (z)p 2 (z)... p n (z)
39 33 Example. Suppose X has Poisson Distribution Then Let s calculate the variance of X Then λ λr P(X r) e r! E [ z X] r0 r 0,,... z r λ λr e r! e λ e λz e λ( z) p λe λ( z) p λ 2 e λ( z) E[X] lim z p (z) p ()( Since p (z) continuous at z )E[X] λ E[X(X )] p () λ 2 Var X E [ X 2] E[X] 2 E[X(X )] + E[X] E[X] 2 λ 2 + λ λ 2 λ Example. Suppose that Y has a Poisson Distribution with parameter µ. If X and Y are independent then: E [ z X+Y ] E [ z X] E [ z Y ] e λ( z) e µ( z) e (λ+µ)( z) But this is the p.g.f of a Poisson random variable with parameter λ + µ. By uniqueness (first theorem of the p.g.f) this must be the distribution for X + Y Example. X has a binomial Distribution, ( n P(X r) r E [ z X] n r0 ) p r ( p) n r r 0,,... ( ) n p r ( p) n r z r r (pz + p) n This shows that X Y + Y Y n. Where Y + Y Y n are independent random variables each with P(Y i ) p P(Y i 0) p Note if the p.g.f factorizes look to see if the random variable can be written as a sum.
40 34 CHAPTER 5. GENERATING FUNCTIONS 5. Combinatorial Applications Tile a (2 n) bathroom with (2 ) tiles. How many ways can this be done? Say f n f n f n + f n 2 f 0 f Let F (z) f n z n n0 f n z n f n z n + f n 2 z n f n z n f n z n + f n 2 z n n2 n2 n0 F (z) f 0 zf z(f (z) f 0 ) + z 2 F (z) F (z)( z z 2 ) f 0 ( z) + zf z + z. Since f 0 f, then F (z) z z 2 Let α + 5 α F (z) ( α z)( α 2 z) α The coefficient of z n, that is f n, is ( α z) α 2 ( α 2 z) ( α α n z n α 2 α α 2 f n n0 5.2 Conditional Expectation n0 α α 2 (α n+ α n+ 2 ) Let X and Y be random variables with joint distribution Then the distribution of X is P(X x) P(X x, Y y) y R y P(X x, Y y) α n 2 z n ) This is often called the Marginal distribution for X. The conditional distribution for X given by Y y is P(X x Y y) P(X x, Y y) P(Y y)
41 5.3. PROPERTIES OF CONDITIONAL EXPECTATION 35 Definition 5.2. The conditional expectation of X given Y y is, E[X x Y y] x R x xp(x x Y y) The conditional Expectation of X given Y is the random variable E[X Y ] defined by Thus E[X Y ] : Ω R E[X Y ] (ω) E[X Y Y (ω)] Example. Let X, X 2,..., X n be independent identically distributed random variables with P(X ) p and P(X 0) p. Let Then Y X + X X n P(X Y r) P(X, Y r) P(Y r) P(X, X X n r ) P(Y r) P(X ) P(X X n r ) P(Y r) p( ) n r p r ( p) n r ) pr ( p) n r ( n r ( n r ( n r) ) r n Then E[X Y r] 0 P(X 0 Y r) + P(X Y r) r n E[X Y Y (ω)] n Y (ω) Therefore E[X Y ] n Y Note a random variable - a function of Y. 5.3 Properties of Conditional Expectation Theorem 5.5. E[E[X Y ]] E[X]
42 36 CHAPTER 5. GENERATING FUNCTIONS Proof. E[E[X Y ]] P(Y y) E[X Y y] y R y y y E[X] P(Y y) P(X x Y y) x R x xp(x x Y y) x Theorem 5.6. If X and Y are independent then E[X Y ] E[X] Proof. If X and Y are independent then for any y R y E[X Y y] xp(x x Y y) xp(x x) E[X] x R x x Example. Let X, X 2,... be i.i.d.r.v s with p.g.f p(z). Let N be a random variable independent of X, X 2,... with p.g.f h(z). What is the p.g.f of: X + X X N E [ z X+,...,Xn] E [ E [ z X+,...,Xn N ]] P(N n) E [ z X+,...,Xn N n ] n0 P(N n) (p(z)) n n0 h(p(z)) Then for example E[X +,..., X n ] d dz h(p(z)) z h ()p () E[N] E[X ] Exercise Calculate d2 dz 2 h(p(z)) and hence Var X +,..., X n In terms of Var N and Var X
43 5.4. BRANCHING PROCESSES Branching Processes X 0, X... sequence of random variables. X n number of individuals in the n th generation of population. Assume.. X 0 2. Each individual lives for unit time then on death produces k offspring, probability f k. f k 3. All offspring behave independently. Assume Where Y n i. f f 0 + f X n+ Y n + Y n Y n n are i.i.d.r.v s. Y n i number of offspring of individual i in generation n. Let F(z) be the probability generating function ofy n i. Let F (z) f k z k E [ z Xi] ] E [z Y n i n0 F n (z) E [ z Xn] Then F (z) F (z) the probability generating function of the offspring distribution. Theorem 5.7. F n (z) is an n-fold iterative formula. F n+ (z) F n (F (z)) F (F (... (F (z))... ))
44 38 CHAPTER 5. GENERATING FUNCTIONS Proof. F n+ (z) E [ z Xn+] E [ E [ ]] z Xn+ X n P(X n k) E [ z Xn+ X n k ] n0 n0 n0 ] P(X n k) E [z Y n +Y n 2 + +Y n n ] [ ] P(X n k) E [z Y n... E z Y n n P(X n k) (F (z)) k n0 F n (F (z)) Theorem 5.8. Mean and Variance of population size If m and σ 2 kf k k0 (k m) 2 f k k0 Mean and Variance of offspring distribution. Then E[X n ] m n Var X n { σ 2 m n (m n ) m, m nσ 2, m (5.) Proof. Prove by calculating F (z), F (z) Alternatively E[X n ] E[E[X n X n ]] E[m X n ] me[x n ] m n by induction E [ (X n mx n ) 2] E [ E [ (X n mx n ) 2 X n ]] E[Var (X n X n )] E [ σ 2 X n ] σ 2 m n Thus E [ Xn] 2 2mE[Xn X n ] + m 2 E [ Xn 2 ] 2 σ 2 m n
45 5.4. BRANCHING PROCESSES 39 Now calculate E[X n X n ] E[E[X n X n X n ]] E[X n E[X n X n ]] E[X n mx n ] me [ Xn 2 ] Then E [ Xn 2 ] σ 2 m n + m 2 E[X n ] 2 Var X n E [ Xn 2 ] E[Xn ] 2 m 2 E [ Xn 2 ] + σ 2 m n m 2 E[X n ] 2 m 2 Var X n + σ 2 m n m 4 Var X n 2 + σ 2 (m n + m n ) m 2(n ) Var X + σ 2 (m n + m n + + m 2n 3 ) σ 2 m n ( + m + + m n ) To deal with extinction we need to be careful with limits as n. Let A n X n 0 Extinction occurs by generation n and let A A n the event that extinction ever occurs Can we calculate P(A) from P(A n )? More generally let A n be an increasing sequence A A 2... and define A lim n A n A n Define B n for n B A B n A n ( n i A n A c n A i ) c
46 40 CHAPTER 5. GENERATING FUNCTIONS B n for n are disjoint events and Thus A i i n A i i i n i B i B i ( ) ( ) P A i P B i i i P(B i ) lim n lim n P(B i ) n B i n i A i n i lim n lim n P(A n) ( ) P lim A n lim P(A n) n n Probability is a continuous set function. Thus P(extinction ever occurs) lim n P(A n) lim n P(X n 0) q, Say Note P(X n 0), n, 2, 3,... is an increasing sequence so limit exists. But P(X n 0) F n (0) F n is the p.g.f of X n So Also q lim n F n(0) ( ) F (q) F lim F n(0) n lim F (F n(0)) n lim F n+(0) n Thus F (q) q Since F is continuous q is called the Extinction Probability.
47 5.4. BRANCHING PROCESSES 4 Alternative Derivation q k P(X k) P(extinction X k) P(X k) q k F (q) Theorem 5.9. The probability of extinction, q, is the smallest positive root of the equation F (q) q. m is the mean of the offspring distribution. If m then q, while if m thenq Proof. F () m 0 kf k lim z F (z) F (z) j(j )z j 2 in 0 z Since f 0 + f Also F (0) f 0 0 jz Thus if m, there does not exists a q (0, ) with F (q) q. If m then let α be the smallest positive root of F (z) z then α. Further, F (0) F (α) α F (F (0)) F (α) α F n (0) α n q lim F n(0) 0 n q α Since q is a root of F (z) z
48 42 CHAPTER 5. GENERATING FUNCTIONS 5.5 Random Walks Let X, X 2,... be i.i.d.r.vs. Let S n S 0 + X + X X n Where, usually S 0 0 Then S n (n 0,, 2,... is a dimensional Random Walk. We shall assume X n {, with probability p, with probability q (5.2) This is a simple random walk. If p q 2 then the random walk is called symmetric Example (Gambler s Ruin). You have an initial fortune of A and I have an initial fortune of B. We toss coins repeatedly I win with probability p and you win with probability q. What is the probability that I bankrupt you before you bankrupt me?
49 5.5. RANDOM WALKS 43 Set a A + B and z B Stop a random walk starting at z when it hits 0 or a. Let p z be the probability that the random walk hits a before it hits 0, starting from z. Let q z be the probability that the random walk hits 0 before it hits a, starting from z. After the first step the gambler s fortune is either z or z + with prob p and q respectively. From the law of total probability. p z qp z + pp z+ 0 z a Also p 0 0 and p a. Must solve pt 2 t + q 0. t ± 4pq 2p ± 2p 2p or q p General Solution for p q is and so p z A + B ( ) z q A + B 0A p p z If p q, the general solution is A + Bz ( ) z q p ( ) a q p ( ) a q p p z z a To calculate q z, observe that this is the same problem with p, q, z replaced by p, q, a z respectively. Thus ) a ( ) z q q z ( q p ( q p p ) a if p q
50 44 CHAPTER 5. GENERATING FUNCTIONS or q z a z if p q z Thus q z + p z and so on, as we expected, the game ends with probability one. What happens as a? P(hits 0 before a) q z P( paths hit 0 ever) q z ( q p ( q p Or a z z az+ ) a ( q p )z ) a if p q if p q path hits 0 before it hits a P(hits 0 ever) lim P(hits 0 before a) a lim q z a ( ) q p q p p q Let G be the ultimate gain or loss. { a z, with probability p z G (5.3) z, with probability q z E[G] { ap z z, if p q 0, if p q (5.4) Fair game remains fair if the coin is fair then then games based on it have expected reward 0. Duration of a Game Let D z be the expected time until the random walk hits 0 or a, starting from z. Is D z finite? D z is bounded above by x the mean of geometric random variables (number of window s of size a before a window with all + s or s). Hence D z is finite. Consider the first step. Then D z + pd z+ + qd z E[duration] E[E[duration first step]] p (E[duration first step up]) + q (E[duration first step down]) p( + D z+ ) + q( + D z ) Equation holds for 0 z a with D 0 D a 0. Let s try for a particular solution D z C z C q p for p q C z C p (z + ) + C q (z ) +
51 5.5. RANDOM WALKS 45 Consider the homogeneous relation General Solution for p q is Substitute z 0, a to get A and B pt 2 t + q 0 t t 2 q p D z D z A + B z q p ( ) z q + z p q p a q p ( ) z q p ( q p ) a p q If p q then a particular solution is z 2. General solution D z z 2 + A + Bz Substituting the boundary conditions given., D z z(a z) p q Example. Initial Capital. p q z a P(ruin) E[gain] E[duration] Stop the random walk when it hits 0 or a. We have absorption at 0 or a. Let U z,n P(r.w. hits 0 at time n starts at z) U z,n+ pu z+,n + qu z,n 0 z a n 0 U 0,n U a,n 0 n 0 U a,0 U z,0 0 0 z a Let U z U z,n s n. n0 Now multiply by s n+ and add for n 0,, 2... Where U 0 (s) and U a (s) 0 U z (s) psu z+ (s) + qsu z (s) Look for a solution U x (s) (λ(s)) z λ(s)
52 46 CHAPTER 5. GENERATING FUNCTIONS Must satisfy Two Roots, Every Solution of the form λ(s) ps ((λ(s)) 2 + qs λ (s), λ 2 (s) ± 4pqs 2 2ps U z (s) A(s) (λ (s)) z + B(s) (λ 2 (s)) z Substitute U 0 (s) and U a (s) 0.A(s) + B(s) and A(s) (λ (s)) a + B(s) (λ 2 (s)) a 0 U z (s) (λ (s)) a (λ 2 (s)) z (λ (s)) z (λ 2 (s)) a (λ (s)) a (λ 2 (s)) a But λ (s)λ 2 (s) q recall quadratic p ( ) q (λ (s)) a z (λ 2 (s)) a z U z (s) p (λ (s)) a (λ 2 (s)) a Same method give generating function for absorption probabilities at the other barrier. Generating function for the duration of the game is the sum of these two generating functions.
53 Chapter 6 Continuous Random Variables In this chapter we drop the assumption that Ω id finite or countable. Assume we are given a probability p on some subset of Ω. For example, spin a pointer, and let ω Ω give the position at which it stops, with Ω ω : 0 ω 2π. Let P(ω [0, θ]) θ (0 θ 2π) 2π Definition 6.. A continuous random variable X is a function X : Ω R for which Where f(x) is a function satisfying. f(x) 0 P(a X(ω) b) b a f(x)dx 2. + f(x)dx The function f is called the Probability Density Function. For example, if X(ω) ω given position of the pointer then x is a continuous random variable with p.d.f f(x) { 2π, (0 x 2π) 0, otherwise (6.) This is an example of a uniformly distributed random variable. On the interval [0, 2π] 47
54 48 CHAPTER 6. CONTINUOUS RANDOM VARIABLES in this case. Intuition about probability density functions is based on the approximate relation. P(X [x, x + xδx]) x+xδx Proofs however more often use the distribution function F (x) P(X x) F (x) is increasing in x. x f(z)dz If X is a continuous random variable then F (x) and so F is continuous and differentiable. x f(z)dz F (x) f(x) (At any point x where then fundamental theorem of calculus applies). The distribution function is also defined for a discrete random variable, F (x) and so F is a step function. ω:x(ω) x p ω In either case P(a X b) P(X b) P(X a) F (b) F (a)
55 49 Example. The exponential distribution. Let { e λx, 0 x F (x) 0, x 0 (6.2) The corresponding pdf is f(x) λe λx 0 x this is known as the exponential distribution with parameter λ. If X has this distribution then P(X x + z X z) P(X x + z) P(X z) e λ(x+z) e λz e λx P(X x) This is known as the memoryless property of the exponential distribution. Theorem 6.. If X is a continuous random variable with pdf f(x) and h(x) is a continuous strictly increasing function with h (x) differentiable then h(x) is a continuous random variable with pdf f h (x) f ( h (x) ) d dx h (x) Proof. The distribution function of h(x) is P(h(X) x) P ( X h (x) ) F ( h (x) ) Since h is strictly increasing and F is the distribution function of X Then. d P(h(X) x) dx is a continuous random variable with pdf as claimed f h. Note usually need to repeat proof than remember the result.
56 50 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Example. Suppose X U[0, ] that is it is uniformly distributed on [0, ] Consider Y log x Thus Y is exponentially distributed. More generally P(Y y) P( log X y) P ( X e Y ) e Y dx e Y Theorem 6.2. Let U U[0, ]. For any continuous distribution function F, the random variable X defined by X F (u) has distribution function F. Proof. P(X x) P ( F (u) x ) P(U F (x)) F (x) U[0, ] Remark. a bit more messy for discrete random variables P(X X i ) p i i 0,,... Let X x j if j p i U i0 i0 j p i U U[0, ] 2. useful for simulations 6. Jointly Distributed Random Variables For two random variables X and Y the joint distribution function is Let F (x, y) P(X x, Y y) F : R 2 [0, ] F X (x) P(Xz x) P(X x, Y ) F (x, ) lim F (x, y) y
57 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 5 This is called the marginal distribution of X. Similarly F Y (x) F (, y) X, X 2,..., X n are jointly distributed continuous random variables if for a set c R b P((X, X 2,..., X n ) c)... f(x,..., x n )dx... dx n (x,...,x n) c For some function f called the joint probability density function satisfying the obvious conditions.. 2. Example. (n 2) f(x,..., x n )dx 0... f(x,..., x n )dx... dx n R n and so f(x, y) 2 F (x, y) x y F (x, y) P(X x, Y y) x y f(u, v)dudv Theorem 6.3. provided defined at (x, y). If X and y are jointly continuous random variables then they are individually continuous. Proof. Since X and Y are jointly continuous random variables is the pdf of X. P(X A) P(X A, Y (, + )) where f X (x) f(x, y)dy A f A f X (x)dx Jointly continuous random variables X and Y are Independent if f(x, y) f X (x)f Y (y) Then P(X A, Y B) P(X A) P(Y B) f(x, y)dxdy Similarly jointly continuous random variables X,..., X n are independent if n f(x,..., x n ) f Xi (x i ) i
58 52 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Where f Xi (x i ) are the pdf s of the individual random variables. Example. Two points X and Y are tossed at random and independently onto a line segment of length L. What is the probability that: X Y l? Suppose that at random means uniformly so that f(x, y) L 2 x, y [0, L] 2
59 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 53 Desired probability A f(x, y)dxdy area of A L 2 L2 2 2 (L l)2 2Ll l2 L 2 L 2 Example (Buffon s Needle Problem). A needle of length l is tossed at random onto a floor marked with parallel lines a distance L apart l L. What is the probability that the needle intersects one of the parallel lines. Let θ [0, 2π] be the angle between the needle and the parallel lines and let x be the distance from the bottom of the needle to the line closest to it. It is reasonable to suppose that X is distributed Uniformly. X U[0, L] Θ U[0, π) and X and Θ are independent. Thus f(x, θ) lπ 0 x L and 0 θ π
60 54 CHAPTER 6. CONTINUOUS RANDOM VARIABLES The needle intersects the line if and only if X sin θ The event A f(x, θ)dxdθ l A π 0 2l πl sin θ πl dθ Definition 6.2. The expectation or mean of a continuous random variable X is E[X] xf(x)dx provided not both of xf(x)dx and 0 xf(x)dx are infinite Example (Normal Distribution). Let f(x) e (x µ)2 2σ 2 2πσ x This is non-negative for it to be a pdf we also need to check that Make the substitution z x µ σ. Then f(x)dx Thus I 2 [ ] [ ] e x2 2 dx e y2 2 dy 2π I 2πσ 2π 2π 2π 0 2π 0 2π 0 dθ e (x µ) 2 2σ 2 dx e z2 2 dz e (y2 +x 2 ) 2 dxdy re θ2 2 drdθ Therefore I. A random variable with the pdf f(x) given above has a Normal distribution with parameters µ and σ 2 we write this as The Expectation is E[X] 2πσ 2πσ X N[µ, σ 2 ] xe (x µ) 2 2σ 2 dx (x µ)e (x µ) 2 2σ 2 dx + 2πσ µe (x µ) 2 2σ 2 dx.
61 6.. JOINTLY DISTRIBUTED RANDOM VARIABLES 55 The first term is convergent and equals zero by symmetry, so that E[X] 0 + µ µ Theorem 6.4. If X is a continuous random variable then, Proof. result follows. Similarly E[X] P(X x) dx P(X x) dx P(X x) dx 0 0 x P(X x) dx [ 0 0 y ] f(y)dy dx I[y x]f(y)dydx dxf(y)dy yf(y)dy yf(y)dy Note This holds for discrete random variables and is useful as a general way of finding the expectation whether the random variable is discrete or continuous. If X takes values in the set [0,,..., ] Theorem states and a direct proof follows E[X] P(X n) n0 P(X n) n0 n0 m0 I[m n]p(x m) ( ) I[m n] P(X m) m0 n0 mp(x m) m0 Theorem 6.5. Let X be a continuous random variable with pdf f(x) and let h(x) be a continuous real-valued function. Then provided h(x) f(x)dx E[h(x)] h(x)f(x)dx
62 56 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Proof. Similarly 0 0 P(h(X) y) dy 0 0 [ x:h(x) P(h(X) y) x:h(x) 0 f(x)dx x:h(x) 0 x:h(x) 0 ] dy I[h(x) y]f(x)dxdy [ ] h(x) 0 dy f(x)dx 0 x:h(x) 0 h(x)f(x)dy h(x)f(x)dy So the result follows from the last theorem. Definition 6.3. The variance of a continuous random variable X is Var X E [ (X E[X]) 2] Note The properties of expectation and variance are the same for discrete and continuous random variables just replace with in the proofs. Example. Var X E [ X 2] E[X] 2 ( x 2 f(x)dx xf(x)dx Example. Suppose X N[µ, σ 2 ] Let z X µ σ ( ) X µ P(Z z) P z σ ( Let u x µ ) σ P(X µ + σz) µ+σz z e (x µ)2 2σ 2 dx 2πσ 2π e u2 2 du then Φ(z) The distribution function of a N(0, ) random variable Z N(0, ) ) 2
63 6.2. TRANSFORMATION OF RANDOM VARIABLES 57 What is the variance of Z? Var X E [ Z 2] E[Z] 2 Last term is zero 2π Var X [ ze z2 2 2π 0 + z 2 e z2 2 dz ] + e z2 2 dz Variance of X? X µ + σz Thus E[X] µ we know that already Var X σ 2 Var Z Var X σ 2 X (µ, σ 2 ) 6.2 Transformation of Random Variables Suppose X, X 2,..., X n have joint pdf f(x,..., x n ) let Y r (X, X 2,..., X n ) Y 2 r 2 (X, X 2,..., X n ). Y n r n (X, X 2,..., X n ) Let R R n be such that P((X, X 2,..., X n ) R) Let S be the image of R under the above transformation suppose the transformation from R to S is - (bijective).
64 58 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Then inverse functions x s (y, y 2,..., y n ) x 2 s 2 (y, y 2,..., y n )... x n s n (y, y 2,..., y n ) Assume that si y j exists and is continuous at every point (y, y 2,..., y n ) in S s y... J..... s n y... s y n s n y n (6.3) If A R P((X,..., X n ) A) [] A B f(x,..., x n )dx... dx n f (s,..., s n ) J dy... dy n Where B is the image of A P((Y,..., Y n ) B) [2] Since transformation is - then [],[2] are the same Thus the density for Y,..., Y n is g((y, y 2,..., y n ) f (s (y, y 2,..., y n ),..., s n (y, y 2,..., y n )) J y, y 2,..., y n S 0 otherwise. Example (density of products and quotients). Suppose that (X, Y ) has density f(x, y) { 4xy, for 0 x, 0 y 0, Otherwise. (6.4) Let U X Y and V XY
65 6.2. TRANSFORMATION OF RANDOM VARIABLES 59 X V UV Y U x v uv y u x u v x 2 u v u 2 v y u 2 Therefore J 2u and so v 2 u 3 2 g(u, v) 2u (4xy) Note U and V are NOT independent 2u 4 uv v u 2 u v if (u, v) D 0 Otherwise. g(u, v) 2 u I[(u, v) D] v y v 2 uv. not product of the two identities. When the transformations are linear things are simpler still. Let A be the n n invertible matrix. Then the pdf of (Y,..., Y n ) is Y. A Y n X. X n. J det A det A g(y,..., n ) det A f(a g) Example. Suppose X, X 2 have the pdf f(x, x 2 ). Calculate the pdf of X + X 2. Let Y X + X 2 and Z X 2. Then X Y Z and X 2 Z. ( ) A (6.5) 0 Then det A det A g(y, z) f(x, x 2 ) f(y z, z)
66 60 CHAPTER 6. CONTINUOUS RANDOM VARIABLES joint distributions of Y and X. Marginal density of Y is g(y) or g(y) f(y z, z)dz y f(z, y z)dz By change of variable If X and X 2 are independent, with pgf s f and f 2 then f(x, x 2 ) f(x )f(x 2 ) and then g(y) f(y z)f(z)dz - the convolution of f and f 2 For the pdf f(x) ˆx is a mode if f(ˆx) f(x) x ˆx is a median if ˆx f(x)dx ˆx For a discrete random variable, ˆx is a median if f(x)dx 2 P(X ˆx) 2 or P(X ˆx) 2 If X,..., X n is a sample from the distribution then recall that the sample mean is n X i n Let Y,..., Y n (the statistics) be the values of X,..., X n arranged in increasing order. Then the sample median is Y n+ if n is odd or any value in 2 [ ] Y n, Y 2 n+ if n is even 2 If Y n maxx,..., X n and X,..., X n are iidrv s with distribution F and density f then, P(Y n y) P(X y,..., X n y) (F (y)) n
67 6.2. TRANSFORMATION OF RANDOM VARIABLES 6 Thus the density of Y n is g(y) d (F (y))n dy Similarly Y minx,..., X n and is Then the density of Y is n (F (y)) n f(y) P(Y y) P(X y,..., X n y) ( F (y)) n What about the joint density of Y, Y n? G(y, y n ) P(Y y, Y n y n ) n ( F (y)) n f(y) P(Y n y n ) P(Y n y n, Y ) P(Y n y n ) P(y X y n, y X 2 y n,..., y X n y n ) (F (y n )) n (F (y n ) F (y )) n Thus the pdf of Y, Y n is g(y, y n ) 2 y y n G(y, y n ) n(n ) (F (y n ) F (y )) n 2 f(y )f(y n ) y y n 0 otherwise What happens if the mapping is not -? X f(x) and X g(x)? P( X (a, b)) b a (f(x) + f( x)) dx g(x) f(x) + f( x) Suppose X,..., X n are iidrv s. What is the pdf of Y,..., Y n the order statistics? { n!f(y )... f(y n ), y y 2 y n g(y,..., y n ) (6.6) 0, Otherwise Example. Suppose X,..., X n are iidrv s exponentially distributed with parameter λ. Let z Y z 2 Y 2 Y. z n Y n Y n Where Y,..., Y n are the order statistics of X,..., X n. What is the distribution of the z s?
68 62 CHAPTER 6. CONTINUOUS RANDOM VARIABLES Z AY Where det(a) A h(z,..., z n ) g(y,..., y n ) n!f(y )... f(y n ) n!λ n e λy... e λyn n!λ n e λ(y+ +yn) n!λ n e λ(z2z2+ +nzn) n λie λizn+ i i (6.7) Thus h(z,..., z n ) is expressed as the product of n density functions and Z n+ i exp(λi) exponentially distributed with parameter λi, with z,..., z n independent. Example. Let X and Y be independent N(0.) random variables. Let D R 2 X 2 + Y 2 then tan Θ Y X then ( y d x 2 + y 2 and θ arctan x) J 2x y x 2 +( x) y 2 2y x +( x) y 2 2 (6.8)
Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More information3. DISCRETE RANDOM VARIABLES
IA Probability Lent Term 3 DISCRETE RANDOM VARIABLES 31 Introduction When an experiment is conducted there may be a number of quantities associated with the outcome ω Ω that may be of interest Suppose
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationProbability Models. 4. What is the definition of the expectation of a discrete random variable?
1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions
More informationProbability. J.R. Norris. December 13, 2017
Probability J.R. Norris December 13, 2017 1 Contents 1 Mathematical models for randomness 6 1.1 A general definition............................... 6 1.2 Equally likely outcomes.............................
More informationEE514A Information Theory I Fall 2013
EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationSUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)
SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems
More information2 (Statistics) Random variables
2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More information4. CONTINUOUS RANDOM VARIABLES
IA Probability Lent Term 4 CONTINUOUS RANDOM VARIABLES 4 Introduction Up to now we have restricted consideration to sample spaces Ω which are finite, or countable; we will now relax that assumption We
More informationStatistics 100A Homework 5 Solutions
Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationChp 4. Expectation and Variance
Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.
More informationRandom Models. Tusheng Zhang. February 14, 2013
Random Models Tusheng Zhang February 14, 013 1 Introduction In this module, we will introduce some random models which have many real life applications. The course consists of four parts. 1. A brief review
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More information4 Branching Processes
4 Branching Processes Organise by generations: Discrete time. If P(no offspring) 0 there is a probability that the process will die out. Let X = number of offspring of an individual p(x) = P(X = x) = offspring
More informationPROBABILITY VITTORIA SILVESTRI
PROBABILITY VITTORIA SILVESTRI Contents Preface 2 1. Introduction 3 2. Combinatorial analysis 6 3. Stirling s formula 9 4. Properties of Probability measures 12 5. Independence 17 6. Conditional probability
More informationMATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)
MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) Last modified: March 7, 2009 Reference: PRP, Sections 3.6 and 3.7. 1. Tail-Sum Theorem
More informationWeek 12-13: Discrete Probability
Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible
More informationTopic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.
Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationReview of Probability Theory
Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving
More informationMultivariate distributions
CHAPTER Multivariate distributions.. Introduction We want to discuss collections of random variables (X, X,..., X n ), which are known as random vectors. In the discrete case, we can define the density
More informationTHE QUEEN S UNIVERSITY OF BELFAST
THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M
More informationProbability: Handout
Probability: Handout Klaus Pötzelberger Vienna University of Economics and Business Institute for Statistics and Mathematics E-mail: Klaus.Poetzelberger@wu.ac.at Contents 1 Axioms of Probability 3 1.1
More informationMAS113 Introduction to Probability and Statistics. Proofs of theorems
MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a
More informationADVANCED PROBABILITY: SOLUTIONS TO SHEET 1
ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1 Last compiled: November 6, 213 1. Conditional expectation Exercise 1.1. To start with, note that P(X Y = P( c R : X > c, Y c or X c, Y > c = P( c Q : X > c, Y
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More information2.1 Elementary probability; random sampling
Chapter 2 Probability Theory Chapter 2 outlines the probability theory necessary to understand this text. It is meant as a refresher for students who need review and as a reference for concepts and theorems
More information2. AXIOMATIC PROBABILITY
IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop
More informationActuarial Science Exam 1/P
Actuarial Science Exam /P Ville A. Satopää December 5, 2009 Contents Review of Algebra and Calculus 2 2 Basic Probability Concepts 3 3 Conditional Probability and Independence 4 4 Combinatorial Principles,
More informationBASICS OF PROBABILITY
October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationChapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University
Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real
More informationMAS113 Introduction to Probability and Statistics. Proofs of theorems
MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a
More informationLecture 4: Probability and Discrete Random Variables
Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1
More information1: PROBABILITY REVIEW
1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following
More informationSample Spaces, Random Variables
Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationCME 106: Review Probability theory
: Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationProbability. Copier s Message. Schedules. This is a pure course.
Probability This is a pure course Copier s Message These notes may contain errors In fact, they almost certainly do since they were just copied down by me during lectures and everyone makes mistakes when
More informationPart (A): Review of Probability [Statistics I revision]
Part (A): Review of Probability [Statistics I revision] 1 Definition of Probability 1.1 Experiment An experiment is any procedure whose outcome is uncertain ffl toss a coin ffl throw a die ffl buy a lottery
More informationLecture 2: Review of Probability
Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................
More informationProbability. Computer Science Tripos, Part IA. R.J. Gibbens. Computer Laboratory University of Cambridge. Easter Term 2008/9
Probability Computer Science Tripos, Part IA R.J. Gibbens Computer Laboratory University of Cambridge Easter Term 2008/9 Last revision: 2009-05-06/r-36 1 Outline Elementary probability theory (2 lectures)
More informationA D VA N C E D P R O B A B I L - I T Y
A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2
More information1 Generating functions
1 Generating functions Even quite straightforward counting problems can lead to laborious and lengthy calculations. These are greatly simplified by using generating functions. 2 Definition 1.1. Given a
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationS n = x + X 1 + X X n.
0 Lecture 0 0. Gambler Ruin Problem Let X be a payoff if a coin toss game such that P(X = ) = P(X = ) = /2. Suppose you start with x dollars and play the game n times. Let X,X 2,...,X n be payoffs in each
More informationMATH/STAT 3360, Probability Sample Final Examination Model Solutions
MATH/STAT 3360, Probability Sample Final Examination Model Solutions This Sample examination has more questions than the actual final, in order to cover a wider range of questions. Estimated times are
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 17: Continuous random variables: conditional PDF Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More information3 Multiple Discrete Random Variables
3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationBasic Probability. Introduction
Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with
More informationRandom Variables and Probability Distributions
CHAPTER Random Variables and Probability Distributions Random Variables Suppose that to each point of a sample space we assign a number. We then have a function defined on the sample space. This function
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationLIST OF FORMULAS FOR STK1100 AND STK1110
LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function
More information2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).
Name M362K Final Exam Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. There is a table of formulae on the last page. 1. Suppose X 1,..., X 1 are independent
More information2 Functions of random variables
2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as
More informationStochastic Models of Manufacturing Systems
Stochastic Models of Manufacturing Systems Ivo Adan Organization 2/47 7 lectures (lecture of May 12 is canceled) Studyguide available (with notes, slides, assignments, references), see http://www.win.tue.nl/
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems
More informationProbability Review. Gonzalo Mateos
Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 11, 2018 Introduction
More informationPROBABILITY VITTORIA SILVESTRI
PROBABILITY VITTORIA SILVESTRI Contents Preface. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8 4. Properties of Probability measures Preface These lecture notes are for the course
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 18
EECS 7 Discrete Mathematics and Probability Theory Spring 214 Anant Sahai Note 18 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω,
More informationPractice Examination # 3
Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More information1.1 Review of Probability Theory
1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationStat 134 Fall 2011: Notes on generating functions
Stat 3 Fall 0: Notes on generating functions Michael Lugo October, 0 Definitions Given a random variable X which always takes on a positive integer value, we define the probability generating function
More informationM378K In-Class Assignment #1
The following problems are a review of M6K. M7K In-Class Assignment # Problem.. Complete the definition of mutual exclusivity of events below: Events A, B Ω are said to be mutually exclusive if A B =.
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair
More informationWe introduce methods that are useful in:
Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more
More informationMath Bootcamp 2012 Miscellaneous
Math Bootcamp 202 Miscellaneous Factorial, combination and permutation The factorial of a positive integer n denoted by n!, is the product of all positive integers less than or equal to n. Define 0! =.
More informationLectures on Elementary Probability. William G. Faris
Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................
More informationRecap of Basic Probability Theory
02407 Stochastic Processes? Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk
More information8 Laws of large numbers
8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationSingle Maths B: Introduction to Probability
Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction
More informationUniversity of Regina. Lecture Notes. Michael Kozdron
University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating
More informationRandom Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.
Random Variables Amappingthattransformstheeventstotherealline. Example 1. Toss a fair coin. Define a random variable X where X is 1 if head appears and X is if tail appears. P (X =)=1/2 P (X =1)=1/2 Example
More informationDiscrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability
CS 7 Discrete Mathematics and Probability Theory Fall 215 Note 2 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the number
More informationMath Introduction to Probability. Davar Khoshnevisan University of Utah
Math 5010 1 Introduction to Probability Based on D. Stirzaker s book Cambridge University Press Davar Khoshnevisan University of Utah Lecture 1 1. The sample space, events, and outcomes Need a math model
More informationChapter 2: Random Variables
ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:
More informationLecture 16. Lectures 1-15 Review
18.440: Lecture 16 Lectures 1-15 Review Scott Sheffield MIT 1 Outline Counting tricks and basic principles of probability Discrete random variables 2 Outline Counting tricks and basic principles of probability
More informationIntroduction to Information Entropy Adapted from Papoulis (1991)
Introduction to Information Entropy Adapted from Papoulis (1991) Federico Lombardo Papoulis, A., Probability, Random Variables and Stochastic Processes, 3rd edition, McGraw ill, 1991. 1 1. INTRODUCTION
More informationClassical Probability
Chapter 1 Classical Probability Probability is the very guide of life. Marcus Thullius Cicero The original development of probability theory took place during the seventeenth through nineteenth centuries.
More informationChapter 1. Sets and probability. 1.3 Probability space
Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationMarkov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains
Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time
More informationDiscrete Mathematics and Probability Theory Fall 2013 Vazirani Note 16. A Brief Introduction to Continuous Probability
CS 7 Discrete Mathematics and Probability Theory Fall 213 Vazirani Note 16 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the
More information