MA 519 : Review Yingwei Wang Department of Mathematics, Purdue University, West Lafayette, IN, USA Contents 1 How to compute the expectation? 1.1 Tail........................................... 1. Index.......................................... Permutations and combinations 3.1 Stars and bars..................................... 3. Divided to parts.................................... 3 3 About the Exp(λ) 3 3.1 Basic facts....................................... 3 3. Scaling......................................... 4 3.3 Relation to Geo(p).................................. 4 3.4 Memoryless property................................. 4 3.5 Hazard rate...................................... 4 3.6 Comparison between independent variables..................... 5 3.6.1 Two variables with different λ........................ 5 3.6. Two variables with the same λ........................ 5 3.6.3 More than two variables with the same λ.................. 5 4 Order statistics 6 4.1 Two variables with joint density........................... 6 4. n variables with iid.................................. 6 4.3 Conditional case.................................... 7 This is based on the lecture notes of Prof Sellke. 1
5 Poisson process 7 5.1 Gamma distribution................................. 7 5. Poisson distribution and binomial distribution................... 7 5.3 Poisson process.................................... 8 5.3.1 Definition................................... 8 5.3. Comparison.................................. 8 6 Normal Statistics 9 6.1 Independent case................................... 9 6. Dependent case.................................... 9 6.3 Prediction....................................... 10 7 Limiting distribution 10 7.1 The max of Exp(λ).................................. 10 7. The min of U[0,1]................................... 10 8 Useful things 11 8.1 Jensen Theorem.................................... 11 8. Stirling formula.................................... 11 8.3 Slutsky Theorem................................... 11 8.4 Delta method..................................... 11 8.5 Central limit theorem................................. 11 9 Mongolian coins problem 11 9.1 Versions........................................ 11 9. Questions....................................... 1 1 How to compute the expectation? 1.1 Tail E(X) P(X k). k1 1. Index E(X) k I k (w), where I k 0 or 1.
Permutations and combinations.1 Stars and bars Theorem.1. The number of distinguishable ways that n indistinguishable balls can be distributed among r distinguishable boxes is ( ) n+r 1 r 1 Corollary.1. If box i is requested to have m i balls, ( r i1 m i n), then the answer is ( n r i1 m ) i +r 1 Corollary.. There are box is empty. ( n 1 r 1 r 1 ) ways to distribute n identical balls to r boxes so that no. Divided to parts Theorem.. n people are supposed to be divided into r groups with n r people in each group, i.e. n 1 + +n r n. Then there are ( ways to do that. 3 About the Exp(λ) n n 1,n,,n r ) n! n 1! n r! 3.1 Basic facts If X Exp (λ), then { 0, if t 0, f X (t) λe λt, if t > 0. { 0, if t 0, F X (t) 1 e λt, if t > 0. (3.1) (3.) E(X) 1 λ, E(Xk ) k! λk, (3.3) Var(X) 1 λ. (3.4) 3
3. Scaling If X Exp (λ), then cx Exp (λ/c), where c > 0 is a constant. 3.3 Relation to Geo(p) If X Exp (λ), then P(0 < X 1) 1 e λ, P(1 < X ) e λ e λ e λ (1 e λ ), P(k < X k +1) e kλ (1 e λ ). Let P(Y k) P(k < X k +1), then Y Geo (1 e λ ). 3.4 Memoryless property If X Exp (λ), then Note that it is independent of t. 3.5 Hazard rate Definition 3.1. If X Exp (λ), then We call λ as Hazard rate. P(X > t+m X > t) λe λ(t+m) λe λt e λm P(X > m). (3.5) P(X [t,t+δ) X t) P(t < X < t+δ,x t) P(X t) e λ(t+δ) e λt e λt e λδ δλ. Definition 3.. Generally, if random variable T has density function f(t) and cdf F(t), then the Hazard rate is λ T (t) f(t) 1 F(t). 4
3.6 Comparison between independent variables Note that all of the random variables here are independent to each other. 3.6.1 Two variables with different λ If X Exp (λ 1 ), Y Exp (λ ), then Furthermore, let U min(x,y), then P(X < Y) λ 1 λ 1 +λ. (3.6) P(U > t) P(X > t)p(y > t) e (λ 1+λ )t, so U Exp (λ 1 +λ ). (3.7) 3.6. Two variables with the same λ Suppose X,Y Exp (λ), and D X Y,W X Y, then { 1 f D (t) λe λt, if t > 0, 1 λeλt, if t < 0. { 1 f W (t) 3 λeλt, if t < 0, 1 3 λe 1 λt, if t > 0. (3.8) (3.9) Furthermore, U min(x,y), V max(x,y), T V U, then T Exp (λ). Note that T D. Besides, X 1 X 1 +X U[0,1]. 3.6.3 More than two variables with the same λ Suppose X 1,,X n,y Exp (λ), then min{x 1,,X n } Exp (nλ), (3.10) P(max{X 1,,X n } < Y) 1 n+1, (3.11) ( ) n 1 P(X 1 +X + +X n < Y). (3.1) 5
Furthermore, if X 1,,X n, Exp (1), let V n max{x 1,X,,X n }, then consider the order statistics, V n X (n) X (1) +(X () X (1) )+ +(X (n) X (n 1) ), Exp(n)+ Exp(n 1)+ + Exp(1), E(V n ) 1 n + 1 n 1 + + 1 +1 ln(n), Var(V n ) 1 n + 1 π + (n 1) 6. Remark 3.1. Eq.(3.11) is always true if {X i } and Y are iid, no matter what kind of distribution. Remark 3.. In Eq.(3.1), X 1 +X + +X n Gamma (n,λ). 4 Order statistics 4.1 Two variables with joint density Suppose X,Y have joint density f XY (x,y), then U min(x,y),v max(x,y), then f UV (u,v) { fxy (u,v)+f XY (v,u) u < v, 0 else. (4.1) Furthermore, if we are just interested in the expectation, it is more convenient to use these equations: 4. n variables with iid max(x,y) X +Y min(x,y) X +Y + 1 X Y, (4.) 1 X Y. (4.3) Suppose X 1,X,,X n iid f(t) and F(t), then the density for each X (k) is The combined density is f X(k) Cn 1 Ck 1 n 1 Fk 1 f(1 F) n k, (4.4) n F X(k) Cn j Fj (1 F) n j. (4.5) jk f X(1),X (),,X (n) (x 1,x,,x n ) n!f(x 1 )f(x ) f(x n ), x 1 < x < x n. 6
4.3 Conditional case Suppose X 1,X,,X n iid U[0,1], then the combined density of X (1),X (),,X (n) is { n! 0 < x1 < x f X(1),X (1),,X (n) (x 1,x,,x n ) < < x n < 1, (4.6) 0 else. The conditional density of X (1),,X (k 1),X (k+1),x (n) given X (k) a, a [0,1], is f X(1),,X (k 1),X (k+1),,x (n) X (k) a(x 1,,x k 1,x k+1,,x n x k a) { f(x1,,x k 1,a,x k+1,,x n), 0 < x f(x1,,x k 1,a,x k+1,,x n)dx 1 < < x k 1 < a < x k+1 < x n < 1, 0, else. 5 Poisson process 5.1 Gamma distribution Definition 5.1 (Gamma function). Define the function Γ(x) as Γ(x) 0 e y y x 1 dy, x > 0. Remark 5.1. Special case: Γ(n) (n 1)! for n N; Γ(1/) π,γ(1) 1. Definition 5. (Gamma distribution). Say X Gamma(α, λ) if its density is { 0, if t 0, f X (t) λe λt (λt) α 1, if t > 0. Γ(α) Remark 5.. E(X) α λ, E(Xk ) Γ(α+k) λ k Γ(α), Var(X) α λ. 5. Poisson distribution and binomial distribution Definition 5.3 (Poisson distribution). Say X Poisson(λ) if Remark 5.3. E(X) λ, Var(X) λ. P(X k) e λλk k!, k 0,1,. Definition 5.4 (Binomial distribution). Say X Binomial(n, p) if P(X k) C k n pk (1 p) n k, k 0,1,,n. Theorem 5.1. Suppose X Binomial(n,p). If n is very large while p is very small, then X Poisson (λ np). 7
5.3 Poisson process 5.3.1 Definition Definition 5.5 (Poisson process). Suppose W i Exp(λ), T n n i1 W i, then we say T n is a Poisson process and T n Gamma(n,λ). Theorem 5.. Suppose T n is a Poisson process. Let N the number of W is in the interval [a,b], then N Poisson (λ(b a)). 5.3. Comparison There are two Poisson processes: A with rate λ A, waiting time X 1,X,,X n ; B with rate λ B : waiting time Y 1,Y,,Y m. Then (I) (II) P(X 1 +X + +X n < Y 1 ) P( First n hits in combined process are all A hits ) ( ) n λa. λ A +λ B P(X 1 +X + +X n +1 < Y 1 ) P( First n hits in combined process are all A hits and no B hits in next 1 time unit ) ( ) n λa e λ B. λ A +λ B (III) P(X 1 +X +X 3 < Y 1 +Y ) P( 3rd A hit are before nd B hit ) P( there are 3 or 4 hits of the first 4 hits are A hits ) ( ) 3 ( ) ( ) 4 λa λb λa. C 3 4 λ A +λ B λ A +λ B +C 4 4 λ A +λ B Remark 5.4. Compare the results here with Section 3.6.3. 8
6 Normal Statistics 6.1 Independent case Definition 6.1 (χ distribution ). Say X χ (n) if where x 1,x,,x n iid N(0,1). X x 1 +x + +x n, Remark 6.1. χ (n) Gamma( n, 1 ), χ () Exp( 1 ). Theorem 6.1. Let X 1,X,,X n iid N(µ,σ ), and then 6. Dependent case X 1 n n X i, (6.1) i1 S 1 n 1 n ( Xi X ), (6.) i1 X N(µ, σ ), n (6.3) (n 1)S χ (n 1). σ (6.4) Suppose (X, Y) Standard Bivariate Normal distribution with correlation ρ, then P(X > 0,Y > 0) P(X > 0,ρX + (1 ρ )Z > 0) ρ P(X > 0,Z > Z) 1 ρ ( ) π +arctan ρ 1 ρ where Z N(0,1) and independent with X. π, 9
6.3 Prediction Suppose X,Y with correlation ρ, then the prediction of Y based on X is Ŷ ρx, Ŷ µ Y +ρx σ Y, where X X µ X, Y Y µ Y. σ X σ Y 7 Limiting distribution Idea: try to use ( 1 x n) n e x. 7.1 The max of Exp(λ) Suppose X 1,,X n, Exp (1), and V n max{x 1,X,,X n }, then F Vn (t) P(V n < t) (1 e t ) n, (7.1) ) n P(V n ln(n) < t) (1 e (t+ln(n)) ) n (1 e t e e t. (7.) n Now we know that, if n is sufficient large, then E(V n ) ln(n), (7.3) Median (V n ) ln(n) ln(ln()). (7.4) Remark 7.1. Compare the results here with Section 3.6.3. 7. The min of U[0,1] Suppose X 1,,X n, U(1), and U n min{x 1,X,,X n }, then F Un (t) 1 P(U n > t) 1 (1 t) n, (7.5) ( F nun (t) 1 P U n > t ) ( 1 1 t n 1 e n n) t. (7.6) It indicates that nu n Exp(1), if n is sufficient large. 10
8 Useful things 8.1 Jensen Theorem Theorem 8.1 (Jensen). Suppose Q(x) : R R is convex, then Q(E(X)) E(Q(X)). Remark 8.1. Consider Q(x) x, then E(X ) (EX) Var(X) 0. 8. Stirling formula 8.3 Slutsky Theorem n! ( n ) ne πn θ 1n, θ (0,1). (8.1) e Theorem 8. (Slutsky). If X n X,(D) and Y n a,(p), W n b,(p), then 8.4 Delta method Y n X n +W n ax +b,(d). Suppose a n, a n (W n b) X,(D). Let g : R R be differentiable at b, then 8.5 Central limit theorem a n (g(w n ) g(b)) g (b)x,(d). Theorem 8.3 (Central limit). Suppose X 1,X,,X n iid with mean µ and variance σ. Then X 1 +X + +X n nµ σ n N(0,1). 9 Mongolian coins problem 9.1 Versions Suppose in each toss, P( Head ) θ, where θ U[0,1]. Let N the number of Heads in n tosses. Version I : P(N k) 1 0 C k nθ k (1 θ) n k dθ. 11
Recall the order statistics: is the U (k+1) from U 1,,U n+1 iid U[0,1]. So f (k+1) (t) C 1 n+1 Ck n tk (1 t) n k, P(N k) 1 n+1, k. Version II : Let U 0,U 1,,U n iid U[0,1]. Call θ U 0, X k I{U k < U 0 }. Then 9. Questions N n X k the number of U 1,,U n which are < U 0, k1 P(N k) P(U 0 is the U (k+1) from U 0,,U n ) 1 n+1. Given θ, let P(X i 1) P( ith toss get head ), then X 1,X,,X n iid Bernoulli (θ). (i) Compute P(X 3 X 1 X 1). Method one: P(X 3 X 1 X 1) P(X 1 X X 3 1) P(X 1 X 1) E(P(X 1 X X 3 1 θ [0,1])) E(P(X 1 X 1 θ [0,1])) 1 0 θ3 dθ 1 3 4. 0 θ dθ Method two: Consider the relative order of U 0,U 1,U,U 3. We can also get the same answer. (ii) Compute P(θ < 1/ X 1 X 1). P(θ < 1/ X 1 X 1) 1/ 0 θ dθ 1 0 θ dθ 1 8. 1