Convergence in Distribution

Size: px
Start display at page:

Download "Convergence in Distribution"

Transcription

1 Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal distribution. Also Binomial(n, p) random variable has approximately a N(np, np(1 p)) distribution. Precise meaning of statements like X and Y have approximately the same distribution? Desired meaning: X and Y have nearly the same cdf. But care needed. 1

2 Q1) If n is a large number is the N(0,1/n) distribution close to the distribution of X 0? Q2) Is N(0,1/n) close to the N(1/n,1/n) distribution? Q3) Is N(0,1/n) close to N(1/ n,1/n) distribution? Q4) If X n 2 n is the distribution of X n close to that of X 0? 2

3 Answers depend on how close close needs to be so it s a matter of definition. In practice the usual sort of approximation we want to make is to say that some random variable X, say, has nearly some continuous distribution, like N(0,1). So: want to know probabilities like P(X > x) are nearly P(N(0,1) > x). Real difficulty: case of discrete random variables or infinite dimensions: not done in this course. Mathematicians meaning of close: Either they can provide an upper bound on the distance between the two things or they are talking about taking a limit. In this course we take limits and use metrics. 3

4 Definition: A sequence of random variables X n taking values in a separable metric space S, d converges in distribution to a random variable X if E(g(X n )) E(g(X)) for every bounded continuous function g mapping S to the real line. Notation: : X n X. Remark: This is abusive language. It is the distributions that converge not the random variables. Example: If U is Uniform and X n = U, X = 1 U then X n converges in distribution to X. Other Jargon: weak convergence, weak convergence, convergence in law. 4

5 General Properties: If X n X and h is continuous from S 1 to S 2 then Y n = h(x n ) Y = h(x) Theorem 1 (Slutsky) If X n X, Y y o and h is continuous from S 1 S 2 to S 3 at x, y o for each x then Z n = h(x n, Y n ) Z = h(x, y) 5

6 We will begin by specializing to simplest case: S is the real line and d(x, y) = x y. In the following we suppose that X n, X are real valued random variables. Theorem 2 The following are equivalent: 1. X n converges in distribution to X. 2. P(X n x) P(X x) for each x such that P(X = x) = The limit of the characteristic functions of X n is the characteristic function of X: for every real t. E(e itxn ) E(e itx ) These are all implied by M Xn (t) M X (t) < for all t ǫ for some positive ǫ. 6

7 Now let s go back to the questions I asked: X n N(0,1/n) and X = 0. Then P(X n x) 1 x > 0 0 x < 0 1/2 x = 0 Limit is cdf of X = 0 except for x = 0; cdf of X is not continuous at x = 0. So: X n X. Does X n N(1/n,1/n) have distribution close that of Y n N(0,1/n). Find a limit X and prove both X n X and Y n X. Take X = 0. Then and E(e txn ) = e t/n+t2 /(2n) 1 = E(e tx ) E(e tyn ) = e t2 /(2n) 1 so that both X n and Y n have the same limit in distribution. 7

8 N(0,1/n) vs X=0; n= X=0 N(0,1/n) N(0,1/n) vs X=0; n= X=0 N(0,1/n)

9 N(1/n,1/n) vs N(0,1/n); n= N(0,1/n) N(1/n,1/n) N(1/n,1/n) vs N(0,1/n); n= N(0,1/n) N(1/n,1/n)

10 Multiply both X n and Y n by n 1/2 and let X N(0,1). Then nx n N(n 1/2,1) and ny n N(0,1). Use characteristic functions to prove that both nx n and nyn converge to N(0,1) in distribution. If you now let X n N(n 1/2,1/n) and Y n N(0,1/n) then again both X n and Y n converge to 0 in distribution. If you multiply X n and Y n in the previous point by n 1/2 then n 1/2 X n N(1,1) and n 1/2 Y n N(0,1) so that n 1/2 X n and n 1/2 Y n are not close together in distribution. You can check that 2 n 0 in distribution. 10

11 N(1/sqrt(n),1/n) vs N(0,1/n); n= N(0,1/n) N(1/sqrt(n),1/n) N(1/sqrt(n),1/n) vs N(0,1/n); n= N(0,1/n) N(1/sqrt(n),1/n)

12 Summary: to derive approximate distributions: Show sequence of rvs X n converges weakly to some X. The limit distribution (i.e. dstbn of X) should be non-trivial, like say N(0,1). Don t say: X n is approximately N(1/n,1/n). Do say: n 1/2 (X n 1/n) converges to N(0,1) in distribution. 12

13 The Central Limit Theorem Theorem 3 If X 1, X 2, are iid with mean 0 and variance 1 then n 1/2 X converges in distribution to N(0,1). That is, P(n 1/2 X x) 1 2π x e y2 /2 dy. Proof: We will show E(e itn1/2 X ) e t 2 /2. This is the characteristic function of N(0,1) so we are done by our theorem. 13

14 Some basic facts: If Z N(0,1) then E ( e itz) = e t2 /2 Theorem 4 If X is a real random variable with E( X k ) < then the function ψ(t) = E ( e itx) has k continuous derivatives as a function of the real variable t. (Real part and imaginary part each have that many derivatives.) Moreover for 1 j k we find ψ (j) (t) = i k E ( X k e itx) Theorem 5 (Taylor Expansion) For such an X: ψ(t) = 1 + k j=1 i j E(X j )t j /j! + R(t) where the remainder function R(t) satisfies lim t 0 R(t)/tk = 0 14

15 Finish proof: let ψ(t) = E(exp(itX 1 )): E(e it n X ) = ψ n (t/ n) Since variance is 1 and mean is 0: ψ(t) = 1 t 2 /2 + R(t) where lim t 0 R(t)/t 2 = 0. Fix t, replace t by t/ n: ψ n (t/ n) = 1 t 2 /(2n) + R(t/ n) Define x n = t 2 /2 + 2nR(t/ n). Notice x n t 2 /2 (by property of R) and use x n x implies valid for all complex x. Get to finish proof. (1 + x n /n) n e x E(e itn1/2 X ) e t2 /2. 15

16 Proof of Theorem 4: do case k = 1. Must show But lim h 0 ψ(t + h) ψ(t) h ψ(t + h) ψ(t) h = E = ie(xe itx ) ei(t+h)x e itx h Fact: e i(t+h)x e itx h X for any t. By Dominated Convergence Theorem can take limit inside integral to get ψ (t) = ie(xe itx ) 16

17 Multivariate convergence in distribution Definition: X n R p converges in distribution to X R p if E(g(X n )) E(g(X)) for each bounded continuous real valued function g on R p. This is equivalent to either of Cramér Wold Device: a T X n converges in distribution to a T X for each a R p. or Convergence of characteristic functions: E(e iat X n ) E(e iat X ) for each a R p. 17

18 Extensions of the CLT 1. Y 1, Y 2, iid in R p, mean µ, variance Σ then n 1/2 (Ȳ µ) MVN(0,Σ). 2. Lyapunov CLT: for each n X n1,..., X nn independent rvs with E(X ni ) = 0 (1) Var( X ni ) = 1 (2) i E( X ni 3 ) 0 (3) then i X ni N(0,1). i 3. Lindeberg CLT: If conds (1), (2) and E(X 2 ni 1( X ni > ǫ)) 0 each ǫ > 0 then i X ni N(0,1). (Lyapunov s condition implies Lindeberg s.) 4. Non-independent rvs: m-dependent CLT, martingale CLT, CLT for mixing processes. 5. Not sums: Slutsky s theorem, δ method. 18

19 Slutsky s Theorem in R p : If X n X and Y n converges in distribution (or in probability) to c, a constant, then X n + Y n X + c. More generally, if f(x, y) is continuous then f(x n, Y n ) f(x, c). Warning: hypothesis that limit of Y n constant is essential. Definition: Y n Y in probability if ǫ > 0: P(d(Y n, Y ) > ǫ) 0. Fact: for Y constant convergence in distribution and in probability are the same. Always convergence in probability implies convergence in distribution. Both are weaker than almost sure convergence: Definition: Y n Y almost surely if P({ω Ω : lim n Y n (ω) = Y (ω)}) = 1. 19

20 Theorem 6 (The delta method) Suppose: Sequence Y n y, a constant. If X n = a n (Y n y) then X n X for some random variable X. f is ftn defined on a neighbourhood of y R p which is differentiable at y. Then a n (f(y n ) f(y)) converges in distribution to f (y)x. If X n R p and f : R p R q then f is q p matrix of first derivatives of components of f. 20

21 Proof: The function f : R q R p is differentiable at y R q if there is a matrix Df such that lim h 0 f(y + h) f(y) Dfh h = 0 that is, for each ǫ > 0 there is a δ > 0 such that h δ implies Define and f(y + h) f(y) Dfh ǫ h. R n = a n (f(y n ) f(y)) a n Df (Y n y) S n = a n Df (Y n y) = DfX n According to Slutsky s theorem S n DfX If we now prove R n 0 then by Slutsky s theorem we find a n (f(y n ) f(y)) = S n + R n DfX 21

22 Now fix ǫ 1, ǫ 2 > 0. I claim there is K so big that for all n P(B n ) P( a n (Y n y) > K) ǫ 1. Let δ > 0 be the value in the definition of derivative corresponding to ǫ 2 /K. Choose N so large that n N implies K/a n δ. For n N we have { a n (Y n y) > K} { Y n y > δ} { R n > ǫ 2 } so that n N implies P( R n > ǫ 2 ) ǫ 1 which means R n 0 in probability. 22

23 Example: Suppose X 1,..., X n are a sample from a population with mean µ, variance σ 2, and third and fourth central moments µ 3 and µ 4. Then n 1/2 (s 2 σ 2 ) N(0, µ 4 σ 4 ) where is notation for convergence in distribution. For simplicity I define s 2 = X 2 X 2. 23

24 How to apply δ method: 1) Write statistic as a function of averages: Define See that Define W i = W n = [ X 2 i X i [ X 2 X ] ]. See that s 2 = f( W n ). f(x 1, x 2 ) = x 1 x 2 2 2) Compute mean of your averages: [ ] [ E(X 2 µ W E( W n ) = i ) µ = 2 + σ 2 E(X i ) µ ]. 3) In δ method theorem take Y n = W n and y = E(Y n ). 24

25 4) Take a n = n 1/2. 5) Use central limit theorem: n 1/2 (Y n y) MV N(0,Σ) where Σ = Var(W i ). 6) To compute Σ take expected value of (W µ W )(W µ W ) T There are 4 entries in this matrix. Top left entry is This has expectation: (X 2 µ 2 σ 2 ) 2 E { (X 2 µ 2 σ 2 ) 2} = E(X 4 ) (µ 2 + σ 2 ) 2. 25

26 Using binomial expansion: So E(X 4 ) = E{(X µ + µ) 4 } = µ 4 + 4µµ 3 + 6µ 2 σ 2 + 4µ 3 E(X µ) + µ 4. Σ 11 = µ 4 σ 4 + 4µµ 3 + 4µ 2 σ 2 Top right entry is expectation of (X 2 µ 2 σ 2 )(X µ) which is E(X 3 ) µe(x 2 ) Similar to 4th moment get µ 3 + 2µσ 2 Lower right entry is σ 2. So Σ = [ µ4 σ 4 + 4µµ 3 + 4µ 2 σ 2 µ 3 + 2µσ 2 µ 3 + 2µσ 2 σ 2 ] 26

27 7) Compute derivative (gradient) of f: has components (1, 2x 2 ). Evaluate at y = (µ 2 + σ 2, µ) to get a T = (1, 2µ). This leads to n 1/2 (s 2 σ 2 ) [ n 1/2 X [1, 2µ] 2 (µ 2 + σ 2 ) X µ which converges in distribution to ] (1, 2µ)MV N(0,Σ). This rv is N(0, a T Σa) = N(0, µ 4 σ 4 ). 27

28 Alternative approach worth pursuing. Suppose c is constant. Define X i = X i c. Then: sample variance of Xi variance of X i. is same as sample Notice all central moments of X i same as for X i. Conclusion: no loss in µ = 0. In this case: a T = (1,0) and Σ = [ µ4 σ 4 µ 3 µ 3 σ 2 ]. Notice that a T Σ = [µ 4 σ 4, µ 3 ] and a T Σa = µ 4 σ 4. 28

29 Special case: if population is N(µ, σ 2 ) then µ 3 = 0 and µ 4 = 3σ 4. Our calculation has n 1/2 (s 2 σ 2 ) N(0,2σ 4 ) You can divide through by σ 2 and get n 1/2 ( s2 1) N(0,2) σ2 In fact ns 2 /σ 2 has a χ 2 n 1 distribution and so the usual central limit theorem shows that (n 1) 1/2 [ns 2 /σ 2 (n 1)] N(0,2) (using mean of χ 2 1 is 1 and variance is 2). Factor out n to get n n 1 n1/2 (s 2 /σ 2 1)+(n 1) 1/2 N(0,2) which is δ method calculation except for some constants. Difference is unimportant: Slutsky s theorem. 29

30 Extending the ideas to higher dimensions. W 1, W 2, iid Density f(x) = exp( (x + 1))1(x > 1) Mean 0 shifted exponential. Set S k = W W k Plot against k for k = 1..n. Label x axis to run from 0 to 1. Rescale vertical axes to fit in square. 30

31 n=100 n=400 S_k/sqrt(n) S_k/sqrt(n) k/n k/n n=1600 n=10000 S_k/sqrt(n) S_k/sqrt(n) k/n k/n 31

32 Proof: of Slutsky s Theorem: First: why is it true? If X n X and Y n y then we will show (X n, Y n ) (X, y). Point is that joint law of X, y is determined by marginal laws! Once this is done then by definition. E(h(X n, Y n )) E(h(X, y)) Note: You don t need continuity for all x, y but I will do only easy case. 32

33 Definition: A family {P α, α A} of probability measures on (S, d) is tight if for each ǫ > 0 there is a K compact in S such that for every α A P(K) 1 ǫ Theorem 7 If S is a complete separable metric space then each probability measure P on the Borel sets in S is tight. Proof: Let x 1, x 2, be dense in S. For each n draw balls B n,, B n,2, 1/n around x 1, x 2,.... of radius Each point in S is in one of these balls because the x j sequence is dense. That is: S = B n,j j=1 33

34 Thus 1 = P(S) = lim J P Pick J n so large that P J n B n,j j=1 J B n,j j=1 1 ǫ/2 n Let F n be the closure of J n j=1 B n,j. Let K = n=1 F n. I claim K is compact and has probability at least 1 ǫ. First P(K) = 1 P(K c ) = 1 P ( F c n ) 1 P(F c n) 1 ǫ/2 n = 1 ǫ (Incidentally you see that K is not empty!) 34

35 Second: K closed (intersection of closed sets). Third: K is totally bounded since each F n is a cover of K by (closed) balls of radius 1/n. So K is compact. Theorem 8 If X n converge in distribution to some X in a complete separable metric space S then the sequence X n is tight. Conversely: Theorem 9 If the sequence X n is tight then every subsequence is also tight. There is a subsequence X nk and a random variable X such that as k X nk X. Theorem 10 If there is a rv X such that every subsequence of X n has a further subsubsequence converging in distribution to X then X n X. 35

36 Proof of Theorem 9: do R p. First assertion obvious. Let x 1, x 2,... be dense in R p. Find sequence n 1,1 < n 1,2 < such that the sequence F n1,k (x 1 ) has a limit which we denote y 1. Exists because probabilities trapped in [0,1]. (Bolzano-Weierstrass). Pick n 2,1 < n 2,2 < a subsequence of the n 1,k such that F n2,k (x 2 ) has a limit which we denote y 2. Continue picking subsequence n m+1,k from the sequence n m,k so that F nm+1,k (x m+1 ) has a limit which we denote y m+1. 36

37 Trick: Diagonalization. Consider the sequence n 1,1 < n 2,2 < After the kth entry all remaining are a subsequence of the kth subsequence n k,j. So for each j. lim k F n k,k (x j ) = y j Idea: would like to define F X (x j ) = y j but that might not give cdf. Instead set F X (x) = inf{y j : x j > x}. Next: prove F X is cdf. Then prove subsequence converges to F X. 37

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

IEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15.

IEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15. IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2011, Professor Whitt Class Lecture Notes: Thursday, September 15. Random Variables, Conditional Expectation and Transforms 1. Random

More information

Limiting Distributions

Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the two fundamental results

More information

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ). 1 Economics 620, Lecture 8a: Asymptotics II Uses of Asymptotic Distributions: Suppose X n! 0 in probability. (What can be said about the distribution of X n?) In order to get distribution theory, we need

More information

Limiting Distributions

Limiting Distributions Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the

More information

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory Charles J. Geyer School of Statistics University of Minnesota 1 Asymptotic Approximation The last big subject in probability

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Theoretical Statistics. Lecture 1.

Theoretical Statistics. Lecture 1. 1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1 Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat.

More information

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2 APPM/MATH 4/5520 Solutions to Exam I Review Problems. (a) f X (x ) f X,X 2 (x,x 2 )dx 2 x 2e x x 2 dx 2 2e 2x x was below x 2, but when marginalizing out x 2, we ran it over all values from 0 to and so

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

CHAPTER 3: LARGE SAMPLE THEORY

CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1. Chapter 1 Metric spaces 1.1 Metric and convergence We will begin with some basic concepts. Definition 1.1. (Metric space) Metric space is a set X, with a metric satisfying: 1. d(x, y) 0, d(x, y) = 0 x

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

CS145: Probability & Computing

CS145: Probability & Computing CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

lim F n(x) = F(x) will not use either of these. In particular, I m keeping reserved for implies. ) Note:

lim F n(x) = F(x) will not use either of these. In particular, I m keeping reserved for implies. ) Note: APPM/MATH 4/5520, Fall 2013 Notes 9: Convergence in Distribution and the Central Limit Theorem Definition: Let {X n } be a sequence of random variables with cdfs F n (x) = P(X n x). Let X be a random variable

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Chapter 4. Chapter 4 sections

Chapter 4. Chapter 4 sections Chapter 4 sections 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP: 4.8 Utility Expectation

More information

Measure-theoretic probability

Measure-theoretic probability Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Stochastic Models (Lecture #4)

Stochastic Models (Lecture #4) Stochastic Models (Lecture #4) Thomas Verdebout Université libre de Bruxelles (ULB) Today Today, our goal will be to discuss limits of sequences of rv, and to study famous limiting results. Convergence

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

SDS : Theoretical Statistics

SDS : Theoretical Statistics SDS 384 11: Theoretical Statistics Lecture 1: Introduction Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin https://psarkar.github.io/teaching Manegerial Stuff

More information

Lecture 14: Multivariate mgf s and chf s

Lecture 14: Multivariate mgf s and chf s Lecture 14: Multivariate mgf s and chf s Multivariate mgf and chf For an n-dimensional random vector X, its mgf is defined as M X (t) = E(e t X ), t R n and its chf is defined as φ X (t) = E(e ıt X ),

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

Properties of Random Variables

Properties of Random Variables Properties of Random Variables 1 Definitions A discrete random variable is defined by a probability distribution that lists each possible outcome and the probability of obtaining that outcome If the random

More information

Math 117: Infinite Sequences

Math 117: Infinite Sequences Math 7: Infinite Sequences John Douglas Moore November, 008 The three main theorems in the theory of infinite sequences are the Monotone Convergence Theorem, the Cauchy Sequence Theorem and the Subsequence

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary

More information

Fourth Week: Lectures 10-12

Fourth Week: Lectures 10-12 Fourth Week: Lectures 10-12 Lecture 10 The fact that a power series p of positive radius of convergence defines a function inside its disc of convergence via substitution is something that we cannot ignore

More information

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours

More information

i=1 k i=1 g i (Y )] = k

i=1 k i=1 g i (Y )] = k Math 483 EXAM 2 covers 2.4, 2.5, 2.7, 2.8, 3.1, 3.2, 3.3, 3.4, 3.8, 3.9, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.9, 5.1, 5.2, and 5.3. The exam is on Thursday, Oct. 13. You are allowed THREE SHEETS OF NOTES and

More information

Section Taylor and Maclaurin Series

Section Taylor and Maclaurin Series Section.0 Taylor and Maclaurin Series Ruipeng Shen Feb 5 Taylor and Maclaurin Series Main Goal: How to find a power series representation for a smooth function us assume that a smooth function has a power

More information

Statistics, Data Analysis, and Simulation SS 2015

Statistics, Data Analysis, and Simulation SS 2015 Statistics, Data Analysis, and Simulation SS 2015 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler Mainz, 27. April 2015 Dr. Michael O. Distler

More information

A Brief Analysis of Central Limit Theorem. SIAM Chapter Florida State University

A Brief Analysis of Central Limit Theorem. SIAM Chapter Florida State University 1 / 36 A Brief Analysis of Central Limit Theorem Omid Khanmohamadi (okhanmoh@math.fsu.edu) Diego Hernán Díaz Martínez (ddiazmar@math.fsu.edu) Tony Wills (twills@math.fsu.edu) Kouadio David Yao (kyao@math.fsu.edu)

More information

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N Problem 1. Let f : A R R have the property that for every x A, there exists ɛ > 0 such that f(t) > ɛ if t (x ɛ, x + ɛ) A. If the set A is compact, prove there exists c > 0 such that f(x) > c for all x

More information

Chapter 5. Weak convergence

Chapter 5. Weak convergence Chapter 5 Weak convergence We will see later that if the X i are i.i.d. with mean zero and variance one, then S n / p n converges in the sense P(S n / p n 2 [a, b])! P(Z 2 [a, b]), where Z is a standard

More information

Mid Term-1 : Practice problems

Mid Term-1 : Practice problems Mid Term-1 : Practice problems These problems are meant only to provide practice; they do not necessarily reflect the difficulty level of the problems in the exam. The actual exam problems are likely to

More information

Sometimes can find power series expansion of M X and read off the moments of X from the coefficients of t k /k!.

Sometimes can find power series expansion of M X and read off the moments of X from the coefficients of t k /k!. Moment Generating Functions Defn: The moment generating function of a real valued X is M X (t) = E(e tx ) defined for those real t for which the expected value is finite. Defn: The moment generating function

More information

Characteristic Functions and the Central Limit Theorem

Characteristic Functions and the Central Limit Theorem Chapter 6 Characteristic Functions and the Central Limit Theorem 6.1 Characteristic Functions 6.1.1 Transforms and Characteristic Functions. There are several transforms or generating functions used in

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Lecture 13 and 14: Bayesian estimation theory

Lecture 13 and 14: Bayesian estimation theory 1 Lecture 13 and 14: Bayesian estimation theory Spring 2012 - EE 194 Networked estimation and control (Prof. Khan) March 26 2012 I. BAYESIAN ESTIMATORS Mother Nature conducts a random experiment that generates

More information

Transforms. Convergence of probability generating functions. Convergence of characteristic functions functions

Transforms. Convergence of probability generating functions. Convergence of characteristic functions functions Transforms For non-negative integer value ranom variables, let the probability generating function g X : [0, 1] [0, 1] be efine by g X (t) = E(t X ). The moment generating function ψ X (t) = E(e tx ) is

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Random Variables. P(x) = P[X(e)] = P(e). (1)

Random Variables. P(x) = P[X(e)] = P(e). (1) Random Variables Random variable (discrete or continuous) is used to derive the output statistical properties of a system whose input is a random variable or random in nature. Definition Consider an experiment

More information

the convolution of f and g) given by

the convolution of f and g) given by 09:53 /5/2000 TOPIC Characteristic functions, cont d This lecture develops an inversion formula for recovering the density of a smooth random variable X from its characteristic function, and uses that

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

Problem Set 2: Solutions Math 201A: Fall 2016

Problem Set 2: Solutions Math 201A: Fall 2016 Problem Set 2: s Math 201A: Fall 2016 Problem 1. (a) Prove that a closed subset of a complete metric space is complete. (b) Prove that a closed subset of a compact metric space is compact. (c) Prove that

More information

Chapter 6. Characteristic functions

Chapter 6. Characteristic functions Chapter 6 Characteristic functions We define the characteristic function of a random variable X by ' X (t) E e x for t R. Note that ' X (t) R e x P X (dx). So if X and Y have the same law, they have the

More information

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability Probability Theory Chapter 6 Convergence Four different convergence concepts Let X 1, X 2, be a sequence of (usually dependent) random variables Definition 1.1. X n converges almost surely (a.s.), or with

More information

i=1 k i=1 g i (Y )] = k f(t)dt and f(y) = F (y) except at possibly countably many points, E[g(Y )] = f(y)dy = 1, F(y) = y

i=1 k i=1 g i (Y )] = k f(t)dt and f(y) = F (y) except at possibly countably many points, E[g(Y )] = f(y)dy = 1, F(y) = y Math 480 Exam 2 is Wed. Oct. 31. You are allowed 7 sheets of notes and a calculator. The exam emphasizes HW5-8, and Q5-8. From the 1st exam: The conditional probability of A given B is P(A B) = P(A B)

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University Section 27 The Central Limit Theorem Po-Ning Chen, Professor Institute of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 3000, R.O.C. Identically distributed summands 27- Central

More information

Analysis Qualifying Exam

Analysis Qualifying Exam Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

Math 832 Fall University of Wisconsin at Madison. Instructor: David F. Anderson

Math 832 Fall University of Wisconsin at Madison. Instructor: David F. Anderson Math 832 Fall 2013 University of Wisconsin at Madison Instructor: David F. Anderson Pertinent information Instructor: David Anderson Office: Van Vleck 617 email: anderson@math.wisc.edu Office hours: Mondays

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability Chapter 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary family

More information

Chapter 2. Metric Spaces. 2.1 Metric Spaces

Chapter 2. Metric Spaces. 2.1 Metric Spaces Chapter 2 Metric Spaces ddddddddddddddddddddddddd ddddddd dd ddd A metric space is a mathematical object in which the distance between two points is meaningful. Metric spaces constitute an important class

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

µ X (A) = P ( X 1 (A) )

µ X (A) = P ( X 1 (A) ) 1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration

More information

Chapter 7: Special Distributions

Chapter 7: Special Distributions This chater first resents some imortant distributions, and then develos the largesamle distribution theory which is crucial in estimation and statistical inference Discrete distributions The Bernoulli

More information

Many of you got these steps reversed or otherwise out of order.

Many of you got these steps reversed or otherwise out of order. Problem 1. Let (X, d X ) and (Y, d Y ) be metric spaces. Suppose that there is a bijection f : X Y such that for all x 1, x 2 X. 1 10 d X(x 1, x 2 ) d Y (f(x 1 ), f(x 2 )) 10d X (x 1, x 2 ) Show that if

More information

Econ 508B: Lecture 5

Econ 508B: Lecture 5 Econ 508B: Lecture 5 Expectation, MGF and CGF Hongyi Liu Washington University in St. Louis July 31, 2017 Hongyi Liu (Washington University in St. Louis) Math Camp 2017 Stats July 31, 2017 1 / 23 Outline

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are

More information

Introduction to Statistical Inference Self-study

Introduction to Statistical Inference Self-study Introduction to Statistical Inference Self-study Contents Definition, sample space The fundamental object in probability is a nonempty sample space Ω. An event is a subset A Ω. Definition, σ-algebra A

More information

EE4601 Communication Systems

EE4601 Communication Systems EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

Department of Mathematics

Department of Mathematics Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Supplement 2: Review Your Distributions Relevant textbook passages: Pitman [10]: pages 476 487. Larsen

More information

Math Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT

Math Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT Math Camp II Calculus Yiqing Xu MIT August 27, 2014 1 Sequence and Limit 2 Derivatives 3 OLS Asymptotics 4 Integrals Sequence Definition A sequence {y n } = {y 1, y 2, y 3,..., y n } is an ordered set

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Notes on uniform convergence

Notes on uniform convergence Notes on uniform convergence Erik Wahlén erik.wahlen@math.lu.se January 17, 2012 1 Numerical sequences We begin by recalling some properties of numerical sequences. By a numerical sequence we simply mean

More information

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition)

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition) Exam P Review Sheet log b (b x ) = x log b (y k ) = k log b (y) log b (y) = ln(y) ln(b) log b (yz) = log b (y) + log b (z) log b (y/z) = log b (y) log b (z) ln(e x ) = x e ln(y) = y for y > 0. d dx ax

More information

Chapter 5 Joint Probability Distributions

Chapter 5 Joint Probability Distributions Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger Chapter 5 Joint Probability Distributions 5 Joint Probability Distributions CHAPTER OUTLINE 5-1 Two

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4.1 (*. Let independent variables X 1,..., X n have U(0, 1 distribution. Show that for every x (0, 1, we have P ( X (1 < x 1 and P ( X (n > x 1 as n. Ex. 4.2 (**. By using induction

More information

1 Review of Probability and Distributions

1 Review of Probability and Distributions Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote

More information

Lecture 5: Moment generating functions

Lecture 5: Moment generating functions Lecture 5: Moment generating functions Definition 2.3.6. The moment generating function (mgf) of a random variable X is { x e tx f M X (t) = E(e tx X (x) if X has a pmf ) = etx f X (x)dx if X has a pdf

More information

Probability on a Riemannian Manifold

Probability on a Riemannian Manifold Probability on a Riemannian Manifold Jennifer Pajda-De La O December 2, 2015 1 Introduction We discuss how we can construct probability theory on a Riemannian manifold. We make comparisons to this and

More information