Theoretical Statistics. Lecture 1.

Similar documents
SDS : Theoretical Statistics

Theoretical Statistics. Lecture 17.

Convergence in Distribution

Stochastic Convergence, Delta Method & Moment Estimators

Lecture 21: Convergence of transformations and generating a random variable

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Asymptotic Statistics-III. Changliang Zou

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

7 Influence Functions

1 Stat 605. Homework I. Due Feb. 1, 2011

Lecture 32: Asymptotic confidence sets and likelihoods

University of California San Diego and Stanford University and

Qualifying Exam in Probability and Statistics.

Notes on Random Vectors and Multivariate Normal

Master s Written Examination

Probability Lecture III (August, 2006)

CHAPTER 3: LARGE SAMPLE THEORY

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets

Stat 710: Mathematical Statistics Lecture 31

Economics 620, Lecture 8: Asymptotics I

Chapter 3: Maximum Likelihood Theory

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models

Mathematical statistics

Useful Probability Theorems

Mathematical statistics

Tail bound inequalities and empirical likelihood for the mean

17. Convergence of Random Variables

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

STAT Sample Problem: General Asymptotic Results

Formulas for probability theory and linear models SF2941

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018

STAT215: Solutions for Homework 2

Lecture 28: Asymptotic confidence sets

Final Examination Statistics 200C. T. Ferguson June 11, 2009

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it

Lecture Notes on Asymptotic Statistics. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Theoretical Statistics. Lecture 12.

Chapter 5. Weak convergence

5 Operations on Multiple Random Variables

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

AMCS243/CS243/EE243 Probability and Statistics. Fall Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Spring 2012 Math 541B Exam 1

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

The properties of L p -GMM estimators

STOCHASTIC GEOMETRY BIOIMAGING

MAS223 Statistical Inference and Modelling Exercises

Better Bootstrap Confidence Intervals

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

Qualifying Exam in Probability and Statistics.

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

MATH 450: Mathematical statistics

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Maximum Likelihood Asymptotic Theory. Eduardo Rossi University of Pavia

CS281B/Stat241B. Statistical Learning Theory. Lecture 1.

STAT 512 sp 2018 Summary Sheet

Lecture 2: Random Variables and Expectation

Random Process Lecture 1. Fundamentals of Probability

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

Lecture 22: Variance and Covariance

Lecture 1 Measure concentration

The Multivariate Normal Distribution 1

Statistics. Statistics

ROBUST - September 10-14, 2012

A Very Brief Summary of Statistical Inference, and Examples

2014/2015 Smester II ST5224 Final Exam Solution

1 Exercises for lecture 1

1 Glivenko-Cantelli type theorems

Statistics 3858 : Maximum Likelihood Estimators

Complexity of two and multi-stage stochastic programming problems

Regression Estimation Least Squares and Maximum Likelihood

On the convergence of sequences of random variables: A primer

Probability and Measure

Limiting Distributions

Econ 583 Final Exam Fall 2008

Lecture Notes 3 Convergence (Chapter 5)

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Good luck! Problem 1. (b) The random variables X 1 and X 2 are independent and N(0, 1)-distributed. Show that the random variables X 1

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text.

Exercise Exercise Homework #6 Solutions Thursday 6 April 2006

7 Convergence in R d and in Metric Spaces

Part II Probability and Measure

1 Probability theory. 2 Random variables and probability theory.

Lecture 14: Multivariate mgf s and chf s

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

1 Weak Convergence in R k

Theoretical Statistics. Lecture 23.

6.1 Variational representation of f-divergences

Stochastic Models (Lecture #4)

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Transcription:

1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1

Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat. Office hours: Tue 1-2pm, Wed 1:30-2:30pm (Evans 399). GSI: Siqi Wu. siqi@stat. Office hours: Mon 3:30-4:30pm, Tue 3:30-4:30pm (Evans 307). http://www.stat.berkeley.edu/ bartlett/courses/210b-spring2013/ Check it for announcements, homework assignments,... Texts: Asymptotic Statistics, Aad van der Vaart. Cambridge. 1998. Convergence of Stochastic rocesses, David ollard. Springer. 1984. Available on-line at http://www.stat.yale.edu/ pollard/1984book/ 2

Organizational Issues Assessment: Homework Assignments (60%): posted on the website. Final Exam (40%): scheduled for Thursday, 5/16/13, 8-11am. Required background: Stat 210A, and either Stat 205A or Stat 204. 3

Asymptotics: Why? Example: We have a sample of size n from a density p θ. Some estimator gives ˆθ n. Consistent? i.e., ˆθ n θ? Stochastic convergence. Rate? Is it optimal? Often no finite sample optimality results. Asymptotically optimal? Variance of estimate? Optimal? Asymptotically? Distribution of estimate? Confidence region. Asymptotically? 4

Asymptotics: Approximate confidence regions Example: We have a sample of size n from a density p θ. Maximum likelihood estimator gives ˆθ n. Under mild conditions, ) n(ˆθn θ is asymptotically N ( 0,I 1 ) θ. Thus 1/2 ni θ (ˆθ n θ) N(0,I), and n(ˆθ n θ) T I θ (ˆθ n θ) χ 2 (k). So we have an approximate 1 αconfidence region for θ: { } θ : (θ ˆθ n ) T Iˆθn (θ ˆθ n ) χ2 k,α n. 5

Overview of the Course 1. Tools for consistency, rates, asymptotic distributions: Stochastic convergence. Concentration inequalities. rojections. U-statistics. Delta method. 2. Tools for richer settings (eg: function space vsr k ) Uniform laws of large numbers. Empirical process theory. Metric entropy. Functional delta method. 6

3. Tools for asymptotics of likelihood ratios: Contiguity. Local asymptotic normality. 4. Asymptotic optimality: Efficiency of estimators. Efficiency of tests. 5. Applications: Nonparametric regression. Nonparametric density estimation. M-estimators. Bootstrap estimators. 7

Convergence in Distribution X 1,X 2,...,X are random vectors, Definition: X n converges in distribution (or weakly converges) to X (written X n X) means that their distribution functions satisfy F n (x) F(x) at all continuity points of F. 8

Review: Other Types of Convergence d is a distance onr k (for which the Borel σ-algebra is the usual one). as Definition: X n converges almost surely to X (written X n X) means that d(x n,x) 0 a.s. Definition: X n converges in probability to X (written X n X) means that, for all ǫ > 0, (d(x n,x) > ǫ) 0. 9

Review: Other Types of Convergence Theorem: X n as X = X n X = Xn X, X n c Xn c. NB: For X n as X and X n X,Xn andx must be functions on the sample space of the same probability space. But not convergence in distribution. 10

Convergence in Distribution: Equivalent Definitions Theorem: [ortmanteau] The following are equivalent: 1. (Xn x) (X x) for all continuity pointsxof(x ). 2. Ef(Xn) Ef(X) for all bounded, continuous f. 3. Ef(Xn) Ef(X) for all bounded, Lipschitz f. 4. Ee itt Xn Ee itt X for allt R k. (Lévy s Continuity Theorem) 5. for all t R k,t T Xn t T X. (Cramér-Wold Device) 6. lim inf Ef(Xn) Ef(X) for all nonnegative, continuous f. 7. liminf (Xn U) (X U) for all open U. 8. limsup(xn F) (X F) for all closed F. 9. (Xn B) (X B) for all continuity sets B (i.e.,(x B) = 0). 11

Convergence in Distribution: Equivalent Definitions Example: [Why do we need continuity?] Consider f(x) = 1[x > 0], X n = 1/n. Then X n 0, f(x) 1, but f(0) = 0. [Why do we need boundedness?] Consider f(x) = x, n w.p. 1/n, X n = 0 w.p. 1 1/n. Then X n 0, Ef(X n ) 1, but f(0) = 0. 12

Relating Convergence roperties Theorem: X n X and d(x n,y n ) 0 = Y n X, X n X and Y n c = (X n,y n ) (X,c), X n X and Yn Y = (Xn,Y n ) (X,Y). 13

Relating Convergence roperties Example: NB: NOT X n X andy n Y = (X n,y n ) (X,Y). (joint convergence versus marginal convergence in distribution) Consider X,Y independent N(0,1), X n N(0,1), Y n = X n. Then X n X,Y n Y, but (X n,y n ) (X, X), which has a very different distribution from that of (X,Y). 14

Relating Convergence roperties: Continuous Mapping Supposef : R k R m is almost surely continuous (i.e., for some S with(x S)=1,f is continuous ons). Theorem: [Continuous mapping] X n X = f(x n ) f(x). X n X = f(xn ) f(x). X n as X = f(x n ) as f(x). 15

Relating Convergence roperties: Continuous Mapping Example: For X 1,...,X n i.i.d. mean µ, variance σ 2, we have n σ ( X n µ) N(0,1). So n σ 2( X n µ) 2 (N(0,1)) 2 = χ 2 1. Example: We also have X n µ 0 hence ( X n µ) 2 0. Consider f(x) = 1[x > 0]. Then f(( X n µ) 2 ) 1 f(0). (The problem is that f is not continuous at 0, and X (0) > 0, for X satisfying ( X n µ) 2 X.) 16

Relating Convergence roperties: Slutsky s Lemma Theorem: X n X and Y n c imply X n +Y n X +c, Y n X n cx, Y 1 n X n c 1 X. (Why does X n X and Y n Y not implyx n +Y n X +Y?) 17

Relating Convergence roperties: Examples Theorem: For i.i.d.y t withey 1 = µ,ey 2 1 = σ 2 <, n Ȳ n µ S n N(0,1), where n Ȳ n = n 1 Y i, i=1 n Sn 2 = (n 1) 1 (Y i Ȳ n ) 2. i=1 18

roof: S 2 n = n n 1 }{{} 1 1 n Yi 2 n i=1 }{{} EY 2 1 Ȳn }{{} EY 1 2 (weak law of large numbers) EY 2 1 (EY 1 ) 2 = σ 2. (continuous mapping theorem, Slutsky s Lemma) 19

Also n (Ȳn µ ) } {{ } N(0,σ 2 ) N(0,1) 1 S n }{{} 1/σ (central limit theorem) (continuous mapping theorem, Slutsky s Lemma) 20

Showing Convergence in Distribution Recall that the characteristic function demonstrates weak convergence: X n X Ee itt X n Ee ittx for all t R k. Theorem: [Lévy s Continuity Theorem] If Ee itt X n φ(t) for all t inr k, and φ : R k C is continuous at 0, then X n X, where Ee ittx = φ(t). Special case: X n = Y. SoX,Y have same distribution iffφ X = φ Y. 21

Showing Convergence in Distribution Theorem: [Weak law of large numbers] Suppose X 1,...,X n are i.i.d. Then X n µ iff φ X1 (0) = iµ. roof: We ll show that φ X 1 (0) = iµ implies X n µ. Indeed, Ee it X n = φ n (t/n) = (1+tiµ/n+o(1/n)) n }{{} e itµ. =φ µ (t) Lévy s Theorem implies X n µ, hence X n µ. 22

Showing Convergence in Distribution e.g., X N(µ, Σ) has characteristic function φ X (t) = Ee ittx = e itt µ t T Σt/2. Theorem: [Central limit theorem] Suppose X 1,...,X n are i.i.d., EX 1 = 0, EX 2 1 = 1. Then n X n N(0,1). 23

roof: φ X1 (0) = 1,φ X 1 (0) = iex 1 = 0,φ X 1 (0) = i 2 EX1 2 = 1. Ee it n X n = φ n (t/ n) = ( 1+0 t 2 EY 2 /(2n)+o(1/n) ) n e t2 /2 = φ N(0,1) (t). 24

Uniformly tight Definition: X is tight means that for all ǫ > 0 there is an M for which ( X > M) < ǫ. {X n } is uniformly tight (or bounded in probability) means that for all ǫ > 0 there is an M for which sup n ( X n > M) < ǫ. (so there is a compact set that contains each X n with high probability.) 25

Notation: Uniformly tight Theorem: [rohorov s Theorem] 1. X n X implies{x n } is uniformly tight. 2. {X n } uniformly tight implies that for somex and some subsequence, X nj X. 26

Notation for rates: o, O Definition: X n = o (1) X n 0, X n = o (R n ) X n = Y n R n and Y n = o (1). X n = O (1) X n uniformly tight X n = O (R n ) X n = Y n R n and Y n = O (1). (i.e., o,o specify rates of growth of a sequence. o means strictly slower (sequence Y n converges in probability to zero). O means within some constant (sequence Y n lies in a ball). 27

Relations between rates o (1)+o (1) = o (1). o (1)+O (1) = O (1). o (1)O (1) = o (1). (1+o (1)) 1 = O (1). o (O (1)) = o (1). X n 0, R(h) = o( h p ) = R(X n ) = o ( X n p ). X n 0, R(h) = O( h p ) = R(X n ) = O ( X n p ). 28