Maximum Likelihood Large Sample Theory
|
|
- Earl Anderson
- 5 years ago
- Views:
Transcription
1 Maximum Likelihood Large Sample Theory MIT Dr. Kempthorne Spring
2 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 2
3 Asymptotic Results: Overview Asymptotic Framework Data Model : X n = (X 1, X 2,..., X n ). i.i.d. sample with pdf/pmf f (x n 1,..., x n θ) = i=1 f (x i θ) Data Realization: X n = x n = (x 1,..., x n ) Likelihood of θ (given x n ): lik(θ) = f (x 1,..., x n θ) θˆ n: MLE of θ given xn = ( x1,..., xn) { θ ˆn, n } : sequence of MLEs indexed by sample size n Results: L Consistency: θˆn θ L κ(θ)/n Asymptotic Variance: σˆ θ n = Var(ˆ θ n ) where κ(θ) is an explicit function of the pdf/pmf f ( θ). Limiting Distribution: n(θˆn θ) N(0, κ(θ)). L 3
4 Theorems for Asymptotic Results Setting: x 1,..., x n a realization of an i.i.d. sample from distribution with density/pmf f (x θ). n (θ) = i=1 ln f (x i θ) θ 0 : true value of θ ˆθ n : the MLE Theorem A Under appropriate smoothness conditions on f, the MLE θˆn is consistent, i.e., for any true value θ 0, for every E > 0, P( θˆn θ 0 > E) 0. Proof: Weak Law of Large Numbers (WLLN) 1 L n (θ) E [log f (x θ) θ 0 ] = log[f (x θ)]f (x θ 0 )dx (Note!! statement holds for every θ given any value of θ 0.) 4
5 Theorem 8.5.2A (continued) Proof (continued): The MLE θˆn maximizes 1 n (θ) Since 1 n (θ) E [log f (x θ) θ 0 ], ˆθ n is close to θ maximizing E [log f (x θ) θ 0 ] Under smoothness conditions on f (x θ), θ maximizes E [log f (x θ) θ 0 ] if θ solves d dθ (E[log f (x θ) θ 0 ]) = 0 Claim: θ = θ 0 : d d dθ (E[log f (x θ) θ 0 ]) = dθo log[f (x θ)]f { (x θ 0 )dx d = log[f (x θ)] f (x θ 0 )dx (dθ x d [f (x θ)] dθ = f (x θ) f (x θ 0 )dx d (at θ = θ 0 ) = [ dθ [f (x θ)]dx] θ=θ0 d = [ d f (x θ)dx] θ=θ0 = (1) 0. dθ 5 dθ
6 Theorems for Asymptotic Results Theorem 8.5.B Under smoothness conditions on f (x θ), L n(θˆn - θ 00 ) N(0, 1/I( ( θ 0 )) 2 where I(θ) = E log f (X θ) θ θ 2 Proof: Using the Taylor approximation to ' (θ), centered at θ 0 consider the following development: 0 = ' (θˆ) ' (θ θ θ) '' 0 ) + (ˆ (θ 0 ) ' (θ 0 ) = (θˆ θ) '' (θ 0 ) n[ 1 n ' (θ 0 )] = n(θˆ θ) 1 n [ '' (θ 0 )] L By the CLT n[ 1 ' (θ 0 )] N(0, I(θ 0 )) (Lemma A) n L By the WLLN 1 [ '' (θ 0 )] I(θ 0 ) n L Thus: n(θˆ θ) (1/I(θ 0 )) 2 N(0, I(θ 0 )) = N(0, 1/I(θ 0 )) 6
7 Theorems for Asymptotic Results Lemma A (extended). For the distribution with pdf/pmf f (x θ) define The Score Function U(X ; θ) = θ logf (X θ) The (Fisher) Information - 0 of the distribution ( ) I(θ) = E 2 log f (X θ) θ θ 2 Then under sufficient smoothness conditions on f (x θ) (a). E [U(X ; θ) θ] = 0 (b). Var[U(X ; θ) θ] = I(θ) o { = E [U(X ; θ)] 2 θ] Proof: Differentiate f (x θ)dx = 1 with respect to θ two times. Interchange the order of differentiation and integration. (a) follows from the first derivative. (b) follows from the second derivative. 7
8 Theorems for Asymptotic Results Qualifications/Extensions Results require true θ 0 to lie in interior of the parameter space. Results require that {x : f (x θ) > 0} not vary with θ. Results extend to multi-dimensional θ Vector-valued Score Function Matrix-valued (Fisher) Information 8
9 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 9
10 Confidence Intervals Confidence Interval for a Normal Mean X 1, X 2,..., X n i.i.d. N(µ, σ 0 2 ), unknown mean µ (known σ 0 2 ) Parameter Estimate: X (sample mean) X = 1 n1 X i n i=1 E [X ] = µ Var[X ] = σ 2 = σ 2 0 /n X A 95% confidence interval for µ is a random interval, calculated from the data, that contains µ with probability 0.95, no matter what the value of the true µ. 10
11 Confidence Interval for a Normal Mean Sampling distribution of X X N(µ, σ 2 ) = N(µ, σ 2 0 /n) X X µ = Z = N(0, 1) (standard normal) σ X For any 0 < α < 1, (e.g. α = 0.05) Define z(α/2) : P(Z > z(α/2)) = α/2 By symmetry of the standard normal distribution P ( z(α/2) < Z < +z(α/2)) = 1 α i.e., ( x X µ P z(α/2) < < +z(α/2) = 1 α σ X Re-express interval of Z as interval of µ: o { P X z(α/2)σ X < µ < X + z(α/2)σ X = 1 α The interval given by [X ± z(α/2) σ ] is the X 100(1 α)% confidence interval for µ 11
12 Confidence Interval for a Normal Mean Important Properties/Qualifications The confidence interval is random. The parameter µ is not random. 100(1 α)%: the confidence-level of the confidence interval is the probability that the random interval contains the fixed parameter µ (the coverage probability of the confidence interval) Given a data realization of (X 1,..., X n ) = (x 1,..., x n ) µ is either inside the confidence interval or not. The confidence level scales the reliability of a sequence of confidence intervals constructed in this way. The confidence interval quantifies the uncertainty of the parameter estimate. 12
13 Confidence Intervals for Normal Distribution Parameters Normal Distribution with unknown mean (µ) and variance σ 2. X 1,..., X n i.i.d. N(µ, σ 2 ), with MLEs: µˆ = X n1 σˆ2 = 1 (X i X ) 2 n i=1 Confidence interval for µ based on the T -Statistic X µ T = t n 1 S/ n where t n 1 is Student s t distribution with (n 1) degrees of 1 n n 1 i=1 freedom and S 2 = (X i X ) 2. Define t n 1 (α/2) : P(t n 1 > t n 1 (α/2)) = α/2 By symmetry of Student s t distribution P [ t n 1 (α/2) < T < +t n 1 (α/2)] = 1 α [ X µ i.e., P t n 1 (α/2) < < +t n 1 (α/2) = 1 α S n 13
14 Confidence Interval For Normal Mean Re-express [ interval of T as interval ] of µ: X µ P t n 1 < < +t n 1 = 1 α S n = P X t n 1 (α/2)s/ n < µ < X + t n 1 (α/2)s/ n = 1 α The interval given by [X ± t n 1 (α/2) S/ n] is the 100(1 α)% confidence interval for µ Properties of confidence intervals for µ Center is X = µˆmle Width proportional to ˆσ µˆmle = S/ n (random!) 14
15 Confidence Interval For Normal Variance Normal Distribution with unknown mean (µ) and variance σ 2. X 1,..., X n i.i.d. N(µ, σ 2 ), with MLEs: µˆ = X n1 σˆ2 = 1 (X i X ) 2 n i=1 Confidence interval for σ 2 based on the sampling distribution of the MLE ˆσ 2. nσˆ2 Ω = χ 2 σ 2 n 1. where χ 2 n 1 is Chi-squared distribution with (n 1) d.f. Define χ 2 n 1 (α ) : P(χ 2 n 1 > χ n 2 1 (α )) = α Using α = o α/2 and α = (1 α/2), { P +χ2 n 1 (1 α/2) < Ω < +χ ( 2 n 1 (α/2) = x 1 α nσˆ2 i.e., P +χ2 n 1 (1 α/2) < < +χ 2 σ 2 n 1(α/2) = 1 α 15
16 Confidence Interval for Normal Variance Re-express interval of Ω as interval of σ 2 : P[+χn 1 2 (1 α/2) < Ω < +χ 2 n 1 (α/2)] = 1 α nσˆ2 P[+χn 2 1(1 α/2) < < +χ 2 n 1(α/2)] = 1 α σ 2 nσˆ2 nσˆ2 P[ < σ 2 < ] = 1 α χ 2 (α/2) χ 2 (1 α/2) n 1 n 1 The 100(1 α)% confidence interval for σ 2 is given by nσˆ2 nσˆ2 [ < σ 2 < ] χn 2 1(α/2) χn 2 1(1 α/2) Properties of confidence interval for σ 2 Asymmetrical about the MLE ˆσ 2. Width proportional to ˆσ 2 (random!) 100(1 α)% a confidence interval for σ a immediate: n n [ˆσ( 2 ) < σ < σˆ( )] (α/2) (1 α/2) χ n 1 χn
17 Confidence Intervals For Normal Distribution Parameters Important Features of Normal Distribution Case Required use of exact sampling distributions of the MLEs µˆ and ˆσ 2. Construction of each confidence interval based on pivotal quantity: a function of the data and the parameters whose distribution does not involve any unknown parameters. Examples of pivotals X µ T = t n 1 S/ n nσˆ2 Ω = χ 2 σ 2 n 1 17
18 Confidence Intervals Based On Large Sample Theory Asymptotic Framework (Re-cap) Data Model : X n = (X 1, X 2,..., X n ). i.i.d. sample with pdf/pmf f (x n 1,..., x n θ) = i=1 f (x i θ) Likelihood of θ given X n = x n = (x 1,..., x n ): lik(θ) = f (x 1,..., x n θ) θˆ n = θˆ n (x n ): MLE of θ given x n = (x 1,..., x n ) {θˆn, n }: sequence of MLEs indexed by sample size n Results (subject to sufficient smoothness conditions on f ) L Consistency: θˆn θ L 1 Asymptotic Variance: σˆ θ n = Var(θˆn ) ni(θ) o { 2 - ) d d where I(θ) = E[ dθ [log f (x θ)] ] = E [ 2 [log f (x θ)] ] dθ 2 Limiting Distribution: ni(θ)(θˆn θ) L N(0, 1). MIT Maximum LikelihoodLarge Sample Theory 18
19 Confidence Intervals Based on Large Sample Theory Large-Sample Confidence Interval Exploit the limiting pivotal quantity L Z n = ni(θ)(θˆn θ) N(0, 1). I.e. P( z(α/2) < Z n < +z(α/2)) 1 α P( z(α/2) < ni(θ)(θˆn θ) < +z(α/2)) 1 α P( z(α/2) < ni(θˆn)(θˆn θ) < +z(α/2)) 1 α Note (!): I(θˆn) substituted for I(θ) Re-express interval of Z n as interval of θ: P(θˆn z(α/2) 1 < θ < θˆn + z(α/2) 1 ) 1 α ni(θˆ) ni(θˆ) The interval given by [θˆn ± z(α/2) 1 ] is the ni(θˆ) 100(1 α)% confidence interval (large sample) for θ 19
20 Large-Sample Confidence Intervals Example 8.5.B. Poisson Distribution X 1,..., X n i.i.d. Poisson(λ) f (x λ) = λx x! e λ n (λ) = i=1[x i ln(λ) λ ln(x!)] MLE λˆ d (λ) n1 x i λˆ solves: = [ 1] = 0; λˆ = X. dλ λ i=1 d 2 X i 1 I(λ) = E [ log f (x λ)] = E [ ] = dλ 2 λ 2 λ ˆλ λ L Z n = ni(λˆ)(λˆ λ) = N(0, 1) ˆλ/n Large-Sample Confidence Interval for λ Approximate 100(1 α)% confidence interval for λ X [λˆ ± σˆλˆ] = [X ± z(α/2) n ] 20
21 Confidence Interval for Poisson Parameter Deaths By Horse Kick in Prussian Army (Bortkiewicz, 1898) Annual Counts of fatalities in 10 corps of Prussian cavalry over a period of 20 years. n = 200 corps-years worth of data. Annual Fatalities Observed Model X 1,..., X 200 as i.i.d. Poisson(λ). ˆ 122 λ MLE = X = 200 = 0.61 ˆ σˆˆ λ MLE = λ MLE /n =.0552 For an 95% confidence interval, z(α/2) = 1.96 giving 0.61 ± (1.96)(.0552) = [.5018,.7182] 21
22 Confidence Interval for Multinomial Parameter Definition: Multinomial Distribution W 1,..., W n are iid Multinomial(1, probs = (p 1,..., p m )) r.v.s The sample space of each W i is W = {1, 2,..., m}, a set of m distinct outcomes. P(W i = k) = p k, k = 1, 2,..., m. Define Count Statistics from Multinomial Sample: n X k = i=1 1(W i = k), (sum of indicators of outcome k), k = 1,..., m X = (X 1,..., X m ) follows a multinomial distribution n!. m x f j (x1,..., x n p 1,..., p m ) = m j=1 p j j=1 x i! where (p 1,..., p m ) is the vector of cell probabilities with m m i=1 p i = 1 and n = j=1 x i is the total count. Note: for m = 2, the W i are Bernoulli(p 1 ) X 1 is Binomial(n, p 1 ) and X 2 n X 1 22
23 MLEs of Multinomial Parameter Maximum Likelihood Estimation for Multinomial Likelihood function of counts lik(p 1,..., p m ) = log[f (x 1,..., x m p 1,..., p m )] m m = log(n!) j=1 log(x j!) + j=1 x j log(p j ) Note: Likelihood function of Multinomial Sample w 1,...,, w n lik (p 1,..., p m ) = log[f (w 1,..., w n p 1,..., p m )] n m = i=1[ j=1 log(p j ) 1(W i = j)] m = j=1 x j log(p j ) Maximum Likelihood Estimate (MLE) of (p 1,..., p m ) maximizes lik(p 1,..., p m ) (with x 1,..., x m fixed!) Maximum achieved when differential is zero m Constraint: j=1 p j = 1 Apply method of Lagrange multipliers Solution: ˆp j = x j /n, j = 1,..., m. Note: if any x j = 0, then ˆp j = 0 solved as limit 23
24 MLEs of Multinomial Parameter Example A Hardy-Weinberg Equilibrium Equilibrium frequency of genotypes: AA, Aa, and aa P(a) = θ and P(A) = 1 θ Equilibrium probabilities of genotypes: (1 θ) 2, 2(θ)(1 θ), and θ 2. Multinomial Data: (X 1, X 2, X 3 ) corresponding to counts of AA, Aa, and aa in a sample of size n. Sample Data Genotype AA Aa aa Total Count X 1 X 2 X 3 n Frequency
25 Hardy-Weinberg Equilibrium Maximum-Likelihood Estimation of θ (X 1, X 2, X 3 ) Multinomial(n, p = ((1 θ) 2, 2θ(1 θ), θ 2 )) Log Likelihood for θ (θ) = log(f (x 1, x 2, x 3 p 1 (θ), p 2 (θ), p 3 (θ))) n! = log( p 1 (θ) x 1 p 2 (θ) x 2 p 3 (θ) x 3 x 1!x 2!x 3! ) = x 1 log((1 θ) 2 ) + x 2 log(2θ(1 θ)) +x 3 log(θ 2 ) + (non-θ terms) = (2x 1 + x 2 )log(1 θ) + (2x 3 + x 2 )log(θ) + (non-θ terms) First Differential of log likelihood: (2x 1 + x 2 ) (2x 3 + x 2 ) ' (θ) = + 1 θ θ ˆ 2x 3 + x 2 2x 3 + x 2 = θ = = = x 1 + 2x 2 + 2x 3 2n 25
26 Asymptotic variance of MLE θˆ: 1 Var(θˆ) E [ '' (θ)] Second Differential of log likelihood: d (2x 1 + x 2 ) (2x 3 + x 2 ) '' (θ) = [ + ] dθ 1 θ θ (2x 1 + x 2 ) (2x 3 + x 2 ) = (1 θ) 2 θ 2 Each of the X i are Binomial(n, p i (θ)) so E [X 1 ] = np 1 (θ) = n(1 θ) 2 E [X 2 ] = np 2 (θ) = n2θ(1 θ) E [X 3 ] = np 3 (θ) = nθ 2 E [ '' 2n (θ)] = θ(1 θ) a θˆ(1 θˆ).4247(1.4247) σˆ θˆ = = = n
27 Hardy-Weinberg Model Approximate 95% Confidence Interval for θ ˆ Interval : θ ± z(α/2) σˆˆ, with θ ˆθ = Interval : σˆθˆ = z(α/2) = 1.96 (with α = ) ± 1.96 (0.0109) = [0.4033,.4461] Note (!!) : Bootstrap simulation in R of θˆ the RMSE (θˆ) = is virtually equal to ˆσ θˆ 27
28 MIT OpenCourseWare Statistics for Applications Spring 2015 For information about citing these materials or our Terms of Use, visit:
MIT Spring 2015
Assessing Goodness Of Fit MIT 8.443 Dr. Kempthorne Spring 205 Outline 2 Poisson Distribution Counts of events that occur at constant rate Counts in disjoint intervals/regions are independent If intervals/regions
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationSTAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method
STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationMethods of Estimation
Methods of Estimation MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Methods of Estimation I 1 Methods of Estimation I 2 X X, X P P = {P θ, θ Θ}. Problem: Finding a function θˆ(x ) which is close to θ.
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationMATH4427 Notebook 2 Fall Semester 2017/2018
MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationSTAT 135 Lab 3 Asymptotic MLE and the Method of Moments
STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,
More informationPost-exam 2 practice questions 18.05, Spring 2014
Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationLecture 28: Asymptotic confidence sets
Lecture 28: Asymptotic confidence sets 1 α asymptotic confidence sets Similar to testing hypotheses, in many situations it is difficult to find a confidence set with a given confidence coefficient or level
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationStatistics Ph.D. Qualifying Exam
Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationCherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants
18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009
More informationP n. This is called the law of large numbers but it comes in two forms: Strong and Weak.
Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to
More informationMIT Spring 2016
Dr. Kempthorne Spring 2016 1 Outline Building 1 Building 2 Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P
More informationGoodness of Fit Goodness of fit - 2 classes
Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationChapter 8 - Statistical intervals for a single sample
Chapter 8 - Statistical intervals for a single sample 8-1 Introduction In statistics, no quantity estimated from data is known for certain. All estimated quantities have probability distributions of their
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationML Testing (Likelihood Ratio Testing) for non-gaussian models
ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l
More information7.1 Basic Properties of Confidence Intervals
7.1 Basic Properties of Confidence Intervals What s Missing in a Point Just a single estimate What we need: how reliable it is Estimate? No idea how reliable this estimate is some measure of the variability
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More information2014/2015 Smester II ST5224 Final Exam Solution
014/015 Smester II ST54 Final Exam Solution 1 Suppose that (X 1,, X n ) is a random sample from a distribution with probability density function f(x; θ) = e (x θ) I [θ, ) (x) (i) Show that the family of
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013 STAB57H3 Introduction to Statistics Duration: 3 hours Last Name: First Name: Student number:
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationLecture 1: Introduction
Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One
More informationChapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)
Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationChapter 3 : Likelihood function and inference
Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM
More informationTopic 19 Extensions on the Likelihood Ratio
Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationChapters 10. Hypothesis Testing
Chapters 10. Hypothesis Testing Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment on blood pressure more effective than the
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationMIT Spring 2015
MIT 18.443 Dr. Kempthorne Spring 2015 MIT 18.443 1 Outline 1 MIT 18.443 2 Batches of data: single or multiple x 1, x 2,..., x n y 1, y 2,..., y m w 1, w 2,..., w l etc. Graphical displays Summary statistics:
More informationMath 152. Rumbos Fall Solutions to Assignment #12
Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus
More informationHorse Kick Example: Chi-Squared Test for Goodness of Fit with Unknown Parameters
Math 3080 1. Treibergs Horse Kick Example: Chi-Squared Test for Goodness of Fit with Unknown Parameters Name: Example April 4, 014 This R c program explores a goodness of fit test where the parameter is
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Five Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Five Notes Spring 2011 1 / 37 Outline 1
More information1 One-way analysis of variance
LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationSOLUTION FOR HOMEWORK 4, STAT 4352
SOLUTION FOR HOMEWORK 4, STAT 4352 Welcome to your fourth homework. Here we begin the study of confidence intervals, Errors, etc. Recall that X n := (X 1,...,X n ) denotes the vector of n observations.
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 3 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a)
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationInference in non-linear time series
Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators
More informationMath 70 Homework 8 Michael Downs. and, given n iid observations, has log-likelihood function: k i n (2) = 1 n
1. (a) The poisson distribution has density function f(k; λ) = λk e λ k! and, given n iid observations, has log-likelihood function: l(λ; k 1,..., k n ) = ln(λ) k i nλ ln(k i!) (1) and score equation:
More informationHypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses
Testing Hypotheses MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline Hypothesis Testing 1 Hypothesis Testing 2 Hypothesis Testing: Statistical Decision Problem Two coins: Coin 0 and Coin 1 P(Head Coin 0)
More information10. Composite Hypothesis Testing. ECE 830, Spring 2014
10. Composite Hypothesis Testing ECE 830, Spring 2014 1 / 25 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve unknown parameters
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More informationBasic Concepts of Inference
Basic Concepts of Inference Corresponds to Chapter 6 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT) with some slides by Jacqueline Telford (Johns Hopkins University) and Roy Welsch (MIT).
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More information9 Asymptotic Approximations and Practical Asymptotic Tools
9 Asymptotic Approximations and Practical Asymptotic Tools A point estimator is merely an educated guess about the true value of an unknown parameter. The utility of a point estimate without some idea
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More information2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 2.830J / 6.780J / ESD.63J Control of Processes (SMA 6303) Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More information15 Discrete Distributions
Lecture Note 6 Special Distributions (Discrete and Continuous) MIT 4.30 Spring 006 Herman Bennett 5 Discrete Distributions We have already seen the binomial distribution and the uniform distribution. 5.
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationSome General Types of Tests
Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationSome New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary
Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,
More information1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments
/4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population
More informationConfidence Intervals for Normal Data Spring 2014
Confidence Intervals for Normal Data 18.05 Spring 2014 Agenda Today Review of critical values and quantiles. Computing z, t, χ 2 confidence intervals for normal data. Conceptual view of confidence intervals.
More informationMIT Spring 2016
Exponential Families II MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Exponential Families II 1 Exponential Families II 2 : Expectation and Variance U (k 1) and V (l 1) are random vectors If A (m k),
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationCh. 5 Hypothesis Testing
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,
More informationStat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota
Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationNotes on the Multivariate Normal and Related Topics
Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions
More informationSTT 315 Problem Set #3
1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >
More informationDelta Method. Example : Method of Moments for Exponential Distribution. f(x; λ) = λe λx I(x > 0)
Delta Method Often estimators are functions of other random variables, for example in the method of moments. These functions of random variables can sometimes inherit a normal approximation from the underlying
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More information6. MAXIMUM LIKELIHOOD ESTIMATION
6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationHypothesis testing: theory and methods
Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable
More informationLet us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided
Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or
More information