Inference in non-linear time series

Size: px
Start display at page:

Download "Inference in non-linear time series"

Transcription

1 Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU

2 Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators Least squares Maximum Likelihood MLE asymptotics Two theorems Fisher Information Proofs Other estimators

3 Intro LS MLE Other General Properties Popular estimatiors General We are interested in estimating a parameter vector θ 0 from data X. Ad hoc or formal estimation? Properties Denition ˆθ = T (X). Interpretation

4 Intro LS MLE Other General Properties Popular estimatiors Properties Bias b = θ 0 E[T (X)]. Asympt. bias lim θ 0 E[T (X)]. Consistency ˆθ p θ 0 P( ˆθ θ 0 > ε) 0 as. a.s. Strong consistency ˆθ θ 0. Eciency Var[T (X)] I 1 (θ 0), where I 1 (θ 0) is the Fisher information matrix. Asympt. normality Convergence α (ˆθ θ 0 ) d (µ, Σ).

5 Intro LS MLE Other General Properties Popular estimatiors Popular estimators How are the estimates computed? Optimization: Least squares (LS), Weighted Least Squares (WLS), Prediction Error Methods (PEM), Generalized Method of Moments (GMM) etc. Solving equations: Estimation functions (EFs), Method of moments (MM) Either: Bayesian methods (MCMC), Maximum Likelihood, Quasi Maximum Likelihood

6 Intro LS MLE Other Least squares Observations y 1,..., y Predictors J j=1 B j(x)θ j Vector form: Y = Zθ + ε, ε (0, Ω) (Weighted) Parameter estimation: Generalization ˆθ LS = argmin θ Θ n=1 λ n y n j B j (x n )θ j 2 (1) = (Y Zθ) T W (Y Zθ) (2) ˆθ PLS = (Y Zθ) T W (Y Zθ) + (θ θ 0 ) T D(θ θ 0 ) (3)

7 Intro LS MLE Other Least squares Observations y 1,..., y Predictors J j=1 B j(x)θ j Vector form: Y = Zθ + ε, ε (0, Ω) (Weighted) Parameter estimation: Generalization ˆθ LS = argmin θ Θ n=1 λ n y n j B j (x n )θ j 2 (1) = (Y Zθ) T W (Y Zθ) (2) ˆθ PLS = (Y Zθ) T W (Y Zθ) + (θ θ 0 ) T D(θ θ 0 ) (3)

8 Intro LS MLE Other Estimate ˆθ = ( Z T WZ) 1 (ZWY ) (4) Bias? o, not if the correct model is used ] ( 1 E [ˆθ = Z WZ) T (ZW (Zθ + ε)) (5) ( 1 = θ + Z WZ) T (ZW ε) (6) Variance ] Var [ˆθ = (Z T WZ) 1 (Z T W ΩWZ)(Z T WZ) 1 (7) Simplies if W = Ω 1, and Ω = σ 2 I. We then get ] Var [ˆθ = σ 2 (Z T Z) 1. (8)

9 Intro LS MLE Other Estimate ˆθ = ( Z T WZ) 1 (ZWY ) (4) Bias? o, not if the correct model is used ] ( 1 E [ˆθ = Z WZ) T (ZW (Zθ + ε)) (5) ( 1 = θ + Z WZ) T (ZW ε) (6) Variance ] Var [ˆθ = (Z T WZ) 1 (Z T W ΩWZ)(Z T WZ) 1 (7) Simplies if W = Ω 1, and Ω = σ 2 I. We then get ] Var [ˆθ = σ 2 (Z T Z) 1. (8)

10 Intro LS MLE Other Estimate ˆθ = ( Z T WZ) 1 (ZWY ) (4) Bias? o, not if the correct model is used ] ( 1 E [ˆθ = Z WZ) T (ZW (Zθ + ε)) (5) ( 1 = θ + Z WZ) T (ZW ε) (6) Variance ] Var [ˆθ = (Z T WZ) 1 (Z T W ΩWZ)(Z T WZ) 1 (7) Simplies if W = Ω 1, and Ω = σ 2 I. We then get ] Var [ˆθ = σ 2 (Z T Z) 1. (8)

11 Intro LS MLE Other Estimate ˆθ = ( Z T WZ) 1 (ZWY ) (4) Bias? o, not if the correct model is used ] ( 1 E [ˆθ = Z WZ) T (ZW (Zθ + ε)) (5) ( 1 = θ + Z WZ) T (ZW ε) (6) Variance ] Var [ˆθ = (Z T WZ) 1 (Z T W ΩWZ)(Z T WZ) 1 (7) Simplies if W = Ω 1, and Ω = σ 2 I. We then get ] Var [ˆθ = σ 2 (Z T Z) 1. (8)

12 Intro LS MLE Other F-tests We estimate σ 2 = (Y X θ)t (Y X θ) J Dene Q(ˆθ) = (Y X ˆθ) T (Y X ˆθ) = Y T (I P) T (I P)Y where P is the projection matrix We approximate Q by E[Q] = tr ( (I P) T (I P) Cov[Y, Y ] ). It holds for the standard model that Cov[Y, Y ] = σ 2 I and (I P) T (I P) = (I P) It then follows that ( ) tr (I P) T (I P) Cov[Y, Y ] = σ 2 tr (I P) (9) = σ 2 tr(i ) σ 2 tr(p) (10) ( ) = σ 2 σ 2 tr Z(Z T Z) 1 Z T (11) ( ) = σ 2 ( tr (Z T Z) 1 (Z T Z) = σ 2 ( J) (12)

13 Intro LS MLE Other What about the penalized asymptotics? What if the estimation is given by ˆθ PLS = (Y Zθ) T W (Y Zθ) + (θ θ 0 ) T D(θ θ 0 ) (13) Derive on blackboard. Another popular form is ˆθ Adaptive LASSO = (Y Zθ) T W (Y Zθ) + D(θ θ 0 ) 1 (14)

14 Intro LS MLE Other What about the penalized asymptotics? What if the estimation is given by ˆθ PLS = (Y Zθ) T W (Y Zθ) + (θ θ 0 ) T D(θ θ 0 ) (13) Derive on blackboard. Another popular form is ˆθ Adaptive LASSO = (Y Zθ) T W (Y Zθ) + D(θ θ 0 ) 1 (14)

15 Intro LS MLE Other Theorems Info Proofs Maximum Likelihood estimators Dened as ˆθ MLE = argmax p θ (X 1,... X ). θ Θ The MLE use all information in the data, and have nice properties: Ia. Consistency ˆθ p θ 0. Ib. Asympt. normality II. Eciency Var[T (X)] = I 1 0 (θ 0 ), III. Invariant. We have under general conditions that (ˆθ θ0 ) d (0, I 1 0 (θ 0 )).

16 Intro LS MLE Other Theorems Info Proofs Two useful theorems Assume X i iid and E[X i ] = µ, Var[X i ] = σ 2 Law of Large umbers 1 Central Limit Theorem where Z (0, 1). 1 i=1 i=1 X i p/a.s. µ. X i µ σ These theorems can be generalized further. d Z,

17 Intro LS MLE Other Theorems Info Proofs Fisher information The Fisher Information matrix is dened as [ ] I (θ) = Var θ log p θ(x ), which is equivalent to and E ote E [ θ log p θ(x θ) ] = 0. [ ( ) ] 2 θ log p θ(x ) [ ] 2 E θ 2 log p θ(x ).

18 Intro LS MLE Other Theorems Info Proofs Proofs Ia. Kullback-Leibler and law of large number Ib. Second order Taylor expansion of the Likelihood. II. Cauchy-Schwartz inequality III. Direct calculations.

19 Intro LS MLE Other Theorems Info Proofs Consistency The log-likelihood function is dened as l(θ) = log p θ (x n x 1:n 1 ) (15) The estimate is given by n=1 1 ˆθ MLE = argmax l(θ) = argmax l(θ) (16) θ Θ θ Θ = argmax θ Θ 1 n=1 log p θ (x n x 1:n 1 ) (17) ow we rewrite this as 1 l(θ) = 1 l(θ) ± E[l(θ)] (18) ( ) 1 = E[l(θ)] + l(θ) E[l(θ)] (19)

20 Intro LS MLE Other Theorems Info Proofs Consistency The log-likelihood function is dened as l(θ) = log p θ (x n x 1:n 1 ) (15) The estimate is given by n=1 1 ˆθ MLE = argmax l(θ) = argmax l(θ) (16) θ Θ θ Θ = argmax θ Θ 1 n=1 log p θ (x n x 1:n 1 ) (17) ow we rewrite this as 1 l(θ) = 1 l(θ) ± E[l(θ)] (18) ( ) 1 = E[l(θ)] + l(θ) E[l(θ)] (19)

21 Intro LS MLE Other Theorems Info Proofs Consistency The log-likelihood function is dened as l(θ) = log p θ (x n x 1:n 1 ) (15) The estimate is given by n=1 1 ˆθ MLE = argmax l(θ) = argmax l(θ) (16) θ Θ θ Θ = argmax θ Θ 1 n=1 log p θ (x n x 1:n 1 ) (17) ow we rewrite this as 1 l(θ) = 1 l(θ) ± E[l(θ)] (18) ( ) 1 = E[l(θ)] + l(θ) E[l(θ)] (19)

22 Intro LS MLE Other Theorems Info Proofs Consistency cont. This decomposition shows that 1 l(θ) = It then follows that E[l(θ)] }{{} Expected log likelihood ˆθ MLE = argmax E[l(θ)] + θ Θ ( + 1 l(θ) E[l(θ)] } {{} Random error ) ˆθ MLE argmax E[l(θ)] θ Θ (20) (21)

23 Intro LS MLE Other Theorems Info Proofs First we look at the random error, that is written so that we can apply the Law of Large umbers. If lim uniformly over Θ, then it follows that lim 1 l(θ) E[l(θ)] = 0 (22) ˆθ MLE = argmax E[l(θ)] (23) θ Θ

24 Intro LS MLE Other Theorems Info Proofs Finally, we denote any feasible density by q θ Q θ and assume that the true density p θ0 Q θ. The dierence in expected log-likelihood between the true density and any other density is given by E[log q θ (X )] E[log p θ0 (X )] = (log q θ (x) log p θ0 (x)) p θ0 (x)dx = (24) ( ) qθ (x) log p θ0 (x)dx p θ0 (x) (25) Log is a concave function. We know from Jensen's inequality that E[log(ξ)] log(e[ξ])

25 Intro LS MLE Other Theorems Info Proofs Finally, we denote any feasible density by q θ Q θ and assume that the true density p θ0 Q θ. The dierence in expected log-likelihood between the true density and any other density is given by E[log q θ (X )] E[log p θ0 (X )] = (log q θ (x) log p θ0 (x)) p θ0 (x)dx = (24) ( ) qθ (x) log p θ0 (x)dx p θ0 (x) (25) Log is a concave function. We know from Jensen's inequality that E[log(ξ)] log(e[ξ])

26 Intro LS MLE Other Theorems Info Proofs Finally, we denote any feasible density by q θ Q θ and assume that the true density p θ0 Q θ. The dierence in expected log-likelihood between the true density and any other density is given by E[log q θ (X )] E[log p θ0 (X )] = (log q θ (x) log p θ0 (x)) p θ0 (x)dx = (24) ( ) qθ (x) log p θ0 (x)dx p θ0 (x) (25) Log is a concave function. We know from Jensen's inequality that E[log(ξ)] log(e[ξ])

27 Intro LS MLE Other Theorems Info Proofs That means that ( qθ (x) log p θ0 (x) ) p θ0 (x)dx log ( ) qθ (x) p θ0 (x) p θ 0 (x)dx = 0 (26) We only get equality when q θ (x) = p θ0 (x) for all values of x. The conclusion is that and hence that as the random error vanishes as. θ 0 = argmax E[l(θ)] (27) θ Θ lim ˆθ MLE = θ 0 (28)

28 Intro LS MLE Other Theorems Info Proofs That means that ( qθ (x) log p θ0 (x) ) p θ0 (x)dx log ( ) qθ (x) p θ0 (x) p θ 0 (x)dx = 0 (26) We only get equality when q θ (x) = p θ0 (x) for all values of x. The conclusion is that and hence that as the random error vanishes as. θ 0 = argmax E[l(θ)] (27) θ Θ lim ˆθ MLE = θ 0 (28)

29 Intro LS MLE Other Theorems Info Proofs Proof Ib ( Rewrite L(X) = exp log p θ (X 1 ) + ) n=2 log p θ(x n X 1:n 1 ). Second order Taylor expansion around θ 0. Maximize (ˆθ θ 0 )... Consistency and asymp normality follows.

30 Intro LS MLE Other Theorems Info Proofs Proof I, Ext. Write l (θ) = log L(X θ). This is a sum consisting of terms (think LL and CLT!). Approximate l (θ) using a second order Taylor expansion around θ 0. Thus l (θ) = l (θ 0 ) + θ l (θ 0 )(θ θ 0 ) θ l (θ 0 )(θ θ 0 ) 2 + R. Ignore the last term, multiply l with 1/ and maximize wrt θ to obtain ˆθ. We obtain 1 θ l (θ 0 ) θ l (θ 0 ) (ˆθ θ 0 ) := 0.

31 Intro LS MLE Other Theorems Info Proofs Proof I, Ext. We have from LL that and from CLT that 1 2 θ l (θ 0 ) I (θ 0 ), 1 θ l (θ 0 ) Z, where Z (0, I (θ 0 )). Rearranging gives (ˆθ θ 0 ) d (0, I 1 (θ 0 )).

32 Intro LS MLE Other Theorems Info Proofs Proof II Denote the score function U = θ log θ p(x ) and another estimator by T. Assume that E[T ] = θ 0 (to rule out anomalies). Calculate Cov(U, T ) Cov(U, T ) = E[UT ] E[U]E[T ] = E[UT ] θ 0 0 E[UT ] = T (x) θ log p θ 0 (x)p θ0 (x)dx T (x)pθ0 (x)dx = θ θ = 1. = θ Cauchy-Schwartz states Cov(U, T ) Var[U]Var[T ]. Thus Var[T ] Var[ θ log p θ(x )] 1 I.e. Var[T ] Var[ˆθ MLE ]!

33 Intro LS MLE Other Theorems Info Proofs Proof III Original problem Dene Y = g(x ). Calculations... Ok. ˆθ MLE = argmax p θ (X 1,... X ). θ Θ

34 Intro LS MLE Other Other estimators We can derive the asymptotical distribution using the same arguments for any M-estimator, i.e. for any estimator taking the estimate as the value that maximized/minimizes a function of data. Similar arguments can also be used for Z-estimators, i.e. estimator taking the estimate as the value that solves a system of equations depending on data.

35 Intro LS MLE Other M-estimators Take any loss function J(θ), such that E.g. PEM: J(θ) = i=1 ε i(θ) 2. The corresponding problem is ˆθ = argmax J(θ). (29) θ Θ 1 θ J(θ 0 ) θ J(θ 0) (ˆθ θ 0 ) := 0. Assume that E[ 1 θ J(θ 0 )] = 0 (Why?), Var[ 1 θ J(θ 0 )] = W and E[ 1 2 θ J(θ 0)] = V. Applying the limit theorems as in the Maximum likelihood setup gives the asymptotics. Result: (ˆθ θ 0 ) d (0, V 1 W (V 1 ) T ).

36 Intro LS MLE Other Z-estimators Take any loss function G(θ), such that ˆθ = argmax G(θ) = 0. (30) θ Θ E.g. Estimation functions G(θ) = g i (θ)(x i E[X i ]) Compute a rst order Taylor expansion around θ 0, and solve wrt θ. Thus G(θ 0 ) + θ G(θ 0 )(ˆθ θ 0 ) := 0. Multiply by 1/ and apply limit theorems. Let E[ 1 G(θ 0 )] = 0 (why?), Var[ 1 G(θ 0 )] = W and E[ 1 θg(θ 0 )] = V. Then (ˆθ θ 0 ) d (0, V 1 W (V 1 ) T ).

37 Intro LS MLE Other The likelihood framework can be used to test the t of models. Condence intervals Likelihood ratio tests

38 Intro LS MLE Other Condence intervals. Assume (ˆθ θ 0 ) d (0, Σ). Then I θ0 ˆθ ± 2 diag(σ). Better accuracy: Use prole likelihood!

39 Intro LS MLE Other Likelihood based tests. Likelihood ratio tests where Θ R Θ F. Then Λ = {sup L(X θ), θ ΘF } {sup L(X θ), θ Θ R } 2 log(λ) χ 2 (k), k being the degrees of freedom. LR tests are optimal, cf. eyman-pearson. and are closely related to AIC, BIC etc.

40 Intro LS MLE Other Likelihood ratios Compare L(X ˆθ) to L(X θ 0 ). Taylor expand We have that ( ) 2 log(λ) = 2 l (ˆθ) l (θ 0 ). l (θ) = l (θ 0 ) + θ l (θ 0 )(θ θ 0 ) θ l (θ 0 )(θ θ 0 ) 2 + R. The estimate must solve θ l (θ 0 ) + 2 θ l (θ 0 )(ˆθ θ 0 ) = 0. Thus 2 log(λ) = 2 ( ) 1 2 θl (θ 0 )(ˆθ θ 0 ).

41 Intro LS MLE Other Likelihood ratios, cont. Plugging in (ˆθ θ 0 ) = ( 2 θ l (θ 0 )) 1 θ l (θ 0 ) gives 2 log(λ) = θ l (θ 0 )( 2 θ l (θ 0 )) 1 θ l (θ 0 ) Or nicer written 2 log(λ) = 1 θ l (θ 0 )( 1 2 θ l (θ 0 )) 1 1 θ l (θ 0 ). This is a quadratic form, distributed as χ 2 (k), k being the degrees of freedom.

42 Intro LS MLE Other Feedback Send feedback, questions and information about typos to

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

Chapter 3. Point Estimation. 3.1 Introduction

Chapter 3. Point Estimation. 3.1 Introduction Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

δ -method and M-estimation

δ -method and M-estimation Econ 2110, fall 2016, Part IVb Asymptotic Theory: δ -method and M-estimation Maximilian Kasy Department of Economics, Harvard University 1 / 40 Example Suppose we estimate the average effect of class size

More information

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1 Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1

More information

Expectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )

Expectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 ) Expectation Maximization (EM Algorithm Motivating Example: Have two coins: Coin 1 and Coin 2 Each has it s own probability of seeing H on any one flip. Let p 1 = P ( H on Coin 1 p 2 = P ( H on Coin 2 Select

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Normalising constants and maximum likelihood inference

Normalising constants and maximum likelihood inference Normalising constants and maximum likelihood inference Jakob G. Rasmussen Department of Mathematics Aalborg University Denmark March 9, 2011 1/14 Today Normalising constants Approximation of normalising

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans

More information

Lecture 6: Model Checking and Selection

Lecture 6: Model Checking and Selection Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models Herman J. Bierens Pennsylvania State University September 16, 2005 1. The uniform weak

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

Z-estimators (generalized method of moments)

Z-estimators (generalized method of moments) Z-estimators (generalized method of moments) Consider the estimation of an unknown parameter θ in a set, based on data x = (x,...,x n ) R n. Each function h(x, ) on defines a Z-estimator θ n = θ n (x,...,x

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Information in a Two-Stage Adaptive Optimal Design

Information in a Two-Stage Adaptive Optimal Design Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans

More information

5.2 Fisher information and the Cramer-Rao bound

5.2 Fisher information and the Cramer-Rao bound Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

Evaluating the Performance of Estimators (Section 7.3)

Evaluating the Performance of Estimators (Section 7.3) Evaluating the Performance of Estimators (Section 7.3) Example: Suppose we observe X 1,..., X n iid N(θ, σ 2 0 ), with σ2 0 known, and wish to estimate θ. Two possible estimators are: ˆθ = X sample mean

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Classical Estimation Topics

Classical Estimation Topics Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Estimation of arrival and service rates for M/M/c queue system

Estimation of arrival and service rates for M/M/c queue system Estimation of arrival and service rates for M/M/c queue system Katarína Starinská starinskak@gmail.com Charles University Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes

More information

Lecture 14 October 13

Lecture 14 October 13 STAT 383C: Statistical Modeling I Fall 2015 Lecture 14 October 13 Lecturer: Purnamrita Sarkar Scribe: Some one Disclaimer: These scribe notes have been slightly proofread and may have typos etc. Note:

More information

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22 MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y

More information

Model selection in penalized Gaussian graphical models

Model selection in penalized Gaussian graphical models University of Groningen e.c.wit@rug.nl http://www.math.rug.nl/ ernst January 2014 Penalized likelihood generates a PATH of solutions Consider an experiment: Γ genes measured across T time points. Assume

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Mathematical statistics

Mathematical statistics October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Time Series Analysis Fall 2008

Time Series Analysis Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 4.384 Time Series Analysis Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Indirect Inference 4.384 Time

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

i=1 h n (ˆθ n ) = 0. (2)

i=1 h n (ˆθ n ) = 0. (2) Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches

Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches Ref: Huber,

More information

Approximate Likelihoods

Approximate Likelihoods Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Some General Types of Tests

Some General Types of Tests Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links. 14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

Minimum Message Length Analysis of the Behrens Fisher Problem

Minimum Message Length Analysis of the Behrens Fisher Problem Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 9: Asymptotics III(MLE) 1 / 20 Jensen

More information

ML Testing (Likelihood Ratio Testing) for non-gaussian models

ML Testing (Likelihood Ratio Testing) for non-gaussian models ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics CS 540: Machine Learning Lecture 2: Review of Probability & Statistics AD January 2008 AD () January 2008 1 / 35 Outline Probability theory (PRML, Section 1.2) Statistics (PRML, Sections 2.1-2.4) AD ()

More information