Chapter 3: Maximum Likelihood Theory

Size: px
Start display at page:

Download "Chapter 3: Maximum Likelihood Theory"

Transcription

1 Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

2 1 Introduction Example 2 Maximum likelihood estimator Notation Likelihood and log-likelihood Maximum likelihood principle Equivariance principle 3 Fisher information Score vector Fisher information matrix 4 Asymptotic results Overview Consistency Asymptotic efficiency Large sample distribution Back to the equivariance... Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

3 Introduction Example Example 1: Suppose that Y 1,Y 2,,Y n are i.i.d. random variables, with Y i B(p): { 1 with probability p Y i = 0 with probability 1 p where p is an unknown parameter to estimate. The sample (y 1, y 2,, y n ) is observed. Explicit assumption regarding the distribution of Y i. Can we find an estimate (estimator) of p? Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

4 Introduction Example Example 1 (cont d) The joint distribution of the sample is: ( n ) P (Y i = y i ) = P ((Y 1 = y 1 ) (Y 2 = y 2 ) (Y n = y n )) = = n P(Y i = y i ) n p y i (1 p) 1 y i = p np y i (1 p) n n P y i The likelihood function is the joint density of the data, except that we treat it as a function of the parameter: n L(p y) L(y; p) = p y i (1 p) 1 y i... The likely values of the unknown parameter given the realizations of the random variables... Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

5 Introduction Example Example 1 (cont d) Suppose that two estimates of p, given by ˆp 1,n (y) and ˆp 2,n (y), are such that L n (y; ˆp 1,n (y)) > L n (y; ˆp 2,n (y)) The sample we observe y = (y 1,, y n ) is more likely to have occurred if p = ˆp 1,n (y) than if p = ˆp 2,n (y) ˆp 1,n (y) is a more plausible value. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

6 Introduction Example Example 1 (cont d) Under suitable regularity conditions, the maximum likelihood estimate (estimator) is defined to be: ˆp = argmaxl(y; p) = argmaxl(y; p) p p where l(y; p) = log(l(y; p)) is the log-likelihood function. The maximum likelihood estimate is: ˆp(y) = 1 n The maximum likelihood estimator is: ˆp = 1 n y i Y i. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

7 Introduction Example How to apply the maximum likelihood principle to the multiple linear regression model? What are the main properties of the maximum likelihood estimator? Is it asymptotically unbiased? Is it asymptotically efficient? Under which condition(s)? Is it consistent? What is the asymptotic distribution? What are the main properties of any transformation of the estimator, say θ = g(p)?... All of these questions are answered in this lecture... Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

8 Maximum likelihood estimator Notation 2. Maximum likelihood estimator 2.1 Notation Consider the multiple linear regression model: y i = x i b + u i where the error terms are spherical, and the observations (y i, x i ), i = 1,, n, are i.i.d. The joint density function is given by: where θ = (b, σ 2 ). By definition, f (y i, x i ) L i (y i, x i ; θ) L(y i, x i ; θ) f (y i, x i ) = f (y i x i )f (x i ) where f (y i x i ) is the conditional density of Y X = x i and f (x i ) is the marginal density of X i. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

9 Maximum likelihood estimator Notation To get the (log-)likelihood function, one needs some parametric assumptions: 1 One can specify the conditional distribution of u X, i.e. the conditional distribution of Y X: u X N (0 n 1, σ 2 I n ) i.e Y X N (Xb, σ 2 I n ) 2 One can specify the joint (multivariate) distribution of (X, Y ) and the marginal (multivariate) distribution of X: ( Y X ) N (( EY EX or X N (EX, Σ xx ) ) ( Σyy Σ, yx Σ xy Σ xx )) Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

10 Maximum likelihood estimator Notation In the first case (conditional distribution), the estimator of θ can be observed from the conditional (log-)likelihood function. In the second case (joint distribution), the estimator of θ can be derived from the joint (log-)likelihood function The joint likelihood function is the product of the conditional likelihood function and the marginal likelihood function (the information provided by the marginal distribution of X). The joint log-likelihood function is the sum of the conditional log-likelihood function and the marginal log-likelihood function. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

11 Maximum likelihood estimator Notation The conditional and marginal (log-)likelihood function (and thus the joint and conditional (log-)likelihood function) are conceptually different, and so are the two corresponding estimators (especially, in finite samples). Choosing one or the other depends on the empirical setting. For instance, the distribution of the sample data can be conditionally normal but not jointly normal (e.g., the variables X are arbitrarily determined in some experimental settings). In the sequel, we only consider the conditional maximum likelihood estimator (under the assumption of independent samples). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

12 Maximum likelihood estimator 2.2. Likelihood and log-likelihood Likelihood and log-likelihood Definition The (conditional) likelihood function is defined to be: L n : Y Θ [0, + ) ((y, x), θ) L n (y x; θ) = n L i (y i x i ; θ) Remark: The conditional likelihood function is the joint conditional density of the data in which the unknown parameter is θ. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

13 Maximum likelihood estimator Likelihood and log-likelihood Definition The (conditional) log-likelihood function is defined to be: l n : Y Θ R ((y, x); θ) l n (y x; θ) = logl i (y i x i ; θ) Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

14 Maximum likelihood estimator Likelihood and log-likelihood Application: The multiple linear regression model. Under the conditional normality assumption, ( f (x i x i ; θ) L i (y i x i ; θ) = (σ 2 2π) 1 2 exp (y ) i x i b)2 2σ0 2 Therefore L n (y x; θ) = n L i (y i x i ; θ) = (σ 2 2π) n 2 exp ( 1 2σ 2 and l n (y x; θ) = n 2 log(2π) n 2 log(σ2 ) 1 2σ 2 (y i x i b)2 ) (y i x i b)2. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

15 Maximum likelihood estimator 2.3. Maximum likelihood principle Maximum likelihood principle Definition A maximum likelihood estimator of θ Θ R k is a solution to the maximization problem: or ˆθ n = argmaxl n (θ) θ Θ ˆθ n = argmaxl n (θ). θ Θ Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

16 Maximum likelihood estimator Maximum likelihood principle 2.3. The maximum likelihood principle: Using the first-order conditions... Definition Under suitable regularity conditions, a maximum likelihood estimator of θ Θ R k is defined to be the solution of the first-order conditions (likelihood or log-likelihood equations): or L n θ (y x, ˆθ n ) = 0 k 1 l n θ (y x, ˆθ n ) = 0 k 1. Remark: Regularity conditions are fundamental! Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

17 Maximum likelihood estimator Maximum likelihood principle Application: The multiple linear regression model (cont d) Under suitable regularity conditions, the first-order condition are given by: { ln b (y x; ˆθ n ) = 0 k 1 l n σ 2 (y x; ˆθ n ) = σ 2 x i (y i x i ˆb n ) = 0 k 1 (y i x i ˆb n ) 2 = 0 n + 1 2ˆσ n 2 2ˆσ n 4 The maximum likelihood estimate of θ is: ( ) 1 ( ˆb n = x i x i ) x i y i ˆσ n 2 = n 1 n (y i x i ˆb n ) 2. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

18 Maximum likelihood estimator Maximum likelihood principle Second-order conditions: the Hessian matrix evaluated at θ = ˆθ n must be negative definite. The Hessian matrix is given by: and H θ= ˆθn H = = 1 σ 2 1 x σ 4 x i x i 1 x σ 4 i (y i x i b) i (y i x i b) 1ˆσ x n 2 i x i 0 k k n 2ˆσ n 4 n 1 2σ 4 σ 6 (y i x i b)2 Given that (X X) is positive definite and ˆσ 2 > 0, then H θ= ˆθn negative definite and ˆθ n is a maximum. is Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

19 Maximum likelihood estimator 2.4. Equivariance principle Equivariance principle Definition Under suitable regularity conditions, the maximum likelihood estimator of a function g(θ) of the parameter θ is g(ˆθ n ), where ˆθ n is the maximum likelihood estimator of θ. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

20 Maximum likelihood estimator Equivariance principle Example: Suppose Y 1,,Y n i.i.d. E(θ). The likelihood function is: n L n (y; θ) = θ exp( θy i ) ( ) = θ n exp θ y i. One gets (second-order conditions hold): ˆθ n = 1 Ȳ n ˆθ n (y) = 1 ȳ n. Consider now the probability density function: f Yi (y i ; λ) = 1 ( λ exp y ) i. λ Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

21 Example (continued): Maximum likelihood estimator Equivariance principle The log-likelihood function is: l(y; λ) = nlog(λ) 1 λ The first-order condition with respect to λ is: n λ + 1 λ 2 y i = 0. y i. Since the second order condition holds, one gets (as to be expected!): ˆλ n = Ȳn = 1ˆθ n ˆλ n (y) = ȳ n = 1 ˆθ n (y). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

22 3. Fisher information. 3.1 Score vector Fisher information Score vector Definition The score vector, s, is defined to be the vector formed by the first (partial) derivative of the (conditional) log-likelihood with respect to the parameters θ Θ R k : ( ) ln s(θ) l n,θ (Y x; θ) = (Y x; θ) θ i 1 i k It satisfies: [ ] ln E θ (Y x; θ) = 0 k 1, x, θ. θ Remark: E θ means the expectation with respect to the conditional distribution Y X. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

23 Fisher information Score vector Application: The multiple linear regression model The score vector is given by: 1 x σ s(θ) = 2 i (Y i x i b) n + 1 (Y 2σ 2 2σ 4 i x i b)2 E θ [s(θ)] = 0 (k+1) 1 since: [ ] ln E θ β (Y x; βσ2 ) [ E θ n 2σ σ 4 = 1 σ 2 x i (E θ (Y i ) x i b) = 0 k 1 ] (Y i E θ (Y i x i )) 1 2σ 4 = n 2σ 2 + E θ (Y i E(Y i x i )) 2 }{{} = 0 V θ (Y i x i )=σ 2 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

24 Fisher information 3.2. Fisher information matrix Fisher information matrix Definition The Fisher information matrix at x is the variance-covariance matrix of the score vector: [ I x ln F = V θ (Y x; θ) θ [ ln (Y x; θ) = E θ. l n(y x; θ) θ θ ] ]. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

25 Fisher information Fisher information matrix Definition The Fisher information matrix of x is also given by: Remarks: [ I x 2 ] l n F = E θ (Y x; θ). θ θ 1 Three equivalent definition of the Fisher information matrix three different consistent estimates of the Fisher information matrix. 2 Finite sample properties can be quite different! 3 I x F can be defined from the Fisher information matrix for the observation i. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

26 Fisher information Fisher information matrix Definition The Fisher information matrix for the observation i (or x i ) can be defined by: Ĩ x i F (θ) = V θ [ ] l θ (Y i x i ; θ) = E θ [ l θ (Y i x i ; θ). l = E θ [ 2 l n θ θ (Y i x i ; θ) ] θ (Y i x i ; θ) ]. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

27 Fisher information Fisher information matrix Proposition The Fisher information matrix at x = (x 1,, x n ) (or for n observations) is given by: I x F (θ) = Ĩ x i F (θ). Remark: In a sampling model (with i.i.d. observations), one has: I x F (θ) = nĩx i F (θ). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

28 Fisher information Fisher information matrix Definition The average Fisher information matrix for one observation is defined by: Theorem (a) Ĩ F (θ) = plim 1 n IX F (θ) n Ĩ F (θ) = E Xi Ĩ X i F (θ). (b) Ĩ F (θ) = E [ l θ (Y i X i ; θ) l θ (Y i X i ; θ) ] [ ] (c) Ĩ F (θ) = E 2 l θ θ (Y i X i ; θ) Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

29 Fisher information Fisher information matrix A consistent estimator of the Fisher information matrix? Proposition If ˆθ n converges in probability to θ 0, then: Î (1) F (ˆθ n,ml ) = 1 n Î (2) F (ˆθ n,ml ) = 1 n Î (3) F (ˆθ n,ml ) = 1 n I x i F (ˆθ n,ml ) l i (y i x i ; ˆθ n,ml ) l i (y i x i ; ˆθ n,ml ) θ θ t 2 l i (y i x i ; ˆθ n,ml ) θ θ t = 1 n 2 l n (y x; ˆθ n,ml ) θ θ t. are three consistent estimators of the Fisher information matrix. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

30 Fisher information Fisher information matrix These three consistent estimators of the Fisher information matrix are asymptotically equivalent and none of these estimators is preferable to the others on statistical grounds. The main difficulty is that these estimators can have very different finite sample properties (again!). This can lead to different statistical conclusions for the same problem! Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

31 Fisher information Fisher information matrix Application: The multiple linear regression model Computation of Ĩ F (θ): Derivation of the Hessian matrix of the log-likelihood function for observation i: 2 ( l i θ θ (y 1 x i x i ; θ) = σ 2 i x i 1 ) x σ 4 i (y i x i b) 1 x σ 4 i (y 1 i x i b) 1 (y 2σ 4 σ 6 i x i b)2 Expectation with respect to the conditional distribution of Y i X i = x i : [ 2 ] ( l E i 1 θ θ θ (.) x = σ 2 i x i 0 k k 1 2σ 4 Expectation with respect to the distribution of X i : [ 2 ] ( l 1 Ĩ F (θ) = E Xi E i θ θ θ (.) E(X = σ 2 i X i ) 0 ) k k 2σ 4 ) Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

32 4. Asymptotic results 4.1. Overview Asymptotic results Overview Under certain regularity conditions, the maximum likelihood estimator, ˆθ n, possesses many appealing properties: 1. The maximum likelihood estimator is consistent. 2. The maximum likelihood estimator is asymptotically normal: n (ˆθn θ 0 ) d N (.,.). 3. The maximum likelihood estimator is asymptotically optimal or efficient. 4. The maximum likelihood estimator is equivariant: if ˆθ n is an estimator of θ 0 then g(ˆθ n ) is an estimator of g(θ 0 ). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

33 Asymptotic results Overview At the same time, Dependence to the explicit assumptions regarding Y 1,,Y n Finite sample properties can be very different from large sample properties: - The maximum likelihood estimator is consistent but can be severely biased in finite samples - The estimation of the variance-covariance matrix can be seriously doubtful in finite samples. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

34 4.2. Consistency Asymptotic results Consistency Theorem Under suitable regularity conditions, ˆθ n,ml a.s θ 0. Remark: This implies that: ˆθ n,ml p θ0 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

35 Asymptotic results 4.3. Asymptotical efficiency Asymptotic efficiency Proposition An unbiased maximum likelihood estimator of θ or g(θ) attains the FDCR lower bound and is thus (asymptotically) efficient. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

36 Asymptotic results 4.4. Large sample distribution Large sample distribution Theorem Under suitable regularity conditions, ( ) d n(ˆθn,ml θ 0 ) N 0, Ĩ 1 n F (θ 0) ˆθ n,ml a N (θ 0, n 1 Ĩ 1 F (θ 0) Remark: In a sampling model, I F (θ 0 ) is independent of x and is the Fisher information matrix for one observation: ( ) d n(ˆθ n,ml θ 0 ) N 0, I 1 n 1 (θ 0) ). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

37 Asymptotic results Large sample distribution Interpretation: The distribution of ˆθ n is approximatively (for n large) normally distributed with expectation the true unknown parameter and variance-covariance matrix the FDCR lower bound. The maximum likelihood estimator is asymptotically unbiased. The maximum likelihood estimator is asymptotically efficient. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

38 Asymptotic results 5.3. Back to the equivariance... Back to the equivariance... Proposition Assume H1, H2, H3-H8 hold, and g is a continuously differentiable function of θ and is defined from R k to R p, then: ) a.s. g (ˆθ n g(θ 0 ) ( ) ) ( [ ] [ ]) d g g n g (ˆθ n g(θ 0 ) N 0, θ t (θ 0) Ĩ 1 t F (θ 0) θ (θ 0). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

39 Asymptotic results Back to the equivariance... Application: The multiple linear regression model The Fisher information matrix is given by: ( Ĩ 1 σ F (θ 2 0) = 0 (EX i X i ) ) 1 0 k k 2σ0 4 Therefore, n(ˆbn,ml b 0 ) n(ˆσ 2 n,ml σ 2 0 ) ( d N 0, σ 2 n 0 (EX ix i ) 1) d n N (0, 2σ 4 0 ) The two vectors n(ˆb n,ml b 0 ) and n(ˆσ 2 n,ml σ2 0 ) are asymptotically independent. Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

40 Asymptotic results Back to the equivariance... A consistent estimate of the Fisher information matrix can be given by: Ĩ (1) F = 1 n = ( 1ˆσ 2 n E 2 l(y i x i ; ˆθ n ) θ θ 1 n X ) X 0 k 1 0 k 1 1 ˆσ 4 n so that: ˆb n,ml a N ( b 0, ˆσ 2 n(x X) 1) ˆσ n,ml a N ( σ 2 0, 2ˆσ4 n n ). Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, / 40

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1 Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1

More information

Final Examination Statistics 200C. T. Ferguson June 11, 2009

Final Examination Statistics 200C. T. Ferguson June 11, 2009 Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

This paper is not to be removed from the Examination Halls

This paper is not to be removed from the Examination Halls ~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Economics 620, Lecture 2: Regression Mechanics (Simple Regression) 1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Machine Learning Basics: Maximum Likelihood Estimation

Machine Learning Basics: Maximum Likelihood Estimation Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

ECE531 Lecture 8: Non-Random Parameter Estimation

ECE531 Lecture 8: Non-Random Parameter Estimation ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically

More information

Statistical Machine Learning Hilary Term 2018

Statistical Machine Learning Hilary Term 2018 Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

Math 181B Homework 1 Solution

Math 181B Homework 1 Solution Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin March 23, 2011 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction 2 3 4 5 Preliminaries Introduction Ordinary least squares

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Regression Estimation Least Squares and Maximum Likelihood

Regression Estimation Least Squares and Maximum Likelihood Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize

More information

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22 MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

2.3 Methods of Estimation

2.3 Methods of Estimation 96 CHAPTER 2. ELEMENTS OF STATISTICAL INFERENCE 2.3 Methods of Estimation 2.3. Method of Moments The Method of Moments is a simple technique based on the idea that the sample moments are natural estimators

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Lecture 2: ARMA(p,q) models (part 2)

Lecture 2: ARMA(p,q) models (part 2) Lecture 2: ARMA(p,q) models (part 2) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Jan. 2012 Florian Pelgrin (HEC) Univariate time series Sept.

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

STA 260: Statistics and Probability II

STA 260: Statistics and Probability II Al Nosedal. University of Toronto. Winter 2017 1 Properties of Point Estimators and Methods of Estimation 2 3 If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Maximum Likelihood Tests and Quasi-Maximum-Likelihood Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Mathematical statistics

Mathematical statistics October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Chapter 4: Asymptotic Properties of the MLE (Part 2) Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Theory of Statistics.

Theory of Statistics. Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

Statistics and Econometrics I

Statistics and Econometrics I Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,

More information