Advanced Quantitative Methods: maximum likelihood

Size: px
Start display at page:

Download "Advanced Quantitative Methods: maximum likelihood"

Transcription

1 Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014

2

3 Outline

4 of straight lines y = 1 2 x + 2 dy dx = 1 2

5 of curves y = x 2 4x + 5

6 of curves y = x 2 4x + 5 d(ax b ) dx d(a + b) dx = bax b 1 = da dx + db dx

7 of curves y = x 2 4x + 5 d(ax b ) dx d(a + b) dx = bax b 1 = da dx + db dx dy dx = 2x 4

8 Finding location of peaks and valleys A maximum or minimum value of a curve can be found by setting the derivative equal to zero.

9 Finding location of peaks and valleys A maximum or minimum value of a curve can be found by setting the derivative equal to zero. y = x 2 4x + 5 dy dx = 2x 4 = 0 2x = 4 x = 4 2 = 2

10 Finding location of peaks and valleys A maximum or minimum value of a curve can be found by setting the derivative equal to zero. y = x 2 4x + 5 dy dx = 2x 4 = 0 2x = 4 x = 4 2 = 2 So at x = 2 we will find a (local) maximum or minimum value - the plot shows that in this case it is a minimum.

11 in regression This concept of finding the minimum or maximum is crucial for regression analysis. With ordinary least squares (OLS), we estimate the β-coefficients by finding the minimum of the sum of squared errors. With (ML), we estimate the β-coefficients by finding the maximum point of the (log)likelihood function.

12 of matrices The derivative of a matrix consists of the matrix containing the derivatives of each element of the original matrix.

13 of matrices The derivative of a matrix consists of the matrix containing the derivatives of each element of the original matrix. x 2 2x 3 M = 2x + 4 3x 3 2x 3 2 4x 2x 2 0 dm dx = 2 9x

14 of matrices d(v x) = v dx d(x Ax) = (A + A )x dx d(bx) = B dx d 2 (x Ax) dxdx = A + A

15 Outline

16 Preliminaries Ordinary least squares and its variations (GLS, WLS, 2SLS, etc.) is very flexible and applicable in many circumstances, but for some models (e.g. limited dependent variable models) we need more flexible estimation procedures. The most prominent alternative estimators are maximum likelihood (ML) and Bayesian estimators. This lecture is about the former.

17 Logarithms Some rules about logarithms, which are important for understanding ML: log ab = log a + log b log e a = a log a b = b log a log a > log b iff a > b d(log a) = 1 da a d(log f (a)) da = d(f (a))/da f (a)

18 Likelihood: intuition Dice example.

19 Likelihood: intuition Dice example. Imagine a dice is thrown with unknown number of sides k. We know after throwing the dice that the outcome is 5. How likely is this outcome if k = 0? And k = 1? Continue until k = 10. Plot of this is the likelihood function.

20 Likelihood Likelihood is the hypothetical probability that an event that has already occurred would yield a specific outcome. The concept differs from that of a probability in that a probability refers to the occurrence of future events, while a likelihood refers to past events with known outcomes. (Wolfram Mathworld)

21 Preliminaries Looking first at just univariate models, we can express the model as the probability density function (PDF) f (y, θ), where y is the dependent variable, θ the set of parameters, and f ( ) expresses the model specification.

22 Preliminaries Looking first at just univariate models, we can express the model as the probability density function (PDF) f (y, θ), where y is the dependent variable, θ the set of parameters, and f ( ) expresses the model specification. ML is purely parametric - we need to make strong assumptions about the shape of f ( ) before we can estimate θ.

23 Likelihood Given θ, f (, θ) is the PDF of y.

24 Likelihood Given θ, f (, θ) is the PDF of y. Given y, f (y, ) cannot be interpreted as a PDF and is therefore called the likelihood function.

25 Likelihood Given θ, f (, θ) is the PDF of y. Given y, f (y, ) cannot be interpreted as a PDF and is therefore called the likelihood function. ˆθ ML is the θ that maximizes this likelihood function. (Davidson & MacKinnon 1999, )

26 (Log)likelihood function Generally, observations are assumed to be independent, in which case the joint density of the entire sample is the product of the densities of the individual observations. n f (y, θ) = f (y i, θ) i=1

27 (Log)likelihood function Generally, observations are assumed to be independent, in which case the joint density of the entire sample is the product of the densities of the individual observations. n f (y, θ) = f (y i, θ) It is easier to work with the log of this function, because sums are easier to deal with than products and the logarithmic transformation is monotonic: n n l(y, θ) log f (y, θ) = log f i (y i, θ) = l i (y i, θ) i=1 i=1 i=1 (Davidson & MacKinnon 1999, )

28 Example: exponential distribution For example, take the PDF of y to be f (y, θ) = θe θy y i > 0, θ > 0

29 Example: exponential distribution For example, take the PDF of y to be f (y, θ) = θe θy y i > 0, θ > 0 This distribution is useful when modeling something which is by definition positive. (Davidson & MacKinnon 1999, )

30 Example: exponential distribution Deriving the loglikelihood function: f (y, θ) = l(y, θ) = = n f (y i, θ) = i=1 n i=1 n log f (y i, θ) = i=1 θe θy i n log(θe θy i ) i=1 n (log θ θy i ) = n log θ θ n i=1 i=1 y i (Davidson & MacKinnon 1999, )

31 Maximum Likelihood Estimates The estimate of θ is the value of θ which maximizes the (log)likelihood function l(y, θ).

32 Maximum Likelihood Estimates The estimate of θ is the value of θ which maximizes the (log)likelihood function l(y, θ). In some cases, such as the above exponential distribution or the linear model, we can find this value analytically, but taking the derivative, and setting equal to zero.

33 Maximum Likelihood Estimates The estimate of θ is the value of θ which maximizes the (log)likelihood function l(y, θ). In some cases, such as the above exponential distribution or the linear model, we can find this value analytically, but taking the derivative, and setting equal to zero. In many other cases, there is no such analytical solution, and we use numerical search algorithms to find the value.

34 Example: exponential distribution To find the maximum, we take the derivative and set this equal to zero: n l(y, θ) = n log θ θ d(n log θ θ n i=1 y i) dθ n n ˆθ y i = 0 ML i=1 ˆθ ML = = n θ n i=1 n n i=1 y i y i i=1 y i (Davidson & MacKinnon 1999, )

35 Numerical methods In many cases, closed form solutions like the examples above do not exist and numerical methods are necessary. A computer algorithm searches for the value that maximizes the loglikelihood.

36 Numerical methods In many cases, closed form solutions like the examples above do not exist and numerical methods are necessary. A computer algorithm searches for the value that maximizes the loglikelihood. In R: optim()

37 Newton s Method θ (m+1) = θ (m) H 1 (m) g (m) (Davidson & MacKinnon 1999, 401)

38 Outline

39 Model: y = Xβ + ε, ε N(0, σ 2 I)

40 Model: y = Xβ + ε, ε N(0, σ 2 I) We assume ε to be independently distributed and therefore, conditional on X, y is assumed to be. f i (y i, β, σ 2 ) = 1 (y 2πσ 2 e i X i β) 2σ 2 2 (Davidson & MacKinnon 1999, )

41 : loglikelihood function f (y, β, σ 2 ) = l(y, β, σ 2 ) = = n f i (y i, β, σ 2 ) = i=1 n 1 (y 2πσ 2 e i X i β) 2σ 2 i=1 n log f i (y i, β, σ 2 ) = i=1 n ( 1 2 log σ2 1 2 i=1 = n 2 log σ2 n 2 = n 2 log σ2 n 2 n 1 log( (y 2πσ 2 e i X i β) i= σ 2 ) log 2π 1 2σ 2 (y i X i β) 2 ) log 2π 1 2σ 2 n (y i X i β) 2 i=1 log 2π 1 2σ 2 (y Xβ) (y Xβ)

42 : loglikelihoodfunction To zoom in on one part of this function: log 1 2πσ 2 = log 1 σ 2π = log 1 log(σ 2π) = 0 (log σ + log 2π) = 1 2 log σ2 1 log 2π, using 2 log σ = 1 log σ2 2

43 : loglikelihood function f (y, β, σ 2 ) = l(y, β, σ 2 ) = = n f i (y i, β, σ 2 ) = i=1 n 1 (y 2πσ 2 e i X i β) 2σ 2 i=1 n log f i (y i, β, σ 2 ) = i=1 n ( 1 2 log σ2 1 2 i=1 = n 2 log σ2 n 2 = n 2 log σ2 n 2 n 1 log( (y 2πσ 2 e i X i β) i= σ 2 ) log 2π 1 2σ 2 (y i X i β) 2 ) log 2π 1 2σ 2 n (y i X i β) 2 i=1 log 2π 1 2σ 2 (y Xβ) (y Xβ)

44 : loglikelihood function f (y, β, σ 2 ) = l(y, β, σ 2 ) = = n f i (y i, β, σ 2 ) = i=1 n 1 (y 2πσ 2 e i X i β) 2σ 2 i=1 n log f i (y i, β, σ 2 ) = i=1 n ( 1 2 log σ2 1 2 i=1 = n 2 log σ2 n 2 = n 2 log σ2 n 2 n 1 log( (y 2πσ 2 e i X i β) i= σ 2 ) log 2π 1 2σ 2 (y i X i β) 2 ) log 2π 1 2σ 2 n (y i X i β) 2 i=1 log 2π 1 2σ 2 (y Xβ) (y Xβ)

45 : estimating σ 2 l(y, β, σ 2 ) σ 2 = n 2σ σ 4 (y Xβ) (y Xβ) 0 = n 2ˆσ ˆσ 4 (y Xβ) (y Xβ) n 2ˆσ 2 = 1 2ˆσ 4 (y Xβ) (y Xβ) n = 1ˆσ 2 (y Xβ) (y Xβ) ˆσ 2 = 1 n (y Xβ) (y Xβ) Thus ˆσ 2 is a function of β. (Davidson & MacKinnon 1999, )

46 : estimating σ 2 ˆσ 2 = 1 n (y Xβ) (y Xβ) = e e n Note that this is almost the equivalent of the variance estimator of OLS ( e e n k ). This estimator is a biased, but consistent, estimator of ˆσ 2.

47 : estimating β l(y, β, σ 2 ) = n 2 log σ2 n 2 log 2π 1 2σ 2 (y Xβ) (y Xβ) l(y, β) = n 2 log( 1 n (y Xβ) (y Xβ)) n log 2π 2 1 2( 1 n (y Xβ) (y Xβ)) (y Xβ) (y Xβ) = n 2 log( 1 n (y Xβ) (y Xβ)) n 2 log 2π n 2

48 : estimating β l(y, β, σ 2 ) = n 2 log σ2 n 2 log 2π 1 2σ 2 (y Xβ) (y Xβ) l(y, β) = n 2 log( 1 n (y Xβ) (y Xβ)) n log 2π 2 1 2( 1 n (y Xβ) (y Xβ)) (y Xβ) (y Xβ) = n 2 log( 1 n (y Xβ) (y Xβ)) n 2 log 2π n 2 Because the middle term is equivalent to minus n/2 times the log of SSR, maximizing l(y, β) with respect to β is the equivalent to minimizing SSR: ˆβ ML = ˆβ OLS. (Davidson & MacKinnon 1999, )

49 l(y, β, σ 2 ) = n 2 log σ2 n 2 log 2π 1 2σ 2 (y Xβ) (y Xβ) ll <- function(par, X, y) { s2 <- exp(par[1]) beta <- par[-1] n <- length(y) r <- y - X %*% beta - n/2 * log(s2) - n/2 * log(2*pi) - 1/(2*s2) * sum(r^2) } mlest <- optim(c(1,0,0,0), ll, NULL, X, y, control = list(fnscale = -1), hessian=true)

50 Some practical tricks: By default, optim() minimizes rather than maximizes, hence the need for fnscale = -1. To get ˆσ 2, we need to take the exponent of the first parameter - this is done to make sure σ 2 is always assumed positive.

51 Outline

52 Gradient vector The gradient vector or score vector is a vector with typical element: l(y, θ) g j (y, θ), θ j i.e. the vector of partial derivatives of the loglikelihood function towards each parameter in θ. (Davidson & MacKinnon 1999, 400)

53 Hessian matrix H(θ) is a k k matrix with typical element h ij = 2 l(θ) θ i θ j, i.e. the matrix of second derivatives of the loglikelihood function. (Davidson & MacKinnon 1999, 401, 407)

54 Hessian matrix H(θ) is a k k matrix with typical element h ij = 2 l(θ) θ i θ j, i.e. the matrix of second derivatives of the loglikelihood function. 1 Asymptotic equivalent: H(θ) plim n nh(y, θ). (Davidson & MacKinnon 1999, 401, 407)

55 Information matrix I(θ) = n E θ ((G i (y, θ)) G i (y, θ)) i=1 I(θ) is the information matrix, or the covariance matrix of the score vector. (Davidson & MacKinnon 1999, )

56 Information matrix I(θ) = n E θ ((G i (y, θ)) G i (y, θ)) i=1 I(θ) is the information matrix, or the covariance matrix of the score vector. There is also an asymptotic equivalent, I(θ) plim n 1 n I(θ). (Davidson & MacKinnon 1999, )

57 Information matrix I(θ) = n E θ ((G i (y, θ)) G i (y, θ)) i=1 I(θ) is the information matrix, or the covariance matrix of the score vector. There is also an asymptotic equivalent, I(θ) plim n 1 n I(θ). I(θ) = H(θ) - they both measure the amount of curvature in the loglikelihood function. (Davidson & MacKinnon 1999, )

58 Outline

59 Under some weak conditions, ML estimators are consistent, asymptotically efficient and asymptotically normally distributed. (Davidson & MacKinnon 1999, )

60 Under some weak conditions, ML estimators are consistent, asymptotically efficient and asymptotically normally distributed. New ML estimators thus do not require extensive proof of this - once it is shown it is an ML estimator, it inherits those properties. (Davidson & MacKinnon 1999, )

61 Variance-covariance matrix of ˆβ ML V (plim n n( ˆβ ML θ)) = H 1 (θ)i (θ)h 1 (θ) = I 1 (θ), the latter being true iff I(θ) = H(θ) is true. (Davidson & MacKinnon 1999, )

62 Variance-covariance matrix of ˆβ ML V (plim n n( ˆβ ML θ)) = H 1 (θ)i (θ)h 1 (θ) = I 1 (θ), the latter being true iff I(θ) = H(θ) is true. One estimator for the variance is: V ( ˆβ ML ) = H 1 ( ˆβ ML ). For alternative estimators see reference below. (Davidson & MacKinnon 1999, )

63 Estimating V ( ˆβ ML ) mlest <- optim(..., hessian=true) V <- solve(-mlest$hessian) sqrt(diag(v))

64 Exercise Imagine, we take a random sample of N colored balls from a vase, with replacement. In the vase, a fraction p of the balls are yellow and the other ones blue. In our sample, there are q yellow balls. Assuming that y i = 1 if the ball is yellow and y i = 0 for blue, the probability of obtaining our sample is: P(q) = p q (1 p) N q, which is the likelihood function with unknown parameter p. Write down the loglikelihood function. (Verbeek 2008, )

65 Exercise Imagine, we take a random sample of N colored balls from a vase, with replacement. In the vase, a fraction p of the balls are yellow and the other ones blue. In our sample, there are q yellow balls. Assuming that y i = 1 if the ball is yellow and y i = 0 for blue, the probability of obtaining our sample is: P(q) = p q (1 p) N q, which is the likelihood function with unknown parameter p. Write down the loglikelihood function. l(p) = log(p q (1 p) N q ) = q log p + (N q) log(1 p) (Verbeek 2008, )

66 Exercise Take derivative towards p. l(p) = q log p + (N q) log(1 p) Set equal to zero and find ˆp ML. (Verbeek 2008, )

67 Exercise l(p) = q log p + (N q) log(1 p) Take derivative towards p. dl(p) dp = q p + N q 1 p Set equal to zero and find ˆp ML. (Verbeek 2008, )

68 Exercise l(p) = q log p + (N q) log(1 p) Take derivative towards p. dl(p) dp = q p + N q 1 p Set equal to zero and find ˆp ML. q ˆp + N q 1 ˆp = 0 = ˆp = q N (Verbeek 2008, )

69 Exercise l(p) = q log p + (N q) log(1 p) Using the loglikelihood function and optim(), find p for N = 100 and q = 30.

70 Exercise l(p) = q log p + (N q) log(1 p) Using the loglikelihood function and optim(), find p for N = 100 and q = 30. ML estimates are invariant under reparametrization, so we will use the logit transform of p instead of p, to avoid probabilities outside [0, 1]: p = e p.

71 Exercise l(p) = q log p + (N q) log(1 p) Using the loglikelihood function and optim(), find p for N = 100 and q = 30. ML estimates are invariant under reparametrization, so we will use the logit transform of p instead of p, to avoid probabilities outside [0, 1]: p = e p. ll <- function(p, N, q) { pstar <- 1 / (1 + exp(-p)) q * log(pstar) + (N - q) * log(1 - pstar) } mlest <- optim(0, ll, NULL, N=100, q=30, hessian=true, control = list(fnscale = -1)) phat <- 1 / (1 + exp(-mlest$par))

72 Exercise Using the previous estimate, including the Hessian, calculate a 95% confidence interval around this ˆp.

73 Exercise Using the previous estimate, including the Hessian, calculate a 95% confidence interval around this ˆp. se <- sqrt(-1/mlest$hessian) ci <- c(mlest$par * se, mlest$par * se) ci <- 1 / (1 + exp(-ci)) Repeat for N = 1000, q = 300.

74 Exercise Using the uswages.dta data, estimate model log(wage) i = β 0 + β 1 educ i + β 2 exper i + ε i with the lm() function and with the optim() function. Compare estimates for β, σ 2 and V (β). Try different optimization algorithms in optim().

75 Outline

76 Within the ML framework, there are three common tests: Likelihood ratio test (LR) Wald test (W ) Lagrange multiplier test (LM) These are asymptotically equivalent. Given r restrictions, all have asymptotically a χ 2 (r) distribution. (Davidson & MacKinnon 1999, 414)

77 Within the ML framework, there are three common tests: Likelihood ratio test (LR) Wald test (W ) Lagrange multiplier test (LM) These are asymptotically equivalent. Given r restrictions, all have asymptotically a χ 2 (r) distribution. In the following, θ ML will denote the restricted estimate and ˆθ ML the unrestricted one. (Davidson & MacKinnon 1999, 414)

78 Likelihood ratio test LR = 2(l( ˆθ ML ) l( θ ML )) = 2 log L( ˆθ ML ) L( θ ML ), thus twice the log of the ratio of likelihood functions. If we have two separate estimates, this is thus very easy to compute or even eye-ball. (Davidson & MacKinnon 1999, )

79 Likelihood ratio test unrestricted loglikelihood restricted LR/ theta

80 Wald test The basic intuition is that the Wald test is quite comparable to an F -test on a set of restrictions. For a single regression parameter the formule would be: W = ( ˆβ β 0 ) 2 ( d 2 L(β) dβ 2 ) (Davidson & MacKinnon 1999, )

81 Wald test The basic intuition is that the Wald test is quite comparable to an F -test on a set of restrictions. For a single regression parameter the formule would be: W = ( ˆβ β 0 ) 2 ( d 2 L(β) dβ 2 ) This test does not require an estimate of θ ML. The test is somewhat sensitive to the formulation of r, so that as long as estimating θ ML is not too costly, it is better to use an LR or LM test. (Davidson & MacKinnon 1999, )

82 Wald test If the second derivative is higher, the slope of l( ) changes faster, thus the difference between the likelihoods at the restricted and unrestricted positions will be larger.

83 Wald test loglikelihood restricted unrestricted theta

84 Lagrange multiplier test For a single parameter: LM = ( dl( β) )2 d β d 2 L( β) d β 2 Thus the steeper the slope of the loglikelihood function at the point of the restricted estimate, the more likely it is significantly different from the unrestricted estimate, unless this slope changes very quickly (the second derivative is high).

85 Lagrange multiplier test unrestricted loglikelihood restricted theta

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin March 23, 2011 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction 2 3 4 5 Preliminaries Introduction Ordinary least squares

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test. Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Statistics and econometrics

Statistics and econometrics 1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample

More information

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007 Hypothesis Testing Daniel Schmierer Econ 312 March 30, 2007 Basics Parameter of interest: θ Θ Structure of the test: H 0 : θ Θ 0 H 1 : θ Θ 1 for some sets Θ 0, Θ 1 Θ where Θ 0 Θ 1 = (often Θ 1 = Θ Θ 0

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Maximum Likelihood Tests and Quasi-Maximum-Likelihood Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Bootstrap Testing in Nonlinear Models

Bootstrap Testing in Nonlinear Models !" # $%% %%% Bootstrap Testing in Nonlinear Models GREQAM Centre de la Vieille Charité 2 rue de la Charité 13002 Marseille, France by Russell Davidson russell@ehesscnrs-mrsfr and James G MacKinnon Department

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

MEI Exam Review. June 7, 2002

MEI Exam Review. June 7, 2002 MEI Exam Review June 7, 2002 1 Final Exam Revision Notes 1.1 Random Rules and Formulas Linear transformations of random variables. f y (Y ) = f x (X) dx. dg Inverse Proof. (AB)(AB) 1 = I. (B 1 A 1 )(AB)(AB)

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

ECON 5350 Class Notes Nonlinear Regression Models

ECON 5350 Class Notes Nonlinear Regression Models ECON 5350 Class Notes Nonlinear Regression Models 1 Introduction In this section, we examine regression models that are nonlinear in the parameters and give a brief overview of methods to estimate such

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22 MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

Mathematical statistics

Mathematical statistics October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

1.1 Basis of Statistical Decision Theory

1.1 Basis of Statistical Decision Theory ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Advanced Econometrics

Advanced Econometrics Advanced Econometrics Dr. Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Econometrics Winter 2013/2014 1 / 156 General information Aims and prerequisites Objective:

More information

Mixed models in R using the lme4 package Part 4: Theory of linear mixed models

Mixed models in R using the lme4 package Part 4: Theory of linear mixed models Mixed models in R using the lme4 package Part 4: Theory of linear mixed models Douglas Bates 8 th International Amsterdam Conference on Multilevel Analysis 2011-03-16 Douglas Bates

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Lecture 6: Hypothesis Testing

Lecture 6: Hypothesis Testing Lecture 6: Hypothesis Testing Mauricio Sarrias Universidad Católica del Norte November 6, 2017 1 Moran s I Statistic Mandatory Reading Moran s I based on Cliff and Ord (1972) Kelijan and Prucha (2001)

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Lecture 16 Solving GLMs via IRWLS

Lecture 16 Solving GLMs via IRWLS Lecture 16 Solving GLMs via IRWLS 09 November 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Notes problem set 5 posted; due next class problem set 6, November 18th Goals for today fixed PCA example

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Statistics, inference and ordinary least squares. Frank Venmans

Statistics, inference and ordinary least squares. Frank Venmans Statistics, inference and ordinary least squares Frank Venmans Statistics Conditional probability Consider 2 events: A: die shows 1,3 or 5 => P(A)=3/6 B: die shows 3 or 6 =>P(B)=2/6 A B : A and B occur:

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Estimation and Inference Gerald P. Dwyer Trinity College, Dublin January 2013 Who am I? Visiting Professor and BB&T Scholar at Clemson University Federal Reserve Bank of Atlanta

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

CHAPTER 1: BINARY LOGIT MODEL

CHAPTER 1: BINARY LOGIT MODEL CHAPTER 1: BINARY LOGIT MODEL Prof. Alan Wan 1 / 44 Table of contents 1. Introduction 1.1 Dichotomous dependent variables 1.2 Problems with OLS 3.3.1 SAS codes and basic outputs 3.3.2 Wald test for individual

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1 Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam.

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam. CS 189 Spring 2013 Introduction to Machine Learning Midterm You have 1 hour 20 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. Please use non-programmable calculators

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information