Lecture 6: Discrete Choice: Qualitative Response

Size: px
Start display at page:

Download "Lecture 6: Discrete Choice: Qualitative Response"

Transcription

1 Lecture 6: Instructor: Department of Economics Stanford University 2011

2 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV; Probit. Multivariate Models Binary Choice: Only the choice is observed. The choice maximizes the latent utility yi. yi = x i β + ε i, ε i F ), where F ) is known, symmetric around 0. Observe y i = 1 y i 0), then P y i = 1) = P ε i > x i β) = 1 F x i β) = F x i β).

3 Choice of F Logit: F t) = Λ t) = expt) 1+expt). Probit: F t) = Φ t), standard normal distribution. Arctan: F t) = π argtgt, cauchy distribution. Is the choice of F ) important? NO. If the truth is P y i = 1) = G x i β) but you specify F x β), then H ) s.t. P y i = 1) = F H x i, β)). H x i, β) can be approximated by z i β, for some transformation z i of x i, then it is just ok to run P y i = 1) = F z i β). So the choice of z i is the important factor. YES. In practice, usually determine choices of x i by how to interpret the choice and won t transform it into z i simply for getting better fit. So the choice of F ) is in fact important.

4 Maximum Likelihood Estimation Is the estimate ˆβ important? NO. Compare Fx, ˆβ) x, usually evaluated at the average of x. MLE: L = log L = log L β = = n n n n F x i β) y i 1 F x i β)) 1 y i y i log F x i β) + 1 y i ) log 1 F x i β)) yi F x i β) 1 y ) i 1 F x i β) f x i β) x i F x i y i F x i β) β) 1 F x i β))f x i β) x i

5 Then 2 log L β 0 ) β β LD = n fi 2 F i 1 F i ) x ix i since E y i F x i β) x i) = 0. 1 log L β 0 ) A N 0, 1 n β n n ) fi 2 F i 1 F i ) x ix i since Var y i F i x i ) = F i 1 F i ). Therefore, ) [ LD= 1 n ˆβ 2 log L β 0 ) β 0 n β β [ d 1 n N 0, n ] 1 1 log L β n 0 ) β fi 2 F i 1 F i ) x ix i ] 1

6 Probit and Logit Case The two distributions are similar in both location and scale. Only the tail of Logit is flatter. The variance of logit = π2 3. For probit: [ ) d 1 n ˆβ β N 0, n n φ 2 i Φ i 1 Φ i ) x ix i ] 1 For logit: f t) = Λ t) = Λ t) 1 Λ t)). So [ ] ) 1 d= 1 n n ˆβ β N 0, Λ x i β) 1 Λ x i β)) x i x i n.

7 Nonlinear Least Square Alternatively, y i = F x i β) + ɛ i E ɛ i x i ) = 0 V ɛ i x i ) = F x i β) 1 F x i β)) Minimize n y i F x i β))2 to get β. β is not efficient because of heteroscedasticity so do weighted least square: ˆβ = argmin n F 1 x i β ) 1 F x i β )) y i F x i β)) 2 But the usual choice for Probit and Logit is MLE The least square problem is also nonlinear. log L for probit and logit are both concave, so any local max is global max.

8 Discriminant Analysis DA) The DA method specifies the density of x given y = 1 or y = 0, and tries to predict whether x belongs to group y = 1 or y = 0 by looking at whether P y x) > 1 2 or not. Normal discriminant analysis: Assume x y = 1) N µ 1, Σ), x y = 0) N µ 0, Σ), P y = 1) = q 1, P y = 0) = q 0, then use Bayes rule: P y = 1 x) = q 1 N µ 1, Σ) q 1 N µ 1, Σ) + q 0 N µ 0, Σ) = q1 Nµ 1,Σ) q 0 Nµ 0,Σ) 1 = ) 1 + exp x Σ 1 µ 1 µ 0 ) + ln q1 q µ 1 Σ 1 µ µ 0 Σ 1 µ 0 1 = 1 + exp x for β = Σ 1 µ 1 µ 0 ) β + C)

9 You can estimate ˆq 0, ˆq 1, ˆµ 1, ˆµ 0, ˆΣ from sample moments, then ˆβ = ˆΣ 1 ˆµ 1 ˆµ 0 ). If you only care about P y = 1 x) then you can estimate β and C using a logit model. Both are consistent, the sample moments are more efficient. If the true model is a logit model, and NOT a DA model, then you have to estimate it using the logit likelihood. Otherwise you will get inconsistent estimate. If the true model is DA, then it can be shown that if you run the linear regression: y = â + ˆbx, then ˆb = λ ˆβ = λˆσ 1 ˆµ 1 ˆµ 0 ) for some constant proportional λ.

10 Grouped Data, min χ 2 There is one exception when you can t use MLE if only grouped data but not individual data are available. Example: Have the proportionˆp t ) of people voting for yes for some bill, but no individual record of voting. Let P t = F x tβ), then ˆP t = P t + ɛ t for Eɛ t ) = 0 and Var ɛ t ) = P t 1 P t ). Note that F 1 P t ) = x tβ.do a Taylor expansion F 1 ˆP t ) F 1 P t ) + F 1 P t ) P t ˆPt P t ) = x tβ + v t for Ev t = 0 and Var v t ) = Pt1 Pt) n tf 2 F 1 P t)).

11 Then estimate Var v t ) by plug in ˆP t into where P t is and run a WLS of this taylor expansion: )) T n t f 2 F ˆP 1 t ˆβ = argmin ˆP t 1 ˆP ) F 1 ˆPt ) ) 2 x tβ t t=1 Alternatively, run a nonlinear weighted least without inverting F 1 P t ): β = argmin T t=1 n t ˆP t 1 ˆP t ) ˆP t F x tβ)) 2 But ˆβ is preferable because it is a linear problem rather a nonlinear problem.

12 Multinomial Models Utility from taking car, train and bus: Bus y = 1): U 1i = µ 1i + ɛ 1i Train y = 2): U 2i = µ 2i + ɛ 2i Car y = 3): U 3i = µ 3i + ɛ 3i Then P y i = 1) = P U 1i > U 2i, U 1i > U 3i ) P y i = 2) = P U 2i > U 1i, U 2i > U 3i ) P y i = 3) = P U 3i > U 1i, U 3i > U 2i ).

13 Multinomial Logit ɛ 1i, ɛ 2i, ɛ 3i iid, distributed as Type-I extreme Value: F ɛ) = e e ɛ, f ɛ) = e ɛ e e ɛ. The probability of choice can be calculated: P y i = 2) = P ɛ 2 + µ 2 µ 1 > ɛ 1, ɛ 2 + µ 2 µ 3 > ɛ 3 ) = = = = Similarly, P y i = j) = 0 f ɛ 2 ) P ɛ 3 < ɛ 2 + µ 2 µ 3 ) P ɛ 1 < ɛ 2 + µ 2 µ 1 ) dɛ 2 e ɛ2 [ e e ɛ 2 [1+e µ 1 µ 2 +e µ 3 µ 2] ] dɛ 2 e [1+eµ 1 µ 2 +e µ 3 µ 2 ]x dx e µ1 µ2 + e µ3 µ2 = e µ2 e µ1 + e µ2 + e µ3. e µ j e µ 1 +e µ 2 +e µ 3. It is easy to show that P y = 1 y = 1 or y = 2) etc.

14 IIA Property and Nested Logit Suppose y = 2 is not available, then similarly P y = 3 without choice 2) = P U 3 > U 1 ) = e µ3 e µ1 + e µ3. The IIA independence of irrelevant alternatives) property: the relative frequency of choicing between 1bus) and 3car) does not change whether or not choice 2train) is present. This appears unreasonable since bus1) and train2) are both public transportation and appears correlated with each other. To allow for correlation between choice 1 and 2, introduce F ɛ 1, ɛ 2, ɛ 3 ) = G ɛ 1, ɛ 2 ) F ɛ 3 ) = G ɛ 1, ɛ 2 ) e e ɛ 3 and [ G ɛ 1, ɛ 2 ) = e e ɛ 1 ρ +e ɛ ] 2 ρ ρ, for 0 < ρ 1.

15 Since the independence of ɛ 3 and G ɛ 1, ɛ 2 ) P U 1 > U 2 ) =P U 1 > U 2 U 1 > U 3 or U 2 > U 3 ) P y = 1) = P y = 1) + P y = 2) In fact, both sides are verified to be e µ 1 /ρ e µ 1 /ρ +e µ 2 /ρ. Also, P y = 3) = P U 3 > U 1, U 3 > U 2 ) = Now the IIA property does not hold since P y = 3 with no choice 2) = P U 3 > U 1 ) e µ3 e µ3 + e µ1/ρ+µ2/ρ) ρ P U 3 > U 1 U 3 > U 2 or U 1 > U 2 ) = P y = 3 y = 1 or y = 3)

16 Estimation Method A convenient way is to use a 2-step procedure to estimate β and ρ first, and then use the resulting estimate as the initial value for MLE iteration. The likelihood function: n L = P y = 1) y 1i P y = 2) y 2i P y = 3) y 3i = L 1 L 2 for L 1 = P y = 1 y = 1 or 2) y 1i P y = 2 y = 1 or 2) y 2i n L 2 = P y = 1 or 2) y 1i +y 2i y P y = 3) 3i

17 First step: L 1 is the logit likelihood for the subsample that choose between 1 and 2, and is a function of only β/ρ: e x 1i β/ρ e x 1i β/ρ log L 1 = y 1i e x 1i β/ρ + e + y x 2i β/ρ 2i e x 1i β/ρ + e x 2i β/ρ Second step: plug this into L 2 and maximizes wrt β and ρ: log L 2 = y 1i + y 2i ) log 1 +y 3i log e x 3i β + e x 1i e x 3i β ) ˆβ ρ e x 3i β + e x 1i + e x 2i ˆβ ρ e x 3i β ˆβ ρ ) )) ρ + e x 2i Use them as the initial starting value for maximizing the likelihood function L. ˆβ ρ )) ρ

18 In fact for this simple example, there is not much point, it is easier to simply maximize the L directly FIML:full information maximum likelihood), instead of using the sequential procedure first. However, if the choice set is large and there are many branches, the two step procedure is important to find a good starting value. Limitation of Logit and Probit: Nested Logit suffers from the problem how to decide on the independence and correlation structure of the choice set. Probit is a way to allow for flexible correlation among different choices, But multinomial probit suffers from the computational burden of integrating high dimensional normal density.

19 Distribution Free Methods All parametric models rely on specifying a completely known F ). The aim here is to relax this parametric assumption. Two types of methods: One is to allow for heteroscedasticity of ɛ while maintaining minimal identifying moment conditions, this is Manski s maximum score estimator. The other is to relax all restriction on F ) while maintaining conditional homoscedasticity, this is Cosslett1983), Klein and Spady1993), among others.

20 Maximum Score Estimator of Manski The identifying assumption is that P ɛ < 0 x) 1 2 for all x. Maximize the number of correct predictions called score ): Q n β) = 1 n yi 1 x i β 0) + 1 y i ) 1 x i β < 0) 1 n n q y i, x i, β) Alternatively, minimize the number of wrong predictions, since the prediction is wrong iff y i 1 x i β 0) = 1, it can be written as a LAD 1 n n y i 1 x i β 0) In fact, when x i β 0 0, P y i = 1) 1 2, so med y i x i ) = 1; when x i β 0 < 0, P y i = 0) 1 2 so med y i x i ) = 0.

21 The asymptotic distribution is very difficult but there is an intuitive reason for the slow rate n 1 3 of convergence. Consider ˆβ maximizes: Q n β) Q n β 0 ) = Q β) Q β 0 ) + }{{} A Q n β) Q n β 0 ) Q β) + Q β 0 ) }{{} B n A is as before approximated by a quadratic function: A O β β 0 ) 2).

22 The convergence rate for B n with mean 0 is Var B n ). Var B n ) = 1 n 2 n Var q y i, x i, β) q y i, x i, β 0 )) 1 n E q y i, x i, β) q y i, x i, β 0 )) 2 = 1 n E [y i1 x i β 0, x i β < 0) 1 y i ) 1 x i β 0 < 0, x i β 0)] 2 ) 1 O n E [1 β z i β 0 ) + 1 β z i β 0 )] ) 1 = O n β β 0 where x i β = β z i ) 1 = B n = O p n β β0. ) 1 Regular linear case: B n = O p n β β 0.

23 Expect ˆβ to maximize an objective function A + B n = O β β 0 ) 2) ) 1 + O p n β β0 A β 0 ) + B n β 0 ) = 0 = A ˆβ) + B n ˆβ) 0, so that ) ) ) 2 1 O ˆβ β0 + O p n ˆβ β 0 0 O ˆβ ) ) β O p n ) ) O ˆβ β 0 O p n 1 3

24 Panel Data Probit y it = βx it + α i + ɛ it. ɛ i N 0, Σ) independent of everything else α i = φ + λ 1 x i λ T x it + v i, for v i N 0, σ 2 v ) independent of everything else. Observe only the probit part: y it = 1 y it 0) = 1 βx it + α i + ɛ it 0) = 1 βx it + φ + λ 1 x i λ T x it + v i + ɛ it 0). To do MLE, write down the likelihood function, let σtt 2 be the tth diagonal element of Σ and simply take φ = 0. ) βx t + λ 1 x λ T x T P y t = 1 x 1,..., x T ) = Φ. σ 2 tt + σv 2 But to do MLE you need to write the JOINT probability of P y 1 = 1, y 2 = 0,..., y T = 1) etc. Use simulation estimation.

25 In contrast to MLE, Chamberlain suggests doing minimum distance by first running a cross-sectional probit for each period t = 1,..., T : P y t = 1 x 1,..., x T ) = Φ π t1 x π tt x T ) and then estimate ψ t σ 2 tt + σ 2 v ) 1 and λt, t = 1,..., T, and β by minimizing the distance between use any weighting matrix you want) ψ ψ T 1 βi T +. 1 λ 1,, λ T ) = ˆπ 11 ˆπ 1T..... π T 1 π TT

26 In cross section models without the endogenous fix effect α i, the proper measure to compare different models is x P y t = 1 x t = x) = P y t = 1 x t = x) P y t = 1 x t = x ) for x close to x. In panel data with latent fixed effect, it is harder to define this measure. Chamberlain suggests that a proper measure is [P y t = 1 x t = x, c) P y t = 1 x t = x, c)] f c c) dc Under the panel probit assumptions, this can be conveniently estimated.

27 Consider z φ + λ 1 x λ T x T, so that c = z + v, and f z, v) = f z) f v). P y t = 1 x t = x, c) f c) dc = P y t = 1 x t = x, c, v) f c c) dc = P y t = 1 x t = x, c) f c) dc = P y t = 1 x t = x, z, v) f v v) f z c v) dvdc = P y t = 1 x t = x, z, v) f v v) dvf z z) dz = P y t = 1 x t = x, z) f z z) dz which can be estimated by P y t = 1 x t = x, z) d ˆF z z) = 1 n n Φ α t βx + λ 1 x i λ T x it ))

28 Conditional Logit This one allows you not to make ANY assumption about the latent fixed effect α i. Assume P y t = 1 x 1,..., x T, c) = G βx t + c) = eβx t +c 1+e βx t +c and y t are independent across t. Then instead of maximizing the likelihood function, maximize the following conditional likelihood function conditional on t y t = s for s = 0,..., T : ) P y t, t = 1,..., T y t = s, x 1,..., x T, c t = P y t, t = 1,..., T, s.t. t y t = s) P t t y = P y t x 1,..., x T, c) t = s) d B t P d t x 1,..., x T, c) where d = d 1,..., d T ) and B {d : s.t. T t=1 d t = s}.

29 By considering y t = 1, 0 seperately, you could see that P y t x 1,..., x T, c) = exp βx t + c) y t, and so, 1 + exp βx t + c) ) P y t, t = 1,..., T y t = s, x 1,..., x T, c t T expβx t+c)y t exp β ) T t=1 1+expβx = t+c) t=1 x ty t T = expβx t+c)d t d B t=1 1+expβx t+c) exp β ) T t=1 x td t The conditional log likelihood function to be maximized is: [ )] [ )] n T T log L = log exp β x it y it log exp β x it d it t=1 d i B i t=1 for B i {d i : T t=1 d it = T t=1 y it}.

30 For example, when T = 3, T t=1 y it = 1, then x i = x i1,..., x it ): ) P P P y i1 = 1, y i2 = 0, y i3 = 0 x i, c i, t y i1 = 0, y i2 = 1, y i3 = 0 x i, c i, t y i1 = 0, y i2 = 0, y i3 = 1 x i, c i, t e βx i1 y it = 1 = e βx i1 + e βx i2 + e βx i3 ) e βx i2 y it = 1 = e βx i1 + e βx i2 + e βx i3 ) y it = 1 e βx i1 = e βx i1 + e βx i2 + e βx i3 The undesirable feature is that the independence assumption is needed, in contrast the fix effect probit allows for flexible Σ. In addition, if to calculate [P y t = 1 x t = x, c) P y t = 1 x t = x, c)] f c) dc need parametric assumption of f c) anyway.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links. 14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22 MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008 Panel Data Seminar Discrete Response Models Romain Aeberhardt Laurent Davezies Crest-Insee 11 April 2008 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29 Contents Overview

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Limited Dependent Variable Models II

Limited Dependent Variable Models II Limited Dependent Variable Models II Fall 2008 Environmental Econometrics (GR03) LDV Fall 2008 1 / 15 Models with Multiple Choices The binary response model was dealing with a decision problem with two

More information

I. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as

I. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as Econ 513, USC, Fall 2005 Lecture 15 Discrete Response Models: Multinomial, Conditional and Nested Logit Models Here we focus again on models for discrete choice with more than two outcomes We assume that

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

LDA, QDA, Naive Bayes

LDA, QDA, Naive Bayes LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Linear regression COMS 4771

Linear regression COMS 4771 Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1)

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #5 1 / 24 Introduction What is a confidence interval? To fix ideas, suppose

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Logistic Regression. Seungjin Choi

Logistic Regression. Seungjin Choi Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models David Rosenberg New York University April 12, 2015 David Rosenberg (New York University) DS-GA 1003 April 12, 2015 1 / 20 Conditional Gaussian Regression Gaussian Regression Input

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Introduction to Discrete Choice Models

Introduction to Discrete Choice Models Chapter 7 Introduction to Dcrete Choice Models 7.1 Introduction It has been mentioned that the conventional selection bias model requires estimation of two structural models, namely the selection model

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Discrete Choice Modeling

Discrete Choice Modeling [Part 4] 1/43 Discrete Choice Modeling 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 Count Data 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Ordered Response and Multinomial Logit Estimation

Ordered Response and Multinomial Logit Estimation Ordered Response and Multinomial Logit Estimation Quantitative Microeconomics R. Mora Department of Economics Universidad Carlos III de Madrid Outline Introduction 1 Introduction 2 3 Introduction The Ordered

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2 Tetsuya Matsubayashi University of North Texas November 9, 2010 Learn multiple responses models that do not require the assumption

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Lecture 2: Consistency of M-estimators

Lecture 2: Consistency of M-estimators Lecture 2: Instructor: Deartment of Economics Stanford University Preared by Wenbo Zhou, Renmin University References Takeshi Amemiya, 1985, Advanced Econometrics, Harvard University Press Newey and McFadden,

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00 / 5 Administration Homework on web page, due Feb NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00... administration / 5 STA 44/04 Jan 6,

More information

LECTURE 18: NONLINEAR MODELS

LECTURE 18: NONLINEAR MODELS LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:

More information

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to

More information

Generalized Method of Moments (GMM) Estimation

Generalized Method of Moments (GMM) Estimation Econometrics 2 Fall 2004 Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen of29 Outline of the Lecture () Introduction. (2) Moment conditions and methods of moments (MM) estimation. Ordinary

More information

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression Tore Schweder October 28, 2011 Outline Examples of binary respons variables Probit and logit - examples

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Lecture 14 More on structural estimation

Lecture 14 More on structural estimation Lecture 14 More on structural estimation Economics 8379 George Washington University Instructor: Prof. Ben Williams traditional MLE and GMM MLE requires a full specification of a model for the distribution

More information

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

ISQS 5349 Spring 2013 Final Exam

ISQS 5349 Spring 2013 Final Exam ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22 GLS and FGLS Econ 671 Purdue University Justin L. Tobias (Purdue) GLS and FGLS 1 / 22 In this lecture we continue to discuss properties associated with the GLS estimator. In addition we discuss the practical

More information

COM336: Neural Computing

COM336: Neural Computing COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

4.2 Estimation on the boundary of the parameter space

4.2 Estimation on the boundary of the parameter space Chapter 4 Non-standard inference As we mentioned in Chapter the the log-likelihood ratio statistic is useful in the context of statistical testing because typically it is pivotal (does not depend on any

More information