Day 1A Count Data Regression: Part 1

Size: px
Start display at page:

Download "Day 1A Count Data Regression: Part 1"

Transcription

1 Day 1A Count Data Regression: Part 1 A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconometrics Aug 28 - Sept 1, Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 1 / 65

2 1. ntroduction 1. ntroduction We consider nonlinear regression with cross-section data. The slides use count regression as a great illustrative example. Count data models are for dependent variable y = 0, 1, 2,... Example: y: Number of doctor visits (usually cross-section) x: health status, age, gender,... Many approaches and issues are general nonlinear model issues. Econometrics: F F Fully parametric: MLE Conditional mean: Quasi-MLE, generalized methods of moments (GMM) Statistics: generalized linear models (GLM). A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 2 / 65

3 1. ntroduction Analysis is straightforward in the usual case of model the conditional mean E[y jx]: in Stata replace command regress with poisson and for panel data (later) replace command xtreg with command xtpoisson nterpretation of marginal e ects, however, is more complicated: E[yjx] = exp(x 0 β) so ME j = E[yjx]/ x j = β j exp(x 0 β) 6= β j. Analysis is more complicated for better parametric models for prediction, censoring, selection autoregressive time series of counts (not covered here).n The session will focus on topics 2, 3 and 4. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 3 / 65

4 1. ntroduction Outline 1 ntroduction 2 Poisson (with cross-section data) Theory* 3 Poisson Application* 4 Summary of the remaining topics* 5 Generalized linear models 6 Diagnostics 7 Negative binomial model 8 Richer Parametric Model (censored, truncated, hurdle,..) 9 Summary 10 References A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 4 / 65

5 2. Poisson cross-section regression theory 2. Poisson cross section regression theory Count data example Many health surveys measure health use as counts as people have better recall of counts than of dollars spent U.S. Medical Expenditure Panel Survey (MEPS). Sample of Medicare population aged 65 and higher (N = 3,677) docvis = annual number of doctor visits. use mus17data.dta. summarize docvis Variable Obs Mean Std. Dev. Min Max docvis A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 5 / 65

6 2. Poisson cross-section regression theory Doctor visits: Histogram dropping observations with more than 40 visits Density # Doctor visits (for < 40 visits). Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 6 / 65

7 2. Poisson cross-section regression theory Poisson distribution Poisson distribution From stochastic process theory, natural model for counts is Probability mass function: Mean and variance: Equidispersion: variance = mean Restriction imposed by Poisson Overdispersion: variance > mean y Poisson[λ]. Pr[Y = yjλ] = e λ λ y y! E[y] = λ and V[y] = λ More common feature of count data Doctor visits data: ȳ = 6.82, sy 2 = ' 8.01ȳ. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 7 / 65

8 2. Poisson cross-section regression theory Poisson regression Poisson regression: summary Poisson regression is straightforward many packages do Poisson regression coe cients are easily interpreted as semi-elasticities. Do Poisson rather than OLS with dependent variable y ln p y (with adjustment for ln 0) y (a variance-stabilizing transformation). Poisson MLE is consistent provided only that E[yjx] = exp(x 0 β). But make sure standard errors etc. are robust to V[yjx] 6= E[yjx]. And generally don t use Poisson if need to predict probabilities. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 8 / 65

9 2. Poisson cross-section regression theory Poisson regression Poisson regression: Poisson MLE Let the Poisson rate parameter vary across individuals with x in way to ensure λ > 0. λ = E[yjx] = exp(x 0 β). MLE is straightforward given data independent over i. f (y) = e λ λ y /y! ) ln f (y) = exp(x 0 β) + yx 0 β ln y! ) ln L(β) = n i=1 f exp(x 0 i β) + y i x 0 i β ln y i!g ) ln L(β) β = n i=1 f exp(x 0 i β)x i + y i x i g A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School of Economics Aug 28 - Sept Advanced 1, 2017Microeconomet 9 / 65

10 2. Poisson cross-section regression theory Poisson regression Poisson regression: rst-order conditions The ML rst-order conditions are n i=1 (y i exp(xi 0 β))x b i = 0. No explicit solution for β. b nstead use Newton-Raphson iterative method. Fast as objective function is globally concave in β. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 10 / 65

11 2. Poisson cross-section regression theory Poisson regression Poisson regression: consistency of Poisson MLE ML rst-order conditions are n i=1 (y i exp(xi 0 β))x b i = 0. Consistency only requires (given independence over i) E[(y i exp(xi 0 β))x i ] = 0 So consistent if E[y i jx i ] = exp(xi 0 β) Poisson MLE is consistent if the conditional mean is correctly speci ed like MLE for linear model under normality (OLS) this robustness holds for only some likelihood based models. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 11 / 65

12 2. Poisson cross-section regression theory Poisson regression Poisson regression: distribution of Poisson MLE f distribution is Poisson then β b a N [β, V MLE [ β]] b where bv MLE [ β] b 1 = i bµ i x i xi 0 using 2 ln L(β)/ β β 0 = i exp(xi 0 β)) f distribution is not Poisson but E[y i jx i ] = exp(xi 0 β) and V[y i jx i ] = σ 2 i then β b a N [β, V ROB [ β]] b and we use the robust sandwich estimate of variance (White (1982), Huber (1967)) bv ROB [ β] b 1 = i bµ i x i xi 0 i (y i bµ i ) 2 x i x 0 i i bµ i x i x 0 i 1 V ROB [ β] b =V MLE [ β] b if σ 2 i = µ i (imposed by Poisson) V ROB [ β] b = αv MLE [ β] b if σ 2 i = αµ i (used in GLM literature) Robust se s are much larger than default ML se s if α >> 1. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 12 / 65

13 2. Poisson cross-section regression theory Poisson regression ASDE: derivation of robust sandwich Take a rst-order Taylor series expansion of i (y i about β. exp(x 0 i b β))x i i (y i exp(x 0 i b β))x i = i (y i exp(x 0 i β))x i i exp(x 0 i β)x i x 0 i ( b β β) + F.o.c. set this to zero and can show that R disappears asymptotically i (y i µ i )x i + ( i µ i x i xi 0) (b β β) = 0 ) ( β b β) = ( i µ i x i xi 0) 1 i (y i µ i )x i a ( i µ i x i xi 0) 1 N 0, i σ 2 i x i xi 0 h a N 0, ( i µ i x i xi 0) 1 i σ 2 i x i xi 0 ( i µ i x i xi 0) 1i where µ i = exp(x 0 i β), bµ i = exp(x 0 i b β) and σ 2 i = E[(y i µ i ) 2 ]. Asymptotically can estimate i σ 2 i x i x 0 i by i (y i bµ i ) 2 x i x 0 i. f density is Poisson then simpli es to ( i µ i x i x 0 i ) 1 as σ 2 i = µ i. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 13 / 65

14 3. Poisson regression example 3. Poisson regression example 2003 MEPS data for over 65 in Medicare Dependent variable: docvis Regressors grouped into three categories: Health insurance status indicators F F private medicaid Socioeconomic F F F age age2 educyr Health status measures F F actlim totchr global xlist private medicaid age age2 educyr actlim totchr in commands refer to as $xlist A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 14 / 65

15 3. Poisson regression example Summary statistics. describe docvis $xlist storage display value variable name type format label variable label docvis float %9.0g # doctor visits private byte %8.0g =1 if has private supplementary insuran medicaid byte %8.0g =1 if has Medicaid public insurance age byte %8.0g Age age2 float %9.0g Age squared educyr byte %8.0g Years of education actlim byte %8.0g =1 if activity limitation totchr byte %8.0g # chronic conditions.. summarize docvis $xlist, sep(10) Variable Obs Mean Std. Dev. Min Max docvis private medicaid age age educyr actlim totchr A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 15 / 65

16 3. Poisson regression example Poisson MLE with robust sandwich standard errors - preferred. * Poisson with robust standard errors. poisson docvis $xlist, vce(robust) nolog // Poisson robust SEs Poisson regression Number of obs = 3677 Wald chi2(7) = Prob > chi2 = Log pseudolikelihood = Pseudo R2 = Robust docvis Coef. Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr _cons A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 16 / 65

17 3. Poisson regression example Poisson MLE with default ML standard errors - do not use - These are misleadingly small due to overdispersion!!. * Poisson with default ML standard errors. poisson docvis $xlist // Poisson default ML standard errors teration 0: log likelihood = teration 1: log likelihood = teration 2: log likelihood = Poisson regression Number of obs = 3677 LR chi2(7) = Prob > chi2 = Log likelihood = Pseudo R2 = docvis Coef. Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr _cons Robustq se s are times larger Note: sy 2 /ȳ = p /6.82 = p 8.01 = A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 17 / 65

18 3. Poisson regression example Coe cient interpretation Poisson regression: coe cient interpretation For the exponential conditional mean the marginal e ect ME j = E[yjx] x j = exp(x 0 β) β j = E[yjx] β j 1 Conditional mean is strictly monotonic increasing (or decreasing) in x ij according to the sign of β j. 2 Coe cients are semi-elasticities: β j is proportionate change in conditional mean when x ij changes by one unit. 3 More precisely (exp(β j ) 1) is proportionate change. Programs have options to report exponentiated coe cients (incidence-rate ratios). 4 Like all single-index models, if β j = 2β k, then the e ect of one-unit change in x j is twice that of x k. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 18 / 65

19 3. Poisson regression example Coe cient interpretation Poisson regression: coe cient interpretation (continued) Example: bβ Private = Private insurance is associated with an increase in mean doctor visits of 14.2%. More precisely the increase is 100 (e ) = 100 ( ) = 15.3%. Alternatively the exponentiated coe cient is e = 1.153, so the multiplicative e ect is Example: bβ Private = and bβ totchr = Since 0.142/0.248 = 0.57, private insurance has the same impact on mean doctor visits as 0.57 more chronic conditions. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 19 / 65

20 3. Poisson regression example Marginal e ects Marginal e ects: Three types 1. Average marginal e ect (AME): Evaluate at each x i and average AME = i E[y i jx i ] x ij = i exp(x 0 i b β) bβ j. 2. Marginal e ect at mean (MEM): Evaluate at x = x MEM = E[yjx] x j = exp(x 0 β) b bβ j x=x 3. Marginal e ect at representative value (MER): Evaluate at x = x AME is nest For population AME use population weights in computing AME. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 20 / 65

21 3. Poisson regression example Marginal e ects For Poisson with intercept in model AME = ȳ bβ j Reason: f.o.c. i (y i exp(xi 0 β)) b = 0 imply i exp(xi 0 β) b = ȳ For Poisson can show that AME > MEM. Computation of marginal e ects in Stata after poisson (or other regression command) give command margins, dydx(*) for AME margins, dydx(*) atmean for MEM margins, dydx(*) at(age=30 educyr=12) for MER Old Stata 10: add-ons mfx for MEM and margeff for AME. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 21 / 65

22 3. Poisson regression example Marginal e ects Marginal e ects: AME (This page) versus MEM (next page). * AME and MEM for Poisson. quietly poisson docvis $xlist, vce(robust). margins, dydx(*) // AME: Average marginal effect for Poisson Average marginal effects Number of obs = 3,677 Model VCE : Robust Expression : Predicted number of events, predict() dy/dx w.r.t. : private medicaid age age2 educyr actlim totchr Delta method dy/dx Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 22 / 65

23 3. Poisson regression example Marginal e ects Marginal e ects: MEM. margins, dydx(*) atmean // MEM: ME for Poisson evaluated at average of x Conditional marginal effects Number of obs = 3,677 Model VCE : Robust Expression : Predicted number of events, predict() dy/dx w.r.t. : private medicaid age age2 educyr actlim totchr at : private = (mean) medicaid = (mean) age = (mean) age2 = (mean) educyr = (mean) actlim = (mean) totchr = (mean) Delta method dy/dx Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 23 / 65

24 3. Poisson regression example Factor Variables Factor variables Problem: obtaining ME in models with interactions and polynomials Solution: Factor variable operators # and ## also i. for discrete variables uses nite di erence not derivatives Now can get ME with respect to age allowing for age 2 in model. Also now the i. variables have a somewhat di erent ME.. * Also factor variables and noncaculus methods. * The i. are discrete and will calculate ME of one unit change (not derivative). * The c.age##age means age and age sauared appear and ME is w.r.t. age. quietly poisson docvis i.private i.medicaid c.age##c.age educyr i.actlim totchr, vce(robust). margins, dydx(*) // MEM: ME for Poisson evaluated at average of x Average marginal effects Number of obs = 3,677 Model VCE : Robust Expression : Predicted number of events, predict() dy/dx w.r.t. : 1.private 1.medicaid age educyr 1.actlim totchr Delta method dy/dx Std. Err. z P> z [95% Conf. nterval] 1.private medicaid age educyr actlim totchr Note: dy/dx for factor levels is the discrete change from the base level. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 24 / 65

25 4. Summary of remaining topics 4. Summary of remaining topics The Poisson is a generalized linear model (GLM) the framework used in the statistics literature for nonlinear regression leading examples are OLS, logit, probit, gamma regression in Stata use command glm with family member poisson and log ling function. The Poisson MLE is generally ine cient but generally not great e ciency loss (like OLS versus GLS) But it is usually the wrong model if we want to predict probabilities intuitively Poisson has only one parameter µ whereas e.g. the normal has µ and σ 2 The standard better ML model is the negative binomial distribution F then Var (y ) = µ + αµ 2 (overdispersion) versus Poisson Var (y ) = µ. F in Stata use command nbreg. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 25 / 65

26 4. Summary of remaining topics 4. Summary of remaining topics (continued) Count data may be record no zeroes e.g. record number of doctor visits only for those who visited the clinic at least once then use truncated MLE. Count data may be topcoded e.g. record number of doctor visits as 0, 1, 2, 3, 4 0r more. then use censored (from above) MLE. Count data may be interval recorded. Count data even if fully observed often have too few zeroes e.g. fewer zeroes than predicted by a negative binomial model. then two approaches F hurdle model - one model for = 0 or > 0 and separate model given > 0 F with zeroes model Most natural extension of R 2 is R 2 Cor = ccor 2 [y i, by i ]. Compare nonnested parametric models using AC or BC. Quantile regression has been extended to counts. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 26 / 65

27 5. Generalized linear models 5. Generalized linear models Generalized linear models (GLM) is the framework used in the statistics literature for nonlinear regression a brief introduction that may be skipped depending on time. Leading examples are OLS regression for y 2 (, ) Logit and probit regression for y 2 f0, 1g Poisson regression for y 2 f0, 1, 2, 3,...g Gamma regression including exponential for y 2 (0, ) n Stata use command glm specify the GLM family member: here poisson specify the link function (inverse of the conditional mean function): here log get robust standard errors: vce(robust) A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 27 / 65

28 Exactly same as poisson, vce(robust) A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconomet Count Regression: Part 1 Aug 28 - Sept 1, / Generalized linear models Poisson GLM with robust sandwich standard errors. glm docvis $xlist, family(poisson) link(log) vce(robust) nolog Generalized linear models No. of obs = 3677 Optimization : ML Residual df = 3669 Scale parameter = 1 Deviance = (1/df) Deviance = Pearson = (1/df) Pearson = Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AC = Log pseudolikelihood = BC = Robust docvis Coef. Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr _cons

29 5. Generalized linear models ASDE: What is a generalized linear model? Class of models based on linear exponential family (LEF): normal, binomial, Bernoulli, gamma, exponential, Poisson. Speci cally for the LEF f (y i jµ i ) = expfa(µ i ) + b(y i ) + c(µ i )y i g E[y i ] = µ i = a 0 (µ i )/c(µ i ) V[y i ] = 1/c(µ i ) For regression specify a model of the mean µ i = µ i (β) = µ i (x i, β). Poisson is a member with a(µ) = µ; c(µ) = ln µ a 0 (µ) = 1 and c 0 (µ) = 1/µ E[y] = ( 1)/(1/µ) = µ and V[y] = 1/(1/µ) = µ. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 29 / 65

30 5. Generalized linear models Quasi-MLE maximizes the log-likelihood ln L(β) = i ln f (y i jµ i (β)) = i fa(µ i (β)) + b(y i ) + c(µ i (β))y i g. The rst-order conditions are i fa 0 (µ i (β)) + c 0 (µ i (β))y i g µ i (β) β = 0 ) i c 0 (µ i (β) fy i a 0 (µ i (β))/c 0 (µ i (β))g µ i (β) β = 0 ) i 1 V[y i ] fy i µ i (β)g µ i (β) β = 0 MLE based on LEF with µ i = g(xi 0 β) shares the robustness properties of normal and Poisson MLE consistency requires correct speci cation of the mean (so E[fy i µ i (β)g] = 0). But correct standard errors should use a robust estimate of variance Robust sandwich s.e. s or Default ML s.e. s multiplied by p bα where V[y i jx i ] = α h(e[y i jx i ]). A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 30 / 65

31 5. Generalized linear models ASDE: Nonlinear least squares estimator Alternative estimator for counts that is not used, because Poisson is simpler and is usually more e cient. Specify same conditional mean as Poisson: E[y i jx i ] = exp(x 0 i β). Minimize sum of squared residuals: N i=1(y i exp(x 0 i β))2. NLS is consistent provided E[y i jx i ] = exp(xi 0β) a N [β, VMLE [ β]] b and use robust sandwich variance estimate: bβ NLS bv ROB [ β b 1 NLS ] = i bµ 2 i x i xi 0 i (y i bµ i ) 2 bµ 2 i x i x 0 i i bµ 2 i x i x 0 i 1. For doctor visits data nl (docvis = exp({xb: $xlist}+{b0})), vce(robust) NLS robust standard errors are 5-20% larger than those for Poisson A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 31 / 65

32 6. Diagnostics 6. Diagnostics: residuals and in uence measures Some diagnostics come out of the GLM literature. Residuals (for Poisson) Raw: r i = (y i bµ i ) Pearson: p i = (y i bµ i )/ p bµ i Deviance: d i = sign(y i bµ i ) p 2fy i ln(y i /bµ i ) (y i bµ i )g Anscombe: a i = 1.5(yi 2/3 µ 2/3 i )/µ 1/6 i Last three will be standardized if V[y i ] = µ i. Small-sample corrections (for Poisson) Hat matrix: H = W 1/2 X(X 0 WX) 1 X 0 W 1/2 ; W = Diag[bµ i ]. Studentized residual: p i = p i / p 1 h ii and d i = d i / p 1 h ii. n uential observations: Rule of thumb: h ii > 2K /N Cook s distance: C i = (pi )2 h i /K (1 h ii ) measures change in β b when observation i is omitted. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 32 / 65

33 6. Diagnostics. summarize rraw rpearson rdeviance ranscombe hat cooksd, sep(10) Variable Obs Mean Std. Dev. Min Max rraw e rpearson rdeviance ranscombe hat cooksd e correlate rraw rpearson rdeviance ranscombe (obs=3677) rraw rpearson rdevia~e ransco~e rraw rpearson rdeviance ranscombe The various residuals are highly correlated. The raw residuals sum to zero due to f.o.c.. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 33 / 65

34 6. Diagnostics Diagnostics: R-squared measures Di erent interpretations of R 2 in linear model lead to di erent R 2 in nonlinear model. Most are di cult to interpret in nonlinear models. Simplest: squared correlation coe cient between y i and by i = bµ i R 2 Cor = ccor 2 [y i, by i ] Sums of squares measures di er in nonlinear models R 2 Res = 1 ResSS/TotalSS 6= R 2 Exp = ExpSS/TotalSS Relative gain in log-likelihood (L 0 is intercept model only) R 2 RG = ln L t ln L 0 ln L max ln L 0 = 1 ln L max ln L t. ln L max ln L 0 Works for Poisson as ln L max occurs when µ i = y i. Unlike others 0 RRG 2 < 1 and R2 RG always increases as add regressors. Stata measure is only applicable to binary and multinomial models R 2 Pseudo = 1 ln L t / ln L 0. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 34 / 65

35 6. Diagnostics ASDE: Diagnostics: overdispersion test H 0 :V[y i jx i ] = E[y i jx i ] versus H 1 :V[y i jx i ] = E[y i jx i ]+ α(e[y i jx i ]) 2. Test H 0 : α = 0 against H 1 : α > 0. mplement by auxiliary regression ((y i bµ i ) 2 y i )/bµ i = αbµ i + error and do t test of whether the coe cient of bµ i is zero. n practice can skip this test and just do Poisson with robust s.e. s. Test is useful as can also use this test for test of underdispersion, whereas other tests (such as LM of Poisson vs. negative binomial) only test overdispersion. f we are just modelling the conditional mean, overdispersion is okay provided robust standard errors are calculated. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 35 / 65

36 6. Diagnostics ASDE: Example of overdispersion test.. * Overdispersion test against V[y x] = E[y x] + a*(e[y x]^2). quietly poisson docvis $xlist, vce(robust). predict muhat, n. quietly generate ystar = ((docvis muhat)^2 docvis)/muhat. regress ystar muhat, noconstant noheader ystar Coef. Std. Err. t P> t [95% Conf. nterval] muhat Very strongly reject H 0. Data here are overdispersed.. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 36 / 65

37 6. Diagnostics Diagnostics: predicted probabilities Now suppose we want to predict probability of 0 doctor visits, 1 doctor visits,... Observed frequency p j (fraction of observations with y i = j). Fitted frequency bp j = N 1 N i=1 bp ij predicted probability bp ij = Pr[y i = j] = e bµ i bµ j i /j! for Poisson. Expect bp j close to p j, j = 0, 1, 2,... nformal statistic is Pearson s chi-square test j (np j nbp j ) 2 nbp j but this is not χ 2 distributed due to estimation to get bp j. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 37 / 65

38 6. Diagnostics nstead do a formal chi-square goodness of t test. Assuming that the density is correctly speci ed (so more applicable to models more general than Poisson) this can be computed as NR 2 u (uncentered R 2 ) from the arti cial regression where 1 = s i (y i, x i, bθ) 0 γ + j (d ij (y i ) bp ij ) 0 δ j + error j denotes cells (e.g. values 0, 1, 2, 3, and 4 or more) d ij (y i ) equals 1 if y i is in cell j and 0 otherwise bp ij equals predicted probability for that cell s i (y i, x i, θ) = ln f (y i jx i, θ)/ θ (= (y i exp(xi 0 β)) for Poisson). Reject at level α if NR 2 u > χ 2 α (J 1) where J is number of cells. Use Stata add-on chi2gof: e.g. chi2gof, cells(11) table A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 38 / 65

39 6. Diagnostics Stata add-on chi2gof, cells(11) table yields χ 2 α (10) statistic equals Clearly problem as Poisson greatly underpredicts low counts e.g. for y = 0.. chi2gof, cells(11) table Chi square Goodness of Fit Test for Poisson Model: Chi square chi2(10) = Prob>chi2 = 0.00 Fitted Cells Abs. Freq. Rel. Freq. Rel. Freq. Abs. Dif or more (supersedes addon countfit: compares actual and tted frequencies but no test). A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 39 / 65

40 7. Negative binomial regression Motivation 7. Negative binomial regression: motivation Count data are often overdispersed with more zeros and more high values than a Poisson distribution predicts. Doctor visits: Frequencies with and grouped. tabulate dvrange dvrange Freq. Percent Cum Total 3, A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 40 / 65

41 7. Negative binomial regression Motivation Poisson (white) with λ = ȳ compared to actual data (grey) Density # Doctor visits versus Poisson Poisson clearly inappropriate: ȳ = 6.82, s y = 7.39, s 2 y = ' 8.01ȳ. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 41 / 65

42 7. Negative binomial regression Negative binomial distribution Negative binomial distribution Negative binomial is a Poisson-gamma mixture. then y Poisson[λυ] υ Gamma[µ = 1, σ 2 = α] y Negative Binomial[µ = λ, σ 2 = λ + αλ 2 ]. Probability mass function: Pr[Y = yjλ, α] = Γ(α 1 α + y) α 1 Γ(α 1 )Γ(y + 1) α 1 + λ Mean and variance: Overdispersion: variance > mean. E[y] = λ V[y] = λ + αλ 2 1 λ λ + α 1 y. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 42 / 65

43 7. Negative binomial regression Negative binomial distribution Negative binomial for λ = ȳ and α = compared to actual data Density # Doctor visits versus negative binomial Negative binomial much more appropriate than Poisson for these data. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 43 / 65

44 7. Negative binomial regression Negative binomial distribution Negative binomial regression Negative binomial (Negbin 2) permits overdispersion. f (yjλ, α) = Γ(y + α 1 ) α 1 α 1 λ y Γ(y + 1)Γ(α 1 ) α 1 + λ α 1. + λ Same conditional mean but di erent conditional variance to Poisson E[yjx] = λ = exp(x 0 β) V[yjx] = λ + αλ 2 = exp(x 0 β) + α(exp(x 0 β)) 2. The ML rst-order conditions w.r.t. β and α are (with µ i = exp(x 0 i β)) N i=1 N i=1 1 α 2 ln(1 + αµ i ) y i 1 j=0 y i exp(xi 0β) 1 + α exp(xi 0β)x i = 0 1 (j + α 1 ) + y i µ i α (1 + αµ i ) = 0. Can additionally allow α = exp(x 0 γ) (generalized negative binomial). Can instead use Negbin 1: V[yjx] = (1 + α)λ = (1 + α) exp(x 0 β). Often little e ciency gain (if any) over Poisson with robust s.e s. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 44 / 65

45 7. Negative binomial regression Negative binomial distribution Negative binomial MLE with ML default standard errors. nbreg docvis $xlist, nolog Negative binomial regression Number of obs = 3677 LR chi2(7) = Dispersion = mean Prob > chi2 = Log likelihood = Pseudo R2 = docvis Coef. Std. Err. z P> z [95% Conf. nterval] private medicaid age age educyr actlim totchr _cons /lnalpha alpha Likelihood ratio test of alpha=0: chibar2(01) = Prob>=chibar2 = Likelihood ratio test of α = 0 prefers NB to Poisson (p < 0.05) - where critical values use half χ 2 (1) as α = 0 is on boundary of NB. Fitted frequencies close to observed frequencies (from output not given) chi2gof, cells(11) table yields χ 2 α (10) statistic equals A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 45 / 65

46 7. Negative binomial regression Negative binomial distribution Poisson and negative binomial MLE with di erent standard error estimates docvis private *** *** *** *** *** (0.0143) (0.0364) (0.0360) (0.0332) (0.0369) medicaid *** * * (0.0189) (0.0568) (0.0475) (0.0454) (0.0567) age *** *** *** *** *** (0.0260) (0.0630) (0.0652) (0.0602) (0.0646) age *** *** *** *** *** (0.0002) (0.0004) (0.0004) (0.0004) (0.0004) educyr *** *** *** *** *** (0.0019) (0.0048) (0.0047) (0.0042) (0.0049) actlim *** *** *** *** *** (0.0146) (0.0397) (0.0366) (0.0348) (0.0394) totchr *** *** *** *** *** (0.0046) (0.0126) (0.0117) (0.0121) (0.0132) _cons *** *** *** *** *** (0.9720) (2.3692) (2.4415) (2.2474) (2.4241) lnalpha _cons *** *** (0.0307) (0.0378) N pseudo R sq Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001 (1) (2) (3) (4) (5) PDEFAULT PROBUST PPEARSON NBDEFAULT NBROBUST A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 46 / 65

47 8. Richer parametric models 8. Richer parametric models Data frequently exhibit non-poisson features: Overdispersion: conditional variance exceeds conditional mean whereas Poisson imposes equality. Excess zeros: higher frequency of zeros than predicted by Poisson. This provides motivation for richer parametric models than basic Poisson. Some models still have E[yjx] = exp(x 0 β) Then richer model can provide more e cient estimates. Other models imply E[yjx] 6= exp(x 0 β) Then Poisson QMLE is inconsistent And marginal e ects and coe cient interpretation more di cult. Many of these models are fully parametric and require correct speci cation for consistency. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 47 / 65

48 8. Richer parametric models Truncated data Counts left-truncated at zero Sampling rule is such that observe only y and x for y 1 i.e. only those who participate at least once are in sample. Truncated density (given untruncated density f (y jx, θ)) is f (yjx, θ, y 1) = f (yjx, θ) f (yjx, θ) = Pr[y 0jx, θ] [1 f (0jx, θ)]. MLE is inconsistent if any aspect of the parametric model is misspeci ed. Need to assume that the process for nonzeroes is the same as zeroes. e.g. f data are on annual number of hunting trips for only those who hunted this year, then a missing 0 is interpreted as being for a hunter who did not hunt this year (rather than for all people). Stata commands ztp (poisson) and ztnb (negative binomial) ztnb docvis $xlist if docvis>0, nolog A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 48 / 65

49 8. Richer parametric models Censored data Counts right-censored Sampling rule is that observe only 0, 1, 2,..., c 1, c or more i.e. Only record counts up to c and then any value above c. Censored density (given uncensored density f (yjx, θ) and cdf is F (yjx, θ)) f (yjx, θ) y c 1 1 F (c 1jx, θ) = 1 c j=0 1 f (jjx, θ) y = c Log-likelihood (where d i = 1 if uncensored and d i = 0 if censored) L(θ) = N i=1 fd i ln f (y i jx i, θ) + (1 d i ) ln(1 c 1 j=0 f (jjx i, θ))g MLE is inconsistent if any aspect of the parametric model is misspeci ed So pick a good density - at least negative binomial. Stata code up yourself using command ml (for user-de ned likelihood). A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 49 / 65

50 8. Richer parametric models Counts recorded in intervals Counts recorded in intervals Sampling rule is that observe only counts in ranges. e.g. 0, 1-4, 5-9, 10 and above. nterval density is simply Pr[a y b] = b f (jjx, θ). j=a Let interval ranges by [a 0, a 1 1], [a 1, a 2 1],..., [a m, a m+1 ), where a 0 = 0, a m+1 =. Let d k be binary indicators for whether in interval k (k = 0,..., m). Then ln L(θ) = N i=1 h i m k=0 d a ij ln k+1 1 f (jjx, θ). k=a k MLE is inconsistent if any aspect of the parametric model is misspeci ed. Stata has no command so need to code up. For convenience could instead use ordered logit or probit here. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 50 / 65

51 8. Richer parametric models Hurdle model or two-part model Hurdle model or two-part model Suppose zero counts are determined by a di erent process to positive counts. Zeros: density f 1 (yjx 1, θ 1 ) so Pr[y = 0] = f 1 (0) and Pr[y > 0] = 1 f 1 (0). Positives: density f 2 (yjx 2, θ 2 ) so truncated density f 2 (y)/(1 f 2 (0)). e.g. First - do hunt this year or not? Second - given chose to hunt, how many times ( 1)? Combined density is 8 < f (yjx 1, x 1, θ 1, θ 2 ) = : f 1 (yjx 1, θ 1 ) y = 0 1 f 1 (0jx 1, θ 1 ) 1 f 2 (0jx 2, θ 2 ) f 2(yjx 2, θ 2 ) y 1 MLE is inconsistent if any aspect of model misspeci ed. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 51 / 65

52 8. Richer parametric models Hurdle model or two-part model Conditional mean is now E[yjx] = Pr[y 1 > 0jx 1 ] E y2 >0[y 2 jy 2 > 0, x 2 ]. This makes marginal e ects more complicated. Example: f 1 () is logit and f 2 () is negative binomial. Then E[yjx] = Λ(x1β) 0 exp(x2β)/[1 0 (1 + α 2 exp(x2β)) 0 1/α 2 ], where Λ(z) = e z /(1 + e z ). A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 52 / 65

53 8. Richer parametric models Hurdle model or two-part model Hurdle model - logit and negative binomial: Stata addon hnblogit. hnblogit docvis $xlist, nolog Negative Binomial Logit Hurdle Regression Number of obs = 3677 Wald chi2(7) = Log likelihood = Prob > chi2 = logit private medicaid age age educyr actlim totchr _cons negbinomial private medicaid age age educyr actlim totchr _cons /lnalpha AC Statistic = Coef. Std. Err. z P> z [95% Conf. nterval] A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 53 / 65

54 8. Richer parametric models Zero-in ated model Zero-in ated model (or with-zeroes model) Suppose there is an additional reason for zero counts Extra model for 0: density f 1 (yjx 1, θ 1 ) Usual model for 0: realization of 0 from density f 2 (yjx 2, θ 2 ). e.g. Some zeroes are mismeasurement and some are true zeros. Zero-in ated model has density = f (yjx 1, x 1, θ 1, θ 2 ) f1 (0jx 1, θ 1 ) + [1 f 1 (0jx 1, θ 1 )] f 2 (0jx 2, θ 2 ) y = 0 [1 f 1 (0jx 1, θ 1 )] f 2 (yjx 2, θ 2 ) y 1 MLE is inconsistent if any aspect of model misspeci ed. Not used much in econometrics - hurdle model more popular. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 54 / 65

55 8. Richer parametric models Zero-in ated model Zero-in ated negative binomial: Stata command zinb and zip. zinb docvis $xlist, inflate($xlist) vuong nolog Zero inflated negative binomial regression Number of obs = 3677 Nonzero obs = 3276 Zero obs = 401 nflation model = logit LR chi2(7) = Log likelihood = Prob > chi2 = Coef. Std. Err. z P> z [95% Conf. nterval] docvis private medicaid age age educyr actlim totchr _cons inflate private medicaid age age educyr actlim totchr _cons /lnalpha alpha Vuong test of zinb vs. standard negative binomial: z = 6.48 Pr>z = A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 55 / 65

56 8. Richer parametric models Continuous mixture models Continuous mixture models Mixture motivation for negative binomial assumes y jθ Poisson (θ) where θ = λυ is the product of two components: observed individual heterogeneity λ = exp(x 0 β) unobserved individual heterogeneity υ Gamma[1, α]. ntegrating out Z Z h(yjλ) = f (yjλ, υ)g(υ)dυ = [e λυ (λυ) y /y!] g(υ)dυ gives yjλ NB [λ, λ + αλ 2 ] if υ Gamma[1, α]. Di erent distributions of υ lead to di erent models e.g. Poisson-lognormal mixture (random e ects model) e.g. Poisson-nverse Gaussian. Even if no closed form solution can estimate using numerical integration (one-dimensional) e.g. Gaussian quadrature. Monte Carlo integration e.g. maximum simulated likelihood. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 56 / 65

57 8. Richer parametric models Hierarchical models Hierarchical models For multi-level surveys cross-section data individuals i may be in cluster j e.g. patient i in hospital j e.g. individual i in household j or village j Hierarchical model or generalized linear mixed model example y i Poisson[µ ij = exp(x 0 ij β j + ε ij )] β j = W j γ + v j ε ij N [0, σ 2 ε ] v j N [0, Diag[σ 2 jk ]] Estimate by MLE or by Bayesian methods Stata command mepoisson A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 57 / 65

58 8. Richer parametric models Model comparison Model comparison for fully parametric models Choice between nested models using likelihood ratio tests e.g. Poisson versus negative binomial. Choice between non-nested models using Vuong s (1989) likelihood ratio test e.g. Zero-in ated NB versus NB Choice between non-nested mixture models using penalized log-likelihood Akaike s information criterion (AC) and extensions (q = # parameters) AC = 2 ln L + 2q BC = 2 ln L + qk ln N CAC = 2 ln L + q(1 + ln)n Prefer model with small AC or BC. AC penalty for larger model too small. Bayesian C (BC) better. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 58 / 65

59 8. Richer parametric models Model comparison Compare predicted means: E[y jx, bθ]. Compare observed frequencies p j to average predicted frequencies where bp ij = bpr[y i = j]. bp j = N 1 N bp ij, i=1 A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 59 / 65

60 8. Richer parametric models Model comparison Compare AC, BC for regular NB, hurdle logit/nb and zero-in ated NB. Statistics NBREG HURDLENB ZNB N ll aic bic Hurdle NB and ZNB are big improvement on regular NB - lnl is approximately 100 higher than for NB - AC and BC is much smaller (with only 9 extra parameters) Little di erence between Hurdle NB and ZNB. A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 60 / 65

61 8. Richer parametric models Model comparison The conditional means from the three models are similar.. summarize docvis dvnbreg dvhurdle dvzinb Variable Obs Mean Std. Dev. Min Max docvis dvnbreg dvhurdle dvzinb correlate docvis dvnbreg dvhurdle dvzinb (obs=3677) docvis dvnbreg dvhurdle dvzinb docvis dvnbreg dvhurdle dvzinb A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 61 / 65

62 8. Richer parametric models Quantile regression Quantile regression The q th quantile regression estimator b β q minimizes over β q N N Q(β q ) = qjy i xi 0 β q j + (1 q)jy i xi 0 β q j, 0 < q < 1. i :y i xi 0 β i :y i <xi 0 β Example: median regression with q = 0.5. For count y adapt standard methods for continuous y by: Replace count y by continuous variable z = y + u where u Uniform[0, 1]. Then reconvert predicted z-quantile to y-quantile using ceiling function. Machado and Santos Silva (2005). Stata example qcount docvis $xlist, q(0.5) rep(50) qcount_mfx // MEM A. Colin Cameron Univ. of Calif. - Davis... for Center Count of Labor Regression: Economics PartNorwegian 1 School ofaug Economics 28 - Sept Advanced 1, 2017 Microeconomet 62 / 65

Day 1B Count Data Regression: Part 2

Day 1B Count Data Regression: Part 2 Day 1B Count Data Regression: Part 2 A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconometrics Aug 28 - Sept 1, 2017. Colin Cameron

More information

Day 4 Nonlinear methods

Day 4 Nonlinear methods Day 4 Nonlinear methods c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009, 2010), Microeconometrics

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Day 3B Nonparametrics and Bootstrap

Day 3B Nonparametrics and Bootstrap Day 3B Nonparametrics and Bootstrap c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010),

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments

Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments Day 2A nstrumental Variables, Two-stage Least Squares and Generalized Method of Moments c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based

More information

Generalized linear models

Generalized linear models Generalized linear models Christopher F Baum ECON 8823: Applied Econometrics Boston College, Spring 2016 Christopher F Baum (BC / DIW) Generalized linear models Boston College, Spring 2016 1 / 1 Introduction

More information

Day 4A Nonparametrics

Day 4A Nonparametrics Day 4A Nonparametrics A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconometrics Aug 28 - Sep 2, 2017. Colin Cameron Univ. of Calif.

More information

Discrete Choice Modeling

Discrete Choice Modeling [Part 6] 1/55 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent Class 11 Mixed Logit 12 Stated Preference

More information

Day 1A Ordinary Least Squares and GLS

Day 1A Ordinary Least Squares and GLS Day 1A Ordinary Least Squares and GLS c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010),

More information

Estimation of hurdle models for overdispersed count data

Estimation of hurdle models for overdispersed count data The Stata Journal (2011) 11, Number 1, pp. 82 94 Estimation of hurdle models for overdispersed count data Helmut Farbmacher Department of Economics University of Munich, Germany helmut.farbmacher@lrz.uni-muenchen.de

More information

MODELING COUNT DATA Joseph M. Hilbe

MODELING COUNT DATA Joseph M. Hilbe MODELING COUNT DATA Joseph M. Hilbe Arizona State University Count models are a subset of discrete response regression models. Count data are distributed as non-negative integers, are intrinsically heteroskedastic,

More information

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

More information

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator As described in the manuscript, the Dimick-Staiger (DS) estimator

More information

GMM Estimation in Stata

GMM Estimation in Stata GMM Estimation in Stata Econometrics I Department of Economics Universidad Carlos III de Madrid Master in Industrial Economics and Markets 1 Outline Motivation 1 Motivation 2 3 4 2 Motivation 3 Stata and

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

More information

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression. Ryan Godwin. ECON University of Manitoba Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

2. We care about proportion for categorical variable, but average for numerical one.

2. We care about proportion for categorical variable, but average for numerical one. Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is

More information

options description set confidence level; default is level(95) maximum number of iterations post estimation results

options description set confidence level; default is level(95) maximum number of iterations post estimation results Title nlcom Nonlinear combinations of estimators Syntax Nonlinear combination of estimators one expression nlcom [ name: ] exp [, options ] Nonlinear combinations of estimators more than one expression

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Jeffrey M. Wooldridge Michigan State University

Jeffrey M. Wooldridge Michigan State University Fractional Response Models with Endogenous Explanatory Variables and Heterogeneity Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Fractional Probit with Heteroskedasticity 3. Fractional

More information

i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)

i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x) ECO 6375 Prof Millimet Problem Set #2: Answer Key Stata problem 2 Q 3 Q (a) The sample average of the individual-specific marginal effects is 0039 for educw and -0054 for white Thus, on average, an extra

More information

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama Problem Set #1 (Random Data Generation) 1. Generate =500random numbers from both the uniform 1 ( [0 1], uniformbetween zero and one) and exponential exp ( ) (set =2and let [0 1]) distributions. Plot the

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data The 3rd Australian and New Zealand Stata Users Group Meeting, Sydney, 5 November 2009 1 Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data Dr Jisheng Cui Public Health

More information

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from

More information

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their

More information

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 The lectures will survey the topic of count regression with emphasis on the role on unobserved heterogeneity.

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University.

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University. Panel GLMs Department of Political Science and Government Aarhus University May 12, 2015 1 Review of Panel Data 2 Model Types 3 Review and Looking Forward 1 Review of Panel Data 2 Model Types 3 Review

More information

Estimating the Fractional Response Model with an Endogenous Count Variable

Estimating the Fractional Response Model with an Endogenous Count Variable Estimating the Fractional Response Model with an Endogenous Count Variable Estimating FRM with CEEV Hoa Bao Nguyen Minh Cong Nguyen Michigan State Universtiy American University July 2009 Nguyen and Nguyen

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

Generalized Linear Models: An Introduction

Generalized Linear Models: An Introduction Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,

More information

A simple bivariate count data regression model. Abstract

A simple bivariate count data regression model. Abstract A simple bivariate count data regression model Shiferaw Gurmu Georgia State University John Elder North Dakota State University Abstract This paper develops a simple bivariate count data regression model

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Latent class analysis and finite mixture models with Stata

Latent class analysis and finite mixture models with Stata Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent

More information

Stata tip 63: Modeling proportions

Stata tip 63: Modeling proportions The Stata Journal (2008) 8, Number 2, pp. 299 303 Stata tip 63: Modeling proportions Christopher F. Baum Department of Economics Boston College Chestnut Hill, MA baum@bc.edu You may often want to model

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

fhetprob: A fast QMLE Stata routine for fractional probit models with multiplicative heteroskedasticity

fhetprob: A fast QMLE Stata routine for fractional probit models with multiplicative heteroskedasticity fhetprob: A fast QMLE Stata routine for fractional probit models with multiplicative heteroskedasticity Richard Bluhm May 26, 2013 Introduction Stata can easily estimate a binary response probit models

More information

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester Modelling Rates Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 05/12/2017 Modelling Rates Can model prevalence (proportion) with logistic regression Cannot model incidence in

More information

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers 1. Consider a binary random variable y i that describes a Bernoulli trial in which the probability of observing y i = 1 in any draw is given

More information

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley Models, Testing, and Correction of Heteroskedasticity James L. Powell Department of Economics University of California, Berkeley Aitken s GLS and Weighted LS The Generalized Classical Regression Model

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

ECON Introductory Econometrics. Lecture 11: Binary dependent variables ECON4150 - Introductory Econometrics Lecture 11: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

GENERALIZED LINEAR MODELS Joseph M. Hilbe

GENERALIZED LINEAR MODELS Joseph M. Hilbe GENERALIZED LINEAR MODELS Joseph M. Hilbe Arizona State University 1. HISTORY Generalized Linear Models (GLM) is a covering algorithm allowing for the estimation of a number of otherwise distinct statistical

More information

Analyzing Proportions

Analyzing Proportions Institut für Soziologie Eberhard Karls Universität Tübingen www.maartenbuis.nl The problem A proportion is bounded between 0 and 1, this means that: the effect of explanatory variables tends to be non-linear,

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Estimating and Interpreting Effects for Nonlinear and Nonparametric Models

Estimating and Interpreting Effects for Nonlinear and Nonparametric Models Estimating and Interpreting Effects for Nonlinear and Nonparametric Models Enrique Pinzón September 18, 2018 September 18, 2018 1 / 112 Objective Build a unified framework to ask questions about model

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Environmental Econometrics

Environmental Econometrics Environmental Econometrics Syngjoo Choi Fall 2008 Environmental Econometrics (GR03) Fall 2008 1 / 37 Syllabus I This is an introductory econometrics course which assumes no prior knowledge on econometrics;

More information

Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins-

Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins- Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins- Isabel Canette Principal Mathematician and Statistician StataCorp

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Ordered Response and Multinomial Logit Estimation

Ordered Response and Multinomial Logit Estimation Ordered Response and Multinomial Logit Estimation Quantitative Microeconomics R. Mora Department of Economics Universidad Carlos III de Madrid Outline Introduction 1 Introduction 2 3 Introduction The Ordered

More information

The 2010 Medici Summer School in Management Studies. William Greene Department of Economics Stern School of Business

The 2010 Medici Summer School in Management Studies. William Greene Department of Economics Stern School of Business The 2010 Medici Summer School in Management Studies William Greene Department of Economics Stern School of Business Econometric Models When There Are Unusual Events Part 5: Binary Outcomes Agenda General

More information

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case.

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case. 24. Strati cation c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 2002. They can be used as an adjunct to Chapter 24 (sections 24.2-24.4 only) of our subsequent book Microeconometrics:

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Dynamic Panels. Chapter Introduction Autoregressive Model

Dynamic Panels. Chapter Introduction Autoregressive Model Chapter 11 Dynamic Panels This chapter covers the econometrics methods to estimate dynamic panel data models, and presents examples in Stata to illustrate the use of these procedures. The topics in this

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Addition to PGLR Chap 6

Addition to PGLR Chap 6 Arizona State University From the SelectedWorks of Joseph M Hilbe August 27, 216 Addition to PGLR Chap 6 Joseph M Hilbe, Arizona State University Available at: https://works.bepress.com/joseph_hilbe/69/

More information

Partial effects in fixed effects models

Partial effects in fixed effects models 1 Partial effects in fixed effects models J.M.C. Santos Silva School of Economics, University of Surrey Gordon C.R. Kemp Department of Economics, University of Essex 22 nd London Stata Users Group Meeting

More information

Discrete Choice Modeling

Discrete Choice Modeling [Part 4] 1/43 Discrete Choice Modeling 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 Count Data 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser Simultaneous Equations with Error Components Mike Bronner Marko Ledic Anja Breitwieser PRESENTATION OUTLINE Part I: - Simultaneous equation models: overview - Empirical example Part II: - Hausman and Taylor

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

A Journey to Latent Class Analysis (LCA)

A Journey to Latent Class Analysis (LCA) A Journey to Latent Class Analysis (LCA) Jeff Pitblado StataCorp LLC 2017 Nordic and Baltic Stata Users Group Meeting Stockholm, Sweden Outline Motivation by: prefix if clause suest command Factor variables

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand Workshop for empirical trade analysis December 2015 Bangkok, Thailand Cosimo Beverelli (WTO) Rainer Lanz (WTO) Content (I) a. Classical regression model b. Introduction to panel data analysis c. Basic

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information