GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit

Similar documents
GOV 2001/ 1002/ E-2001 Section 8 Ordered Probit and Zero-Inflated Logit

Continuing with Binary and Count Outcomes

GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1

0.1 oprobit: Ordinal Probit Regression for Ordered Categorical Dependent Variables

Logit Regression and Quantities of Interest

0.1 ologit: Ordinal Logistic Regression for Ordered Categorical Dependent Variables

Logit Regression and Quantities of Interest

0.1 blogit: Bivariate Logistic Regression for Two Dichotomous

Precept Five: Model Diagnostics for Binary Outcome Models and Ordered Probit

Gov 2001: Section 4. February 20, Gov 2001: Section 4 February 20, / 39

Count and Duration Models

Count and Duration Models

Advanced Quantitative Methods: limited dependent variables

1 gamma.mixed: Mixed effects gamma regression

Linear Regression With Special Variables

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

Generalized Linear Models for Non-Normal Data

Non-Conjugate Models and Grid Approximations. Patrick Lam

0.1 gamma.mixed: Mixed effects gamma regression

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez

Section 4.6 Simple Linear Regression

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Generalized Linear Models

POLI 7050 Spring 2008 February 27, 2008 Unordered Response Models I

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

From Model to Log Likelihood

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Discrete Distributions

Single-level Models for Binary Responses

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

Chapter 11. Regression with a Binary Dependent Variable

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

Model Checking. Patrick Lam

Linear Regression Models P8111

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Advanced Quantitative Methods: maximum likelihood

Represent processes and observations that span multiple levels (aka multi level models) R 2

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

0.1 weibull: Weibull Regression for Duration Dependent

Data-analysis and Retrieval Ordinal Classification

ECON 594: Lecture #6

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Lecture #11: Classification & Logistic Regression

Visualizing Inference

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Visualizing Inference

Generalized Models: Part 1

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Post-Estimation Uncertainty

Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Generalized Linear Models Introduction

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

CSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement

Introduction to Generalized Models

Probability and Estimation. Alan Moses

Logistic Regression. INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE

Zelig: Everyone s Statistical Software

STA 216, GLM, Lecture 16. October 29, 2007

AGEC 661 Note Fourteen

MS&E 226: Small Data

Binary Logistic Regression

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

12 Modelling Binomial Response Data

High-Throughput Sequencing Course

Precept Three: Numerical Optimization and Simulation to Derive QOI from Model Estimates

Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models

Binary Dependent Variable. Regression with a

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Expectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )

Probability Theory for Machine Learning. Chris Cremer September 2015

Machine Learning. Lecture 3: Logistic Regression. Feng Li.

LOGISTIC REGRESSION Joseph M. Hilbe

Introduction to Probability and Statistics (Continued)

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

FEEG6017 lecture: Akaike's information criterion; model reduction. Brendan Neville

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Multiple regression: Categorical dependent variables

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Comparing IRT with Other Models

Logistic regression modeling the probability of success

2. We care about proportion for categorical variable, but average for numerical one.

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Statistical Concepts

Brief Review of Probability

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

MS&E 226: Small Data

Lecture 10: Introduction to Logistic Regression

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

MS&E 226: Small Data

Transcription:

GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit Solé Prillaman Harvard University March 25, 2015 1 / 56

LOGISTICS Reading Assignment- Becker and Kennedy (1992), Harris and Zhao (2007) (sections 1 and 2), and UPM ch 5.4-5.10. Re-replication- Due by 6pm Wednesday, April 1 on Canvas. 2 / 56

RE-REPLICATION Re-replication- Due April 1 at 6pm. You will receive all of the replication files from another team. It is your responsibility to hand-off your replication files. (Re-replication teams are posted on Canvas.) Go through the replication code and try to improve it in any way you can. Provide a short write-up of thoughts on the replication and ideas for their final paper. Aim to be helpful, not critical! 3 / 56

OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 4 / 56

ORDERED CATEGORICAL VARIABLES Suppose our dependent variable is an ordered scale. For example: Customers tell you how much they like your product on a 5-point scale from a lot to very little. Voters identify their ideology on a 7-point scale: very liberal, moderately liberal, somewhat liberal, neutral, somewhat conservative, moderately conservative, and very conservative. Foreign businesses rate their host country from not corrupt to very corrupt. What are the problems with using a linear model to study these processes? 5 / 56

AN EXAMPLE: COOPERATION ON SANCTIONS Lisa Martin (1992) asks what determines cooperation on sanctions? Her dependent variable Coop measures cooperation on a four-point scale. 6 / 56

ORDERED PROBIT: COOPERATION ON SANCTIONS Load the data in R library(zelig) data(sanction) head(sanction) mil coop target import export cost num ncost 1 1 4 3 1 1 4 15 major loss 2 0 2 3 0 1 3 4 modest loss 3 0 1 3 1 0 2 1 little effect 4 1 1 3 1 1 2 1 little effect 5 0 1 3 1 1 2 1 little effect 6 0 1 3 0 1 2 1 little effect 7 / 56

ORDERED PROBIT: COOPERATION ON SANCTIONS We re going to look at the covariates: target which is a measure of the economic health and political stability of the target country cost which is a measure of the cost of the sanctions mil which is a measure of whether or not there is military action in addition to the sanction 8 / 56

MAXIMUM LIKELIHOOD ESTIMATION Steps to finding the MLE: 1. Write out the model. 2. Calculate the likelihood (L(θ y)) for all observations. 3. Take the log of the likelihood (l(θ Y)). 4. Plug in the systematic component for θ i. 5. Bring in observed data. 6. Maximize l(θ y) with respect to θ and confirm that this is a maximum. 7. Find the variance of your estimate. 8. Calculate quantities of interest. 9 / 56

1. THE ORDERED PROBIT MODEL How can we derive the ordered probit? Suppose there is a latent (unobserved) data distribution, Y f N (µ i, 1). This latent distribution has a systematic component, µ i = x i β. Any realizations, y i, are completely unobserved. What you do observe is whether y i is between some threshold parameters. The Model: Y f N (µ i, 1) µ i = x i β 10 / 56

1. THE ORDERED PROBIT MODEL Although y i is unobserved, we do observe which of the categories it falls into - whether y i is between some threshold parameters. Threshold parameters τ j for j = 1,..., m 0.0 0.1 0.2 0.3 0.4-4 -2 0 2 4 11 / 56

1. THE ORDERED PROBIT MODEL In equation form, y ij = { 1 if τj 1 < y i τ j 0 otherwise What does Y look like in our example? Y = y 1,1 y 1,2 y 1,3 y 1,4 y 2,1 y 2,2 y 2,3 y 2,4 y 3,1 y 3,2 y 3,3 y 3,4 y 4,1 y 4,2 y 4,3 y 4,4 12 / 56

1. THE ORDERED PROBIT MODEL In equation form, y ij = { 1 if τj 1 < y i τ j 0 otherwise Our stochastic component for y ij is still Bernoulli, so: where M j=1 π ij = 1 Pr(Y ij π) = π y i1 i1 πy i2 i2 πy i3 i3... 13 / 56

1. THE ORDERED PROBIT MODEL Like the regular probit and logit, the key here is deriving π ij. You use this to derive the probability that y i will fall into category j: π ij = Pr(Y ij = 1) = Pr(τ j 1 < y i < τ j ) = = τj τ j 1 f N (y i µ i, σ 2 )dy i τj τ j 1 f N (y i µ i = x i β, σ 2 = 1)dy i = F N (τ j µ = x i β, σ 2 = 1) F N (τ j 1 µ = x i β, σ 2 = 1) = Φ(τ j x i β) Φ(τ j 1 x i β) where F is the cumulative density of Y i and Φ is the CDF of the standardized normal. 14 / 56

1. ORDERED PROBIT MODEL The latent model: 1. Y i f stn (y i µ i). 2. µ i = X i β 3. Y i and Y j are independent for all i j. The observed model: 1. Y ij f bern (y ij π ij ). 2. π ij = Φ(τ j X i β) Φ(τ j 1 X i β) Note: for the ordered logit Y i is distributed logistic and π ij = eτ j X i β 1+e τ j X i β eτj 1 Xiβ 1+e τ j 1 X i β 15 / 56

MAXIMUM LIKELIHOOD ESTIMATION Steps to finding the MLE: 1. Write out the model. 2. Calculate the likelihood (L(θ y)) for all observations. 3. Take the log of the likelihood (l(θ Y)). 4. Plug in the systematic component for θ i. 5. Bring in observed data. 6. Maximize l(θ y) with respect to θ and confirm that this is a maximum. 7. Find the variance of your estimate. 8. Calculate quantities of interest. 16 / 56

2-4. ORDERED PROBIT: DERIVING THE LIKELIHOOD We want to generalize to all observations and all categories L(τ, β y) = = = = n Pr(Y ij π) i=1 n π y i1 i1 πy i2 i2 πy i3 i3... n m [π ij ] y ij j=1 n m [ ] yij Φ(τ j X i β) Φ(τ j 1 X i β) i=1 i=1 i=1 j=1 17 / 56

3-4. ORDERED PROBIT: DERIVING THE LIKELIHOOD Then we take the log to get the log-likelihood ( n m [ ] yij ) l(τ, β y) = ln Φ(τ j X i β) Φ(τ j 1 X i β) l(τ, β y) = n i=1 i=1 j=1 m y ij ln[φ(τ j X i β) Φ(τ j 1 X i β)] j=1 How many parameters are there to estimate in this model? j-1+k 18 / 56

3-4. ORDERED PROBIT: DERIVING THE LIKELIHOOD Create the log-likelihood function ll.oprobit <- function(par, Z, X){ beta <- par[1:ncol(x)] tau <- par[(ncol(x)+1):length(par)] ystarmu <- X%*%beta m <- length(tau) + 1 probs = cprobs = matrix(nrow=length(ystarmu), ncol=m) for (j in 1:(m-1)){ cprobs[,j] <- pnorm(tau[j]- ystarmu) } probs[,m] <- 1-cprobs[,m-1] probs[,1] <- cprobs[,1] for (j in 2:(m-1)){ probs[,j] <- cprobs[,j] - cprobs[,(j-1)] } sum(log(probs[z])) } 19 / 56

WHY DOES X NOT CONTAIN AN INTERCEPT? In the binary probit model, we have one cutoff point, say τ 1 Pr(Y = 1 Xβ) = 1 Pr(Y = 0 Xβ) = 1 Φ(τ 1 Xβ) Here, τ 1 is both the cutoff point and the intercept. By including an intercept in Xβ we are setting τ 1 to zero. 20 / 56

WHY DOES X NOT CONTAIN AN INTERCEPT? Now in the ordered probit model, we have more than one cutoff point: If we included an intercept, P(y i1 = 1) = Pr(Xβ τ 1 ) P(y i2 = 1) = Pr(τ 1 Xβ τ 2 ) P(y i1 = 1) = Pr(Xβ + β 0 τ 1 ) P(y i2 = 1) = Pr(τ 1 Xβ + β 0 τ 2 ) Or equivalently we could write this: P(y i1 = 1) = Pr(Xβ τ 1 β 0 ) P(y i2 = 1) = Pr(τ 1 β 0 Xβ τ 2 β 0 ) By estimating a cutoff point, we are estimating an intercept. 21 / 56

MAXIMUM LIKELIHOOD ESTIMATION Steps to finding the MLE: 1. Write out the model. 2. Calculate the likelihood (L(θ y)) for all observations. 3. Take the log of the likelihood (l(θ Y)). 4. Plug in the systematic component for θ i. 5. Bring in observed data. 6. Maximize l(θ y) with respect to θ and confirm that this is a maximum. 7. Find the variance of your estimate. 8. Calculate quantities of interest. 22 / 56

5. ORDERED PROBIT: EXAMPLE DATA Make a matrix for the y s indicating what category it is in: y <- sanction$coop # Find all of the unique categories of y y0 <- sort(unique(y)) m <- length(y0) Z <- matrix(na, nrow(sanction), m) # Fill in our matrix with logical values if # the observed value is in each category # Remember R can treat logical values as 0/1s for (j in 1:m){Z[,j] <- y==y0[j]} X <- cbind(sanction$target, sanction$cost, sanction$mil) 23 / 56

MAXIMUM LIKELIHOOD ESTIMATION Steps to finding the MLE: 1. Write out the model. 2. Calculate the likelihood (L(θ y)) for all observations. 3. Take the log of the likelihood (l(θ Y)). 4. Plug in the systematic component for θ i. 5. Bring in observed data. 6. Maximize l(θ y) with respect to θ and confirm that this is a maximum. 7. Find the variance of your estimate. 8. Calculate quantities of interest. 24 / 56

6-7. ORDERED PROBIT: CALCULATE MLE Optimize par <- c(rep(1,3),0,1,2) optim(par, ll.oprobit, Z=Z, X=X, method="bfgs", control=list(fnscale=-1), hessian=t) out$par [1] -0.1602698 0.7672435 0.6389932 1.1196181 1.8708925 2.9159060 sqrt(diag(solve(-out$hessian))) [1] 0.1879563 0.1907797 0.4165843 0.4434676 0.4645836 0.5249454 25 / 56

ORDERED PROBIT: USING ZELIG We estimate the model using Zelig the oprobit call: z.out <- zelig(factor(coop) target + cost + mil, model="oprobit", data=sanction) Note that you could use model = "ologit" for the ordered logit and get similar inferences. 26 / 56

These are a little hard to interpret, so we turn to our bag of tricks... 27 / 56 ORDERED PROBIT: USING ZELIG What does the output look like? z.out Call: zelig(formula = factor(coop) 1 + target + cost + mil, model = "opr data = sanction) Coefficients: Value Std. Error t value target -0.1602725 0.1879563-0.8527113 cost 0.7672444 0.1907797 4.0216242 mil 0.6389931 0.4165843 1.5338865 Intercepts: Value Std. Error t value 1 2 1.1196 0.4435 2.5247 2 3 1.8709 0.4646 4.0270 3 4 2.9159 0.5249 5.5547

MAXIMUM LIKELIHOOD ESTIMATION Steps to finding the MLE: 1. Write out the model. 2. Calculate the likelihood (L(θ y)) for all observations. 3. Take the log of the likelihood (l(θ Y)). 4. Plug in the systematic component for θ i. 5. Bring in observed data. 6. Maximize l(θ y) with respect to θ and confirm that this is a maximum. 7. Find the variance of your estimate. 8. Calculate quantities of interest. 28 / 56

8. ORDERED PROBIT: USING ZELIG FOR QOIS Suppose we want to compare cooperation when there is or is not military action in addition to the sanction. x.low <- setx(z.out, mil = 0) x.high <- setx(z.out, mil = 1) 29 / 56

8. ORDERED PROBIT: USING ZELIG FOR QOIS Now we can simulate values using these hypothetical military involvements: s.out <- sim(z.out, x = x.low, x1 = x.high) summary(s.out) Model: oprobit Number of simulations: 1000 Values of X (Intercept) target cost mil 1 1 2.141026 1.807692 0 Values of X1 (Intercept) target cost mil 1 1 2.141026 1.807692 1 Expected Values: P(Y=j X) mean sd 2.5% 97.5% 1 0.53319553 0.06030198 0.420702369 0.65622723 2 0.26437778 0.05225064 0.173653596 0.36940997 3 0.17097079 0.04377157 0.093288499 0.26643960 4 0.03145590 0.01714663 0.007172707 0.07420374 30 / 56

8. ORDERED PROBIT: USING ZELIG FOR QOIS And then you can use the plot(s.out) command to visualize Predicted Values: Y X Y=4 Y=3 Y=2 Y=1 0 10 20 30 40 50 Percentage of Simulations Expected Values: P(Y=j X) Density 0 5 10 15 20 25 0.0 0.2 0.4 0.6 0.8 First Differences: P(Y=j X1)-P(Y=j X) Density 0 5 10 15-0.6-0.4-0.2 0.0 0.2 0.4 0.6 31 / 56

OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 32 / 56

WHAT IS ZERO-INFLATION? Let s return to binary data. What if we knew that something in our data was mismeasured? For example, what if we thought that some of our data were sytematically zero rather than randomly zero? This could be when: 1. Some data are spoiled or lost 2. Survey respondents put zero to an ordered answer on a survey just to get it done. If our data are mismeasured in some systematic way, our estimates will be off. 33 / 56

A WORKING EXAMPLE: FISHING You re trying to figure out the probability of catching a fish in a lake from a survey. People were asked: How many children were in the group How many people were in the group Whether they caught a fish. 34 / 56

A WORKING EXAMPLE: FISHING The problem is, some people didn t even fish! These people have systematically zero fish. 35 / 56

ZERO-INFLATED LOGIT MODEL We re going to assume that whether or not the person fished is the outcome of a Bernoulli trial. { 0 with probability ψi Y i = Logistic with probability 1 ψ i ψ i is the probability that you do not fish. This is a mixture model because our data is a mix of these two types of groups each with their own data generation process. 36 / 56

ZERO-INFLATED LOGIT MODEL Given that you fished, the logistical model is what we have done before: 1. Y i f bern (y i π i ). 2. π i = 1 1+e X i β 3. Y i and Y j are independent for all i j. So the probability that Y is 0: P(Y i = 0 fished) = 1 and the probability that Y is 1: P(Y i = 1 fished) = 1 1 + e X iβ 1 1 + e X iβ 37 / 56

ZERO-INFLATED LOGIT MODEL Given that you did not fish, what is the model? So the probability that Y is 0: P(Y i = 0 not fished) = 1 and the probability that Y is 1: P(Y i = 1 not fished) = 0 38 / 56

ZERO-INFLATED LOGIT MODEL Let s say that there is a ψ i probability that you did not fish. We can write out the distribution of Y i as (stochastic component): ( ) ψ i + (1 ψ i ) 1 1 if y 1+e P(Y i = y i β, ψ i ) Xβ i = 0 ( (1 ψ i ) if y i = 1 So, we can rewrite this as: P(Y i β, ψ i ) = [ ( ψ i + (1 ψ i ) 1 ) 1 1+e Xβ )] 1 Yi [ ( 1 1 + e X (1 ψ i ) iβ And we can put covariates on ψ (systematic component): 1 1 + e X iβ )] Yi ψ = 1 1 + e z iγ 39 / 56

ZERO-INFLATED LOGIT: DERIVING THE LIKELIHOOD The likelihood function is proportional to the probability of Y i : L(β, γ Y i ) P(Y i β, γ) [ ( = ψ i + (1 ψ i ) 1 = [ ( (1 ψ i ) 1 1 + e X iβ )] Yi 1 1 + e X iβ [ ( 1 1 + e z iγ + 1 1 1 + e z iγ [( ) ( 1 1 1 + e z iγ 1 1 + e X iβ )] 1 Yi ) ( 1 )] Yi )] 1 1 Yi 1 + e X iβ 40 / 56

ZERO-INFLATED LOGIT: DERIVING THE LIKELIHOOD Multiplying over all observations we get: L(β, γ Y) = n [ i=1 1 1 + e z iγ + [( 1 1 1 + e z iγ ( 1 ) ( ) ( 1 1 + e z 1 iγ 1 1 + e X iβ )] Yi )] 1 Yi 1 1 + e X iβ Taking the log we get: l(β, γ) = n i=1 +Y i ln { [ (1 Y i ) ln [( 1 1 1 + e z iγ + ) ( 1 1 + e z iγ ( 1 1 1 + e X iβ How many parameters do we need to estimate? 1 1 + e z iγ )]} ) ( 1 )] 1 1 + e X iβ 41 / 56

LET S PROGRAM THIS IN R Load and get the data ready: fish <- read.table("http://www.ats.ucla.edu/stat/data/fish.csv", sep=",", header=t) X <- fish[c("child", "persons")] Z <- fish[c("persons")] X <- as.matrix(cbind(1,x)) Z <- as.matrix(cbind(1,z)) y <- ifelse(fish$count>0,1,0) 42 / 56

LET S PROGRAM THIS IN R Write out the Log-likelihood function ll.zilogit <- function(par, X, Z, y){ beta <- par[1:ncol(x)] gamma <- par[(ncol(x)+1):length(par)] psi <- 1/(1+exp(-Z%*%gamma)) pie <- 1/(1+exp(-X%*%beta)) sum(y*log((1-psi)*pie) + (1-y)*(log(psi + (1-psi)*(1-pie)))) } 43 / 56

LET S PROGRAM THIS IN R Optimize to get the results par <- rep(1,(ncol(x)+ncol(z))) out <- optim(par, ll.zilogit, Z=Z, X=X,y=y, method="bfgs", control=list(fnscale=-1), hessian=true) out$par [1] 0.9215608-2.6952940 0.5814261 1.5657467-1.2020805 varcv.par <- solve(-out$hessian) 44 / 56

PLOTTING TO SEE THE RELATIONSHIP These numbers don t mean a lot to us, so we can plot the predicted probabilities of a group having not fished (i.e. predict ψ). First, we have to simulate our gammas: library(mvtnorm) sim.pars <- rmvnorm(10000, out$par, varcv.par) # Subset to only the parameters we need (gammas) # Better to simulate all though sim.z <- sim.pars[,(ncol(x)+1):length(par)] 45 / 56

PLOTTING TO SEE THE RELATIONSHIP We then generate predicted probabilities of not fishing for different sized groups. person.vec <- seq(1,4) Zcovariates <- cbind(1, person.vec) exp.holder <- matrix(na, ncol=4, nrow=10000) for(i in 1:length(person.vec)){ exp.holder[,i] <- 1/(1+exp(-Zcovariates[i,]%*%t(sim.z))) } 46 / 56

PLOTTING TO SEE THE RELATIONSHIP Using these numbers, we can plot the densities of probabilities, to get a sense of the probability and the uncertainty. plot(density(exp.holder[,4]), col="blue", xlim=c(0,1), main="probability of a Structural Zero", xlab="probability") lines(density(exp.holder[,3]), col="red") lines(density(exp.holder[,2]), col="green") lines(density(exp.holder[,1]), col="black") legend(.7,12, legend=c("one Person", "Two People", "Three People", "Four People"), col=c("black", "green", "red", "blue"), lty=1) 47 / 56

PLOTTING TO SEE THE RELATIONSHIP Probability of a Structural Zero Density 0 5 10 15 One Person Two People Three People Four People 0.0 0.2 0.4 0.6 0.8 1.0 Probability 48 / 56

OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 49 / 56

BINOMIAL MODEL Suppose our dependent variable is the number of successes in a series of independent trials. For example: The number of heads in 10 coin flips. The number of times you voted in the last six elections. The number of Supreme Court cases the government won in the last ten decisions. We can use a generalization of the binary model to study these processes. 50 / 56

BINOMIAL MODEL Stochastic Component Y i Binomial(y i π i ) ( ) N P(Y i = y i π i ) = π y i i (1 π i ) N y i y i π y i i : There are y i successes each with probability of π i (1 πi ) N y i : There are N yi failures each with probability ( 1 π i N ) y i : Number of ways to distribute yi successes in N trials; order of successes does not matter. 51 / 56

BINOMIAL MODEL Systematic Component Why? π i = 1 1 + e x iβ 52 / 56

BINOMIAL MODEL Derive the likelihood: L(π i y i ) = P(y i π i ) n ( ) N = π y i i (1 π i ) N y i lnl(π i y i ) = = i=1 n i=1 y i [ ln ( N y i ) ] + lnπ y i i + ln(1 π i ) N y i n [y i lnπ i + (N y i )ln(1 π i )] i=1 53 / 56

BINOMIAL MODEL We can operationalize this in R by coding the log likelihood up ourselves. First, let s make up some data to play with: x1 <- rnorm(1000,0,1) x2 <- rnorm(1000,9,.5) pi <- inv.logit(-5 +.4*x1 +.6*x2) y <- rbinom(1000,10,pi) 54 / 56

BINOMIAL MODEL Write out the Log-likelihood function ll.binom <- function(par, N, X, y){ pi <- 1/(1 + exp(-1*x%*%par)) out <- sum(y * log(pi) + (N - y)*log(1-pi)) return(out) } 55 / 56

BINOMIAL MODEL Optimize to get the results my.optim <- optim(par = c(0,0,0), fn = ll, y = y, X = cbind(1,x1,x2), N = 10, method = "BFGS", control=list(fnscale=-1), hessian=t) my.optim$par [1] -4.6132799 0.3836413 0.5590359 Given that pi <- inv.logit(-5 +.4*x1 +.6*x2) the output doesn t look too bad. 56 / 56