GOV 2001/ 1002/ E-2001 Section 8 Ordered Probit and Zero-Inflated Logit
|
|
- Herbert Carr
- 5 years ago
- Views:
Transcription
1 GOV 2001/ 1002/ E-2001 Section 8 Ordered Probit and Zero-Inflated Logit Solé Prillaman Harvard University March 26, / 51
2 LOGISTICS Reading Assignment- Becker and Kennedy (1992), Harris and Zhao (2007) (sections 1 and 2), and UPM ch Re-replication- Due by 6pm Wednesday, April 2 on Canvas. Class Party- Tentatively April 19 at Gary s house. 2 / 51
3 RE-REPLICATION Re-replication- Due April 2 at 6pm. You will receive all of the replication files from another team. It is your responsibility to hand-off your replication files. (Re-replication teams are posted on Canvas.) Go through the replication code and try to improve it in any way you can. Provide a short write-up of thoughts on the replication and ideas for their final paper. Aim to be helpful, not critical! 3 / 51
4 OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 4 / 51
5 ORDERED CATEGORICAL VARIABLES Suppose our dependent variable is an ordered scale. For example: Customers tell you how much they like your product on a 5-point scale from a lot to very little. Voters identify their ideology on a 7-point scale: very liberal, moderately liberal, somewhat liberal, neutral, somewhat conservative, moderately conservative, and very conservative. Foreign businesses rate their host country from not corrupt to very corrupt. What are the problems with using a linear model to study these processes? 5 / 51
6 ORDERED PROBIT: THE INTUITION How can we derive the ordered probit? Suppose there is a latent (unobserved) data distribution, Y f stn (y µ i ). This latent distribution has a systematic component, µ i = x i β. Any realizations, y i, are completely unobserved. What you do observe is whether y i is between some threshold parameters. 6 / 51
7 ORDERED PROBIT: THE INTUITION Threshold parameters τ j for j = 1,..., m Although y i is unobserved, we do observe which of the categories it falls into / 51
8 ORDERED PROBIT: DERIVING THE LIKELIHOOD In equation form, y ij = { 1 if τj i < y i τ j 0 otherwise Our stochastic component is still Bernoulli: where M j=1 π ij = 1 What does Y look like? Pr(Y ij π) = π y i1 i1 πy i2 i2 πy i3 i / 51
9 ORDERED PROBIT: DERIVING THE LIKELIHOOD Like the regular probit and logit, the key here is deriving π ij. You use this to derive the probability that y i will fall into category j: π ij = Pr(Y ij = 1) = Pr(τ j 1 < y i < τ j ) = = τj τ j 1 f (y i µ i)dy i τj τ j 1 f (y i x iβ)dy i = F(τ j x i β) F(τ j 1 x i β) = Φ(τ j x i β) Φ(τ j 1 x i β) where F is the cumulative density of Y i and Φ is the CDF of the standardized normal. 9 / 51
10 ORDERED PROBIT The latent model: 1. Y i f stn (y i µ i). 2. µ i = X i β 3. Y i and Y j are independent for all i j. The observed model: 1. Y ij f bern (y ij π ij ). 2. π ij = Φ(τ j X i β) Φ(τ j 1 X i β) Note: for the ordered logit Y i is distributed logistic and π ij = eτ j X i β 1+e τ j X i β eτj 1 Xiβ 1+e τ j 1 X i β 10 / 51
11 ORDERED PROBIT: DERIVING THE LIKELIHOOD We want to generalize to all observations and all categories L(τ, β y) = = = = n Pr(Y ij π) i=1 n π y i1 i1 πy i2 i2 πy i3 i3... n m [π ij ] y ij j=1 n m [ ] yij Φ(τ j X i β) Φ(τ j 1 X i β) i=1 i=1 i=1 j=1 11 / 51
12 ORDERED PROBIT: DERIVING THE LIKELIHOOD Then we take the log to get the log-likelihood ( n m [ ] yij ) l(τ, β y) = ln Φ(τ j X i β) Φ(τ j 1 X i β) l(τ, β y) = n i=1 i=1 j=1 m y ij ln[φ(τ j X i β) Φ(τ j 1 X i β)] j=1 How many parameters are there to estimate in this model? j+k 12 / 51
13 WHY DOES X NOT CONTAIN AN INTERCEPT? In the binary probit model, we have one cutoff point, say τ 1 Pr(Y = 1 Xβ) = 1 Pr(Y = 0 Xβ) = 1 Φ(τ 1 Xβ) Here, τ 1 is both the cutoff point and the intercept. By including an intercept in Xβ we are setting τ 1 to zero. 13 / 51
14 WHY DOES X NOT CONTAIN AN INTERCEPT? Now in the ordered probit model, we have more than one cutoff point: If we included an intercept, P(y i1 = 1) = Pr(Xβ τ 1 ) P(y i2 = 1) = Pr(τ 1 Xβ τ 2 ) P(y i1 = 1) = Pr(Xβ + A τ 1 ) P(y i2 = 1) = Pr(τ 1 Xβ + A τ 2 ) Or equivalently we could write this: P(y i1 = 1) = Pr(Xβ τ 1 A) P(y i2 = 1) = Pr(τ 1 A Xβ τ 2 A) By estimating a cutoff point, we are estimating an intercept. 14 / 51
15 A WORKING EXAMPLE: COOPERATION ON SANCTIONS Lisa Martin (1992) asks what determines cooperation on sanctions? Her dependent variable Coop measures cooperation on a four-point scale. 15 / 51
16 ORDERED PROBIT: COOPERATION ON SANCTIONS Load the data in R library(zelig) data(sanction) head(sanction) mil coop target import export cost num ncost major loss modest loss little effect little effect little effect little effect 16 / 51
17 ORDERED PROBIT: COOPERATION ON SANCTIONS We re going to look at the covariates: target which is a measure of the economic health and political stability of the target country cost which is a measure of the cost of the sanctions mil which is a measure of whether or not there is military action in addition to the sanction 17 / 51
18 ORDERED PROBIT: USING ZELIG We estimate the model using Zelig the oprobit call: z.out <- zelig(factor(coop) target + cost + mil, model="oprobit", data=sanction) Note that you could use model = "ologit" for the ordered logit and get similar inferences. 18 / 51
19 These are a little hard to interpret, so we turn to our bag of tricks / 51 ORDERED PROBIT: USING ZELIG What does the output look like? z.out Call: zelig(formula = factor(coop) 1 + target + cost + mil, model = "opr data = sanction) Coefficients: Value Std. Error t value target cost mil Intercepts: Value Std. Error t value
20 ORDERED PROBIT: USING ZELIG Suppose we want to compare cooperation when there is or is not military action in addition to the sanction. x.low <- setx(z.out, mil = 0) x.high <- setx(z.out, mil = 1) 20 / 51
21 ORDERED PROBIT: USING ZELIG Now we can simulate values using these hypothetical military involvements: s.out <- sim(z.out, x = x.low, x1 = x.high) summary(s.out) Model: oprobit Number of simulations: 1000 Values of X (Intercept) target cost mil Values of X1 (Intercept) target cost mil Expected Values: P(Y=j X) mean sd 2.5% 97.5% / 51
22 ORDERED PROBIT: USING ZELIG And then you can use the plot(s.out) command to visualize Predicted Values: Y X Y=4 Y=3 Y=2 Y= Percentage of Simulations Expected Values: P(Y=j X) Density First Differences: P(Y=j X1)-P(Y=j X) Density / 51
23 ORDERED PROBIT: WITHOUT ZELIG Make a matrix for the y s indicating what category it is in: y <- sanction$coop # Find all of the unique categories of y y0 <- sort(unique(y)) m <- length(y0) Z <- matrix(na, nrow(sanction), m) # Fill in our matrix with logical values if # the observed value is in each category # Remember R can treat logical values as 0/1s for (j in 1:m){Z[,j] <- y==y0[j]} X <- cbind(sanction$target, sanction$cost, sanction$mil) 23 / 51
24 ORDERED PROBIT: WITHOUT ZELIG Create the log-likelihood function ll.oprobit <- function(par, Z, X){ beta <- par[1:ncol(x)] tau <- par[(ncol(x)+1):length(par)] ystarmu <- X%*%beta m <- length(tau) + 1 probs =cprobs = matrix(nrow=length(ystarmu), ncol=m) for (j in 1:(m-1)){cprobs[,j] <- pnorm(tau[j]- ystarmu)} probs[,m] <- 1-cprobs[,m-1] probs[,1] <- cprobs[,1] for (j in 2:(m-1)){probs[,j] <- cprobs[,j] - cprobs[,(j-1)]} sum(log(probs[z])) } 24 / 51
25 ORDERED PROBIT: WITHOUT ZELIG Optimize par <- c(rep(1,3),0,1,2) optim(par, ll.oprobit, Z=Z, X=X, method="bfgs", control=list(fnscale=-1)) out$par [1] / 51
26 OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 26 / 51
27 WHAT IS ZERO-INFLATION? Let s return to binary data. What if we knew that something in our data was mismeasured? For example, what if we thought that some of our data were sytematically zero rather than randomly zero? This could be when: 1. Some data are spoiled or lost 2. Survey respondents put zero to an ordered answer on a survey just to get it done. If our data are mismeasured in some systematic way, our estimates will be off. 27 / 51
28 A WORKING EXAMPLE: FISHING You re trying to figure out the probability of catching a fish in a lake from a survey. People were asked: How many children were in the group How many people were in the group Whether they caught a fish. 28 / 51
29 A WORKING EXAMPLE: FISHING The problem is, some people didn t even fish! These people have systematically zero fish. 29 / 51
30 ZERO-INFLATED LOGIT MODEL We re going to assume that whether or not the person fished is the outcome of a Bernoulli trial. { 0 with probability ψi Y i = Logistic with probability 1 ψ i ψ i is the probability that you do not fish. This is a mixture model because our data is a mix of these two types of groups each with their own data generation process. 30 / 51
31 ZERO-INFLATED LOGIT MODEL Given that you fished, the logistical model is what we have done before: 1. Y i f bern (y i π i ). 2. π i = 1 1+e X i β 3. Y i and Y j are independent for all i j. So the probability that Y is 0: P(Y i = 0 fished) = 1 and the probability that Y is 1: P(Y i = 1 fished) = e X iβ e X iβ 31 / 51
32 ZERO-INFLATED LOGIT MODEL Given that you did not fish, what is the model? So the probability that Y is 0: P(Y i = 0 not fished) = 1 and the probability that Y is 1: P(Y i = 1 not fished) = 0 32 / 51
33 ZERO-INFLATED LOGIT MODEL We can write out the distribution of Y i as (stochastic component): ( ) ψ i + (1 ψ i ) 1 1 if y 1+e P(Y i = y i β, ψ i ) Xβ i = 0 ( (1 ψ i ) if y i = 1 ) 1 1+e Xβ And we can put covariates on ψ (systematic component): ψ = e z iγ 33 / 51
34 ZERO-INFLATED LOGIT: DERIVING THE LIKELIHOOD The likelihood function is proportional to the probability of Y i : L(β, γ Y i ) P(Y i β, γ) [ ( = ψ i + (1 ψ i ) 1 = [ ( (1 ψ i ) e X iβ )] Yi e X iβ [ ( e z iγ e z iγ [( ) ( e z iγ e X iβ )] 1 Yi ) ( 1 )] Yi )] 1 1 Yi 1 + e X iβ 34 / 51
35 ZERO-INFLATED LOGIT: DERIVING THE LIKELIHOOD Multiplying over all observations we get: L(β, γ Y) = n [ i= e z iγ + [( e z iγ ( 1 ) ( ) ( e z 1 iγ e X iβ )] Yi )] 1 Yi e X iβ 35 / 51
36 ZERO-INFLATED LOGIT: DERIVING THE LIKELIHOOD Taking the log we get: l(β, γ) = = n i=1 { [ ( Y i ln (1 ψ) e X iβ ( 1 )] + 1 (1 Y i ) ln[ψ + (1 ψ) 1 + e X iβ n { [( ) ( 1 1 Y i ln e z iγ 1 + e X iβ i=1 [ ( 1 (1 Y i ) ln 1 + e z iγ e z iγ How many parameters do we need to estimate? ) } ] )] + ) ( 1 )]} e X iβ 36 / 51
37 LET S PROGRAM THIS IN R Load and get the data ready: fish <- read.table(" sep=",", header=t) X <- fish[c("child", "persons")] Z <- fish[c("persons")] X <- as.matrix(cbind(1,x)) Z <- as.matrix(cbind(1,z)) y <- ifelse(fish$count>0,1,0) 37 / 51
38 LET S PROGRAM THIS IN R Write out the Log-likelihood function ll.zilogit <- function(par, X, Z, y){ beta <- par[1:ncol(x)] gamma <- par[(ncol(x)+1):length(par)] phi <- 1/(1+exp(-Z%*%gamma)) pie <- 1/(1+exp(-X%*%beta)) sum(y*log((1-phi)*pie) + (1-y)*(log(phi + (1-phi)*(1-pie)))) } 38 / 51
39 LET S PROGRAM THIS IN R Optimize to get the results par <- rep(1,(ncol(x)+ncol(z))) out <- optim(par, ll.zilogit, Z=Z, X=X,y=y, method="bfgs", control=list(fnscale=-1), hessian=true) out$par [1] / 51
40 PLOTTING TO SEE THE RELATIONSHIP These numbers don t mean a lot to us, so we can plot the predicted probabilities of a group having not fished (i.e. predict ψ. First, we have to simulate our gammas: varcv.par <- solve(-out$hessian) library(mvtnorm) sim.pars <- rmvnorm(10000, out$par, varcv.par) # Subset to only the parameters we need (gammas) # Better to simulate all though sim.z <- sim.pars[,(ncol(x)+1):length(par)] 40 / 51
41 PLOTTING TO SEE THE RELATIONSHIP We then generate predicted probabilities of not fishing for different sized groups. person.vec <- seq(1,4) Zcovariates <- cbind(1, person.vec) exp.holder <- matrix(na, ncol=4, nrow=10000) for(i in 1:length(person.vec)){ exp.holder[,i] <- 1/(1+exp(-Zcovariates[i,]%*%t(sim.z))) } 41 / 51
42 PLOTTING TO SEE THE RELATIONSHIP Using these numbers, we can plot the densities of probabilities, to get a sense of the probability and the uncertainty. plot(density(exp.holder[,4]), col="blue", xlim=c(0,1), main="probability of a Structural Zero", xlab="probability") lines(density(exp.holder[,3]), col="red") lines(density(exp.holder[,2]), col="green") lines(density(exp.holder[,1]), col="black") legend(.7,12, legend=c("one Person", "Two People", "Three People", "Four People"), col=c("black", "green", "red", "blue"), lty=1) 42 / 51
43 PLOTTING TO SEE THE RELATIONSHIP Probability of a Structural Zero Density One Person Two People Three People Four People Probability 43 / 51
44 OUTLINE The Ordered Probit Model Zero-Inflated Logistic Regression Binomial Model 44 / 51
45 BINOMIAL MODEL Suppose our dependent variable is the number of successes in a series of independent trials. For example: The number of heads in 10 coin flips. The number of times you voted in the last six elections. The number of Supreme Court cases the government won in the last ten decisions. We can use a generalization of the binary model to study these processes. 45 / 51
46 BINOMIAL MODEL Stochastic Component Y i Binomial(y i π i ) ( ) N P(Y i = y i π i ) = π y i i (1 π i ) N y i y i π y i i : There are y i successes each with probability of π i (1 πi ) N y i : There are N yi failures each with probability ( 1 π i N ) y i : Number of ways to distribute yi successes in N trials; order of successes does not matter. 46 / 51
47 BINOMIAL MODEL Systematic Component Why? π i = e x iβ 47 / 51
48 BINOMIAL MODEL Derive the likelihood: L(π i y i ) = P(y i π i ) n ( ) N = π y i i (1 π i ) N y i lnl(π i y i ) = = i=1 n i=1 y i [ ln ( N y i ) ] + lnπ y i i + ln(1 π i ) N y i n [y i lnπ i + (N y i )ln(1 π i )] i=1 48 / 51
49 BINOMIAL MODEL We can operationalize this in R by coding the log likelihood up ourselves. First, let s make up some data to play with: x1 <- rnorm(1000,0,1) x2 <- rnorm(1000,9,.5) pi <- inv.logit(-5 +.4*x1 +.6*x2) y <- rbinom(1000,10,pi) 49 / 51
50 BINOMIAL MODEL Write out the Log-likelihood function ll.binom <- function(par, N, X, y){ pi <- 1/(1 + exp(-1*x%*%par)) out <- sum(y * log(pi) + (N - y)*log(1-pi)) return(out) } 50 / 51
51 BINOMIAL MODEL Optimize to get the results my.optim <- optim(par = c(0,0,0), fn = ll, y = y, X = cbind(1,x1,x2), N = 10, method = "BFGS", control=list(fnscale=-1), hessian=t) my.optim$par [1] Given that pi <- inv.logit(-5 +.4*x1 +.6*x2) the output doesn t look too bad. 51 / 51
GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit
GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit Solé Prillaman Harvard University March 25, 2015 1 / 56 LOGISTICS Reading Assignment- Becker and Kennedy (1992), Harris and Zhao
More informationContinuing with Binary and Count Outcomes
Gov 2001 Section 8: Continuing with Binary and Count Outcomes Konstantin Kashin 1 March 27, 2013 1 Thanks to Jen Pan, Brandon Stewart, Iain Osgood, and Patrick Lam for contributing to this material. Outline
More informationGOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1
GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1 Anton Strezhnev Harvard University March 23, 2016 1 These section notes are heavily indebted to past Gov 2001 TFs
More information0.1 oprobit: Ordinal Probit Regression for Ordered Categorical Dependent Variables
0.1 oprobit: Ordinal Probit Regression for Ordered Categorical Dependent Variables Use the ordinal probit regression model if your dependent variables are ordered and categorical. They may take on either
More information0.1 ologit: Ordinal Logistic Regression for Ordered Categorical Dependent Variables
0.1 ologit: Ordinal Logistic Regression for Ordered Categorical Dependent Variables Use the ordinal logit regression model if your dependent variable is ordered and categorical, either in the form of integer
More informationLogit Regression and Quantities of Interest
Logit Regression and Quantities of Interest Stephen Pettigrew March 4, 2015 Stephen Pettigrew Logit Regression and Quantities of Interest March 4, 2015 1 / 57 Outline 1 Logistics 2 Generalized Linear Models
More information0.1 blogit: Bivariate Logistic Regression for Two Dichotomous
0.1 blogit: Bivariate Logistic Regression for Two Dichotomous Dependent Variables Use the bivariate logistic regression model if you have two binary dependent variables (Y 1, Y 2 ), and wish to model them
More informationLogit Regression and Quantities of Interest
Logit Regression and Quantities of Interest Stephen Pettigrew March 5, 2014 Stephen Pettigrew Logit Regression and Quantities of Interest March 5, 2014 1 / 59 Outline 1 Logistics 2 Generalized Linear Models
More informationCount and Duration Models
Count and Duration Models Stephen Pettigrew April 2, 2014 Stephen Pettigrew Count and Duration Models April 2, 2014 1 / 61 Outline 1 Logistics 2 Last week s assessment question 3 Counts: Poisson Model
More informationPrecept Five: Model Diagnostics for Binary Outcome Models and Ordered Probit
Precept Five: Model Diagnostics for Binary Outcome Models and Ordered Probit Rebecca Johnson March 8th, 2017 1 / 43 Outline Replication check-in: questions; advice on constructive feedback Follow-up topic
More informationCount and Duration Models
Count and Duration Models Stephen Pettigrew April 2, 2015 Stephen Pettigrew Count and Duration Models April 2, 2015 1 / 1 Outline Stephen Pettigrew Count and Duration Models April 2, 2015 2 / 1 Logistics
More informationAdvanced Quantitative Methods: limited dependent variables
Advanced Quantitative Methods: Limited Dependent Variables I University College Dublin 2 April 2013 1 2 3 4 5 Outline Model Measurement levels 1 2 3 4 5 Components Model Measurement levels Two components
More informationGov 2001: Section 4. February 20, Gov 2001: Section 4 February 20, / 39
Gov 2001: Section 4 February 20, 2013 Gov 2001: Section 4 February 20, 2013 1 / 39 Outline 1 The Likelihood Model with Covariates 2 Likelihood Ratio Test 3 The Central Limit Theorem and the MLE 4 What
More information1 gamma.mixed: Mixed effects gamma regression
gamma.mixed: Mixed effects gamma regression Use generalized multi-level linear regression if you have covariates that are grouped according to one or more classification factors. Gamma regression models
More informationGOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching
GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching Mayya Komisarchik Harvard University April 13, 2016 1 Heartfelt thanks to all of the Gov 2001 TFs of yesteryear; this section draws heavily
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationdisc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez
disc choice5.tex; April 11, 2001 1 Lecture Notes on Discrete Choice Models Copyright, April 11, 2001 Jonathan Nagler 1 Topics 1. Review the Latent Varible Setup For Binary Choice ffl Logit ffl Likelihood
More informationNon-Conjugate Models and Grid Approximations. Patrick Lam
Non-Conjugate Models and Grid Approximations Patrick Lam Outline The Binomial Model with a Non-Conjugate Prior Bayesian Regression with Grid Approximations Outline The Binomial Model with a Non-Conjugate
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More information0.1 gamma.mixed: Mixed effects gamma regression
0. gamma.mixed: Mixed effects gamma regression Use generalized multi-level linear regression if you have covariates that are grouped according to one or more classification factors. Gamma regression models
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationPOLI 7050 Spring 2008 February 27, 2008 Unordered Response Models I
POLI 7050 Spring 2008 February 27, 2008 Unordered Response Models I Introduction For the next couple weeks we ll be talking about unordered, polychotomous dependent variables. Examples include: Voter choice
More informationReview of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models
Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general
More informationGOV 2001/ 1002/ E-2001 Section 3 Theories of Inference
GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference Solé Prillaman Harvard University February 11, 2015 1 / 48 LOGISTICS Reading Assignment- Unifying Political Methodology chs 2 and 4. Problem Set 3-
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationDiscrete Distributions
Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have
More informationZelig: Everyone s Statistical Software
Zelig: Everyone s Statistical Software Toward A Common Framework for Statistical Analysis & Development Kosuke Imai 1 Gary King 2 Olivia Lau 3 1 Department of Politics Princeton University 2 Department
More information0.1 weibull: Weibull Regression for Duration Dependent
0.1 weibull: Weibull Regression for Duration Dependent Variables Choose the Weibull regression model if the values in your dependent variable are duration observations. The Weibull model relaxes the exponential
More informationSingle-level Models for Binary Responses
Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =
More informationEconometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit
Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables
More informationNELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation
NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable
More informationLinear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model
Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables
More informationLogistic Regression. INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE
Logistic Regression INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE INFO-2301: Quantitative Reasoning 2 Paul and Boyd-Graber Logistic Regression
More informationGeneralized Models: Part 1
Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationGOV 2001/ 1002/ Stat E-200 Section 1 Probability Review
GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review Solé Prillaman Harvard University January 28, 2015 1 / 54 LOGISTICS Course Website: j.mp/g2001 lecture notes, videos, announcements Canvas: problem
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationModel Checking. Patrick Lam
Model Checking Patrick Lam Outline Posterior Predictive Distribution Posterior Predictive Checks An Example Outline Posterior Predictive Distribution Posterior Predictive Checks An Example Prediction Once
More informationCSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement
CSSS/STAT/SOC 321 Case-Based Social Statistics I Levels of Measurement Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University of Washington, Seattle
More informationIntroduction to Generalized Models
Introduction to Generalized Models Today s topics: The big picture of generalized models Review of maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical
More informationLecture #11: Classification & Logistic Regression
Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded
More informationPost-Estimation Uncertainty
Post-Estimation Uncertainty Brad 1 1 Department of Political Science University of California, Davis May 12, 2009 Simulation Methods and Estimation Uncertainty Common approach to presenting statistical
More informationRepresent processes and observations that span multiple levels (aka multi level models) R 2
Hierarchical models Hierarchical models Represent processes and observations that span multiple levels (aka multi level models) R 1 R 2 R 3 N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 N 9 N i = true abundance on a
More informationPSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata
PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata Consider the following two-level model random coefficient logit model. This is a Supreme Court decision making model,
More information22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression
22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then
More informationData-analysis and Retrieval Ordinal Classification
Data-analysis and Retrieval Ordinal Classification Ad Feelders Universiteit Utrecht Data-analysis and Retrieval 1 / 30 Strongly disagree Ordinal Classification 1 2 3 4 5 0% (0) 10.5% (2) 21.1% (4) 42.1%
More informationMixed Models for Longitudinal Ordinal and Nominal Outcomes
Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel
More informationGoals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model
Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model
More information0.1 poisson.bayes: Bayesian Poisson Regression
0.1 poisson.bayes: Bayesian Poisson Regression Use the Poisson regression model if the observations of your dependent variable represents the number of independent events that occur during a fixed period
More informationLogistic regression modeling the probability of success
Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationModel Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection
Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist
More informationNinth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"
Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric
More informationModels for Heterogeneous Choices
APPENDIX B Models for Heterogeneous Choices Heteroskedastic Choice Models In the empirical chapters of the printed book we are interested in testing two different types of propositions about the beliefs
More information0.1 normal.bayes: Bayesian Normal Linear Regression
0.1 normal.bayes: Bayesian Normal Linear Regression Use Bayesian regression to specify a continuous dependent variable as a linear function of specified explanatory variables. The model is implemented
More informationFrom Model to Log Likelihood
From Model to Log Likelihood Stephen Pettigrew February 18, 2015 Stephen Pettigrew From Model to Log Likelihood February 18, 2015 1 / 38 Outline 1 Big picture 2 Defining our model 3 Probability statements
More informationThe Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen
The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 23-24, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting
More informationWhy analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago
Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago hedeker@uic.edu www.uic.edu/ hedeker/long.html Efficiency: Armstrong & Sloan (1989, Amer
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationItem Response Theory for Conjoint Survey Experiments
Item Response Theory for Conjoint Survey Experiments Devin Caughey Hiroto Katsumata Teppei Yamamoto Massachusetts Institute of Technology PolMeth XXXV @ Brigham Young University July 21, 2018 Conjoint
More informationLogistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy
Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Logistic regression (v1) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 30 Regression methods for binary outcomes 2 / 30 Binary outcomes For the duration of this
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationComparing IRT with Other Models
Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More information2. We care about proportion for categorical variable, but average for numerical one.
Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is
More informationMultiple regression: Categorical dependent variables
Multiple : Categorical Johan A. Elkink School of Politics & International Relations University College Dublin 28 November 2016 1 2 3 4 Outline 1 2 3 4 models models have a variable consisting of two categories.
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationLatent class analysis and finite mixture models with Stata
Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y
More information12 Modelling Binomial Response Data
c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 9: Logistic regression (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 28 Regression methods for binary outcomes 2 / 28 Binary outcomes For the duration of this lecture suppose
More informationBinary Dependent Variable. Regression with a
Beykent University Faculty of Business and Economics Department of Economics Econometrics II Yrd.Doç.Dr. Özgür Ömer Ersin Regression with a Binary Dependent Variable (SW Chapter 11) SW Ch. 11 1/59 Regression
More informationISQS 5349 Spring 2013 Final Exam
ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices
More informationLogistic Regression. Advanced Methods for Data Analysis (36-402/36-608) Spring 2014
Logistic Regression Advanced Methods for Data Analysis (36-402/36-608 Spring 204 Classification. Introduction to classification Classification, like regression, is a predictive task, but one in which the
More informationChapter 5: Logistic Regression-I
: Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationIntroduction Fitting logistic regression models Results. Logistic Regression. Patrick Breheny. March 29
Logistic Regression March 29 Introduction Binary outcomes are quite common in medicine and public health: alive/dead, diseased/healthy, infected/not infected, case/control Assuming that these outcomes
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationAGEC 661 Note Fourteen
AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,
More informationGov Multiple Random Variables
Gov 2000-4. Multiple Random Variables Matthew Blackwell September 29, 2015 Where are we? Where are we going? We described a formal way to talk about uncertain outcomes, probability. We ve talked about
More informationChapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models
Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29 14.1 Regression Models
More informationGeneralized Linear Models
Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.
More informationLink to Paper. The latest iteration can be found at:
Link to Paper Introduction The latest iteration can be found at: http://learneconometrics.com/pdf/gc2017/collin_gretl_170523.pdf BKW dignostics in GRETL: Interpretation and Performance Oklahoma State University
More informationBinary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment
BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense
More informationPOLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II
POLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II Introduction Today we ll talk about interpreting MNL and CL models. We ll start with general issues of model fit, and then get to variable
More informationECON Introductory Econometrics. Lecture 11: Binary dependent variables
ECON4150 - Introductory Econometrics Lecture 11: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability
More information1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches
Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model
More informationInterpreting and using heterogeneous choice & generalized ordered logit models
Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2
More informationRandom Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution
Random Variable Theoretical Probability Distribution Random Variable Discrete Probability Distributions A variable that assumes a numerical description for the outcome of a random eperiment (by chance).
More informationEconometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur
Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 28 LOGIT and PROBIT Model Good afternoon, this is doctor Pradhan
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationVisualizing Inference
CSSS 569: Visualizing Data Visualizing Inference Christopher Adolph University of Washington, Seattle February 23, 2011 Assistant Professor, Department of Political Science and Center for Statistics and
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationGraduate Econometrics I: What is econometrics?
Graduate Econometrics I: What is econometrics? Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: What is econometrics?
More information