Advanced Quantitative Methods: limited dependent variables

Size: px
Start display at page:

Download "Advanced Quantitative Methods: limited dependent variables"

Transcription

1 Advanced Quantitative Methods: Limited Dependent Variables I University College Dublin 2 April 2013

2

3 Outline Model Measurement levels

4 Components Model Measurement levels Two components of the model: y N(µ,σ 2 ) Stochastic µ = Xβ Systematic

5 Components Model Measurement levels Two components of the model: y N(µ,σ 2 ) Stochastic µ = Xβ Systematic Generalised version (not necessarily linear): y f(µ,α) Stochastic µ = g(x,β) Systematic (King 1998, 8)

6 Components Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Stochastic component: varies over repeated (hypothetical) observations on the same unit. Systematic component: varies across units, but constant given X. (King 1998, 8)

7 Uncertainty Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Two types of uncertainty: Estimation uncertainty: lack of knowledge about α and β; can be reduced by increasing n.

8 Uncertainty Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Two types of uncertainty: Estimation uncertainty: lack of knowledge about α and β; can be reduced by increasing n. Fundamental uncertainty: represented by stochastic component and exists independent of researcher.

9 Levels of measurement Model Measurement levels Discreet Continuous party choice Interval Ratio Categories in no particular order (examples in cells)

10 Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval Ratio Categories in a specific order (examples in cells)

11 Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval how likely to vote... temperature Ratio All values possible (examples in cells)

12 Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval how likely to vote... temperature Ratio deaths in war ideological distance All values possible, with a meaningful zero point (examples in cells)

13 Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - turnout - Interval how likely to vote... temperature Ratio deaths in war ideological distance Two categories, coded as 0 and 1 (examples in cells)

14 Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions.

15 Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions. E.g. regressing whether someone voted on a set of independent variables will give somewhat reasonable estimates (see previous homework), but using these estimates to calculate predictions leads to meaningless predictions.

16 Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions. E.g. regressing whether someone voted on a set of independent variables will give somewhat reasonable estimates (see previous homework), but using these estimates to calculate predictions leads to meaningless predictions. Furthermore, estimating limited dependent variable data with a linear model implies serious heteroscedasticity.

17 Outline Logistic regression Interpretation Probit regression

18 models Logistic regression Interpretation Probit regression models have a dependent variable consisting of two categories.

19 models Logistic regression Interpretation Probit regression models have a dependent variable consisting of two categories. For example, Vote on a particular law Turning out in an election Approval in a referendum Bankrupt or not

20 Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i

21 Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i Variance of stochastic component: V(y i ) = π i (1 π i )

22 Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i Variance of stochastic component: V(y i ) = π i (1 π i ) Systematic: π i = g(x i β) = 1 1+e x iβ

23 Logistic distribution Logistic regression Interpretation Probit regression 1/(1 + exp( x)) x x 0.5x x

24 Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1

25 Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1 show limited effect at extremes

26 Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1 show limited effect at extremes are a smooth, monotone translation from linear prediction X i β

27 Loglikelihood function Logistic regression Interpretation Probit regression Systematic part: π i = 1 1+e x iβ Stochastic part: P(y i = 1) = π y i i (1 π i ) 1 y i n f(y π i ) = π y i i (1 π i ) 1 y i i=1 l(y) = log = n i=1 π y i i (1 π i ) 1 y i n y i logπ i + i=1 n (1 y i )log(1 π i ) i=1

28 Logistic regression in R Logistic regression Interpretation Probit regression model <- glm(y ~ x1 + x2 + x3, data=data, family=binomial(link="logit")) summary(model)

29 Exercise Logistic regression Interpretation Probit regression Using the asiabaro.dta data set, estimate a logistic regression explaining abstention in elections by trust in the government, satisfaction with democracy, gender, urbanisation, and preference for democracy.

30 Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant

31 Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant (e.g. at the mean, median, etc.).

32 Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant (e.g. at the mean, median, etc.). Because the link function g(xβ) is not linear in this case (but instead g(xβ) = 1 1+e Xβ ), the effect of X on y depends on all X.

33 Graphical: example in R Logistic regression Interpretation Probit regression m <- glm(y ~ x1 + x2 + x3, family=binomial(link="logit")) curve(1/(1+exp(-(coef(m)[1] + coef(m)[2]*x + coef(m)[3]*mean(x2) + coef(m([4]*mean(x3)))), from=0, to=1)

34 Graphical: example Logistic regression Interpretation Probit regression jitter(vote)

35 Fitted values Logistic regression Interpretation Probit regression A second useful way of interpreting logit regression coefficients is by describing typical cases or interesting examples.

36 Fitted values Logistic regression Interpretation Probit regression A second useful way of interpreting logit regression coefficients is by describing typical cases or interesting examples. Party Amount P(Vote = 1) 95% C.I. Republican $ [.69,.95] Democrat $ [.27,.65] Republican $ [.84,.99] Democrat $ [.44,.95]

37 First differences Logistic regression Interpretation Probit regression Basic idea is to calculate and present: ˆπ = g(xβ) g(x β), whereby X differs only in one variable from X.

38 First differences Logistic regression Interpretation Probit regression Basic idea is to calculate and present: ˆπ = g(xβ) g(x β), whereby X differs only in one variable from X. Variable Values Diff 95% C.I. Party Rep, Dem -.34 [-.53,-.11] Amount $10000, $ [.03,.20]

39 Derivatives Logistic regression Interpretation Probit regression δˆπ X j = β jˆπ(1 ˆπ)

40 Derivatives Logistic regression Interpretation Probit regression δˆπ X j = β jˆπ(1 ˆπ) Hence a quick method to interpret logit coefficients is to divide them by 4 to get the slope at ˆπ =.5.

41 Presentation Logistic regression Interpretation Probit regression Bottom line: it is much better to present interpretable and understandable inferences, with an indication of the level of uncertainty, than to present simply estimated coefficients.

42 Presentation Logistic regression Interpretation Probit regression E.g. An increase in automobile support for a Republican senator from $10000 to $20000 in total increases his or her probability to vote for the Corporate Average Fuel Economy standard bill by 11%, give or take 7%, all else equal.

43 Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities

44 Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc.

45 Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin.

46 Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin. The closer this is to the 45 degree line, the more accurate the predictions of the probabilities.

47 Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin. The closer this is to the 45 degree line, the more accurate the predictions of the probabilities. In R, you can use the custom function plot.predicted().

48 Logistic regression Interpretation Probit regression m <- glm(..., family = binomial(link = "logit")) phat <- predict(m, type = "response") pgroups <- cut(phat, seq(0, 1,.1)) hist(phat, xlim = c(0,1)) barplot(prop.table(table(pgroups)), ylim = c(0,1))

49 Predictions check Logistic regression Interpretation Probit regression Proportion of outcomes positive Predicted probability

50 Predictions check Logistic regression Interpretation Probit regression 0 s correctly predicted s correctly predicted

51 Exercise Logistic regression Interpretation Probit regression Using the logistic previous logistic regression: 1 What is the slope of the regression line with respect to preference for democracy at ˆπ =.5? 2 Calculate predicted differences for urban vs rural respondent. 3 Calculate first differences for urban vs rural and female vs male. 4 Plot predicted probabilities as a function of preference for democracy.

52 Threshold models Logistic regression Interpretation Probit regression Imagine we have a latent, unobserved variable: y f(µ) µ = Xβ

53 Threshold models Logistic regression Interpretation Probit regression Imagine we have a latent, unobserved variable: y f(µ) µ = Xβ With the following observation mechanism: { 1 if yi > 0 y i = 0 if yi 0.

54 Threshold models Logistic regression Interpretation Probit regression If f(µ i ) is the standardised logistic distribution, we replicate the logit model.

55 Threshold models Logistic regression Interpretation Probit regression If f(µ i ) is the standardised logistic distribution, we replicate the logit model. f(x i β) = e y i µ i (1+e y i µ i ) 2

56 Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model.

57 Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation.

58 Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation. P(y i = 1 β,x i ) = Φ(x i β), whereby Φ(x) is the cumulative standard normal distribution (i.e. the surface between and x under a normal distribution with µ = 0 and σ 2 = 1).

59 Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation. P(y i = 1 β,x i ) = Φ(x i β), whereby Φ(x) is the cumulative standard normal distribution (i.e. the surface between and x under a normal distribution with µ = 0 and σ 2 = 1). pnorm(xb); predict(m, type="response")

60 Threshold models Logistic regression Interpretation Probit regression

61 Loglikelihood function Logistic regression Interpretation Probit regression Systematic part: π i = Φ(x i β), where Φ(x) is the cumulative standard normal distribution. Stochastic part: P(y i = 1) = π y i i (1 π i ) 1 y i n f(y π i ) = π y i i (1 π i ) 1 y i l(y) = i=1 n y i logπ i + i=1 n (1 y i )log(1 π i ) i=1

62 Exercise Logistic regression Interpretation Probit regression Repeat the previous estimation using probit instead of logit. 1 Plot predicted probabilities with respect to preference for democracy. 2 Evaluate the predictive performance of this model. 3 Add suitdemoc as a dependent variable to the model. Does the predictive performance improve?

63 Outline

64 Ordered data Models where the dependent variable is categorical, and the categories are in a particular order.

65 Ordered data Models where the dependent variable is categorical, and the categories are in a particular order. We can thus take a similar latent variable approach.

66 Ordered probit y i N(µ i,1) µ i = x i β

67 Ordered probit y i N(µ i,1) µ i = x i β With the following observation mechanism: y i = j if τ j 1 y i τ j

68 Ordered probit y i N(µ i,1) µ i = x i β With the following observation mechanism: y i = j if τ j 1 y i τ j Or, alternatively, when we use dummy variables for each category: { 1 if τ j 1 yi τ j y ij = 0 otherwise.

69 Ordered probit y ij = { 1 if τ j 1 y i τ j 0 otherwise.

70 Ordered probit y ij = { 1 if τ j 1 y i τ j 0 otherwise. P(y i = j β,x i ) = Φ(τ j x i β) Φ(τ j 1 x i β)

71 Ordered probit

72 Ordered probit To estimate in R: library(mass) m <- polr(y ~ x1 + x2, method="probit")

73 Ordered probit To estimate in R: library(mass) m <- polr(y ~ x1 + x2, method="probit") To get predicted probabilities: pnorm(m$zeta[2] - xb) - pnorm(m$zeta[1] - xb) predict(m, type="probs")

74 Loglikelihood function Assuming y i can take the values 0,1,2,...,p. P(y i = 0) = Φ(τ 1 x i β) P(y i = p) = Φ(x i β τ p ) P(y i = j 0 < y i < p) = Φ(τ j+1 x i β) Φ(τ j x i β)

75 Loglikelihood function Assuming y i can take the values 0,1,2,...,p. P(y i = 0) = Φ(τ 1 x i β) P(y i = p) = Φ(x i β τ p ) P(y i = j 0 < y i < p) = Φ(τ j+1 x i β) Φ(τ j x i β) l(y) = log(φ(τ 1 x i β))+ log(φ(x i β τ p )) y i =0 y i =p + 0<y i <p log(φ(τ j+1 x i β) Φ(τ j x i β))

76 Exercise Using the asiabaro.dta data set, estimate a model explaining interest in politics by gender, education, and reliance on fate. 1 Perform t-tests for each of the independent variables. Use: se <- sqrt(diag(vcov(m))) 2 Check the standard errors of the τ values are the categories clearly separated? 3 Calculate the predicted probability for males and females of the respondent being very interested in politics.

77 Outline

78 Multinomial model Here the dependent variable consists of multiple categories, without particular order.

79 Multinomial model Here the dependent variable consists of multiple categories, without particular order. A latent variable approach will thus not work.

80 Multinomial model Here the dependent variable consists of multiple categories, without particular order. A latent variable approach will thus not work. The probabilities for a particular case to be in any of those categories still has to be one.

81 Multinomial logit R function: multinom, in package nnet.

82 Multinomial logit R function: multinom, in package nnet. m <- multinom(y ~ x1 + x2) summary(m)

83 Multinomial logit R function: multinom, in package nnet. m <- multinom(y ~ x1 + x2) summary(m) predict(m, type="probs")

84 Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k

85 Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k To get predicted probabilities for a new dataset (X ) - e.g. one that is constant on all variables but one:

86 Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k To get predicted probabilities for a new dataset (X ) - e.g. one that is constant on all variables but one: m <- multinom(y ~ x1 + x2 + x3 + x4) Xb <- X %*% t(coef(m)) denominator <- 1 - rowsums(exp(xb)) probs <- exp(xb) / denominator baseline <- 1 - rowsums(p) p <- cbind(baseline, p)

87 Loglikelihood function 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k

88 Loglikelihood function 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k l(y) = n p p I(y i = j)x i β j log i=1 j=0 j=0 e x iβ j

89 Exercise Using data set asiabaro.dta, selecting only data from Taiwan, explain party choice by urbanisation, gender, age, and trust in the police. Estimate a multinomial model and interpret the results.

90 Outline

91 models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent

92 models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible)

93 models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible) No upper limit

94 models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible) No upper limit Limited by time, place, or both

95 Poisson regression Stochastic: y i Poisson(λ i )

96 Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ

97 Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k!

98 Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k! Note that there is no parameter for the variance. This leads to regular underestimation of the variance (overdispersion, most common) or overestimation of the variance (underdispersion).

99 Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k! Note that there is no parameter for the variance. This leads to regular underestimation of the variance (overdispersion, most common) or overestimation of the variance (underdispersion). glm(y ~ x1 + x2 + x3, family=poisson)

100 Poisson regression The exponential of the coefficients can be interpreted in a multiplicative sense. E.g. if the coefficient of x 1 is 0.012, then e = implies that an increase of x 1 by 1 increases y by 1.2%.

101 Estimating overdispersion The estimated overdispersion of a Poisson model is 1 n k n ( y i ŷ i sd(ŷ i ) )2 = 1 n k i n ( y i e X iβ e X i β )2, i which has a χ 2 n k distribution.

102 Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 )

103 Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i )

104 Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e xiβ = E(y i ) Variance: V(y X) = φσ 2

105 Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i ) Variance: V(y X) = φσ 2 When σ 2 approaches 1, the negative binomial approaches the Poisson distribution.

106 Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i ) Variance: V(y X) = φσ 2 When σ 2 approaches 1, the negative binomial approaches the Poisson distribution. library(mass) glm.nb(y ~ x1 + x2 + x3)

Multiple regression: Categorical dependent variables

Multiple regression: Categorical dependent variables Multiple : Categorical Johan A. Elkink School of Politics & International Relations University College Dublin 28 November 2016 1 2 3 4 Outline 1 2 3 4 models models have a variable consisting of two categories.

More information

Data Analytics for Social Science

Data Analytics for Social Science Data Analytics for Social Science Johan A. Elkink School of Politics & International Relations University College Dublin 17 October 2017 Outline 1 2 3 4 5 6 Levels of measurement Discreet Continuous Nominal

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Homework 1 Solutions

Homework 1 Solutions 36-720 Homework 1 Solutions Problem 3.4 (a) X 2 79.43 and G 2 90.33. We should compare each to a χ 2 distribution with (2 1)(3 1) 2 degrees of freedom. For each, the p-value is so small that S-plus reports

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Logit Regression and Quantities of Interest

Logit Regression and Quantities of Interest Logit Regression and Quantities of Interest Stephen Pettigrew March 4, 2015 Stephen Pettigrew Logit Regression and Quantities of Interest March 4, 2015 1 / 57 Outline 1 Logistics 2 Generalized Linear Models

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Logit Regression and Quantities of Interest

Logit Regression and Quantities of Interest Logit Regression and Quantities of Interest Stephen Pettigrew March 5, 2014 Stephen Pettigrew Logit Regression and Quantities of Interest March 5, 2014 1 / 59 Outline 1 Logistics 2 Generalized Linear Models

More information

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez disc choice5.tex; April 11, 2001 1 Lecture Notes on Discrete Choice Models Copyright, April 11, 2001 Jonathan Nagler 1 Topics 1. Review the Latent Varible Setup For Binary Choice ffl Logit ffl Likelihood

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1 Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Part III Limited Dependent Variable Models As of Jan 30, 2017 1 Background 2 Binary Dependent Variable The Linear Probability

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

Advanced Quantitative Methods: ordinary least squares

Advanced Quantitative Methods: ordinary least squares Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the

More information

Ch 6: Multicategory Logit Models

Ch 6: Multicategory Logit Models 293 Ch 6: Multicategory Logit Models Y has J categories, J>2. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. In R, we will fit these models using the

More information

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables

More information

Can a Pseudo Panel be a Substitute for a Genuine Panel?

Can a Pseudo Panel be a Substitute for a Genuine Panel? Can a Pseudo Panel be a Substitute for a Genuine Panel? Min Hee Seo Washington University in St. Louis minheeseo@wustl.edu February 16th 1 / 20 Outline Motivation: gauging mechanism of changes Introduce

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Post-Estimation Uncertainty

Post-Estimation Uncertainty Post-Estimation Uncertainty Brad 1 1 Department of Political Science University of California, Davis May 12, 2009 Simulation Methods and Estimation Uncertainty Common approach to presenting statistical

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1

GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1 GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1 Anton Strezhnev Harvard University March 23, 2016 1 These section notes are heavily indebted to past Gov 2001 TFs

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Models for Heterogeneous Choices

Models for Heterogeneous Choices APPENDIX B Models for Heterogeneous Choices Heteroskedastic Choice Models In the empirical chapters of the printed book we are interested in testing two different types of propositions about the beliefs

More information

Logistic & Tobit Regression

Logistic & Tobit Regression Logistic & Tobit Regression Different Types of Regression Binary Regression (D) Logistic transformation + e P( y x) = 1 + e! " x! + " x " P( y x) % ln$ ' = ( + ) x # 1! P( y x) & logit of P(y x){ P(y

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

Generalized linear models

Generalized linear models CHAPTER 6 Generalized linear models 6.1 Introduction Generalized linear modeling is a framework for statistical analysis that includes linear and logistic regression as special cases. Linear regression

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

Applied Econometrics Lecture 1

Applied Econometrics Lecture 1 Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Maximum Likelihood and. Limited Dependent Variable Models

Maximum Likelihood and. Limited Dependent Variable Models Maximum Likelihood and Limited Dependent Variable Models Michele Pellizzari IGIER-Bocconi, IZA and frdb May 24, 2010 These notes are largely based on the textbook by Jeffrey M. Wooldridge. 2002. Econometric

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information Part VI Regression with Qualitative Information As of Oct 17, 2017 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving

More information

Visualizing Inference

Visualizing Inference CSSS 569: Visualizing Data Visualizing Inference Christopher Adolph University of Washington, Seattle February 23, 2011 Assistant Professor, Department of Political Science and Center for Statistics and

More information

12 Modelling Binomial Response Data

12 Modelling Binomial Response Data c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual

More information

Visualizing Inference

Visualizing Inference CSSS 569: Visualizing Data Visualizing Inference Christopher Adolph University of Washington, Seattle February 23, 2011 Assistant Professor, Department of Political Science and Center for Statistics and

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Lecture-19: Modeling Count Data II

Lecture-19: Modeling Count Data II Lecture-19: Modeling Count Data II 1 In Today s Class Recap of Count data models Truncated count data models Zero-inflated models Panel count data models R-implementation 2 Count Data In many a phenomena

More information

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching Mayya Komisarchik Harvard University April 13, 2016 1 Heartfelt thanks to all of the Gov 2001 TFs of yesteryear; this section draws heavily

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

12 Generalized linear models

12 Generalized linear models 12 Generalized linear models In this chapter, we combine regression models with other parametric probability models like the binomial and Poisson distributions. Binary responses In many situations, we

More information

Lecture 41 Sections Mon, Apr 7, 2008

Lecture 41 Sections Mon, Apr 7, 2008 Lecture 41 Sections 14.1-14.3 Hampden-Sydney College Mon, Apr 7, 2008 Outline 1 2 3 4 5 one-proportion test that we just studied allows us to test a hypothesis concerning one proportion, or two categories,

More information

Practice Examination # 3

Practice Examination # 3 Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single

More information

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017 Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Generalized Linear Modeling - Logistic Regression

Generalized Linear Modeling - Logistic Regression 1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Loglinear models. STAT 526 Professor Olga Vitek

Loglinear models. STAT 526 Professor Olga Vitek Loglinear models STAT 526 Professor Olga Vitek April 19, 2011 8 Can Use Poisson Likelihood To Model Both Poisson and Multinomial Counts 8-1 Recall: Poisson Distribution Probability distribution: Y - number

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Splitting a predictor at the upper quarter or third and the lower quarter or third

Splitting a predictor at the upper quarter or third and the lower quarter or third Splitting a predictor at the upper quarter or third and the lower quarter or third Andrew Gelman David K. Park July 31, 2007 Abstract A linear regression of y on x can be approximated by a simple difference:

More information

Generalized linear models

Generalized linear models Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Chapter 11: Models for Matched Pairs

Chapter 11: Models for Matched Pairs : Models for Matched Pairs Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

[Chapter 6. Functions of Random Variables]

[Chapter 6. Functions of Random Variables] [Chapter 6. Functions of Random Variables] 6.1 Introduction 6.2 Finding the probability distribution of a function of random variables 6.3 The method of distribution functions 6.5 The method of Moment-generating

More information

Generalised Linear Mixed Models

Generalised Linear Mixed Models University of Edinburgh November 4, 2014 ANCOVA Bradley-Terry models MANCOVA Meta-analysis Multi-membership models Pedigree analysis: animal models Phylogenetic analysis: comparative approach Random Regression

More information

Introduction To Logistic Regression

Introduction To Logistic Regression Introduction To Lecture 22 April 28, 2005 Applied Regression Analysis Lecture #22-4/28/2005 Slide 1 of 28 Today s Lecture Logistic regression. Today s Lecture Lecture #22-4/28/2005 Slide 2 of 28 Background

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit

GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit Solé Prillaman Harvard University March 25, 2015 1 / 56 LOGISTICS Reading Assignment- Becker and Kennedy (1992), Harris and Zhao

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Assumptions of Linear Model Homoskedasticity Model variance No error in X variables Errors in variables No missing data Missing data model Normally distributed error Error in

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

CHAPTER 5. Logistic regression

CHAPTER 5. Logistic regression CHAPTER 5 Logistic regression Logistic regression is the standard way to model binary outcomes (that is, data y i that take on the values 0 or 1). Section 5.1 introduces logistic regression in a simple

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Count and Duration Models

Count and Duration Models Count and Duration Models Stephen Pettigrew April 2, 2015 Stephen Pettigrew Count and Duration Models April 2, 2015 1 / 1 Outline Stephen Pettigrew Count and Duration Models April 2, 2015 2 / 1 Logistics

More information

Machine Learning. Lecture 3: Logistic Regression. Feng Li.

Machine Learning. Lecture 3: Logistic Regression. Feng Li. Machine Learning Lecture 3: Logistic Regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2016 Logistic Regression Classification

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Regression Discontinuity Designs

Regression Discontinuity Designs Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational

More information

Estimated Precision for Predictions from Generalized Linear Models in Sociological Research

Estimated Precision for Predictions from Generalized Linear Models in Sociological Research Quality & Quantity 34: 137 152, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 137 Estimated Precision for Predictions from Generalized Linear Models in Sociological Research TIM FUTING

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

MULTIVARIATE DISTRIBUTIONS

MULTIVARIATE DISTRIBUTIONS Chapter 9 MULTIVARIATE DISTRIBUTIONS John Wishart (1898-1956) British statistician. Wishart was an assistant to Pearson at University College and to Fisher at Rothamsted. In 1928 he derived the distribution

More information

Applied Linear Statistical Methods

Applied Linear Statistical Methods Applied Linear Statistical Methods (short lecturenotes) Prof. Rozenn Dahyot School of Computer Science and Statistics Trinity College Dublin Ireland www.scss.tcd.ie/rozenn.dahyot Hilary Term 2016 1. Introduction

More information