maximum likelihood Maximum likelihood families nega2ve binomial ordered logit/probit

Size: px
Start display at page:

Download "maximum likelihood Maximum likelihood families nega2ve binomial ordered logit/probit"

Transcription

1 maximum likelihood Maximum likelihood families nega2ve binomial ordered logit/probit

2 The story so far... We have some data We want to predict it, using other data We venture a hypothesis We write that hypothesis as a mathema2cal func2on We fit the func2on to the data

3 y i Zink(u = α + βx i,v = γ)

4 y i Zink(u = α + βx i,v = γ) Pr(y i u, v) = Zink(u, v)

5 previous regressions Gaussian Binomial Poisson y i N (α + βx i, σ) mle2( y ~ dnorm( mean=a+b*x, sd=s ), start=list(a=mean(y),b=0,s=1) ) lm( y ~ x )! " y i! Binom 1/(1 + exp(! + "x i ),N mle2( y ~ dbinom( prob=1/(1+exp(a+b*x), size=n ), start=list(a=0,b=0) ) y i! Pois(! + "x i ) mle2( y ~ dpois( lambda=a+b*x ), start=list(a=mean(y),b=0) )

6 new regressions Nega2ve binomial common in ecology Ordered logit/probit common in economics/poli sci/sociology

7 negative binomial y i Nbinom(µ, n) Model of counts (like Poisson and Binomial) No upper bound (like Poisson) mu: mean n: dispersion (smaller means more dispersed)

8 negative binomial mu=2,size=10 mu=2,size=2 Frequency y mu=6,size=10 Frequency Frequency y mu=6,size=2 Frequency y y

9 Nbinom converges to Poisson, as n goes to infinity. nbinom, size=1 nbinom, size=10 nbinom, size=1000 Frequency y y poisson, same mean Frequency Frequency poisson, same mean Frequency Frequency y poisson, same mean Frequency y y y

10 negative binomial As a result, ecologists and social scien2sts almost always use nbinom as a flexible version of Poisson. nbinom some2mes called overdispersed Poisson

11 negative binomial Back to del Norte salamanders frequency salamander density

12 negative binomial Q: Poisson or neg binom bener? A: Let the data decide. mp <- mle2( d$salaman ~ dpois( lambda=a ), start=list(a=mean(d$salaman)) ) mnb <- mle2( d$salaman ~ dnbinom( mu=a, size=exp(n) ), start=list(a=mean(d$salaman),n=0) )

13 negative binomial frequency Poisson predic2on Neg binom predic2on salamander density

14 Some2mes we have ordered categories: cold,hot small,medium,large strongly disagree,disagree,agree,strongly agree 1,2,3,4,5,6,7 Not Gaussian, because discrete and bounded Not counts (binomial,poisson,neg binom), because distance between each category unknown

15 Usual way to model this problem is by slicing up a probability distribu2on into n slices, where n is the number of categories probability

16 Easiest thing to do is compute: Probability a value y is less than or equal to a category value k, for a set of cutoffs z. (Cumula2ve probability.) Pr(y i k k, z) = exp(z k ) Pr(y i k k, z) =p(z k )

17 p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) 4 categories: 1,2,3,4 need four z values, to define slices

18 p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) z_4 you get for free! Must choose value of z_4 so that Pr(y i k k =4, z) =1 (y_i is always less than or equal to 4, because 4 is the max.)

19 other z values are fit to the data Example: equally likely to observe 1,2,3 or 4: p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 )

20 Example: more likely to observe p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 )

21 Example: more likely to observe p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 )

22 Example: more likely to observe 3 and p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 )

23 How to choose those z values? If we use logis2c for probability then: p(z k ) = Pr(y i k k, z) = exp(z k ) If you know the p(z_k) you want, then you can use algebra to solve for z_k. z k = ln ( ) 1/p(z k ) 1

24 In prac2ce, we use maximum likelihood to find values of z_1,z_2,...,z_n. But s2ll important to understand what they correspond to, in the data.

25 An example: 1 4 ra2ngs of sa2sfac2on with course content > stem(y) The decimal point is at the

26 First define a func2on to make our lives easier: logit <- function(x) 1/(1+exp(x))

27 Now make a list of z values (just some star2ng values, for example): z <- c( 1, 0, -1, -Inf ) Probability y_i <= k (in 1 4) is: > logit(z) [1]

28 > logit(z) [1] Why logit(z)? Because we said so just need some (cumula2ve) probability distribu2on.

29 Now to fit our z values to the data, need to write a probability density func3on (like dbinom, dpois, dnbinom). This func2on takes a list of y values and parameters and returns the likelihood of each value, given the parameters.

30 making your own density function For example, write your own Poisson density: Pr(x λ) = λx exp( λ) x! my.dpois <- function( x, lambda ) { lambda^x * exp(-lambda) / factorial(x) }

31 making your own density function Produces the same results as the built in func2on. my.dpois <- function( x, lambda ) { lambda^x * exp(-lambda) / factorial(x) } > my.dpois(1,2) [1] > dpois(1,2) [1]

32 making your own density function Structure of a density func2on: values to compute likelihood of; must be called x because mle2 expects it. whatever parameters it uses a parameter that tells the func2on whether to return likelihood or log likelihood dsomename <- function( x, parameters, log=false ) { code that computes likelihood or log likelihood }

33 making your own density function my.dpois <- function( x, lambda, log=false ) { p <- lambda^x * exp(-lambda) / factorial(x) if (log==true) p <- log(p) p }

34 logit(z_1) p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) dorderedlogit(1,z) z <- c(z1,z2,z3,z4)

35 logit(z_1) logit(z_2) p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) dorderedlogit(1,z) dorderedlogit(2,z) z <- c(z1,z2,z3,z4)

36 logit(z_1) logit(z_2) logit(z_3) p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) dorderedlogit(1,z) dorderedlogit(2,z) dorderedlogit(3,z) z <- c(z1,z2,z3,z4)

37 logit(z_1) logit(z_2) logit(z_3) logit(z_4) p(z 1 ) p(z 2 ) p(z 3 ) p(z 4 ) dorderedlogit(1,z) dorderedlogit(2,z) dorderedlogit(3,z) dorderedlogit(4,z) z <- c(z1,z2,z3,z4)

38 values to compute likelihood of; must be called x because mle2 expects it. our list of z values a parameter that tells the func2on whether to return likelihood or log likelihood dorderlogit <- function( x, z, log=false ) { p <- logit( z[x] ) # prob y <= k nz <- c( Inf, z ) np <- logit( nz[x] ) # prob y <= k-1 p <- p - np # subtract to get likelihood of y==k if ( log==true ) p <- log(p) p }

39 Now we can fit, as usual: m <- mle2( y ~ dorderlogit( z=c(z1,z2,z3,-inf) ), start=list( z1=1, z2=0, z3=-1 ) ) Coefficients: z1 z2 z

40 Plokng the results: plot.new() frame() lines( c(0,1), c(.5,.5) ) lines( c( logit(coef(m)[1]), logit(coef(m)[1]) ), c(0,1) ) # z1 lines( c( logit(coef(m)[2]), logit(coef(m)[2]) ), c(0,1) ) # z2 lines( c( logit(coef(m)[3]), logit(coef(m)[3]) ), c(0,1) ) # z3 lines( c( logit(-inf), logit(-inf) ), c(0,1) ) # z4 = -Inf

41 That s nice, but we want to predict ordinal outcome with some other variable. Suppose we have the gender of each respondent. > y [1] [47] > f [1] [47]

42 Usual solu2on is to treat the z values as category specific intercepts and add a linear model to each: Pr(y i k k, z, β,x i ) = logit(z k + βx i )

43 Just the intercepts: > z <- c( 1, 0, -1, -Inf ) > logit( z ) [1] With covariate: > logit( z - 1 ) [1] > logit( z + 1 ) [1]

44 New density func2on: dorderlogit2 <- function( x, z, model, log=false ) { p <- logit( z[x] + model ) # prob y <= k } nz <- c( Inf, z ) np <- logit( nz[x] + model ) # prob y <= k-1 p <- p - np # subtract to get likelihood of y==k if ( log==true ) p <- log(p) p

45 New fit: m2 <- mle2( y ~ dorderlogit2( z=c(z1,z2,z3,-inf), model=b*f ), start=list( z1=1, z2=0, z3=-1, b=0 ) ) Coefficients: z1 z2 z3 b

46 Visualize as before: males females Coefficients: z1 z2 z3 b

47 Another way to visualize: probability Gender (female=1)

48 probability probability Gender (female=1) Gender (female=1)

49 You can compute propor2onal odds, as with regular logit. But now it is propor2onal change in cumula3ve probability. > # proportional cumulative odds calculation > z2 <- coef(m2)[2] > b <- coef(m2)[4] > # original odds > o1 <- logit(z2)/(1-logit(z2)) > # female odds > o2 <- logit(z2+b)/(1-logit(z2+b)) > # show same as exp(-b) > o2/o1 # proportional cumulative odds for category 2 z > exp(-b) b

50 The easy way is to use the polr() func2on, which does almost exactly what we just did manually. polr() just uses logit( z+b) where we used logit(z +b). library(mass) mpolr <- polr( as.ordered(y) ~ f ) Coefficients: f Intercepts:

51 polr() just uses logit( z+b) where we used logit(z+b). library(mass) mpolr <- polr( as.ordered(y) ~ f ) Coefficients: f Intercepts: mle2(minuslogl = y ~ dorderlogit2(z = c(z1, z2, z3, - Inf), model = b * f), start = list(z1 = 1, z2 = 0, z3 = -1, b = 0)) Coefficients: z1 z2 z3 b

52 Ordered logit lets you: Model ordered discrete data Derives histograms that change as predictor variables change Hard to interpret without visualizing

53 ordered _obit Other probability distribu2ons work fine probit => gaussian density ( probit is just the name sta2s2cians give to the cumula2ve normal.) A Tobit regression is not ordered at all it s a kind of censored gaussian model. We won t talk about it, but it s common in Econ and PoliSci.

Maximum Likelihood Exercises SOLUTIONS

Maximum Likelihood Exercises SOLUTIONS Maximum Likelihood Exercises SOLUTIONS Exercise : Frog covariates () Before we can fit the model, we have to do a little recoding of the data. Both ReedfrogPred$sie and ReedfrogPred$pred are text data,

More information

Data-analysis and Retrieval Ordinal Classification

Data-analysis and Retrieval Ordinal Classification Data-analysis and Retrieval Ordinal Classification Ad Feelders Universiteit Utrecht Data-analysis and Retrieval 1 / 30 Strongly disagree Ordinal Classification 1 2 3 4 5 0% (0) 10.5% (2) 21.1% (4) 42.1%

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

ZERO INFLATED POISSON REGRESSION

ZERO INFLATED POISSON REGRESSION STAT 6500 ZERO INFLATED POISSON REGRESSION FINAL PROJECT DEC 6 th, 2013 SUN JEON DEPARTMENT OF SOCIOLOGY UTAH STATE UNIVERSITY POISSON REGRESSION REVIEW INTRODUCING - ZERO-INFLATED POISSON REGRESSION SAS

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1 Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

Generalized Models: Part 1

Generalized Models: Part 1 Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

The Poisson Distribution

The Poisson Distribution The Poisson Distribution Mary Lindstrom (Adapted from notes provided by Professor Bret Larget) February 5, 2004 Statistics 371 Last modified: February 4, 2004 The Poisson Distribution The Poisson distribution

More information

Introduction to Generalized Models

Introduction to Generalized Models Introduction to Generalized Models Today s topics: The big picture of generalized models Review of maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical

More information

Class Notes. Examining Repeated Measures Data on Individuals

Class Notes. Examining Repeated Measures Data on Individuals Ronald Heck Week 12: Class Notes 1 Class Notes Examining Repeated Measures Data on Individuals Generalized linear mixed models (GLMM) also provide a means of incorporang longitudinal designs with categorical

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

STAT 526 Spring Final Exam. Thursday May 5, 2011

STAT 526 Spring Final Exam. Thursday May 5, 2011 STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

Non-Conjugate Models and Grid Approximations. Patrick Lam

Non-Conjugate Models and Grid Approximations. Patrick Lam Non-Conjugate Models and Grid Approximations Patrick Lam Outline The Binomial Model with a Non-Conjugate Prior Bayesian Regression with Grid Approximations Outline The Binomial Model with a Non-Conjugate

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Assumptions of Linear Model Homoskedasticity Model variance No error in X variables Errors in variables No missing data Missing data model Normally distributed error Error in

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

1. Logistic Regression, One Predictor 2. Inference: Estimating the Parameters 3. Multiple Logistic Regression 4. AIC and BIC in Logistic Regression

1. Logistic Regression, One Predictor 2. Inference: Estimating the Parameters 3. Multiple Logistic Regression 4. AIC and BIC in Logistic Regression Logistic Regression 1. Logistic Regression, One Predictor 2. Inference: Estimating the Parameters 3. Multiple Logistic Regression 4. AIC and BIC in Logistic Regression 5. Target Marketing: Tabloid Data

More information

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University.

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University. Panel GLMs Department of Political Science and Government Aarhus University May 12, 2015 1 Review of Panel Data 2 Model Types 3 Review and Looking Forward 1 Review of Panel Data 2 Model Types 3 Review

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Modeling Binary Outcomes: Logit and Probit Models

Modeling Binary Outcomes: Logit and Probit Models Modeling Binary Outcomes: Logit and Probit Models Eric Zivot December 5, 2009 Motivating Example: Women s labor force participation y i = 1 if married woman is in labor force = 0 otherwise x i k 1 = observed

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

36-309/749 Math Review 2014

36-309/749 Math Review 2014 36-309/749 Math Review 2014 The math content of 36-309 is not high. We will use algebra, including logs. We will not use calculus or matrix algebra. This optional handout is intended to help those students

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Advanced Quantitative Methods: limited dependent variables

Advanced Quantitative Methods: limited dependent variables Advanced Quantitative Methods: Limited Dependent Variables I University College Dublin 2 April 2013 1 2 3 4 5 Outline Model Measurement levels 1 2 3 4 5 Components Model Measurement levels Two components

More information

Generalized Multilevel Models for Non-Normal Outcomes

Generalized Multilevel Models for Non-Normal Outcomes Generalized Multilevel Models for Non-Normal Outcomes Topics: 3 parts of a generalized (multilevel) model Models for binary, proportion, and categorical outcomes Complications for generalized multilevel

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

STT 315 Problem Set #3

STT 315 Problem Set #3 1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses

More information

Chapter 11. Hypothesis Testing (II)

Chapter 11. Hypothesis Testing (II) Chapter 11. Hypothesis Testing (II) 11.1 Likelihood Ratio Tests one of the most popular ways of constructing tests when both null and alternative hypotheses are composite (i.e. not a single point). Let

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Economics Applied Econometrics II

Economics Applied Econometrics II Economics 217 - Applied Econometrics II Professor: Alan Spearot Email: aspearot@ucsc.edu Office Hours: 10AM-12PM Monday, 459 Engineering 2 TA: Bryan Pratt Email: brpratt@ucsc.edu Section times: Friday,

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

R Based Probability Distributions

R Based Probability Distributions General Comments R Based Probability Distributions When a parameter name is ollowed by an equal sign, the value given is the deault. Consider a random variable that has the range, a x b. The parameter,

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Assumptions of Linear Model Homoskedasticity Model variance No error in X variables Errors in variables No missing data Missing data model Normally distributed error GLM Error

More information

Statistics 572 Semester Review

Statistics 572 Semester Review Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Tobit and Selection Models

Tobit and Selection Models Tobit and Selection Models Class Notes Manuel Arellano November 24, 2008 Censored Regression Illustration : Top-coding in wages Suppose Y log wages) are subject to top coding as is often the case with

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

Generalized Linear Modeling - Logistic Regression

Generalized Linear Modeling - Logistic Regression 1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Logit Regression and Quantities of Interest

Logit Regression and Quantities of Interest Logit Regression and Quantities of Interest Stephen Pettigrew March 4, 2015 Stephen Pettigrew Logit Regression and Quantities of Interest March 4, 2015 1 / 57 Outline 1 Logistics 2 Generalized Linear Models

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

Unit 4 Probability. Dr Mahmoud Alhussami

Unit 4 Probability. Dr Mahmoud Alhussami Unit 4 Probability Dr Mahmoud Alhussami Probability Probability theory developed from the study of games of chance like dice and cards. A process like flipping a coin, rolling a die or drawing a card from

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).

More information

Proportional Odds Logistic Regression. stat 557 Heike Hofmann

Proportional Odds Logistic Regression. stat 557 Heike Hofmann Proportional Odds Logistic Regression stat 557 Heike Hofmann Outline Proportional Odds Logistic Regression Model Definition Properties Latent Variables Intro to Loglinear Models Ordinal Response Y is categorical

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Homework 1 Solutions

Homework 1 Solutions 36-720 Homework 1 Solutions Problem 3.4 (a) X 2 79.43 and G 2 90.33. We should compare each to a χ 2 distribution with (2 1)(3 1) 2 degrees of freedom. For each, the p-value is so small that S-plus reports

More information

ECE 510 Lecture 7 Goodness of Fit, Maximum Likelihood. Scott Johnson Glenn Shirley

ECE 510 Lecture 7 Goodness of Fit, Maximum Likelihood. Scott Johnson Glenn Shirley ECE 510 Lecture 7 Goodness of Fit, Maximum Likelihood Scott Johnson Glenn Shirley Confidence Limits 30 Jan 2013 ECE 510 S.C.Johnson, C.G.Shirley 2 Binomial Confidence Limits (Solution 6.2) UCL: Prob of

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Logit Regression and Quantities of Interest

Logit Regression and Quantities of Interest Logit Regression and Quantities of Interest Stephen Pettigrew March 5, 2014 Stephen Pettigrew Logit Regression and Quantities of Interest March 5, 2014 1 / 59 Outline 1 Logistics 2 Generalized Linear Models

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010

POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010 POLI 618 Notes Stuart Soroka, Department of Political Science, McGill University March 2010 These pages were written originally as my own lecture notes, but are now designed to be distributed to students

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics Linear, Generalized Linear, and Mixed-Effects Models in R John Fox McMaster University ICPSR 2018 John Fox (McMaster University) Statistical Models in R ICPSR 2018 1 / 19 Linear and Generalized Linear

More information

Applied Econometrics Lecture 1

Applied Econometrics Lecture 1 Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity

More information

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions Econ 513, USC, Department of Economics Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions I Introduction Here we look at a set of complications with the

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Econ 5410 (former 641) Econometrics I. Prof. J. Huston McCulloch

Econ 5410 (former 641) Econometrics I. Prof. J. Huston McCulloch Econ 5410 (former 641) Econometrics I Prof. J. Huston McCulloch Lecture 1 Introduction Chapter 1 Elementary Statistics Random Variables, distributions Mean, variance Appendi B.1 B.3. Levels of Econometrics

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

Review of Poisson Distributions. Section 3.3 Generalized Linear Models For Count Data. Example (Fatalities From Horse Kicks)

Review of Poisson Distributions. Section 3.3 Generalized Linear Models For Count Data. Example (Fatalities From Horse Kicks) Section 3.3 Generalized Linear Models For Count Data Review of Poisson Distributions Outline Review of Poisson Distributions GLMs for Poisson Response Data Models for Rates Overdispersion and Negative

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

Logistic Regression in R. by Kerry Machemer 12/04/2015

Logistic Regression in R. by Kerry Machemer 12/04/2015 Logistic Regression in R by Kerry Machemer 12/04/2015 Linear Regression {y i, x i1,, x ip } Linear Regression y i = dependent variable & x i = independent variable(s) y i = α + β 1 x i1 + + β p x ip +

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Maximum Likelihood Methods

Maximum Likelihood Methods Maximum Likelihood Methods Some of the models used in econometrics specify the complete probability distribution of the outcomes of interest rather than just a regression function. Sometimes this is because

More information

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information