Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

Size: px
Start display at page:

Download "Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data"

Transcription

1 Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Eduardo Elias Ribeiro Junior 1 2 Walmes Marques Zeviani 1 Wagner Hugo Bonat 1 Clarice Garcia Borges Demétrio 2 John Hinde 3 1 Statistics and Geoinformation Laboratory (LEG-UFPR) 2 Department of Exact Sciences (ESALQ-USP) 3 School of Mathematics, Statistics and Applied Mathematics (NUI-Galway) 27th June 2018 jreduardo@usp.br edujrrib@gmail.com

2 Outline 1. Background 2. Reparametrization 3. Simulation study Final remarks Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 0

3 Background 1 Background Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 0

4 Count data Background Number of times an event occurs in the observation unit. Random variables that assume non-negative integer values. Let Y be a counting random variable, so that y = 0, 1, 2,... Examples in experimental researches: number of grains produced by a plant; number of fruits produced by a tree; number of insects on a particular cell; others. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 1

5 Background Poisson model and limitations GLM framework (Nelder & Wedderburn 1972) Provide suitable distribution for a counting random variables; Efficient algorithm for estimation and inference; Implemented in many software. Poisson model Relationship between mean and variance, E(Y) = Var(Y); Main limitations Overdispersion (more common), E(Y) < Var(Y) Underdispersion (less common), E(Y) > Var(Y) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 2

6 Background COM-Poisson distribution Probability mass function (Shmueli et al. 2005) takes the form Pr(Y = y λ, ν) = where λ > 0 and ν 0. Moments are not available in closed form; λ y (y!) ν Z(λ, ν), Z(λ, ν) = λ j (j!) j=0 ν, Expectation and variance can be closely approximated by E(Y) λ 1/ν ν 1 2ν and Var(Y) λ1/ν ν with accurate approximations for ν 1 or λ > 10 ν (Shmueli et al. 2005, Sellers et al. 2012). Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 3

7 Background COM-Poisson regression models Model definition Modelling the relationship between E(Y i ) and x i indirectly (Sellers & Shmueli 2010); Y i x i COM-Poisson(λ i, ν) η(e(y i x i )) = log(λ i ) = x i β Main goals Study distribution properties in terms of i) modelling real count data and ii) inference aspects. Propose a reparametrization in order to model the expectation of the response variable as a function of the covariate values directly. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 4

8 Reparametrization 2 Reparametrization Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 4

9 Reparametrization Reparametrized COM-Poisson Reparametrization Introduced new parameter µ, using the mean approximation µ = λ 1/ν ν 1 ( ) (ν 1) ν λ = µ + ; 2ν 2ν Precision parameter is taken on the log scale to avoid restrictions on the parameter space Probability mass function φ = log(ν) φ R; Replacing λ and ν as function of µ and φ in the pmf of COM-Poisson ( ) ye φ Pr(Y = y µ, φ) = µ + eφ 1 (y!) e φ 2e φ Z(µ, φ). Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 5

10 Reparametrization Study of the moments approximations µ 15 µ (a) φ 5 (b) φ Figure: Quadratic errors for the approximation of the (a) expectation and (b) variance. Dotted lines represent the restriction for suitable approximations given by Shmueli et al. (2005). Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 6

11 Reparametrization COM-Poisson µ distribution φ = 0.7 φ = 0.0 φ = 0.7 µ = 2 µ = 8 µ = P(Y = y) y Figure: Shapes of the COM-Poisson distribution for different parameter values. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 7

12 Reparametrization Properties of COM-Poisson distribution To explore the flexibility of the COM-Poisson distribution, we consider the follow indexes: Dispersion index: DI = Var(Y)/E(Y); Zero-inflation index: ZI = 1 + log Pr(Y = 0)/E(Y); Heavy-tail index: HT = Pr(Y = y + 1)/ Pr(Y = y), for y. These indexes are interpreted in relation to the Poisson distribution: over- (DI > 1), under- (DI < 1) and equidispersion (DI = 1); zero-inflation (ZI > 0) and zero-deflation (ZI < 0) and heavy-tail distribution for HT 1 when y. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 8

13 Reparametrization Properties of COM-Poisson distribution φ φ φ φ Mean Variance Dispersion index Zero inflation index Heavy tail index µ (a) µ (b) µ (c) y Figure: Indexes for COM-Poisson distribution. (a) Mean and variance relationship, (b d) dispersion, zero-inflation and heavy-tail indexes for different parameter values. Dotted lines represents the Poisson special case. (d) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 9

14 Reparametrization COM-Poisson µ regression models Let y i a set of independent observations from the COM-Poisson and x i = (x i1, x i2,..., x ip ) is a vector of known covariates, i = 1, 2,..., n. Model definition Modelling relationship between E(Y i ) and x i directly Y i x i COM-Poisson µ (µ i, φ) log(e(y i x i )) = log(µ i ) = x i β Log-likelihood function (l = l(β, φ y)) [ n ( ) ] l = e φ y i log µ i + eφ n 1 2e i=1 φ log(y i!) where µ i = exp(x i β) i=1 n i=1 log(z(µ i, φ)) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 10

15 Reparametrization Estimation and inference The estimation and inference is based on the method of maximum likelihood. Let θ = (β, φ) the model parameters. Parameter estimates are obtained by numerical maximization of the log-likelihood function (by BFGS algorithm); l( ˆθ) = max l(θ), θ R p+1 ; Standard errors for regression coefficients are obtained based on the observed information matrix; Var( ˆθ) = H 1, where H is the matrix of second partial derivatives at ˆθ; Confidence intervals for ˆµ i are obtained by delta method.. Var[g( ˆθ)] = G Var( ˆθ)G, where G = ( g/ β 1,..., g/ β p ) ; The Hessian matrix H is obtained numerically by finite differences. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 11

16 Simulation study 3 Simulation study Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 11

17 Simulation study Definitions on the simulation study Objective: assess the properties of maximum likelihood estimators and orthogonality in the reparametrized model; Simulation: we consider counts generated according a regression model with a continuous and categorical covariates and different dispersion scenarios. Algorithm 1: Steps in simulation study. for n {50, 100, 300, 1000} do set x 1 as a sequence, with n elements, between 0 and 1; set x 2 as a repetition, with n elements, of three categories; compute µ using µ = exp(β 0 + β 1 x 1 + β 21 x 21 + β 22 x 22 ); for φ { 1.6, 1.0, 0.0, 1.8} do repeat simulate y from COM-Poisson distribution with µ and φ parameters; fit COM-Poisson µ regression model to simulated y; get ˆθ = ( ˆφ, ˆβ 0, ˆβ 1, ˆβ 21, ˆβ 22 ); get confidence intervals for ˆθ based on the observed information matrix. until 1000 times; Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 12

18 Simulation study Definitions on the simulation study Level of categorical variable (x 2 ) a b c φ = φ = Average count Dispersion index 4 3 φ = 0 φ = Values of continous variable (x 1 ) Values of continous variable (x 1 ) Figure: Average counts (left) and dispersion indexes (right) for each scenario considered in the simulation study. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 13

19 Simulation study Bias of the estimators Sample size n=50 n=100 n=300 n= φ = 1.6 φ = 1 φ = 0 φ = 1.8 ^ β22 ^ β21 ^ β1 ^ β0 ^ φ Standardized Bias Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 14

20 Simulation study Coverage rate of the confidence intervals φ = 1.6 φ = 1 φ = 0 φ = 1.8 ^ φ ^ β0 ^ β ^ β21 ^ β22 Coverage rate Sample size Figure: Coverage rate based on confidence intervals obtained by quadratic approximation for different sample sizes and dispersion levels. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 15

21 Simulation study Orthogonality property of the MLEs φ φ Original parametrization Proposed parametrization φ = φ = φ = φ = λ µ φ = φ = φ = φ = Figure: Deviance surfaces contour plots under original and proposed parametrization. The ellipses are confidence regions (90, 95 and 99%), dotted lines are the maximum likelihood estimates, and points are the real parameters used in the simulation. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 16

22 4 Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 16

23 Motivating data sets and data analysis Three illustrative examples of count data analysis are reported. Assessing toxicity of nitrofen in aquatic systems, an equidispersed example; Soil moisture and potassium doses on soybean culture, an overdispersed example; and Artificial defoliation in cotton phenology, an underdispersed example. In the data analysis, we consider the COM-Poisson model in the two forms (original and new parametrization) and the quasi-poisson regression model as alternative models for the standard Poisson regression model. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 17

24 Artificial defoliation in cotton phenology 4.1 Artificial defoliation in cotton phenology Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 17

25 Cotton bolls data Artificial defoliation in cotton phenology Aim: to assess the effects of five defoliation levels on the bolls produced at five growth stages; Design: factorial 5 5, with 5 replicates; Experimental unit: a plot with 2 plants; Factors: Artificial defoliation (des): Growth stage (est): Response variable: Total number of cotton bolls; Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 18

26 Model specification Artificial defoliation in cotton phenology Linear predictor: following Zeviani et al. (2014) log(µ ij ) = β 0 + β 1j def i + β 2j def 2 i i varies in the levels of artificial defoliation; j varies in the levels of growth stages. Alternative models: Poisson (µ ij ); COM-Poisson (λ ij = η(µ ij ), φ) COM-Poisson µ (µ ij, φ) Quasi-Poisson (var(y ij ) = σµ ij ) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 19

27 Parameter estimates Artificial defoliation in cotton phenology Table: Parameter estimates (Est) and ratio between estimate and standard error (SE). Poisson COM-Poisson COM-Poisson µ Quasi-Poisson Est Est/SE Est Est/SE Est Est/SE Est Est/SE φ, σ β β β β β β β β β β β LogLik AIC BIC Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 20

28 Fitted curves Artificial defoliation in cotton phenology vegetative Poisson COM Poisson COM Poisson µ Quasi Poisson flower bud blossom boll boll open Number of bolls produced Artificial defoliation level Figure: Scatterplots of the observed data and curves of fitted values with 95% confidence intervals as functions of the defoliation level for each growth stage. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 21

29 Soil moisture and potassium doses on soybean culture 4.2 Soil moisture and potassium doses on soybean culture Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 21

30 Soybean data Soil moisture and potassium doses on soybean culture Aim: evaluate the effects of potassium doses applied to soil in different soil moisture levels; Design: factorial 5 3 in a randomized complete block design (5 blocks); Experimental unit: a pot with a plant; Factors: Potassium fertilization dose (K): Soil moisture level (umid): Response variable: Total number of bean seeds per pot; Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 22

31 Model specification Soil moisture and potassium doses on soybean culture Linear predictor: based on descriptive analysis, log(µ ijk ) = β 0 + γ i + τ j + β 1 K k + β 2 K 2 k + β 3jK k i varies according the blocks; j varies in the levels of soil moisture; k varies in the levels of potassium fertilization. Alternative models: Poisson (µ ij ); COM-Poisson (λ ij = η(µ ij ), φ) COM-Poisson µ (µ ij, φ) Quasi-Poisson (var(y ij ) = σµ ij ) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 23

32 Parameter estimates Soil moisture and potassium doses on soybean culture Table: Parameter estimates (Est) and ratio between estimate and standard error (SE). Poisson COM-Poisson COM-Poisson µ Quasi-Poisson Est Est/SE Est Est/SE Est Est/SE Est Est/SE φ, σ β γ γ γ γ τ τ β β β β LogLik AIC BIC Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 24

33 Fitted curves Soil moisture and potassium doses on soybean culture Poisson COM Poisson COM Poisson µ Quasi Poisson moisture : 37.5% moisture : 50% moisture : 62.5% Number of grains per pot Potassium fertilization level Figure: Dispersion diagrams of been seeds counts as function of potassium doses and humidity levels with fitted curves and confidence intervals (95%). Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 25

34 Assessing toxicity of nitrofen in aquatic systems 4.3 Assessing toxicity of nitrofen in aquatic systems Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 25

35 Nitrofen data Assessing toxicity of nitrofen in aquatic systems Aim: measure the reproductive toxicity of the herbicide nitrofen on a species of zooplankton (Ceriodaphnia dubia); Design: completely randomized design, with 10 replicates; Experimental unit: zooplankton animal; Factors: herbicide nitrofen dose (dose); Response variable: Total number of live offspring; Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 26

36 Model specification Assessing toxicity of nitrofen in aquatic systems Linear predictors: Linear: log(µ i ) = β 0 + β 1 dose i, Quadratic: log(µ i ) = β 0 + β 1 dose i + β 2 dose 2 i and Cubic: log(µ i ) = β 0 + β 1 dose i + β 2 dose 2 i + β 3dose 3 i. Alternative models: Poisson (µ ij ); COM-Poisson (λ ij = η(µ ij ), φ) COM-Poisson µ (µ ij, φ) Quasi-Poisson (var(y ij ) = σµ ij ) Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 27

37 Likelihood ratio tests Assessing toxicity of nitrofen in aquatic systems Table: Model fit measures and comparisons between linear predictors. Poisson np l AIC 2(diff l) diff np P(> χ 2 ) Linear Quadratic E 16 Cubic E 02 COM-Poisson np l AIC 2(diff l) diff np P(> χ 2 ) ˆφ Linear Quadratic E Cubic E COM-Poisson µ np l AIC 2(diff l) diff np P(> χ 2 ) ˆφ Linear Quadratic E Cubic E Quasi-Poisson np QDev AIC F diff np P(> F) ˆσ Linear Quadratic E Cubic E Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 28

38 Assessing toxicity of nitrofen in aquatic systems Parameter estimates and fitted values Table: Parameter estimates (Est) and ratio between estimate and standard error (SE). Poisson COM-Poisson COM-Poisson µ Quasi-Poisson Est Est/SE Est Est/SE Est Est/SE Est Est/SE β β β β Number of live offspring Poisson COM Poisson COM Poisson µ Quasi Poisson Nitrofen concentration level Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 29

39 4.4 Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 29

40 Additional results To compare the computational times on the two parametrizations we repeat the fitting 50 times. Time (seconds) 1.5e e e e e+09 Cotton (under) COM Poisson COM Poisson µ Nitrofen (equi) COM Poisson COM Poisson µ 1.0e e e+10 Soybean (over) COM Poisson COM Poisson µ Figure: Computational times to fit the models under original and reparametrized versions based on the fifty repetitions. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 30

41 Final remarks 5 Final remarks Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 30

42 Final remarks Concluding remarks Summary Over/under-dispersion needs caution; COM-Poisson is a suitable choice for these situations; The proposed reparametrization, COM-Poisson µ has some advantages: Simple transformation of the parameter space; Leads to the orthogonality of the parameters (seen empirically); Full parametric approach; Empirical correlation between the estimators was practically null; Faster for fitting; Allows interpretation of the coefficients directly (like GLM-Poisson model). Future work Simulation study to assess model robustness against distribution miss specification; Assess theoretical approximations for Z(λ, ν) (or Z(µ, φ)), in order to avoid the selection of sum s upper bound; Propose a mixed GLM based on the COM-Poisson µ model. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 31

43 Notes Full-text article is available on arxiv All codes (in R) and source files are available on GitHub Acknowledgements National Council for Scientific and Technological Development (CNPq), for their support. Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 32

44 References Bibliography Nelder, J. A. & Wedderburn, R. W. M. (1972), Generalized Linear Models, Journal of the Royal Statistical Society. Series A (General) 135, Sellers, K. F., Borle, S. & Shmueli, G. (2012), The com-poisson model for count data: a survey of methods and applications, Applied Stochastic Models in Business and Industry 28(2), Sellers, K. F. & Shmueli, G. (2010), A flexible regression model for count data, Annals of Applied Statistics 4(2), Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S. & Boatwright, P. (2005), A useful distribution for fitting discrete data: Revival of the Conway-Maxwell-Poisson distribution, Journal of the Royal Statistical Society. Series C: Applied Statistics 54(1), Zeviani, W. M., Ribeiro Jr, P. J., Bonat, W. H., Shimakura, S. E. & Muniz, J. A. (2014), The Gamma-count distribution in the analysis of experimental underdispersed data, Journal of Applied Statistics pp Clarice Garcia Borges Demétrio Reparametrization of COM-Poisson Regression Models Slide 33

Extended Poisson Tweedie: Properties and regression models for count data

Extended Poisson Tweedie: Properties and regression models for count data Extended Poisson Tweedie: Properties and regression models for count data Wagner H. Bonat 1,2, Bent Jørgensen 2,Célestin C. Kokonendji 3, John Hinde 4 and Clarice G. B. Demétrio 5 1 Laboratory of Statistics

More information

Extended Poisson-Tweedie: properties and regression models for count data

Extended Poisson-Tweedie: properties and regression models for count data Extended Poisson-Tweedie: properties and regression models for count data arxiv:1608.06888v2 [stat.me] 11 Sep 2016 Wagner H. Bonat and Bent Jørgensen and Célestin C. Kokonendji and John Hinde and Clarice

More information

The Gamma-count distribution in the analysis of. experimental underdispersed data

The Gamma-count distribution in the analysis of. experimental underdispersed data The Gamma-count distribution in the analysis of experimental underdispersed data arxiv:1312.2423v1 [stat.ap] 9 Dec 2013 Walmes Marques Zeviani 1, Paulo Justiniano Ribeiro Jr 1, Wagner Hugo Bonat 1, Silvia

More information

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion DATA DISPERSION: NOW YOU SEE IT... NOW YOU DON T Kimberly F. Sellers Department of Mathematics and Statistics Georgetown University Washington, DC 20057 kfs7@georgetown.edu Galit Shmueli Indian School

More information

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Royce A. Francis 1,2, Srinivas Reddy Geedipally 3, Seth D. Guikema 2, Soma Sekhar Dhavala 5, Dominique Lord 4, Sarah

More information

Approximating the Conway-Maxwell-Poisson normalizing constant

Approximating the Conway-Maxwell-Poisson normalizing constant Filomat 30:4 016, 953 960 DOI 10.98/FIL1604953S Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Approximating the Conway-Maxwell-Poisson

More information

Poisson regression 1/15

Poisson regression 1/15 Poisson regression 1/15 2/15 Counts data Examples of counts data: Number of hospitalizations over a period of time Number of passengers in a bus station Blood cells number in a blood sample Number of typos

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Multivariate Regression Models in R: The mcglm package

Multivariate Regression Models in R: The mcglm package Multivariate Regression Models in R: The mcglm package Prof. Wagner Hugo Bonat R Day Laboratório de Estatística e Geoinformação - LEG Universidade Federal do Paraná - UFPR 15 de maio de 2018 Introduction

More information

The LmB Conferences on Multivariate Count Analysis

The LmB Conferences on Multivariate Count Analysis The LmB Conferences on Multivariate Count Analysis Title: On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data Rahma ABID, C.C. Kokonendji & A. Masmoudi Email Address: rahma.abid.ch@gmail.com

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Flexiblity of Using Com-Poisson Regression Model for Count Data

Flexiblity of Using Com-Poisson Regression Model for Count Data STATISTICS, OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 6, June 2018, pp 278 285. Published online in International Academic Press (www.iapress.org) Flexiblity of Using Com-Poisson

More information

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58 Generalized Linear Mixed-Effects Models Copyright c 2015 Dan Nettleton (Iowa State University) Statistics 510 1 / 58 Reconsideration of the Plant Fungus Example Consider again the experiment designed to

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

arxiv: v1 [stat.ap] 9 Nov 2010

arxiv: v1 [stat.ap] 9 Nov 2010 The Annals of Applied Statistics 2010, Vol. 4, No. 2, 943 961 DOI: 10.1214/09-AOAS306 c Institute of Mathematical Statistics, 2010 A FLEXIBLE REGRESSION MODEL FOR COUNT DATA arxiv:1011.2077v1 [stat.ap]

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Generalized Linear Models and Extensions

Generalized Linear Models and Extensions Session 1 1 Generalized Linear Models and Extensions Clarice Garcia Borges Demétrio ESALQ/USP Piracicaba, SP, Brasil March 2013 email: Clarice.demetrio@usp.br Session 1 2 Course Outline Session 1 - Generalized

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1 Count models A classical, theoretical argument for the Poisson distribution is the approximation Binom(n, p) Pois(λ) for large n and small p and λ = np. This can be extended considerably to n approx Z

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression. Ryan Godwin. ECON University of Manitoba Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

The Conway Maxwell Poisson Model for Analyzing Crash Data

The Conway Maxwell Poisson Model for Analyzing Crash Data The Conway Maxwell Poisson Model for Analyzing Crash Data (Discussion paper associated with The COM Poisson Model for Count Data: A Survey of Methods and Applications by Sellers, K., Borle, S., and Shmueli,

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Bayesian density regression for count data

Bayesian density regression for count data Bayesian density regression for count data Charalampos Chanialidis 1, Ludger Evers 2, and Tereza Neocleous 3 arxiv:1406.1882v1 [stat.me] 7 Jun 2014 1 University of Glasgow, c.chanialidis1@research.gla.ac.uk

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results

Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Using Historical Experimental Information in the Bayesian Analysis of Reproduction Toxicological Experimental Results Jing Zhang Miami University August 12, 2014 Jing Zhang (Miami University) Using Historical

More information

Quasi-likelihood Scan Statistics for Detection of

Quasi-likelihood Scan Statistics for Detection of for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Poisson Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Poisson Regression 1 / 49 Poisson Regression 1 Introduction

More information

Modeling Overdispersion

Modeling Overdispersion James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in

More information

MSH3 Generalized linear model Ch. 6 Count data models

MSH3 Generalized linear model Ch. 6 Count data models Contents MSH3 Generalized linear model Ch. 6 Count data models 6 Count data model 208 6.1 Introduction: The Children Ever Born Data....... 208 6.2 The Poisson Distribution................. 210 6.3 Log-Linear

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

A strategy for modelling count data which may have extra zeros

A strategy for modelling count data which may have extra zeros A strategy for modelling count data which may have extra zeros Alan Welsh Centre for Mathematics and its Applications Australian National University The Data Response is the number of Leadbeater s possum

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III) Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial

More information

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models TRB Paper #11-2877 Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Instute

More information

Generalized Linear Models and Extensions

Generalized Linear Models and Extensions Session 1 1 Generalized Linear Models and Extensions Clarice Garcia Borges Demétrio ESALQ/USP Piracicaba, SP, Brasil email: Clarice.demetrio@usp.br March 2013 International Year of Statistics http://www.statistics2013.org/

More information

Chapter 4: Generalized Linear Models-II

Chapter 4: Generalized Linear Models-II : Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

A Useful Distribution for Fitting Discrete Data: Revival of the COM-Poisson

A Useful Distribution for Fitting Discrete Data: Revival of the COM-Poisson A Useful Distribution for Fitting Discrete Data: Revival of the COM-Poisson Galit Shmueli Department of Decision and Information Technologies, Robert H. Smith School of Business, University of Maryland,

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Analysis of Extra Zero Counts using Zero-inflated Poisson Models

Analysis of Extra Zero Counts using Zero-inflated Poisson Models Analysis of Extra Zero Counts using Zero-inflated Poisson Models Saranya Numna A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Mathematics and Statistics

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 Work all problems. 60 points are needed to pass at the Masters Level and 75

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Cinzia Viroli 1 joint with E. Bonafede 1, S. Robin 2 & F. Picard 3 1 Department of Statistical Sciences, University

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Approximate Median Regression via the Box-Cox Transformation

Approximate Median Regression via the Box-Cox Transformation Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The

More information

DELTA METHOD and RESERVING

DELTA METHOD and RESERVING XXXVI th ASTIN COLLOQUIUM Zurich, 4 6 September 2005 DELTA METHOD and RESERVING C.PARTRAT, Lyon 1 university (ISFA) N.PEY, AXA Canada J.SCHILLING, GIE AXA Introduction Presentation of methods based on

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

The combined model: A tool for simulating correlated counts with overdispersion. George KALEMA Samuel IDDI Geert MOLENBERGHS

The combined model: A tool for simulating correlated counts with overdispersion. George KALEMA Samuel IDDI Geert MOLENBERGHS The combined model: A tool for simulating correlated counts with overdispersion George KALEMA Samuel IDDI Geert MOLENBERGHS Interuniversity Institute for Biostatistics and statistical Bioinformatics George

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Empirical Bayes Analysis for a Hierarchical Negative Binomial Generalized Linear Model

Empirical Bayes Analysis for a Hierarchical Negative Binomial Generalized Linear Model International Journal of Contemporary Mathematical Sciences Vol. 9, 2014, no. 12, 591-612 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.413 Empirical Bayes Analysis for a Hierarchical

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI Analytics Software Beyond deterministic chain ladder reserves Neil Covington Director of Solutions Management GI Objectives 2 Contents 01 Background 02 Formulaic Stochastic Reserving Methods 03 Bootstrapping

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

What is a meta-analysis? How is a meta-analysis conducted? Model Selection Approaches to Inference. Meta-analysis. Combining Data

What is a meta-analysis? How is a meta-analysis conducted? Model Selection Approaches to Inference. Meta-analysis. Combining Data Combining Data IB/NRES 509 Statistical Modeling What is a? A quantitative synthesis of previous research Studies as individual observations, weighted by n, σ 2, quality, etc. Can combine heterogeneous

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Generalized linear mixed models (GLMMs) for dependent compound risk models

Generalized linear mixed models (GLMMs) for dependent compound risk models Generalized linear mixed models (GLMMs) for dependent compound risk models Emiliano A. Valdez joint work with H. Jeong, J. Ahn and S. Park University of Connecticut 52nd Actuarial Research Conference Georgia

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.

More information

Lecture 8. Poisson models for counts

Lecture 8. Poisson models for counts Lecture 8. Poisson models for counts Jesper Rydén Department of Mathematics, Uppsala University jesper.ryden@math.uu.se Statistical Risk Analysis Spring 2014 Absolute risks The failure intensity λ(t) describes

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Extending the Linear Model

Extending the Linear Model Extending the Linear Model G.Janacek School of Computing Sciences, UEA Part I generalized linear models 1 Contents I generalized linear models 1 0.1 Introduction................................... 4 0.1.1

More information

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory Danaïl Davidov under the supervision of Jean-Philippe Boucher Département de mathématiques Université

More information

Count and Duration Models

Count and Duration Models Count and Duration Models Stephen Pettigrew April 2, 2015 Stephen Pettigrew Count and Duration Models April 2, 2015 1 / 1 Outline Stephen Pettigrew Count and Duration Models April 2, 2015 2 / 1 Logistics

More information