Entropy Coe cient of Determination and Its Application

Size: px
Start display at page:

Download "Entropy Coe cient of Determination and Its Application"

Transcription

1 Entropy Coe cient of Determination and Its Application Nobuoki Eshima Department of Biostatistics, Faculty of Medicine, Oita University, Oita , Japan. The objective of this seminar is to introduce the entropy coe cient of determination (ECD) for measuring the explanatory or predictive power of GLMs and to consider how ECD is used in data analysis. In the rst section, the classical regression and GLM frameworks are compared, and properties of GLMs concerning entropy are discussed. First, the information of an event and the entropy of a random variable are explained, and the Kullback- Leibler information that describes the di erence between two distributions is treated. Second, the log odds ratio and the mean in the GLM are considered from a view point of entropy. ECD is interpreted as the ratio of variation of a response variable explained by the explanatory variables, and is compared with some other explanatory power measures with respect to the following properties: (i) interpretability; (ii) being the multiple correlation coe cient or the coe cient of determination in normal linear regression models; (iii) entropy-based property; (iv) applicability 1

2 to all GLMs; in addition to these, it may be appropriate for a measure to have the following property: (v) monotonicity in the complexity of the linear predictor. In Section 2, rst the asymptotic properties of the maximum likelihood estimator (MLE) of ECD are discussed. The con dence interval of ECD is considered on the basis of an approximate normality of non-central chi square distribution. Second, in canonical link GLMs the contributions of factors are treated according to the decomposition of ECD. Numerical examples are also given. 1 Coe cient of determination for generalized linear models 1.1 Information Theory Let X be a categorical variable with categories = fc 1 ; C 2 ; :::; C K g, in which follows the categories are formally described as = f1; 2; :::; Kg. Then, the information of X = k is de ned by I (X = k) = log 1 Pr(X=k). (1.1) In the above information, the bottom of the logarithm is e and the unit is nat. If the bottom is 2, the unit is bit. In this seminar, the bottom of the logarithm is e. The mean of the above information, which is called entropy, is de ned by 2

3 H(X) P K k=1 Pr (X = k) I (X = k) = P K k=1 Pr (X = k) log 1 Pr(X=k). (1.2) Entropy is a measure of uncertainty in random variable X or sample space. Let p k = Pr (X = k) (k = 1; 2; :::; K). Then, we have the following theorem. Theorem 1.1. Let p = fp 1 ; p 2 ; :::; p K g and q = fq 1 ; q 2 ; :::; q K g be two distributions. Then, P K k=1 p k log p k P K k=1 p k log q k. (1.3) Proof: P K k=1 p k log p k P K k=1 p k 1 P K k=1 p k log q k = P K q k p k = 0. k=1 p k log p k q k = P K k=1 p k log q k p k * log x 1 x; x = q k p k The equation holds if and only if p k = q k (k = 1; 2; :::; K). From (1.3) it follows that H (p) = P K k=1 p k log p k P K k=1 p k log q k, where H (p) implies the entropy of distribution p. Setting q k = 1 K (k = 1; 2; :::; K), we have H (p) P K k=1 p k log 1 K = log K. The following quantity is referred to as the Kullback-Leibler (KL) information or divergence. 3

4 D(pjjq) P K k=1 p k log p k q k ( 0). (1.4) This information is interpreted as the di erence or loss of information using distribution q instead of true distribution p. Example 1.1. Let X B N (3; 1 4 ) and Let Y Pr(Y = k) = 1 4 (k = 1; 2; 3; 4). Then, we have D(XjjY ) = 1 8 log 1= =4 8 log 1= =4 8 log 3= =4 8 log 3=8 1=4 = 3 4 ln ln 2 = 0: The above quantity is the di erence between B N 3; 1 2 and the uniform distribution on f1; 2; 3; 4g. The reciprocal KL information is D(Y jjx) = log 1= =8 4 log 1=4 3=8 = 0:143 8: In this case, D(XjjY ) 6= D(Y jjx). For continuous distributions, the KL information can be de ned similarly as Z D (f(x)jjg(x)) f(x) log f(x) dx ( 0), (1.5) g(x) where f(x) and g(x) are density functions. Example 1.2. Let f(x) N( 1 ; 2 ) and g(x) N( 2 ; 2 ). Then, D (f(x)jjg(x)) = ( 1 2 )2 2 2 (= D (g(x)jjf(x))). 4

5 1.2 Entropy in GLMs Let X and Y be a p 1 explanatory variable vector and a response variable, respectively, and let f(yjx) be the conditional probability or density function of Y given X = x. The function f(yjx) is assumed to be a member of the following exponential family of distributions: f(yjx) = exp + c(y; ') ; (1.6) y b() a(') where and ' are parameters, and a('), b() (> 0) and c(y; ') are speci c functions. This is a random component. Let T = ( 1 ; 2 ; :::; p ) T : For a link function h(u) (link component) and the linear predictor = T x (systematic component), the conditional expectation of Y given X = x is described as follows: E(Y jx = x) = db() d = h 1 ( T x): (1.7) Let us assume that the link function h(u) is a strictly increasing di erentiable function. The conditional variance of response Y given X = x is as follows: Var(Y jx = x) = a(') d2 b() d 2. From this, a(') relates to the dispersion of Y, so it is referred to as a dispersion parameter. Since is a function of = T x, for simpli cation the function is denoted by = ( T x). Let us consider the following log odds ratio: 5

6 logor(x; x 0 ; y; y 0 ) = log f(yjx)=f(y 0jx) = log f(yjx)f(y 0jx 0 ) f(yjx 0 )=f(y 0 jx 0 ) f(y 0 jx)f(yjx 0 ) = 1 a(') (y y 0) ( T x) ( T x 0 ) ; (1.8) where x 0 and y 0 are baselines of X and Y, respectively. The above log odds ratio is viewed as an inner product of ( T x) and y with respect to the dispersion parameter a('). Since logor(x; x 0 ; y; y 0 ) = f log f(y 0 jx)) ( log f(yjx))g f( log f(y 0 jx 0 )) ( log f(yjx 0 ))g ; the log odds ratio (1.8) is the change of the uncertainty of response Y in explanatory variable vector X, and as seen in the above log odds ratio, predictor T x is related to the reduction of uncertainty of response Y through link functions. For levels of the factor vector X = x 1 ; x 2 ; : : : ; x K ; the averages of Y, and Y are de ned as follows: E() = P K k=1 (T x k ) K ; E(Y ) = P K k=1 E(Y jx=x k) K ; and E(Y ) = P K k=1 E(Y jx=x k)( T x k ) K : Remark 1.1. Let n k be sample sizes at factor levels x k (k = 1; 2; :::; K), and n = P K k=1 n k. Then, the above averages (expectations) are replaced by the weighted ones, e.g. E() = P K n k k=1 n ( T x k ). 6

7 When we take the expectation of the inner product (1.8), we have Cov(;Y ) a(') + (E() (T x 0 ))(E(Y ) y 0 ) a('), where Cov(; Y ) =E(Y ) E()E(Y ): For y 0 = E(Y ), the quantity becomes Cov(;Y ) a(') (1.9) and it can be viewed as the average change of uncertainty of response variable Y in explanatory variable vector X. We have Theorem 1.1 and Corollary 1.1. Theorem 1.1. In the GLM with (1.6), the quantity (1.9) is expressed by the Kullback-Leibler Information: Cov(;Y ) a(') = P K k=1 KL(f(y);f(yjx k)) K (1.10) where f(y) = P K k=1 f(yjx k) K and KL(f(y); f(yjx k )) = R f(yjx k ) log f(yjxk ) dy + R f(y) log f(y) f(y) f(yjx k dy ) = D (f(yjx k )jjf(y)) + D (f(y)jjf(yjx k )) (k = 1; 2; : : : ; K): Corollary 1.1. In the GLM with (1.6), the covariance of Y and ( T X) is nonnegative, and it is zero if and only if X and Y are independent, i.e. f(yjx k ) = f(y) (k = 1; 2; :::; K). Example 1.3. An ordinary linear regression model is Y = + T x + e, 7

8 where e is a normal error with mean 0 and variance 2. Let f(yjx) be a normal density function with mean and variance 2. Then, the random component is f(yjx) = p 1 exp (y ) y 1 p 2 = exp 2 + y2 log In this expression, setting =, a(') = 2, b() = and c(y; ') = y2 2 2 p log 2 2, and for linear predictor = + T x and link function =, the normal linear regression model can be viewed as a GLM. Example 1.4. Let Y be a binary variable with p = Pr(Y = 1) (= ). The random component is Then, f(yjx) = p y (1 p) 1 y = fp(1 p) 1 g y (1 p) n = exp y log o. p + log(1 p) 1 p = log p, a(') = 1, b() = log(1 p) and c(y; ') = 0. 1 p Setting link function h(p) = log p, we have the following logistic 1 p regression (logit) model: 8

9 f(yjx) = exp f( + x)y + log(1 p)g = expf(+x)yg 1+exp(+x). For the logit model with explanatory variables X 1 ; X 2 ; :::; X p, the model is expressed as f(yjx) = expf(+p p i=1 i x i)yg 1+exp(+ P p i=1 i x i). (1.11) 1.3 Basic Predictive Power Measures for GLMs In the sense of the previous discussion, it may be appropriate to assess the predictive or explanatory power of factors based on entropy. In the GLM framework, predictive power measures are compared, and the advantage of ECD is mentioned. First, some predictive power measures for regression models are brie y discussed. In general regression models, for variation function D, the predictive power can be measured as follows: RD 2 = D(Y ) D(Y jx), (1.12) D(Y ) where D (Y ) and D(Y jx) imply a variation function of Y and a conditional or error variation function given X, respectively (Efron, 1978; Agresti, 1986; Korn & Simon, 1991). Predictive power measures based on the likelihood function (Theil, 1970; Goodman, 1971) are made according to powers of the likelihood function, i.e. R 2 L = 1 l(0) l() 2 n, 9

10 where l () is the likelihood function and n is the sample size. Let R be the multiple correlation coe cient in ordinary linear regression model. The above measure becomes R 2 in ordinary linear regression cases and increases with model complexity; however it is di cult to interpret the measure in general (Zhen & Agresti, 2000). The entropy measure (Haberman, 1982) for categorical responses is based on entropy of Y, H(Y ), and the conditional entropy H(Y jx), i.e. R 2 E = H(Y ) H(Y jx). H(Y ) The above measure is included in (1.12). The correlation coe cient of response Y and its conditional expectation given factor X, Corr(E (Y jx) ; Y ), is recommended for measuring the predictive power of GLMs, because the correlation measure can be applied to all types of GLMs except polytomous response cases (Zheng & Agresti, 2000). This measure is the correlation coe cient between response Y and the regression on X, referred to as the regression correlation coe cient. With respect to entropy, R 2 L and R2 E may be suitable for GLMs. By considering the average change of log odds ratio, Eshima & Tabata (2007) proposed the following basic predictive power measure: m P P (Y jx) Cov(;Y ) a('). (1.13) The above measure is expressed by the Kullback-Leibler information (Eshima & Tabata, 2007), and it is increasing in Cov(; Y ) and decreasing in a('). Since 10

11 Var(Y jx = x) = a(') d2 b() d 2, function a(') may be interpreted as the error variation of Y in entropy, i.e. residual randomness of Y given X. From this, Cov(; Y ) can be interpreted as the explained entropy of Y by X. Hence, measure (1.13) is the ratio of the explained variation of Y for the error variation of Y in entropy. Entropy variation function D E is de ned by D E (Y ) Cov(; Y ) + a('). Since is a function of X, Cov(; Y jx) = 0. From this, the conditional entropy variation of Y given X is D E (Y jx) a('). Considering this, ECD is de ned as follows: ECD(X; Y ) = Cov(;Y ) Cov(;Y )+a(') = m P P (Y jx) m P P (Y jx)+1 = D E(Y ) D E (Y jx) D E (Y ). (1.14) From (1.14), ECD is included in (1.12), and ECD can be viewed as the proportion of explained variation of Y in entropy. For the normal linear regression model, it follows that ECD(X; Y ) = R 2. Let = 1 ; 2 ; :::; p T be a regression coe cient vector. For canonical links = P p i=1 ix i, ECD (1.14) and the entropy correlation coe cient (ECC) are decomposed as follows: 11

12 ECD(X; Y ) = ECorr(X; Y ) = P p i=1 i Cov(X i;y ) P p i=1 i Cov(X i;y )+a('), (1.15) P p i=1 i Cov(X i;y ) p Var() p Var(Y ) : None of the predictive power measures except ECD and ECC can make the above type of decomposition for GLMs with canonical links. In addition to the desirable properties of measures for GLMs, (i) to (v), decomposability such as (1.15) may also be a suitable property for a predictive power measure, because the relative importance of X i may be assessed by i Cov(X i ; Y ). Moreover, ECD is scale-invariant in GLMs with multivariate responses; however ECC is not. In this respect, ECD is superior to ECC. Table 1.1 Properties of ve predictive power measures in GLMs Property Corr(E (Y jx) ; Y ) RL 2 RE 2 ECC ECD (i) interpretability (ii) R or R 2 (iii) entropy (iv) all GLMs (v) monotonicity 4 4 (vi) decomposition Table 1.1 summarizes the properties of the ve measures mentioned above. Measures Corr(E (Y jx) ; Y ) and ECorr(X; Y ) may have property (v) in most of cases; however it is not easy to prove the property in general. From this table, ECD(X; Y ) is the most desirable predictive power measure for GLMs. 12

13 Example 1.5. Let X and Y be p and q dimensional random vectors, respectively; and the joint distribution is assumed to be a (p + q) variate normal distribution with the following covariance matrix: 0 1 B XX Y X XY Y Y C A. Let the inverse of the above matrix be denoted by B XX Y X XY Y Y C A. Then, = Y X X and a(') = 1, so we have From this, m P P (Y jx) = tr Y X XY. ECD(X; Y ) = try X XY. tr Y X XY +1 Let i (i = 1; 2; :::; minfp; qg) be the squared canonical correlation coe cients. Then, ECD(X; Y ) = P minfp;qg i i=1 1 i P. minfp;qg i i= i For q = 1, ECD is reduced to the usual coe cient of determination 1 (= R 2 ). Example 1.6. In the logistic regression model (1.11), we have ECD(X; Y ) = P p i=1 i Cov(X i;y ) P p i=1 i Cov(X i;y )+1. 13

14 2 Application of ECD 2.1 Asymptotic Property of the ML Estimator of ECD Let f(y) and g (x) be the marginal density or probability function of Y and X, respectively. Then, the association measure is expressed as m P P (Y jx) = RR f(yjx)g(x) log + RR f(y)g(x) log f(yjx) f(y) dxdy f(y) f(yjx) dxdy. (2:1) If Y is discrete, the integral is replaced with the summation. If X is not random and take values x k (k = 1; 2; :::; K), the above measure can be modi ed as follows: m P P (Y jx) = P K n k R k=1 f(yjxk n ) log f(yjxk ) dy + R f(y) log f(y) f(y) f(yjx k dy, (2:2) ) where n k are sample sizes at levels x k (k = 1; 2; :::; K), and n = P K k=1 n k. We have the following theorem. Theorem 2.1. Let [m P P (Y jx), b f(yjx k ), and b f(y) be the ML estimators of m P P (Y jx), f(yjx k ), and f(y), respectively. If the null model, i.e. = 0, holds, the ML estimator of (2.2) multiplied by sample size n, i.e. n [m P P (Y jx) = P K k=1 n k R bf(yjxk ) log bf(yjxk ) bf(y) dy + R b f(y) log bf(y) bf(yjx k ) dy, is asymptotically distributed according to the chi-square distribution with degrees of freedom p as the sample sizes n i tend to in nity. 14

15 Proof. For simplicity of the discussion, the theorem is proven in the case where Y is a polytomous variable with levels or categories f1; 2; : : : ; Jg. Let jjk = Pr (Y = jjx = x k ) and j = Pr (Y = j); and let b jjk and b j be the ML estimators of the jjk and j, respectively. Under the null hypothesis and for su ciently large n k, we have where n [m P P (Y jx) = P K k=1 n k P J j=1 n b jjk log b jjk b j + b j log b j b jjk o = P K P J (n k b jjk n k b j) 2 k=1 j=1 2n k b jjk + P K P J (n k b jjk n k b j) 2 k=1 j=1 2n k b j + o(n) = P K P J (n k b jjk n k b j) 2 k=1 j=1 n k b jjk + o(n), o(n) n P! 0 (n! 1) : Hence, the theorem follows. When the explanatory variables X are random, the following theorem holds similarly. Theorem 2.2. If the null model, i.e. = 0, holds, the ML estimator of (2.1) multiplied by sample size n, i.e. n [m P P (Y jx) RR = n bf(yjx)bg(x) bf(yjx) log dxdy + RR b bf(y) bf(y) f(y)bg(x) log dxdy bf(yjx) is asymptotically distributed according to the chi-square distribution with degrees of freedom p as the sample size n tends to in nity. 15

16 Since ECD(X; Y ) = Cov(;Y )=a(') = Cov(;Y )=a(')+1 m P P (Y jx) m P P (Y jx)+1, the ML estimator of ECD(X; Y ) is [ECD(X; Y ) = [m P P (Y jx) [m P P (Y jx)+1. From this, we can test the hypothesis ECD(X; Y ) = 0 based on the following statistic: 2 = n [m P P (Y jx) = n d Cov(;Y ) a(b'). (2.3) The above statistic is asymptotically distributed according to a non-central chi square distribution with non-centrality = n m P P (Y jx) and degrees of freedom p. Let c = 1 + Statistic 2 c and p+ 0 = p + 2. p+2 is asymptotically distributed according to the chi square distribution with degrees of freedom 0. As 0 becomes large, the chi square distribution tends to a normal distribution with mean 0 and variance 2 0. From this, for su ciently large sample size n, statistic (2.3) 16

17 2 n = m P P (Y jx) is asymptotically normally distributed with mean c0 and variance 2c2 0 n n 2 (Patnaik, 1949). For su ciently large n, we have c 0 n 2c 2 0 n 2 m P P (Y jx), 2c n m P P (Y jx). From this, the asymptotic standard error (ASE) of [m P P (Y jx) is q 2c n [m P P (Y jx). Example 2.1. Agresti (2002, pp ) analyzed the beetle mortality data with complementary log-log model, in which beetles were exposed to gaseous carbon disul de at various concentrations, and the numbers of beetles killed after the 5-hour-exposure were observed (Table 2.1). Let X be gaseous carbon disul de concentration, and and the model parameters. Then, for the complementary log-log link we have = log 1 expf exp(+x)g : expf exp(+x)g For the maximum likelihood estimates = 39:52 (ASE = 3:23) and = 22:01 (ASE = 1:80), ECD is calculated as ECD = 0:475 (ASE = 0:024): The ECD indicates that 47:5% of variation of response variable Y in entropy is explained by the gaseous carbon disul de concentration X. 17

18 Table 2.1. Beetles Killed after Exposure to Carbon Disul de Log Dose No. of Beetles No. of Beetles Killed 1: : : : : : : : GLMs with canonical links Most regression analyses with GLMs are performed using canonical links. Let X = (X 1 ; X 2 ; :::; X p ) T be a p 1 factor or explanatory variable vector; let Y be a response variable; let = 1 ; 2 ; :::; p T be a regression parameter vector; and let = P p i=1 ix i be the canonical links. Then, ECD is decomposed as ECD(X; Y ) = P p i=1 i Cov(X i;y ) P p j=1 j Cov(X j;y )+a('). (2.4) The above decomposition consists of components that relate to regression coe cients i, and the contribution of X i on Y may be de ned by using 18

19 i Cov(X i ; Y ). If X i are independent or the experimental design is a multiway layout experiment, the contribution ratio of X i on the response is de ned by CR (X i ) = i Cov(X i;y ) P p k=1 k Cov(X k;y ). In general, X i are correlated or the experiment model has higher-order interactions. Then, the contribution ratio of X i is de ned by CR (X i ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ). Example 2.2. The present discussion is applied to the ordinary twoway layout experimental design model. Let X 1 and X 2 be factors with levels f1; 2; : : : ; Ig and f1; 2; : : : ; Jg, respectively. Then, the linear predictor is a function of (X 1 ; X 2 ) = (i; j), i.e. = i + j + () ij. For model identi cation, the following constraints are placed on these parameters: P I i=1 i = P J j=1 j = P I i=1 () ij = P J j=1 () ij = 0. Let 8 >< 1 (X k = i) X ki = >: 0 (X k 6= i) (k = 1; 2): Then, dummy vectors 19

20 X 1 = (X 11 ; X 12 ; : : : ; X 1I ) T and X 2 = (X 21 ; X 22 ; : : : ; X 2J ) T are identi ed with factors X 1 and X 2, respectively. From this, the systematic component of the above model can be written as follows: = T X 1 + T X 2 + T X 1 X 2 ; where = ( 1 ; 2 ; : : : ; I ) T, = ( 1 ; 2 ; : : : ; J ) T, = (() 11 ; () 12 ; : : : ; () 1J ; : : : ; () IJ ) T, and X 1 X 2 = (X 11 X 21; X 11 X 22 ; : : : ; X 11 X 2J ; : : : ; X 1I X 2J ) T. Let Cov(X 1 ; Y ), Cov(X 2 ; Y ) and Cov(X 1 X 2 ; Y ) are covariance matrices. Then the total e ect of X 1 and X 2 is m P P (Y j (X 1 ; X 2 )) = Cov(;Y ) 2 = trt Cov(X 1 ;Y ) + trt Cov(X 2 ;Y ) + trt Cov(X 1 X 2 ;Y ) = 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P I IJ j=1 i=1 ()2 ij The above three terms are referred to as the main e ect of X 1, that of X 2 and the interactive e ect, respectively. Then, ECD is calculated as follows: ECD((X 1 ; X 2 ) ; Y ) = In this case, 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P I IJ j=1 i=1 ()2 ij 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P. I IJ j=1 i=1 ()2 ij +2 20

21 CR (X 1 ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ) = 1 P I I i=1 2 i, Cov(;Y ) CR (X 2 ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ) = 1 P J J j=1 2 j. Cov(;Y ) The rest 1 CR (X 1 ) CR (X 2 ) = 1 IJ P J j=1 P I i=1 ()2 ij Cov(;Y ) is due to the e ect of the interaction. Table 2.2. Length of Home Visit in minutes by Public Health Nurses by Nurse s Age Group and Type of Patient Factor X 2 (nurse s age group) (years old) Factor X 1 (Type of Patients) (20 to 29) (30 to 39) (40 to 49) (50 and over) 1 (Cardiac) (Cancer) (C.V.A) (Tuberculosis)

22 Table 2.2 shows two-way layout experiment data in a study of length of time spent on individual home visits by public health nurses (Daniel (1999), pp ). In the example, analysis of the e ects of factors, i.e. the type of patient and the age of a nurse, on the nurses behavior will be signi cant. Let Y be length of home visit, and let factors X 1 and X 2 denote the type of a patient and the age of a nurse, respectively. The results of two-way analysis of variance are shown in Table 2.3. The main and interactive e ects of factors are signi cant. In this case, levels of factor vector X = (X 1 ; X 2 ) are (i; j) (i = 1; 2; 3; 4; 5; j = 1; 2; 3; 4; 5). Although factors X 1 and X 2 are independent (orthogonal), the model has interaction terms between them. By using the present approach, the variance decomposition in Table 2.3 we have ECD((X 1 ; X 2 ) ; Y ) = 3226: :05+704: :55 = 0:796 (SE = 0:018). From this, 79.6% of entropy is explained by the two variables (factors). The contributions of the factors are calculated as follows: CR (X 1 ) = 3226: :55 = 0:502 and CR (X 2) = 1185: :55 = 0:184. The contribution of X 1 on Y is about three times greater than that of X 2. 22

23 Table 2.3. Analysis of variance of length of time spent on individual home visits by public health nurses Source SS df MS F p X : : 5 52: 641 0:000 X : : 02 19: 334 0:000 (X 1 X 2 ) 704: : 272 3: 831 0:000 Residual 1307: : 431 Total 6423:55 79 Example 2.3. Table 2.4 shows death penalty data to study the e ects of defendant s and victim s racial characteristics on whether persons convicted homicide received the death penalty (Agresti, pp ), and the data were analyzed with the following logit model f(yjx) = expf(+ d x d+ v x v)yg 1+exp(+ d x d + v x v), where X d and X v imply the defendant s and victim s race, i.e. 0= black and 1=white; the response death penalty Y takes the value 0=no or 1=yes. The estimated parameters were b = 3:596 (SE=0.507), b d = 0:868 (SE=0.367) and b v = 2:404 (SE=0.601) (Agresti, p. 201). From the results, the odds of death penalty for the white defendant given the victim s race is exp ( 0:868) = 0:420 times higher than that for the black defendant, and the odds of death penalty for the white victim given the defendant s race is exp (2:404) = 11: 067 times higher than that for black victim. Since the 23

24 predictive power measured with ECD is ECD = 0:036 (SE = 0:014), the e ects of the defendant s and victim s races on the death penalty are small. The contribution ratios of the explanatory variables on the response are calculated according to (2.4) as follows: CR (X d ) = 0:603 and CR (X v ) = 0:813: Table 2.4. Death Penalty Data Victims Race Defendant s Race Death Penalty Yes No White White Black Black White 0 16 Black Application to a generalized logit model A baseline-category logit model is considered. Let X 1 and X 2 be categorical factors that take levels f1; 2; : : : ; Ig and f1; 2; : : : ; Jg, respectively, and let Y be a categorical 8 response variable with levels f1; 2; : 8: : ; Kg. Let >< 1 (X a = i) >< 1 (Y = k) X ai = (a = 1; 2) and Y k = >: 0 (X a 6= i) >: 0 (Y 6= k). (2.5) Then, dummy variable vectors 24

25 X 1 = (X 11 ; X 12 ; : : : ; X 1I ) T ; X 2 = (X 21 ; X 22 ; : : : ; X 2J ) T and Y = (Y 1 ; Y 2 ; : : : ; Y K ) T are identi ed with factors X 1 ; X 2 and response Y, respectively. From this, the systematic component of the baseline-category logit model is assumed as follows: = + B (1) X 1 + B (2) X 2 ; where (1)11 (1)12 ::: (1)1I 2 (1)21 (1)22 ::: (1)2I =, B 6. (1) =, 7 6 ::: ::: ::: ::: K (1)K1 (1)K1 ::: (1)KI 2 3 (2)11 (2)12 ::: (2)1J (2)21 (2)22 ::: (2)2J and B (2) =. 6 ::: ::: ::: ::: (2)K1 (2)K1 ::: (2)KJ Then, the logit model is described as Pr(Y = yjx 1 ; x 2 ) = exp(yt B (1) x 1 +y T B (2) x 2 +y T ) Pz exp(zt B (1) x 1 +z T B (2) x 2 +z T ) ; where P z implies the summation over all z. In this model, we have 25

26 ECD(X; Y ) = trb (1)Cov(X 1 ;Y )+ trb (2) Cov(X 2 ;Y ) trb (1) Cov(X 1 ;Y )+ trb (2) Cov(X 2 ;Y )+1. In the following example, the ECD approach is demonstrated. Example 2.4. The data for an investigation of factors in uencing the primary food choice of alligators (Table 2.5) are analyzed (Agresti, 2002; pp ). In this example, explanatory variables are X 1 : lakes where alligators live, {1. Hancock, 2. Oklawaha, 3. Tra ord, 4. George}; and X 2 : sizes of alligators, {1. Small, 2. Large}; and the response variable is Y : primary food choice of alligators, {1. Fish, 2. Invertebrate, 3. Reptile, 4. Bird, 5. Other}. In this analysis, the generalized logit model described in this section is used, and we set I = 4, J = 2, and K = 5 in (2.5). From this model the following estimates of regression coe cients are obtained (Agresti, 2002; pp ): :826 0:006 1: :485 0:931 0:394 0 bb 1 = 0:417 2:454 1:419 0 and b B 2 = 6 0:131 0:659 0: : : :683 0 : 0: By using the above estimates, we have tr b B 1 \ Cov (X1 ; Y ) = 0:258 and tr b B 2 \ Cov (X2 ; Y ) = 0: [m P P (Y jx 1 ; X 2 ) = 219 (0: :107) = 79: 935 (df = 16; P = 0:000). 26

27 From this, the e ect of X 1 and X 2 is signi cant. From (13), we can calculate the ECD as follows: [ECD (X; Y ) = 0: :107 0: : = 0:267 (SE=0.042): Although the e ects of factors are statistically signi cant, the predictive power of the logit model may be small, i.e. only 26:7% of the variation of the response variable in entropy is explained by the explanatory variables. The e ect of Lake on Food is about 2:4 times greater than that of Size. Table 2.5. Alligator Food Choice Data Primary Food Choice Lake Size of Alligator Fish Invertebrate Reptile Bird Other Hancock 2:3m (S) :3m (L) Oklawaha S L Tra ord S L George S L Conclusion In the GLM framework, regression models are described with random, systematic and link components, and GLMs are widely applied in data analyses; 27

28 however the explanatory powers of GLMs have not been measured in practical data analyses except the ordinary linear regression model. In this seminar, GLMs are discussed from a view point of entropy, and ECD for measuring explanatory or predictive power of GLMs has been introduced and the utility of ECD has been shown for practical data analyses. Acknowledgement 1 The author would like to thank Prof. Claudio Borroni and all the members of Department of Quantitative Methods for Economics and Business Sciences, the University of Milan (Universita degli Studi di Milano) for giving me a valuable opportunity to make the present seminar. References [1] Agresti, A. (1986). Applying R 2 -type measures to ordered categorical data, Technometrics; 28: [2] Agresti, A. (2002). Categorical Data Analysis, Second Edition, John Wiley & Sons, Inc.: New York. [3] Ash, A. & Shwarts, M. (1999). R 2 : A useful measure of model performance with predicting a dichotomous outcome, Statistics in Medicine; 18: [4] Daniel, W. W. (1999). Biostatistics: A Foundation for Analysis in the Health Sciences, Seventh Edition, John Wiley & Sons, Inc.: New York. 28

29 [5] Efron, B. (1978). Regression and ANOVA with zero-one data: measures of residual variation, Journal of the American Statistical Association; 73: [6] Eshima, N. & Tabata, M. (2007). Entropy correlation coe cient for measuring predictive power of generalized linear models, Statistics and Probability Letters; 77, [7] Eshima, N & Tabata, M. (2010). Entropy coe cient of determination for generalized linear models, Computational Statistics and Data Analysis, 54, , [8] Eshima, N & Tabata, M. (2011). Three predictive power measures for generalized linear models: Entropy coe cient of determination, entropy correlation coe cient and regression correlation coe cient, Computational Statistics and Data Analysis, 55, [9] Goodman, L. A. (1971). The analysis of multinomial contingency tables: stepwise procedures and direct estimation methods for building models for multiple classi cations, Technometrics; 13: [10] Haberman, S. J. (1982). Analysis of dispersion of multinomial responses, Journal of the American Statistical Association; 77: [11] Kent, J. T. (1983). Information gain and a general measure of correlation, Biometrika, 70,

30 [12] Korn, E. L. and Simon, R. (1991). Explained residual variation, explained risk and goodness of t, American Statistician; 45: [13] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear models, 2nd Ed. Chapman and Hall: London. [14] Mittlebock, M. and Schemper, M. (1996). Explained variation for logistic regression, Statistics in Medicine; 15: [15] Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear model, Journal of the Royal Statistical Society A; 135: [16] Patnaik, P.B. (1949). The non-central 2 and F-distributions and their applications, Biometrika, 36, [17] Theil, H. (1970). On the estimation of relationships involving qualitative variables, American Journal of Sociology; 76: [18] Zheng, B. and Agresti, A. (2000). Summarizing the predictive power of a generalized linear model, Statistics in Medicine 2000; 19:

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression Gebrenegus Ghilagaber, PhD, Associate Professor May 7, 2008 1 1 Introduction Let

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University Instructions: Answer all four (4) questions. Be sure to show your work or provide su cient justi cation for

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

LOGISTICS REGRESSION FOR SAMPLE SURVEYS

LOGISTICS REGRESSION FOR SAMPLE SURVEYS 4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

MULTIVARIATE POPULATIONS

MULTIVARIATE POPULATIONS CHAPTER 5 MULTIVARIATE POPULATIONS 5. INTRODUCTION In the following chapters we will be dealing with a variety of problems concerning multivariate populations. The purpose of this chapter is to provide

More information

Speci cation of Conditional Expectation Functions

Speci cation of Conditional Expectation Functions Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics

More information

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels STATISTICS IN MEDICINE Statist. Med. 2005; 24:1357 1369 Published online 26 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2009 Prediction of ordinal outcomes when the

More information

Treatment Variables INTUB duration of endotracheal intubation (hrs) VENTL duration of assisted ventilation (hrs) LOWO2 hours of exposure to 22 49% lev

Treatment Variables INTUB duration of endotracheal intubation (hrs) VENTL duration of assisted ventilation (hrs) LOWO2 hours of exposure to 22 49% lev Variable selection: Suppose for the i-th observational unit (case) you record ( failure Y i = 1 success and explanatory variabales Z 1i Z 2i Z ri Variable (or model) selection: subject matter theory and

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March Presented by: Tanya D. Havlicek, ACAS, MAAA ANTITRUST Notice The Casualty Actuarial Society is committed

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Logistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ

Logistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ Logistic Regression The goal of a logistic regression analysis is to find the best fitting and most parsimonious, yet biologically reasonable, model to describe the relationship between an outcome (dependent

More information

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Michael Bar October 3, 08 San Francisco State University, department of economics. ii Contents Preliminaries. Probability Spaces................................. Random Variables.................................

More information

Logistic Regression. Continued Psy 524 Ainsworth

Logistic Regression. Continued Psy 524 Ainsworth Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Limited Dependent Variable Models II

Limited Dependent Variable Models II Limited Dependent Variable Models II Fall 2008 Environmental Econometrics (GR03) LDV Fall 2008 1 / 15 Models with Multiple Choices The binary response model was dealing with a decision problem with two

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

Econometrics Homework 1

Econometrics Homework 1 Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z

More information

Generalized linear models

Generalized linear models Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

Stat 8053, Fall 2013: Multinomial Logistic Models

Stat 8053, Fall 2013: Multinomial Logistic Models Stat 8053, Fall 2013: Multinomial Logistic Models Here is the example on page 269 of Agresti on food preference of alligators: s is size class, g is sex of the alligator, l is name of the lake, and f is

More information

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS A COEFFICIENT OF DETEMINATION FO LOGISTIC EGESSION MODELS ENATO MICELI UNIVESITY OF TOINO After a brief presentation of the main extensions of the classical coefficient of determination ( ), a new index

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION Ernest S. Shtatland, Ken Kleinman, Emily M. Cain Harvard Medical School, Harvard Pilgrim Health Care, Boston, MA ABSTRACT In logistic regression,

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Measures of Association and Variance Estimation

Measures of Association and Variance Estimation Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35

More information

Multinomial Regression Models

Multinomial Regression Models Multinomial Regression Models Objectives: Multinomial distribution and likelihood Ordinal data: Cumulative link models (POM). Ordinal data: Continuation models (CRM). 84 Heagerty, Bio/Stat 571 Models for

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

2 Describing Contingency Tables

2 Describing Contingency Tables 2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Comparison of Estimators in GLM with Binary Data

Comparison of Estimators in GLM with Binary Data Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com

More information

Suggested Solution for PS #5

Suggested Solution for PS #5 Cornell University Department of Economics Econ 62 Spring 28 TA: Jae Ho Yun Suggested Solution for S #5. (Measurement Error, IV) (a) This is a measurement error problem. y i x i + t i + " i t i t i + i

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: ) NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Lecture 13: More on Binary Data

Lecture 13: More on Binary Data Lecture 1: More on Binary Data Link functions for Binomial models Link η = g(π) π = g 1 (η) identity π η logarithmic log π e η logistic log ( π 1 π probit Φ 1 (π) Φ(η) log-log log( log π) exp( e η ) complementary

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

Chapter 12: Bivariate & Conditional Distributions

Chapter 12: Bivariate & Conditional Distributions Chapter 12: Bivariate & Conditional Distributions James B. Ramsey March 2007 James B. Ramsey () Chapter 12 26/07 1 / 26 Introduction Key relationships between joint, conditional, and marginal distributions.

More information

Comparing groups using predicted probabilities

Comparing groups using predicted probabilities Comparing groups using predicted probabilities J. Scott Long Indiana University May 9, 2006 MAPSS - May 9, 2006 - Page 1 The problem Allison (1999): Di erences in the estimated coe cients tell us nothing

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Generalized Linear Models: An Introduction

Generalized Linear Models: An Introduction Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs.

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs. Appendix A Math Reviews 03Jan007 Objectives. Review tools that are needed for studying models for CLDVs.. Get you used to the notation that will be used. Readings. Read this appendix before class.. Pay

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

GLM models and OLS regression

GLM models and OLS regression GLM models and OLS regression Graeme Hutcheson, University of Manchester These lecture notes are based on material published in... Hutcheson, G. D. and Sofroniou, N. (1999). The Multivariate Social Scientist:

More information

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

Simple ways to interpret effects in modeling ordinal categorical data

Simple ways to interpret effects in modeling ordinal categorical data DOI: 10.1111/stan.12130 ORIGINAL ARTICLE Simple ways to interpret effects in modeling ordinal categorical data Alan Agresti 1 Claudia Tarantola 2 1 Department of Statistics, University of Florida, Gainesville,

More information