Entropy Coe cient of Determination and Its Application
|
|
- Justin Garrison
- 5 years ago
- Views:
Transcription
1 Entropy Coe cient of Determination and Its Application Nobuoki Eshima Department of Biostatistics, Faculty of Medicine, Oita University, Oita , Japan. The objective of this seminar is to introduce the entropy coe cient of determination (ECD) for measuring the explanatory or predictive power of GLMs and to consider how ECD is used in data analysis. In the rst section, the classical regression and GLM frameworks are compared, and properties of GLMs concerning entropy are discussed. First, the information of an event and the entropy of a random variable are explained, and the Kullback- Leibler information that describes the di erence between two distributions is treated. Second, the log odds ratio and the mean in the GLM are considered from a view point of entropy. ECD is interpreted as the ratio of variation of a response variable explained by the explanatory variables, and is compared with some other explanatory power measures with respect to the following properties: (i) interpretability; (ii) being the multiple correlation coe cient or the coe cient of determination in normal linear regression models; (iii) entropy-based property; (iv) applicability 1
2 to all GLMs; in addition to these, it may be appropriate for a measure to have the following property: (v) monotonicity in the complexity of the linear predictor. In Section 2, rst the asymptotic properties of the maximum likelihood estimator (MLE) of ECD are discussed. The con dence interval of ECD is considered on the basis of an approximate normality of non-central chi square distribution. Second, in canonical link GLMs the contributions of factors are treated according to the decomposition of ECD. Numerical examples are also given. 1 Coe cient of determination for generalized linear models 1.1 Information Theory Let X be a categorical variable with categories = fc 1 ; C 2 ; :::; C K g, in which follows the categories are formally described as = f1; 2; :::; Kg. Then, the information of X = k is de ned by I (X = k) = log 1 Pr(X=k). (1.1) In the above information, the bottom of the logarithm is e and the unit is nat. If the bottom is 2, the unit is bit. In this seminar, the bottom of the logarithm is e. The mean of the above information, which is called entropy, is de ned by 2
3 H(X) P K k=1 Pr (X = k) I (X = k) = P K k=1 Pr (X = k) log 1 Pr(X=k). (1.2) Entropy is a measure of uncertainty in random variable X or sample space. Let p k = Pr (X = k) (k = 1; 2; :::; K). Then, we have the following theorem. Theorem 1.1. Let p = fp 1 ; p 2 ; :::; p K g and q = fq 1 ; q 2 ; :::; q K g be two distributions. Then, P K k=1 p k log p k P K k=1 p k log q k. (1.3) Proof: P K k=1 p k log p k P K k=1 p k 1 P K k=1 p k log q k = P K q k p k = 0. k=1 p k log p k q k = P K k=1 p k log q k p k * log x 1 x; x = q k p k The equation holds if and only if p k = q k (k = 1; 2; :::; K). From (1.3) it follows that H (p) = P K k=1 p k log p k P K k=1 p k log q k, where H (p) implies the entropy of distribution p. Setting q k = 1 K (k = 1; 2; :::; K), we have H (p) P K k=1 p k log 1 K = log K. The following quantity is referred to as the Kullback-Leibler (KL) information or divergence. 3
4 D(pjjq) P K k=1 p k log p k q k ( 0). (1.4) This information is interpreted as the di erence or loss of information using distribution q instead of true distribution p. Example 1.1. Let X B N (3; 1 4 ) and Let Y Pr(Y = k) = 1 4 (k = 1; 2; 3; 4). Then, we have D(XjjY ) = 1 8 log 1= =4 8 log 1= =4 8 log 3= =4 8 log 3=8 1=4 = 3 4 ln ln 2 = 0: The above quantity is the di erence between B N 3; 1 2 and the uniform distribution on f1; 2; 3; 4g. The reciprocal KL information is D(Y jjx) = log 1= =8 4 log 1=4 3=8 = 0:143 8: In this case, D(XjjY ) 6= D(Y jjx). For continuous distributions, the KL information can be de ned similarly as Z D (f(x)jjg(x)) f(x) log f(x) dx ( 0), (1.5) g(x) where f(x) and g(x) are density functions. Example 1.2. Let f(x) N( 1 ; 2 ) and g(x) N( 2 ; 2 ). Then, D (f(x)jjg(x)) = ( 1 2 )2 2 2 (= D (g(x)jjf(x))). 4
5 1.2 Entropy in GLMs Let X and Y be a p 1 explanatory variable vector and a response variable, respectively, and let f(yjx) be the conditional probability or density function of Y given X = x. The function f(yjx) is assumed to be a member of the following exponential family of distributions: f(yjx) = exp + c(y; ') ; (1.6) y b() a(') where and ' are parameters, and a('), b() (> 0) and c(y; ') are speci c functions. This is a random component. Let T = ( 1 ; 2 ; :::; p ) T : For a link function h(u) (link component) and the linear predictor = T x (systematic component), the conditional expectation of Y given X = x is described as follows: E(Y jx = x) = db() d = h 1 ( T x): (1.7) Let us assume that the link function h(u) is a strictly increasing di erentiable function. The conditional variance of response Y given X = x is as follows: Var(Y jx = x) = a(') d2 b() d 2. From this, a(') relates to the dispersion of Y, so it is referred to as a dispersion parameter. Since is a function of = T x, for simpli cation the function is denoted by = ( T x). Let us consider the following log odds ratio: 5
6 logor(x; x 0 ; y; y 0 ) = log f(yjx)=f(y 0jx) = log f(yjx)f(y 0jx 0 ) f(yjx 0 )=f(y 0 jx 0 ) f(y 0 jx)f(yjx 0 ) = 1 a(') (y y 0) ( T x) ( T x 0 ) ; (1.8) where x 0 and y 0 are baselines of X and Y, respectively. The above log odds ratio is viewed as an inner product of ( T x) and y with respect to the dispersion parameter a('). Since logor(x; x 0 ; y; y 0 ) = f log f(y 0 jx)) ( log f(yjx))g f( log f(y 0 jx 0 )) ( log f(yjx 0 ))g ; the log odds ratio (1.8) is the change of the uncertainty of response Y in explanatory variable vector X, and as seen in the above log odds ratio, predictor T x is related to the reduction of uncertainty of response Y through link functions. For levels of the factor vector X = x 1 ; x 2 ; : : : ; x K ; the averages of Y, and Y are de ned as follows: E() = P K k=1 (T x k ) K ; E(Y ) = P K k=1 E(Y jx=x k) K ; and E(Y ) = P K k=1 E(Y jx=x k)( T x k ) K : Remark 1.1. Let n k be sample sizes at factor levels x k (k = 1; 2; :::; K), and n = P K k=1 n k. Then, the above averages (expectations) are replaced by the weighted ones, e.g. E() = P K n k k=1 n ( T x k ). 6
7 When we take the expectation of the inner product (1.8), we have Cov(;Y ) a(') + (E() (T x 0 ))(E(Y ) y 0 ) a('), where Cov(; Y ) =E(Y ) E()E(Y ): For y 0 = E(Y ), the quantity becomes Cov(;Y ) a(') (1.9) and it can be viewed as the average change of uncertainty of response variable Y in explanatory variable vector X. We have Theorem 1.1 and Corollary 1.1. Theorem 1.1. In the GLM with (1.6), the quantity (1.9) is expressed by the Kullback-Leibler Information: Cov(;Y ) a(') = P K k=1 KL(f(y);f(yjx k)) K (1.10) where f(y) = P K k=1 f(yjx k) K and KL(f(y); f(yjx k )) = R f(yjx k ) log f(yjxk ) dy + R f(y) log f(y) f(y) f(yjx k dy ) = D (f(yjx k )jjf(y)) + D (f(y)jjf(yjx k )) (k = 1; 2; : : : ; K): Corollary 1.1. In the GLM with (1.6), the covariance of Y and ( T X) is nonnegative, and it is zero if and only if X and Y are independent, i.e. f(yjx k ) = f(y) (k = 1; 2; :::; K). Example 1.3. An ordinary linear regression model is Y = + T x + e, 7
8 where e is a normal error with mean 0 and variance 2. Let f(yjx) be a normal density function with mean and variance 2. Then, the random component is f(yjx) = p 1 exp (y ) y 1 p 2 = exp 2 + y2 log In this expression, setting =, a(') = 2, b() = and c(y; ') = y2 2 2 p log 2 2, and for linear predictor = + T x and link function =, the normal linear regression model can be viewed as a GLM. Example 1.4. Let Y be a binary variable with p = Pr(Y = 1) (= ). The random component is Then, f(yjx) = p y (1 p) 1 y = fp(1 p) 1 g y (1 p) n = exp y log o. p + log(1 p) 1 p = log p, a(') = 1, b() = log(1 p) and c(y; ') = 0. 1 p Setting link function h(p) = log p, we have the following logistic 1 p regression (logit) model: 8
9 f(yjx) = exp f( + x)y + log(1 p)g = expf(+x)yg 1+exp(+x). For the logit model with explanatory variables X 1 ; X 2 ; :::; X p, the model is expressed as f(yjx) = expf(+p p i=1 i x i)yg 1+exp(+ P p i=1 i x i). (1.11) 1.3 Basic Predictive Power Measures for GLMs In the sense of the previous discussion, it may be appropriate to assess the predictive or explanatory power of factors based on entropy. In the GLM framework, predictive power measures are compared, and the advantage of ECD is mentioned. First, some predictive power measures for regression models are brie y discussed. In general regression models, for variation function D, the predictive power can be measured as follows: RD 2 = D(Y ) D(Y jx), (1.12) D(Y ) where D (Y ) and D(Y jx) imply a variation function of Y and a conditional or error variation function given X, respectively (Efron, 1978; Agresti, 1986; Korn & Simon, 1991). Predictive power measures based on the likelihood function (Theil, 1970; Goodman, 1971) are made according to powers of the likelihood function, i.e. R 2 L = 1 l(0) l() 2 n, 9
10 where l () is the likelihood function and n is the sample size. Let R be the multiple correlation coe cient in ordinary linear regression model. The above measure becomes R 2 in ordinary linear regression cases and increases with model complexity; however it is di cult to interpret the measure in general (Zhen & Agresti, 2000). The entropy measure (Haberman, 1982) for categorical responses is based on entropy of Y, H(Y ), and the conditional entropy H(Y jx), i.e. R 2 E = H(Y ) H(Y jx). H(Y ) The above measure is included in (1.12). The correlation coe cient of response Y and its conditional expectation given factor X, Corr(E (Y jx) ; Y ), is recommended for measuring the predictive power of GLMs, because the correlation measure can be applied to all types of GLMs except polytomous response cases (Zheng & Agresti, 2000). This measure is the correlation coe cient between response Y and the regression on X, referred to as the regression correlation coe cient. With respect to entropy, R 2 L and R2 E may be suitable for GLMs. By considering the average change of log odds ratio, Eshima & Tabata (2007) proposed the following basic predictive power measure: m P P (Y jx) Cov(;Y ) a('). (1.13) The above measure is expressed by the Kullback-Leibler information (Eshima & Tabata, 2007), and it is increasing in Cov(; Y ) and decreasing in a('). Since 10
11 Var(Y jx = x) = a(') d2 b() d 2, function a(') may be interpreted as the error variation of Y in entropy, i.e. residual randomness of Y given X. From this, Cov(; Y ) can be interpreted as the explained entropy of Y by X. Hence, measure (1.13) is the ratio of the explained variation of Y for the error variation of Y in entropy. Entropy variation function D E is de ned by D E (Y ) Cov(; Y ) + a('). Since is a function of X, Cov(; Y jx) = 0. From this, the conditional entropy variation of Y given X is D E (Y jx) a('). Considering this, ECD is de ned as follows: ECD(X; Y ) = Cov(;Y ) Cov(;Y )+a(') = m P P (Y jx) m P P (Y jx)+1 = D E(Y ) D E (Y jx) D E (Y ). (1.14) From (1.14), ECD is included in (1.12), and ECD can be viewed as the proportion of explained variation of Y in entropy. For the normal linear regression model, it follows that ECD(X; Y ) = R 2. Let = 1 ; 2 ; :::; p T be a regression coe cient vector. For canonical links = P p i=1 ix i, ECD (1.14) and the entropy correlation coe cient (ECC) are decomposed as follows: 11
12 ECD(X; Y ) = ECorr(X; Y ) = P p i=1 i Cov(X i;y ) P p i=1 i Cov(X i;y )+a('), (1.15) P p i=1 i Cov(X i;y ) p Var() p Var(Y ) : None of the predictive power measures except ECD and ECC can make the above type of decomposition for GLMs with canonical links. In addition to the desirable properties of measures for GLMs, (i) to (v), decomposability such as (1.15) may also be a suitable property for a predictive power measure, because the relative importance of X i may be assessed by i Cov(X i ; Y ). Moreover, ECD is scale-invariant in GLMs with multivariate responses; however ECC is not. In this respect, ECD is superior to ECC. Table 1.1 Properties of ve predictive power measures in GLMs Property Corr(E (Y jx) ; Y ) RL 2 RE 2 ECC ECD (i) interpretability (ii) R or R 2 (iii) entropy (iv) all GLMs (v) monotonicity 4 4 (vi) decomposition Table 1.1 summarizes the properties of the ve measures mentioned above. Measures Corr(E (Y jx) ; Y ) and ECorr(X; Y ) may have property (v) in most of cases; however it is not easy to prove the property in general. From this table, ECD(X; Y ) is the most desirable predictive power measure for GLMs. 12
13 Example 1.5. Let X and Y be p and q dimensional random vectors, respectively; and the joint distribution is assumed to be a (p + q) variate normal distribution with the following covariance matrix: 0 1 B XX Y X XY Y Y C A. Let the inverse of the above matrix be denoted by B XX Y X XY Y Y C A. Then, = Y X X and a(') = 1, so we have From this, m P P (Y jx) = tr Y X XY. ECD(X; Y ) = try X XY. tr Y X XY +1 Let i (i = 1; 2; :::; minfp; qg) be the squared canonical correlation coe cients. Then, ECD(X; Y ) = P minfp;qg i i=1 1 i P. minfp;qg i i= i For q = 1, ECD is reduced to the usual coe cient of determination 1 (= R 2 ). Example 1.6. In the logistic regression model (1.11), we have ECD(X; Y ) = P p i=1 i Cov(X i;y ) P p i=1 i Cov(X i;y )+1. 13
14 2 Application of ECD 2.1 Asymptotic Property of the ML Estimator of ECD Let f(y) and g (x) be the marginal density or probability function of Y and X, respectively. Then, the association measure is expressed as m P P (Y jx) = RR f(yjx)g(x) log + RR f(y)g(x) log f(yjx) f(y) dxdy f(y) f(yjx) dxdy. (2:1) If Y is discrete, the integral is replaced with the summation. If X is not random and take values x k (k = 1; 2; :::; K), the above measure can be modi ed as follows: m P P (Y jx) = P K n k R k=1 f(yjxk n ) log f(yjxk ) dy + R f(y) log f(y) f(y) f(yjx k dy, (2:2) ) where n k are sample sizes at levels x k (k = 1; 2; :::; K), and n = P K k=1 n k. We have the following theorem. Theorem 2.1. Let [m P P (Y jx), b f(yjx k ), and b f(y) be the ML estimators of m P P (Y jx), f(yjx k ), and f(y), respectively. If the null model, i.e. = 0, holds, the ML estimator of (2.2) multiplied by sample size n, i.e. n [m P P (Y jx) = P K k=1 n k R bf(yjxk ) log bf(yjxk ) bf(y) dy + R b f(y) log bf(y) bf(yjx k ) dy, is asymptotically distributed according to the chi-square distribution with degrees of freedom p as the sample sizes n i tend to in nity. 14
15 Proof. For simplicity of the discussion, the theorem is proven in the case where Y is a polytomous variable with levels or categories f1; 2; : : : ; Jg. Let jjk = Pr (Y = jjx = x k ) and j = Pr (Y = j); and let b jjk and b j be the ML estimators of the jjk and j, respectively. Under the null hypothesis and for su ciently large n k, we have where n [m P P (Y jx) = P K k=1 n k P J j=1 n b jjk log b jjk b j + b j log b j b jjk o = P K P J (n k b jjk n k b j) 2 k=1 j=1 2n k b jjk + P K P J (n k b jjk n k b j) 2 k=1 j=1 2n k b j + o(n) = P K P J (n k b jjk n k b j) 2 k=1 j=1 n k b jjk + o(n), o(n) n P! 0 (n! 1) : Hence, the theorem follows. When the explanatory variables X are random, the following theorem holds similarly. Theorem 2.2. If the null model, i.e. = 0, holds, the ML estimator of (2.1) multiplied by sample size n, i.e. n [m P P (Y jx) RR = n bf(yjx)bg(x) bf(yjx) log dxdy + RR b bf(y) bf(y) f(y)bg(x) log dxdy bf(yjx) is asymptotically distributed according to the chi-square distribution with degrees of freedom p as the sample size n tends to in nity. 15
16 Since ECD(X; Y ) = Cov(;Y )=a(') = Cov(;Y )=a(')+1 m P P (Y jx) m P P (Y jx)+1, the ML estimator of ECD(X; Y ) is [ECD(X; Y ) = [m P P (Y jx) [m P P (Y jx)+1. From this, we can test the hypothesis ECD(X; Y ) = 0 based on the following statistic: 2 = n [m P P (Y jx) = n d Cov(;Y ) a(b'). (2.3) The above statistic is asymptotically distributed according to a non-central chi square distribution with non-centrality = n m P P (Y jx) and degrees of freedom p. Let c = 1 + Statistic 2 c and p+ 0 = p + 2. p+2 is asymptotically distributed according to the chi square distribution with degrees of freedom 0. As 0 becomes large, the chi square distribution tends to a normal distribution with mean 0 and variance 2 0. From this, for su ciently large sample size n, statistic (2.3) 16
17 2 n = m P P (Y jx) is asymptotically normally distributed with mean c0 and variance 2c2 0 n n 2 (Patnaik, 1949). For su ciently large n, we have c 0 n 2c 2 0 n 2 m P P (Y jx), 2c n m P P (Y jx). From this, the asymptotic standard error (ASE) of [m P P (Y jx) is q 2c n [m P P (Y jx). Example 2.1. Agresti (2002, pp ) analyzed the beetle mortality data with complementary log-log model, in which beetles were exposed to gaseous carbon disul de at various concentrations, and the numbers of beetles killed after the 5-hour-exposure were observed (Table 2.1). Let X be gaseous carbon disul de concentration, and and the model parameters. Then, for the complementary log-log link we have = log 1 expf exp(+x)g : expf exp(+x)g For the maximum likelihood estimates = 39:52 (ASE = 3:23) and = 22:01 (ASE = 1:80), ECD is calculated as ECD = 0:475 (ASE = 0:024): The ECD indicates that 47:5% of variation of response variable Y in entropy is explained by the gaseous carbon disul de concentration X. 17
18 Table 2.1. Beetles Killed after Exposure to Carbon Disul de Log Dose No. of Beetles No. of Beetles Killed 1: : : : : : : : GLMs with canonical links Most regression analyses with GLMs are performed using canonical links. Let X = (X 1 ; X 2 ; :::; X p ) T be a p 1 factor or explanatory variable vector; let Y be a response variable; let = 1 ; 2 ; :::; p T be a regression parameter vector; and let = P p i=1 ix i be the canonical links. Then, ECD is decomposed as ECD(X; Y ) = P p i=1 i Cov(X i;y ) P p j=1 j Cov(X j;y )+a('). (2.4) The above decomposition consists of components that relate to regression coe cients i, and the contribution of X i on Y may be de ned by using 18
19 i Cov(X i ; Y ). If X i are independent or the experimental design is a multiway layout experiment, the contribution ratio of X i on the response is de ned by CR (X i ) = i Cov(X i;y ) P p k=1 k Cov(X k;y ). In general, X i are correlated or the experiment model has higher-order interactions. Then, the contribution ratio of X i is de ned by CR (X i ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ). Example 2.2. The present discussion is applied to the ordinary twoway layout experimental design model. Let X 1 and X 2 be factors with levels f1; 2; : : : ; Ig and f1; 2; : : : ; Jg, respectively. Then, the linear predictor is a function of (X 1 ; X 2 ) = (i; j), i.e. = i + j + () ij. For model identi cation, the following constraints are placed on these parameters: P I i=1 i = P J j=1 j = P I i=1 () ij = P J j=1 () ij = 0. Let 8 >< 1 (X k = i) X ki = >: 0 (X k 6= i) (k = 1; 2): Then, dummy vectors 19
20 X 1 = (X 11 ; X 12 ; : : : ; X 1I ) T and X 2 = (X 21 ; X 22 ; : : : ; X 2J ) T are identi ed with factors X 1 and X 2, respectively. From this, the systematic component of the above model can be written as follows: = T X 1 + T X 2 + T X 1 X 2 ; where = ( 1 ; 2 ; : : : ; I ) T, = ( 1 ; 2 ; : : : ; J ) T, = (() 11 ; () 12 ; : : : ; () 1J ; : : : ; () IJ ) T, and X 1 X 2 = (X 11 X 21; X 11 X 22 ; : : : ; X 11 X 2J ; : : : ; X 1I X 2J ) T. Let Cov(X 1 ; Y ), Cov(X 2 ; Y ) and Cov(X 1 X 2 ; Y ) are covariance matrices. Then the total e ect of X 1 and X 2 is m P P (Y j (X 1 ; X 2 )) = Cov(;Y ) 2 = trt Cov(X 1 ;Y ) + trt Cov(X 2 ;Y ) + trt Cov(X 1 X 2 ;Y ) = 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P I IJ j=1 i=1 ()2 ij The above three terms are referred to as the main e ect of X 1, that of X 2 and the interactive e ect, respectively. Then, ECD is calculated as follows: ECD((X 1 ; X 2 ) ; Y ) = In this case, 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P I IJ j=1 i=1 ()2 ij 1 P I I i=1 2 i + 1 P J J j=1 2 j + 1 P J P. I IJ j=1 i=1 ()2 ij +2 20
21 CR (X 1 ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ) = 1 P I I i=1 2 i, Cov(;Y ) CR (X 2 ) = Cov(;Y ) Cov(;Y jx i) Cov(;Y ) = 1 P J J j=1 2 j. Cov(;Y ) The rest 1 CR (X 1 ) CR (X 2 ) = 1 IJ P J j=1 P I i=1 ()2 ij Cov(;Y ) is due to the e ect of the interaction. Table 2.2. Length of Home Visit in minutes by Public Health Nurses by Nurse s Age Group and Type of Patient Factor X 2 (nurse s age group) (years old) Factor X 1 (Type of Patients) (20 to 29) (30 to 39) (40 to 49) (50 and over) 1 (Cardiac) (Cancer) (C.V.A) (Tuberculosis)
22 Table 2.2 shows two-way layout experiment data in a study of length of time spent on individual home visits by public health nurses (Daniel (1999), pp ). In the example, analysis of the e ects of factors, i.e. the type of patient and the age of a nurse, on the nurses behavior will be signi cant. Let Y be length of home visit, and let factors X 1 and X 2 denote the type of a patient and the age of a nurse, respectively. The results of two-way analysis of variance are shown in Table 2.3. The main and interactive e ects of factors are signi cant. In this case, levels of factor vector X = (X 1 ; X 2 ) are (i; j) (i = 1; 2; 3; 4; 5; j = 1; 2; 3; 4; 5). Although factors X 1 and X 2 are independent (orthogonal), the model has interaction terms between them. By using the present approach, the variance decomposition in Table 2.3 we have ECD((X 1 ; X 2 ) ; Y ) = 3226: :05+704: :55 = 0:796 (SE = 0:018). From this, 79.6% of entropy is explained by the two variables (factors). The contributions of the factors are calculated as follows: CR (X 1 ) = 3226: :55 = 0:502 and CR (X 2) = 1185: :55 = 0:184. The contribution of X 1 on Y is about three times greater than that of X 2. 22
23 Table 2.3. Analysis of variance of length of time spent on individual home visits by public health nurses Source SS df MS F p X : : 5 52: 641 0:000 X : : 02 19: 334 0:000 (X 1 X 2 ) 704: : 272 3: 831 0:000 Residual 1307: : 431 Total 6423:55 79 Example 2.3. Table 2.4 shows death penalty data to study the e ects of defendant s and victim s racial characteristics on whether persons convicted homicide received the death penalty (Agresti, pp ), and the data were analyzed with the following logit model f(yjx) = expf(+ d x d+ v x v)yg 1+exp(+ d x d + v x v), where X d and X v imply the defendant s and victim s race, i.e. 0= black and 1=white; the response death penalty Y takes the value 0=no or 1=yes. The estimated parameters were b = 3:596 (SE=0.507), b d = 0:868 (SE=0.367) and b v = 2:404 (SE=0.601) (Agresti, p. 201). From the results, the odds of death penalty for the white defendant given the victim s race is exp ( 0:868) = 0:420 times higher than that for the black defendant, and the odds of death penalty for the white victim given the defendant s race is exp (2:404) = 11: 067 times higher than that for black victim. Since the 23
24 predictive power measured with ECD is ECD = 0:036 (SE = 0:014), the e ects of the defendant s and victim s races on the death penalty are small. The contribution ratios of the explanatory variables on the response are calculated according to (2.4) as follows: CR (X d ) = 0:603 and CR (X v ) = 0:813: Table 2.4. Death Penalty Data Victims Race Defendant s Race Death Penalty Yes No White White Black Black White 0 16 Black Application to a generalized logit model A baseline-category logit model is considered. Let X 1 and X 2 be categorical factors that take levels f1; 2; : : : ; Ig and f1; 2; : : : ; Jg, respectively, and let Y be a categorical 8 response variable with levels f1; 2; : 8: : ; Kg. Let >< 1 (X a = i) >< 1 (Y = k) X ai = (a = 1; 2) and Y k = >: 0 (X a 6= i) >: 0 (Y 6= k). (2.5) Then, dummy variable vectors 24
25 X 1 = (X 11 ; X 12 ; : : : ; X 1I ) T ; X 2 = (X 21 ; X 22 ; : : : ; X 2J ) T and Y = (Y 1 ; Y 2 ; : : : ; Y K ) T are identi ed with factors X 1 ; X 2 and response Y, respectively. From this, the systematic component of the baseline-category logit model is assumed as follows: = + B (1) X 1 + B (2) X 2 ; where (1)11 (1)12 ::: (1)1I 2 (1)21 (1)22 ::: (1)2I =, B 6. (1) =, 7 6 ::: ::: ::: ::: K (1)K1 (1)K1 ::: (1)KI 2 3 (2)11 (2)12 ::: (2)1J (2)21 (2)22 ::: (2)2J and B (2) =. 6 ::: ::: ::: ::: (2)K1 (2)K1 ::: (2)KJ Then, the logit model is described as Pr(Y = yjx 1 ; x 2 ) = exp(yt B (1) x 1 +y T B (2) x 2 +y T ) Pz exp(zt B (1) x 1 +z T B (2) x 2 +z T ) ; where P z implies the summation over all z. In this model, we have 25
26 ECD(X; Y ) = trb (1)Cov(X 1 ;Y )+ trb (2) Cov(X 2 ;Y ) trb (1) Cov(X 1 ;Y )+ trb (2) Cov(X 2 ;Y )+1. In the following example, the ECD approach is demonstrated. Example 2.4. The data for an investigation of factors in uencing the primary food choice of alligators (Table 2.5) are analyzed (Agresti, 2002; pp ). In this example, explanatory variables are X 1 : lakes where alligators live, {1. Hancock, 2. Oklawaha, 3. Tra ord, 4. George}; and X 2 : sizes of alligators, {1. Small, 2. Large}; and the response variable is Y : primary food choice of alligators, {1. Fish, 2. Invertebrate, 3. Reptile, 4. Bird, 5. Other}. In this analysis, the generalized logit model described in this section is used, and we set I = 4, J = 2, and K = 5 in (2.5). From this model the following estimates of regression coe cients are obtained (Agresti, 2002; pp ): :826 0:006 1: :485 0:931 0:394 0 bb 1 = 0:417 2:454 1:419 0 and b B 2 = 6 0:131 0:659 0: : : :683 0 : 0: By using the above estimates, we have tr b B 1 \ Cov (X1 ; Y ) = 0:258 and tr b B 2 \ Cov (X2 ; Y ) = 0: [m P P (Y jx 1 ; X 2 ) = 219 (0: :107) = 79: 935 (df = 16; P = 0:000). 26
27 From this, the e ect of X 1 and X 2 is signi cant. From (13), we can calculate the ECD as follows: [ECD (X; Y ) = 0: :107 0: : = 0:267 (SE=0.042): Although the e ects of factors are statistically signi cant, the predictive power of the logit model may be small, i.e. only 26:7% of the variation of the response variable in entropy is explained by the explanatory variables. The e ect of Lake on Food is about 2:4 times greater than that of Size. Table 2.5. Alligator Food Choice Data Primary Food Choice Lake Size of Alligator Fish Invertebrate Reptile Bird Other Hancock 2:3m (S) :3m (L) Oklawaha S L Tra ord S L George S L Conclusion In the GLM framework, regression models are described with random, systematic and link components, and GLMs are widely applied in data analyses; 27
28 however the explanatory powers of GLMs have not been measured in practical data analyses except the ordinary linear regression model. In this seminar, GLMs are discussed from a view point of entropy, and ECD for measuring explanatory or predictive power of GLMs has been introduced and the utility of ECD has been shown for practical data analyses. Acknowledgement 1 The author would like to thank Prof. Claudio Borroni and all the members of Department of Quantitative Methods for Economics and Business Sciences, the University of Milan (Universita degli Studi di Milano) for giving me a valuable opportunity to make the present seminar. References [1] Agresti, A. (1986). Applying R 2 -type measures to ordered categorical data, Technometrics; 28: [2] Agresti, A. (2002). Categorical Data Analysis, Second Edition, John Wiley & Sons, Inc.: New York. [3] Ash, A. & Shwarts, M. (1999). R 2 : A useful measure of model performance with predicting a dichotomous outcome, Statistics in Medicine; 18: [4] Daniel, W. W. (1999). Biostatistics: A Foundation for Analysis in the Health Sciences, Seventh Edition, John Wiley & Sons, Inc.: New York. 28
29 [5] Efron, B. (1978). Regression and ANOVA with zero-one data: measures of residual variation, Journal of the American Statistical Association; 73: [6] Eshima, N. & Tabata, M. (2007). Entropy correlation coe cient for measuring predictive power of generalized linear models, Statistics and Probability Letters; 77, [7] Eshima, N & Tabata, M. (2010). Entropy coe cient of determination for generalized linear models, Computational Statistics and Data Analysis, 54, , [8] Eshima, N & Tabata, M. (2011). Three predictive power measures for generalized linear models: Entropy coe cient of determination, entropy correlation coe cient and regression correlation coe cient, Computational Statistics and Data Analysis, 55, [9] Goodman, L. A. (1971). The analysis of multinomial contingency tables: stepwise procedures and direct estimation methods for building models for multiple classi cations, Technometrics; 13: [10] Haberman, S. J. (1982). Analysis of dispersion of multinomial responses, Journal of the American Statistical Association; 77: [11] Kent, J. T. (1983). Information gain and a general measure of correlation, Biometrika, 70,
30 [12] Korn, E. L. and Simon, R. (1991). Explained residual variation, explained risk and goodness of t, American Statistician; 45: [13] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear models, 2nd Ed. Chapman and Hall: London. [14] Mittlebock, M. and Schemper, M. (1996). Explained variation for logistic regression, Statistics in Medicine; 15: [15] Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear model, Journal of the Royal Statistical Society A; 135: [16] Patnaik, P.B. (1949). The non-central 2 and F-distributions and their applications, Biometrika, 36, [17] Theil, H. (1970). On the estimation of relationships involving qualitative variables, American Journal of Sociology; 76: [18] Zheng, B. and Agresti, A. (2000). Summarizing the predictive power of a generalized linear model, Statistics in Medicine 2000; 19:
Multinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationFinansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression
Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression Gebrenegus Ghilagaber, PhD, Associate Professor May 7, 2008 1 1 Introduction Let
More informationWe begin by thinking about population relationships.
Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationGeneralized Linear Models 1
Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation
More informationECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University
ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University Instructions: Answer all four (4) questions. Be sure to show your work or provide su cient justi cation for
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationMC3: Econometric Theory and Methods. Course Notes 4
University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More information1 A Non-technical Introduction to Regression
1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationLOGISTICS REGRESSION FOR SAMPLE SURVEYS
4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationMULTIVARIATE POPULATIONS
CHAPTER 5 MULTIVARIATE POPULATIONS 5. INTRODUCTION In the following chapters we will be dealing with a variety of problems concerning multivariate populations. The purpose of this chapter is to provide
More informationSpeci cation of Conditional Expectation Functions
Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics
More informationPrediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels
STATISTICS IN MEDICINE Statist. Med. 2005; 24:1357 1369 Published online 26 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2009 Prediction of ordinal outcomes when the
More informationTreatment Variables INTUB duration of endotracheal intubation (hrs) VENTL duration of assisted ventilation (hrs) LOWO2 hours of exposure to 22 49% lev
Variable selection: Suppose for the i-th observational unit (case) you record ( failure Y i = 1 success and explanatory variabales Z 1i Z 2i Z ri Variable (or model) selection: subject matter theory and
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March Presented by: Tanya D. Havlicek, ACAS, MAAA ANTITRUST Notice The Casualty Actuarial Society is committed
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationLogistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ
Logistic Regression The goal of a logistic regression analysis is to find the best fitting and most parsimonious, yet biologically reasonable, model to describe the relationship between an outcome (dependent
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationCategorical data analysis Chapter 5
Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases
More informationh=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio
Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from
More informationIntroduction to Econometrics
Introduction to Econometrics Michael Bar October 3, 08 San Francisco State University, department of economics. ii Contents Preliminaries. Probability Spaces................................. Random Variables.................................
More informationLogistic Regression. Continued Psy 524 Ainsworth
Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More information1. The Multivariate Classical Linear Regression Model
Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationOnline Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access
Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci
More informationLimited Dependent Variable Models II
Limited Dependent Variable Models II Fall 2008 Environmental Econometrics (GR03) LDV Fall 2008 1 / 15 Models with Multiple Choices The binary response model was dealing with a decision problem with two
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationSupplemental Material 1 for On Optimal Inference in the Linear IV Model
Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationHierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!
Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter
More informationA note on R 2 measures for Poisson and logistic regression models when both models are applicable
Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationPANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1
PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,
More informationTime Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley
Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the
More informationSample size calculations for logistic and Poisson regression models
Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National
More informationAnalysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper
Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions
More informationStat 8053, Fall 2013: Multinomial Logistic Models
Stat 8053, Fall 2013: Multinomial Logistic Models Here is the example on page 269 of Agresti on food preference of alligators: s is size class, g is sex of the alligator, l is name of the lake, and f is
More informationA COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS
A COEFFICIENT OF DETEMINATION FO LOGISTIC EGESSION MODELS ENATO MICELI UNIVESITY OF TOINO After a brief presentation of the main extensions of the classical coefficient of determination ( ), a new index
More informationTesting for Regime Switching: A Comment
Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara
More informationONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION
ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION Ernest S. Shtatland, Ken Kleinman, Emily M. Cain Harvard Medical School, Harvard Pilgrim Health Care, Boston, MA ABSTRACT In logistic regression,
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationEconometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit
Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables
More informationBIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationModel Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection
Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist
More informationMeasures of Association and Variance Estimation
Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35
More informationMultinomial Regression Models
Multinomial Regression Models Objectives: Multinomial distribution and likelihood Ordinal data: Cumulative link models (POM). Ordinal data: Continuation models (CRM). 84 Heagerty, Bio/Stat 571 Models for
More informationOn Properties of QIC in Generalized. Estimating Equations. Shinpei Imori
On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com
More information2 Describing Contingency Tables
2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationComparison of Estimators in GLM with Binary Data
Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com
More informationSuggested Solution for PS #5
Cornell University Department of Economics Econ 62 Spring 28 TA: Jae Ho Yun Suggested Solution for S #5. (Measurement Error, IV) (a) This is a measurement error problem. y i x i + t i + " i t i t i + i
More informationStrati cation in Multivariate Modeling
Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More informationLecture 13: More on Binary Data
Lecture 1: More on Binary Data Link functions for Binomial models Link η = g(π) π = g 1 (η) identity π η logarithmic log π e η logistic log ( π 1 π probit Φ 1 (π) Φ(η) log-log log( log π) exp( e η ) complementary
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationModel Selection for Semiparametric Bayesian Models with Application to Overdispersion
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and
More informationChapter 12: Bivariate & Conditional Distributions
Chapter 12: Bivariate & Conditional Distributions James B. Ramsey March 2007 James B. Ramsey () Chapter 12 26/07 1 / 26 Introduction Key relationships between joint, conditional, and marginal distributions.
More informationComparing groups using predicted probabilities
Comparing groups using predicted probabilities J. Scott Long Indiana University May 9, 2006 MAPSS - May 9, 2006 - Page 1 The problem Allison (1999): Di erences in the estimated coe cients tell us nothing
More informationBias-corrected AIC for selecting variables in Poisson regression models
Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationChapter 2: Describing Contingency Tables - II
: Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationFöreläsning /31
1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationIntroduction: structural econometrics. Jean-Marc Robin
Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider
More informationAppendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs.
Appendix A Math Reviews 03Jan007 Objectives. Review tools that are needed for studying models for CLDVs.. Get you used to the notation that will be used. Readings. Read this appendix before class.. Pay
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationGLM models and OLS regression
GLM models and OLS regression Graeme Hutcheson, University of Manchester These lecture notes are based on material published in... Hutcheson, G. D. and Sofroniou, N. (1999). The Multivariate Social Scientist:
More informationEconomics Introduction to Econometrics - Fall 2007 Final Exam - Answers
Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.
More informationRegression models for multivariate ordered responses via the Plackett distribution
Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento
More informationSimple ways to interpret effects in modeling ordinal categorical data
DOI: 10.1111/stan.12130 ORIGINAL ARTICLE Simple ways to interpret effects in modeling ordinal categorical data Alan Agresti 1 Claudia Tarantola 2 1 Department of Statistics, University of Florida, Gainesville,
More information