# A Bivariate Weibull Regression Model

Save this PDF as:

Size: px
Start display at page: ## Transcription

1 c Heldermann Verlag Economic Quality Control ISSN Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression model based on censored samples with common covariates. There are some interesting biometrical situations which motivate the study of a bivariate Weibull regression model of the proposed type. A procedure for obtaining the maximum likelihood estimators for the parameters in the model is derived and a test of significance for the regression parameters is sketched. Key words: Bivariate Weibull model, parametric regression, Ssurvival times. 1 Introduction We introduce a new bivariate Weibull regression model based on censored samples with common covariates. Freund  proposed a bivariate exponential model (BVE) and Proschan- Sullo  modified Freund s BVE by allowing simultaneous failures. A modified bivariate Weibull (BVW) model is obtained by taking simple transformation of the bivariate exponential model (BVE) of Proschan-Sullo . Hanagal  proposed a multivariate Weibull distribution which is a generalization of the multivariate exponential model of Marshall-Olkin . We choose the BVW model because it is superior compared with the BVE model of Proschan-Sullo . There are some situations arising in biometry which motivate the study of a BVW regression model of a particular type. For example, paired organs like Kidneys, Eyes, Ears or any other paired organs of an individual (or patient) may be looked at as a two component system. Failure of an organ increases the risk of other organ. We assume here that the lifetime of an individual is independent of the lifetimes of the paired organs an use the univariate censoring given by Hanagal [2, 3], since the death of an individual will censor both lifetimes of organs. Hanagal  proposed a bivariate Weibull regression model for the situation described above by extending Marshall-Olkin s bivariate exponential distribution. The covariates may be age of the patient, sex of the patient, smoking or alcoholic habits, diabetic or non-diabetic conditions, some specific diseases of the patient etc. Treating in such situations the lifetimes of paired organs of each patient as identically distributed BVW violates reality and, therefore, it is not advisable. Each patient features certain characteristics and, hence, it is necessary to incorporate covariates on which the

2 2 David D. Hanagal lifetimes of the paired organs may depend. These covariates represent individual properties of a patient relevant to the lifetimes and, thus, can be assumed to be the same for each considered pair of organs, i.e., we may assume common covariates. Unfortunately, at this stage of the investigations there were no real data available for evaluating our proposed model. In Section 2, we introduce the BVW regression model and in Section 3, we derive estimators for the parameters of the proposed model. In Section 4, we present a test procedures for checking the significance of the regression parameters. 2 Bivariate Weibull Regression Model The BVE of Freund  is given by the following joint probability density function (pdf): 1 22 e 22x 2 ( 22 )x 1 for 0 <x 1 <x 2 < f(x 1,x 2 ) = 2 11 e 11x 1 ( 11 )x 2 for 0 <x 2 <x 1 < (1) 3 e for 0 <x 1 = x 2 = x< where =( )and 1, 2, 3, 11, 22 > 0. By means of the transformations T 1 = X1 σ and T 2 = X2 σ, σ>0we get a bivariate Weibull model with joint pdf as follows: 1 22 σ 2 (t 1 t 2 ) σ 1 e 22t σ 2 ( 22)t σ 1 for 0 <t 1 <t 2 < f(t 1,t 2 ) = 2 11 σ 2 (t 1 t 2 ) σ 1 e 11t σ 1 ( 11)t σ 2 for 0 <t 2 <t 1 < (2) 3 σt σ 1 e for 0 <t 1 = t 2 = t< As it is well known for the BVE of Proschan-Sullo , the marginals are weighted combinations of two exponential distributions. Also in the BVW case, the marginals are weighted combinations of two Weibull distributions with same weights. The minimum of the two lifetimes min(t 1,T 2 ) is Weibull distributed with scale parameter ( ) and shape parameter σ. The lifetimes T 1 and T 2 are independent. whenever 11 = 1 and 22 = 2 holds. The probabilities in the two regions are given by P [T 1 <T 2 ] = 1 P [T 1 >T 2 ] = 2 P [T 1 = T 2 ] = 3 (3) Taking logarithm of the BVE variates in (1), i.e., Y 1 =logx 1 and Y 2 =logx 2,wegeta bivariate extreme value (BVEV) distribution with pdf given by:

3 A Bivariate Weibull Regression Model 3 f(y 1,y 2 ) = e y 1+log 1 +y 2 +log 22 e y 1 +log( 22 ) e y 2 +log 22 e y 2+log 2 +y 1 +log 11 e y 2 +log( 11 ) e y 1 +log 11 e y+log 3 e y+log for <y 1 <y 2 < for <y 2 <y 1 < (4) for <y 1 =y 2 =y< The two joint density function (1) and (4) show that the parameters ( 1, 2, 3, 11, 22 ) are not proper scale parameters for BVE of Proschan-Sullo . It follows that BVE of Proschan-Sullo and also the corresponding BVEV do not belong to the location-scale family. The regression model for the two component system of interest is given by: ( ) ( ) Y1 β = 1X ( ) U1 Y 2 β 2X 2 σ U 2 (5) where X 1 and X 2 are m-dimensional vectors of regressor variables or covariates, β 1 and β 2 are m-dimensional vectors of regression coefficients and (U 1,U 2 ) are random variables having density functions as given in (4). For Y 1 =logt 1 and Y 2 =logt 2,weget ( ) ( ) T1 e = β 1 X 1 V 1 σ 1 T 2 e β 2 X 2 V 1 σ 2 (6) where (V 1 = e U 1,V 2 = e U 2 ) is BVE of Proschan-Sullo . Alternatively we can write ( ) ( ) V1 e = σβ 1 X 1 T1 σ V 2 e σβ 2 X 2 T2 σ Assuming additionally that both components have not only common covariates, but also common regression parameters, i.e., X 1 = X 2 and β 1 = β 2 then (7) simplifies to: ( ) ( ) V1 T σ = 1 e σβ X V 2 T σ 2 3 Estimation of the Parameters Suppose that the study shall be based on an independent sample of size n and let the ith pair of the components have lifetimes (T 1i,T 2i ) and censoring time (Z i ). We assume the censoring time Z to be independent of the lifetimes (T 1,T 2 ). The lifetimes associated with the ith pair of the organs are given by (T 1i,T 2i ) = (T 1i,T 2i ) for max(t 1i,T 2i ) <Z i (T 1i,Z i ) for T 1i <Z i <T 2i (Z i,t 2i ) for T 2i <Z i <T 1i (Z i,z i ) for Z i < min(tt 1i,T 2i ) (7)

4 4 David D. Hanagal The likelihood function of the sample of size n is given by n 1 n 2 n 3 n 4 n 5 n 6 L = ( f 1,i )( f 2,i )( f 3,i )( f 4,i )( f 5,i )( F i ) (8) where f 1,i (t 1i t 2i ) = σ (t 1i t 2i ) σ 1 e 2σX β [( )t σ 1i + 22t σ 2i ]e σx β for 0 <t 1i <t 2i <z i f 2,i (t 1i t 2i ) = σ (t 1i t 2i ) σ 1 e 2σX β [( )t σ 2i + 11t σ 1i ]e σx β (9) (10) for 0 <t 2i <t 1i <z i f 3,i (t 1i t 2i ) = σ 3 t σ 1 i e σx β [( )t σ i e σx β (11) for 0 <t 1i = t 2i = t i <z i f 4,i (t 1i t 2i ) = lim δt i 0 f 5,i (t 1i t 2i ) = lim δt i 0 P [t 1i <T 1i <t 1i + δt i T 2i >z i ]P [T 2i >z i ] δt i β (12) = σ 1 t σ 1 1i e σx β [( )t σ 1i + 22zi σ]e σx for 0 <t 1i <z i <t 2i P [t 2i <T 2i <t 2i + δt i T 1i >z i ]P [T 1i >z i ] δt i β (13) = σ 2 t σ 1 1i e σx β [( )t σ 2i + 11zi σ]e σx for 0 <t 2i <z i <t 1i F i (z i ) = P [T 1i >z i,t 2i >z i ] = e ( )zi σe σx β (14) X = (X 1,..., X m ) (15) β = (β 1,..., β m ) (16) The integers n 1, n 2, n 3, n 4, n 5 and n 6 represent the number of observations falling in the range corresponding to f 1, f 2, f 3, f 4, f 5 and F, respectively. As can be seen from the above formulas, the densities f 1 and f 2 refer to Lebesque measures in R 2, while f 3, f 4 and f 5 refer Lebesque measures in R 1. The logarithm of the likelihood function is given by: log L = (2n 1 +2n 2 + n 3 + n 4 + n 5 )logσ +(n 1 + n 4 )log 1 +(n 2 + n 5 )log 2 +n 3 log 3 + n 1 log 22 + n 2 log 11 +(σ 1) log t 1i +(σ 1) iεa iεb log t 2i σ iεc X iβ σ iεd X iβ ( ) i exp( σx iβ) 11 (T1i σ T2i)exp( σx σ iβ) 22 (T2i σ T1i)exp( σx σ iβ) (17) where

5 A Bivariate Weibull Regression Model 5 A = {t 1i t 1i <z i } B = {t 2i t 2i <z i } C = {(t 1i,t 2i ) 0 <t 1i,t 2i <z i } D = {(t 1i,t 2i ) t 1i <z i or t 2i <z i } F = {(T 1i,T 2i ) T 2i <T 1i } G = {(T 1i,T 2i ) T 1i <T 2i } Wi σ = Min(T 1i,T 2i ) (18) From (17) the likelihood equations are obtained: (n 1 + n 4 ) Wi σ exp σx i β = 0 (19) 1 (n 2 + n 5 ) n n 2 i exp σx i β = 0 (20) i exp σx i β = 0 (21) (T1i σ T 2i)exp σ σx i β = 0 (22) 11 n 1 (T2i σ T 1i)exp σ σx i β = 0 (23) 22 (2n 1 +2n 2 + n 3 + n 4 + n 5 ) + log t 1i + log t 2i X σ iβ iεa iεb iεc X iβ ( ) Wi σ exp σx i β [log W i X iβ] iεd 11 [T1i[log σ T 1i X iβ] T2i[log σ T 2i X iβ]] exp σx i β 22 [T2i[log σ T 2i X iβ] T1i[log σ T 1i X iβ]] exp σx i β = 0 (24) σ X ji σ X ji + σ( ) iεc iεd +σ 11 X ji (T1i σ T2i)exp σ σx i β +σ 22 X ji (T2i σ T1i)exp σ σx i β = 0 X ji i for j =1,..., m exp σx i β The above likelihood equations cannot be solved analytically for obtaining explicit expressions for the maximum likelihood estimators(mles). However, they can be solved numerically, for example by the Newton-Raphson procedure. The second order partial derivatives of the log-likelihood function are as follows: 2 1 = (n 1 + n 4 ) 2 1 (25)

6 6 David D. Hanagal = (n 2 + n 5 ) 2 2 = n = n = n = (2n 1 +2n 2 + n 3 + n 4 + n 5 ) σ 2 σ 2 ( ) i exp σx i β (log W 1i X iβ) 2 (26) (27) (28) (29) 11 exp σx i β [T1i(log σ T 1i X iβ) 2 T2i(log σ T 2i X iβ) 2 ] 22 exp σx i β [T2i(log σ T 2i X iβ) 2 T1i(log σ T 1i X iβ) 2 ] (30) β j β k = ( )σ 2 11 σ 2 i X ji X ki exp σx i β (T σ 1i T σ 2i)X ji X ki exp σx i β 22 σ 2 (T σ 2i T σ 1i)X ji X ki exp σx i β for j, k =1,..., m (31) = 0 2 log L = 2 log L i j ii jj i jj for i, j =1, 2, 3; ii, jj =1, 2 (32) σ j = σ 11 = σ 22 = = σ β j k = σ β j 11 i exp σx i β (log W i X iβ) for j =1,.., m (33) exp σx i β [T σ 1i(log T 1i X iβ) T σ 2i(log T 2i X iβ)] (34) exp σx i β [T σ 2i(log T 2i X iβ) T σ 1i(log T 1i X iβ)] (35) i X ji exp σx i β for k =1, 2; j =1,.., m (36) X ji exp σx i β [T σ 1i T σ 2i] for j =1,..., m (37) β j 22 = σ X ji exp σx i β [T2i σ T1i] σ for j =1,..., m (38) β j σ = ( ) i X ji exp σx i β [1 + σ(log W i X iβ)]

7 A Bivariate Weibull Regression Model X ji exp σx i β [T1i[1 σ + σ(log T 1i X iβ)] T 2i[1+σ(log σ T 2i X iβ)]] + 22 X ji exp σx i β [T2i[1 σ + σ(log T 2i X iβ)] T 1i[1+σ(log σ T 1i X iβ)]] X ji X ji for j =1,.., m (39) iεc iεd The Fisher information matrix I is of (m +6) (m + 6) type and has the following form: I = i j 0 0 log L ii jj j σ jj σ j β l jj β l i σ ii σ σ 2 σ β l i β k ii β k σ β k β l β k with the second order partial derivatives given above. The inverse of the Fisher information matrix is the variance-covariance matrix (Σ = I 1 ) of the maximum likelihood estimators ˆ = (ˆ 1, ˆ 2, ˆ 3, ˆ 11, ˆ 22, ˆσ, ˆβ 1,..., ˆβ m ) of the distribution parameter =( 1, 2, 3, 11, 22,σ,β 1,...., β m ). Thus, the sample statistics n(ˆ ) (41) (40) follows asymptotically a multivariate normal distribution with mean vector zero and variance-covariance matrix Σ. 4 Test for Regression Coefficients In order to confirm that certain covariates are relevant, the hypotheses about β is put in the form H 0 : β 1 =0,withβ partitioned as β =(β 1,β 2 ) where β 1 is a k-dimensional vector. To test H 0 with significance level α one can use Λ 1 = ˆβ 1Σ 1 ˆβ 11 1 (42) where Σ 11 is the k k empirical variance-covariance matrix referring to ˆβ 1. Under H 0, Λ 1 follows asymptotically a χ 2 -distribution with k degrees of freedom. If the value of Λ 1 exceeds the corresponding (1 α)-quantile, the null-hypothesis H 0 can be rejected.

8 8 David D. Hanagal 5 Conclusions Statistical analysis is often reduced to one dimension because multidimensional models and procedures are hardly available. In this paper a model and a method is proposed which can be applied for analysing phenomena when it makes no sense to investigate the aspects of interest one by one using an one-dimensional model. This is the case for paired organs in biometry or systems with a number of identical components operating in parallel as hot redundancy in technical equipments. References  Freund, J.E. (1961): A bivariate extension of the exponential distribution. Journal of Amer. Statist. Assoc. 56,  Hanagal, D.D.(1992): Some inference results in bivariate exponential distributions basedoncensoredsamples. Comm. Statistics, Theory and Methods 21,  Hanagal, D.D. (1992).: Some inference results in modified Freund s bivariate exponential distribution. Biometrical Journal 34(6),  Hanagal, D.D.(1996). A multivariate Weibull distribution. Economic Quality Control 11,  Hanagal, D.D. (2004). Parametric bivariate regression analysis based on censored samples. Economic Quality Control 19,  Marshall, A.W. and Olkin, I.(1967). A multivariate exponential distribution. Journal of Amer. Statist. Assoc. 62,  Proschan, F. and Sullo, P. (1974): Estimating the parameters of bivariate exponential distributions in several sampling situations. In Reliability and Biometry. Eds. F. Proschan and R.J. Serfling, Philadelphia: SIAM, David D. Hanagal Department of Statistics University of Pune Pune , India