A Bivariate Weibull Regression Model

Similar documents
Estimation of the Bivariate Generalized. Lomax Distribution Parameters. Based on Censored Samples

Optimizing Combination Therapy under a Bivariate Weibull Distribution, with Application to Toxicity and Efficacy Responses

STT 843 Key to Homework 1 Spring 2018

TMA 4275 Lifetime Analysis June 2004 Solution

Tests of independence for censored bivariate failure time data

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

A New Class of Positively Quadrant Dependent Bivariate Distributions with Pareto

Marshall-Olkin Bivariate Exponential Distribution: Generalisations and Applications

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Interval Estimation for Parameters of a Bivariate Time Varying Covariate Model

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Multivariate Survival Data With Censoring.

Linear models and their mathematical foundations: Simple linear regression

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

COMPOSITE RELIABILITY MODELS FOR SYSTEMS WITH TWO DISTINCT KINDS OF STOCHASTIC DEPENDENCES BETWEEN THEIR COMPONENTS LIFE TIMES

[y i α βx i ] 2 (2) Q = i=1

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

An Extension of the Freund s Bivariate Distribution to Model Load Sharing Systems

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Lecture 3. Truncation, length-bias and prevalence sampling

Survival Analysis Math 434 Fall 2011

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Statistical Analysis of Competing Risks With Missing Causes of Failure

The Log-generalized inverse Weibull Regression Model

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Chapter 2 Inference on Mean Residual Life-Overview

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Notes on the Multivariate Normal and Related Topics

Power and Sample Size Calculations with the Additive Hazards Model

Linear Models and Estimation by Least Squares

Stat 5101 Lecture Notes

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Exercises. (a) Prove that m(t) =

Multivariate Regression

The regression model with one fixed regressor cont d

Quantile Regression for Residual Life and Empirical Likelihood

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

AFT Models and Empirical Likelihood

Statistics 3858 : Maximum Likelihood Estimators

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

SPRING 2007 EXAM C SOLUTIONS

University of California, Berkeley

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

Multivariate Statistics

Quasi-likelihood Scan Statistics for Detection of

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

RELATIVE ERRORS IN RELIABILITY MEASURES. University of Maine and University of New Brunswick

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS

MAS3301 / MAS8311 Biostatistics Part II: Survival

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM

Lecture 3. Inference about multivariate normal distribution

Step-Stress Models and Associated Inference

Analysis of 2 n Factorial Experiments with Exponentially Distributed Response Variable

Introduction to Normal Distribution

3. Linear Regression With a Single Regressor

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases

Homoskedasticity. Var (u X) = σ 2. (23)

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests

Semiparametric Regression

ECON 4160, Autumn term Lecture 1

Lecture 6 Multiple Linear Regression, cont.

Estimation Under Multivariate Inverse Weibull Distribution

MULTIVARIATE DISCRETE PHASE-TYPE DISTRIBUTIONS

Akaike Information Criterion

4. Comparison of Two (K) Samples

Negative Multinomial Model and Cancer. Incidence

Inference for P(Y<X) in Exponentiated Gumbel Distribution

Discriminating Between the Bivariate Generalized Exponential and Bivariate Weibull Distributions

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

The comparative studies on reliability for Rayleigh models

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Ch 2: Simple Linear Regression

Reliability Modelling Incorporating Load Share and Frailty

CTDL-Positive Stable Frailty Model

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Statistical Process Control Methods from the Viewpoint of Industrial Application

The linear model is the most fundamental of all serious statistical models encompassing:

A STATISTICAL TEST FOR MONOTONIC AND NON-MONOTONIC TREND IN REPAIRABLE SYSTEMS

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Probability Theory and Statistics. Peter Jochumzen

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Problem 1 (20) Log-normal. f(x) Cauchy

Test Problems for Probability Theory ,

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples

STAT331. Cox s Proportional Hazards Model

Location-Scale Bivariate Weibull Distributions For Bivariate. Lifetime Modeling. Yi Han

Marshall-Olkin Univariate and Bivariate Logistic Processes

Random Variables and Their Distributions

Simple Linear Regression

UNIVERSITÄT POTSDAM Institut für Mathematik

Transcription:

c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression model based on censored samples with common covariates. There are some interesting biometrical situations which motivate the study of a bivariate Weibull regression model of the proposed type. A procedure for obtaining the maximum likelihood estimators for the parameters in the model is derived and a test of significance for the regression parameters is sketched. Key words: Bivariate Weibull model, parametric regression, Ssurvival times. 1 Introduction We introduce a new bivariate Weibull regression model based on censored samples with common covariates. Freund [1] proposed a bivariate exponential model (BVE) and Proschan- Sullo [7] modified Freund s BVE by allowing simultaneous failures. A modified bivariate Weibull (BVW) model is obtained by taking simple transformation of the bivariate exponential model (BVE) of Proschan-Sullo [7]. Hanagal [4] proposed a multivariate Weibull distribution which is a generalization of the multivariate exponential model of Marshall-Olkin [6]. We choose the BVW model because it is superior compared with the BVE model of Proschan-Sullo [7]. There are some situations arising in biometry which motivate the study of a BVW regression model of a particular type. For example, paired organs like Kidneys, Eyes, Ears or any other paired organs of an individual (or patient) may be looked at as a two component system. Failure of an organ increases the risk of other organ. We assume here that the lifetime of an individual is independent of the lifetimes of the paired organs an use the univariate censoring given by Hanagal [2, 3], since the death of an individual will censor both lifetimes of organs. Hanagal [5] proposed a bivariate Weibull regression model for the situation described above by extending Marshall-Olkin s bivariate exponential distribution. The covariates may be age of the patient, sex of the patient, smoking or alcoholic habits, diabetic or non-diabetic conditions, some specific diseases of the patient etc. Treating in such situations the lifetimes of paired organs of each patient as identically distributed BVW violates reality and, therefore, it is not advisable. Each patient features certain characteristics and, hence, it is necessary to incorporate covariates on which the

2 David D. Hanagal lifetimes of the paired organs may depend. These covariates represent individual properties of a patient relevant to the lifetimes and, thus, can be assumed to be the same for each considered pair of organs, i.e., we may assume common covariates. Unfortunately, at this stage of the investigations there were no real data available for evaluating our proposed model. In Section 2, we introduce the BVW regression model and in Section 3, we derive estimators for the parameters of the proposed model. In Section 4, we present a test procedures for checking the significance of the regression parameters. 2 Bivariate Weibull Regression Model The BVE of Freund [1] is given by the following joint probability density function (pdf): 1 22 e 22x 2 ( 22 )x 1 for 0 <x 1 <x 2 < f(x 1,x 2 ) = 2 11 e 11x 1 ( 11 )x 2 for 0 <x 2 <x 1 < (1) 3 e for 0 <x 1 = x 2 = x< where =( 1 + 2 + 3 )and 1, 2, 3, 11, 22 > 0. By means of the transformations T 1 = X1 σ and T 2 = X2 σ, σ>0we get a bivariate Weibull model with joint pdf as follows: 1 22 σ 2 (t 1 t 2 ) σ 1 e 22t σ 2 ( 22)t σ 1 for 0 <t 1 <t 2 < f(t 1,t 2 ) = 2 11 σ 2 (t 1 t 2 ) σ 1 e 11t σ 1 ( 11)t σ 2 for 0 <t 2 <t 1 < (2) 3 σt σ 1 e for 0 <t 1 = t 2 = t< As it is well known for the BVE of Proschan-Sullo [7], the marginals are weighted combinations of two exponential distributions. Also in the BVW case, the marginals are weighted combinations of two Weibull distributions with same weights. The minimum of the two lifetimes min(t 1,T 2 ) is Weibull distributed with scale parameter ( 1 + 2 + 3 ) and shape parameter σ. The lifetimes T 1 and T 2 are independent. whenever 11 = 1 and 22 = 2 holds. The probabilities in the two regions are given by P [T 1 <T 2 ] = 1 P [T 1 >T 2 ] = 2 P [T 1 = T 2 ] = 3 (3) Taking logarithm of the BVE variates in (1), i.e., Y 1 =logx 1 and Y 2 =logx 2,wegeta bivariate extreme value (BVEV) distribution with pdf given by:

A Bivariate Weibull Regression Model 3 f(y 1,y 2 ) = e y 1+log 1 +y 2 +log 22 e y 1 +log( 22 ) e y 2 +log 22 e y 2+log 2 +y 1 +log 11 e y 2 +log( 11 ) e y 1 +log 11 e y+log 3 e y+log for <y 1 <y 2 < for <y 2 <y 1 < (4) for <y 1 =y 2 =y< The two joint density function (1) and (4) show that the parameters ( 1, 2, 3, 11, 22 ) are not proper scale parameters for BVE of Proschan-Sullo [7]. It follows that BVE of Proschan-Sullo and also the corresponding BVEV do not belong to the location-scale family. The regression model for the two component system of interest is given by: ( ) ( ) Y1 β = 1X 1 + 1 ( ) U1 Y 2 β 2X 2 σ U 2 (5) where X 1 and X 2 are m-dimensional vectors of regressor variables or covariates, β 1 and β 2 are m-dimensional vectors of regression coefficients and (U 1,U 2 ) are random variables having density functions as given in (4). For Y 1 =logt 1 and Y 2 =logt 2,weget ( ) ( ) T1 e = β 1 X 1 V 1 σ 1 T 2 e β 2 X 2 V 1 σ 2 (6) where (V 1 = e U 1,V 2 = e U 2 ) is BVE of Proschan-Sullo [7]. Alternatively we can write ( ) ( ) V1 e = σβ 1 X 1 T1 σ V 2 e σβ 2 X 2 T2 σ Assuming additionally that both components have not only common covariates, but also common regression parameters, i.e., X 1 = X 2 and β 1 = β 2 then (7) simplifies to: ( ) ( ) V1 T σ = 1 e σβ X V 2 T σ 2 3 Estimation of the Parameters Suppose that the study shall be based on an independent sample of size n and let the ith pair of the components have lifetimes (T 1i,T 2i ) and censoring time (Z i ). We assume the censoring time Z to be independent of the lifetimes (T 1,T 2 ). The lifetimes associated with the ith pair of the organs are given by (T 1i,T 2i ) = (T 1i,T 2i ) for max(t 1i,T 2i ) <Z i (T 1i,Z i ) for T 1i <Z i <T 2i (Z i,t 2i ) for T 2i <Z i <T 1i (Z i,z i ) for Z i < min(tt 1i,T 2i ) (7)

4 David D. Hanagal The likelihood function of the sample of size n is given by n 1 n 2 n 3 n 4 n 5 n 6 L = ( f 1,i )( f 2,i )( f 3,i )( f 4,i )( f 5,i )( F i ) (8) where f 1,i (t 1i t 2i ) = σ 2 1 22 (t 1i t 2i ) σ 1 e 2σX β [( 1 + 2 + 3 22 )t σ 1i + 22t σ 2i ]e σx β for 0 <t 1i <t 2i <z i f 2,i (t 1i t 2i ) = σ 2 2 11 (t 1i t 2i ) σ 1 e 2σX β [( 1 + 2 + 3 11 )t σ 2i + 11t σ 1i ]e σx β (9) (10) for 0 <t 2i <t 1i <z i f 3,i (t 1i t 2i ) = σ 3 t σ 1 i e σx β [( 1 + 2 + 3 )t σ i e σx β (11) for 0 <t 1i = t 2i = t i <z i f 4,i (t 1i t 2i ) = lim δt i 0 f 5,i (t 1i t 2i ) = lim δt i 0 P [t 1i <T 1i <t 1i + δt i T 2i >z i ]P [T 2i >z i ] δt i β (12) = σ 1 t σ 1 1i e σx β [( 1 + 2 + 3 22 )t σ 1i + 22zi σ]e σx for 0 <t 1i <z i <t 2i P [t 2i <T 2i <t 2i + δt i T 1i >z i ]P [T 1i >z i ] δt i β (13) = σ 2 t σ 1 1i e σx β [( 1 + 2 + 3 11 )t σ 2i + 11zi σ]e σx for 0 <t 2i <z i <t 1i F i (z i ) = P [T 1i >z i,t 2i >z i ] = e ( 1+ 2 + 3 )zi σe σx β (14) X = (X 1,..., X m ) (15) β = (β 1,..., β m ) (16) The integers n 1, n 2, n 3, n 4, n 5 and n 6 represent the number of observations falling in the range corresponding to f 1, f 2, f 3, f 4, f 5 and F, respectively. As can be seen from the above formulas, the densities f 1 and f 2 refer to Lebesque measures in R 2, while f 3, f 4 and f 5 refer Lebesque measures in R 1. The logarithm of the likelihood function is given by: log L = (2n 1 +2n 2 + n 3 + n 4 + n 5 )logσ +(n 1 + n 4 )log 1 +(n 2 + n 5 )log 2 +n 3 log 3 + n 1 log 22 + n 2 log 11 +(σ 1) log t 1i +(σ 1) iεa iεb log t 2i σ iεc X iβ σ iεd X iβ ( 1 + 2 + 3 ) i exp( σx iβ) 11 (T1i σ T2i)exp( σx σ iβ) 22 (T2i σ T1i)exp( σx σ iβ) (17) where

A Bivariate Weibull Regression Model 5 A = {t 1i t 1i <z i } B = {t 2i t 2i <z i } C = {(t 1i,t 2i ) 0 <t 1i,t 2i <z i } D = {(t 1i,t 2i ) t 1i <z i or t 2i <z i } F = {(T 1i,T 2i ) T 2i <T 1i } G = {(T 1i,T 2i ) T 1i <T 2i } Wi σ = Min(T 1i,T 2i ) (18) From (17) the likelihood equations are obtained: (n 1 + n 4 ) Wi σ exp σx i β = 0 (19) 1 (n 2 + n 5 ) n 3 2 3 n 2 i exp σx i β = 0 (20) i exp σx i β = 0 (21) (T1i σ T 2i)exp σ σx i β = 0 (22) 11 n 1 (T2i σ T 1i)exp σ σx i β = 0 (23) 22 (2n 1 +2n 2 + n 3 + n 4 + n 5 ) + log t 1i + log t 2i X σ iβ iεa iεb iεc X iβ ( 1 + 2 + 3 ) Wi σ exp σx i β [log W i X iβ] iεd 11 [T1i[log σ T 1i X iβ] T2i[log σ T 2i X iβ]] exp σx i β 22 [T2i[log σ T 2i X iβ] T1i[log σ T 1i X iβ]] exp σx i β = 0 (24) σ X ji σ X ji + σ( 1 + 2 + 3 ) iεc iεd +σ 11 X ji (T1i σ T2i)exp σ σx i β +σ 22 X ji (T2i σ T1i)exp σ σx i β = 0 X ji i for j =1,..., m exp σx i β The above likelihood equations cannot be solved analytically for obtaining explicit expressions for the maximum likelihood estimators(mles). However, they can be solved numerically, for example by the Newton-Raphson procedure. The second order partial derivatives of the log-likelihood function are as follows: 2 1 = (n 1 + n 4 ) 2 1 (25)

6 David D. Hanagal 2 2 2 3 2 11 = (n 2 + n 5 ) 2 2 = n 3 2 3 = n 2 2 11 = n 1 22 2 22 2 = (2n 1 +2n 2 + n 3 + n 4 + n 5 ) σ 2 σ 2 ( 1 + 2 + 3 ) i exp σx i β (log W 1i X iβ) 2 (26) (27) (28) (29) 11 exp σx i β [T1i(log σ T 1i X iβ) 2 T2i(log σ T 2i X iβ) 2 ] 22 exp σx i β [T2i(log σ T 2i X iβ) 2 T1i(log σ T 1i X iβ) 2 ] (30) β j β k = ( 1 + 2 + 3 )σ 2 11 σ 2 i X ji X ki exp σx i β (T σ 1i T σ 2i)X ji X ki exp σx i β 22 σ 2 (T σ 2i T σ 1i)X ji X ki exp σx i β for j, k =1,..., m (31) = 0 2 log L = 2 log L i j ii jj i jj for i, j =1, 2, 3; ii, jj =1, 2 (32) σ j = σ 11 = σ 22 = = σ β j k = σ β j 11 i exp σx i β (log W i X iβ) for j =1,.., m (33) exp σx i β [T σ 1i(log T 1i X iβ) T σ 2i(log T 2i X iβ)] (34) exp σx i β [T σ 2i(log T 2i X iβ) T σ 1i(log T 1i X iβ)] (35) i X ji exp σx i β for k =1, 2; j =1,.., m (36) X ji exp σx i β [T σ 1i T σ 2i] for j =1,..., m (37) β j 22 = σ X ji exp σx i β [T2i σ T1i] σ for j =1,..., m (38) β j σ = ( 1 + 2 + 3 ) i X ji exp σx i β [1 + σ(log W i X iβ)]

A Bivariate Weibull Regression Model 7 + 11 X ji exp σx i β [T1i[1 σ + σ(log T 1i X iβ)] T 2i[1+σ(log σ T 2i X iβ)]] + 22 X ji exp σx i β [T2i[1 σ + σ(log T 2i X iβ)] T 1i[1+σ(log σ T 1i X iβ)]] X ji X ji for j =1,.., m (39) iεc iεd The Fisher information matrix I is of (m +6) (m + 6) type and has the following form: I = i j 0 0 log L ii jj j σ jj σ j β l jj β l i σ ii σ σ 2 σ β l i β k ii β k σ β k β l β k with the second order partial derivatives given above. The inverse of the Fisher information matrix is the variance-covariance matrix (Σ = I 1 ) of the maximum likelihood estimators ˆ = (ˆ 1, ˆ 2, ˆ 3, ˆ 11, ˆ 22, ˆσ, ˆβ 1,..., ˆβ m ) of the distribution parameter =( 1, 2, 3, 11, 22,σ,β 1,...., β m ). Thus, the sample statistics n(ˆ ) (41) (40) follows asymptotically a multivariate normal distribution with mean vector zero and variance-covariance matrix Σ. 4 Test for Regression Coefficients In order to confirm that certain covariates are relevant, the hypotheses about β is put in the form H 0 : β 1 =0,withβ partitioned as β =(β 1,β 2 ) where β 1 is a k-dimensional vector. To test H 0 with significance level α one can use Λ 1 = ˆβ 1Σ 1 ˆβ 11 1 (42) where Σ 11 is the k k empirical variance-covariance matrix referring to ˆβ 1. Under H 0, Λ 1 follows asymptotically a χ 2 -distribution with k degrees of freedom. If the value of Λ 1 exceeds the corresponding (1 α)-quantile, the null-hypothesis H 0 can be rejected.

8 David D. Hanagal 5 Conclusions Statistical analysis is often reduced to one dimension because multidimensional models and procedures are hardly available. In this paper a model and a method is proposed which can be applied for analysing phenomena when it makes no sense to investigate the aspects of interest one by one using an one-dimensional model. This is the case for paired organs in biometry or systems with a number of identical components operating in parallel as hot redundancy in technical equipments. References [1] Freund, J.E. (1961): A bivariate extension of the exponential distribution. Journal of Amer. Statist. Assoc. 56, 971-77. [2] Hanagal, D.D.(1992): Some inference results in bivariate exponential distributions basedoncensoredsamples. Comm. Statistics, Theory and Methods 21, 1273-95. [3] Hanagal, D.D. (1992).: Some inference results in modified Freund s bivariate exponential distribution. Biometrical Journal 34(6), 745-56. [4] Hanagal, D.D.(1996). A multivariate Weibull distribution. Economic Quality Control 11, 193-200. [5] Hanagal, D.D. (2004). Parametric bivariate regression analysis based on censored samples. Economic Quality Control 19, 83-90. [6] Marshall, A.W. and Olkin, I.(1967). A multivariate exponential distribution. Journal of Amer. Statist. Assoc. 62, 30-44. [7] Proschan, F. and Sullo, P. (1974): Estimating the parameters of bivariate exponential distributions in several sampling situations. In Reliability and Biometry. Eds. F. Proschan and R.J. Serfling, Philadelphia: SIAM, 423-40. David D. Hanagal Department of Statistics University of Pune Pune-411007, India