# Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Save this PDF as:

Size: px
Start display at page:

Download "Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data"

## Transcription

1 Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, DOI: Print ISSN / Online ISSN Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Yang-Jin Kim 1,a a Department of Statistics, Sookmyung Women s University, Korea Abstract Kendall s tau statistic has been applied to test an association of bivariate random variables. However, incomplete bivariate data with a truncation and a censoring results in incomparable or unorderable pairs. With such a partial information, Tsai (1990 suggested a conditional tau statistic and a test procedure for a quasi independence that was extended to more diverse cases such as double truncation and a semi-competing risk data. In this paper, we also employed a conditional tau statistic to estimate an association of bivariate interval censored data. The suggested method shows a better result in simulation studies than Betensky and Finkelstein s multiple imputation method except a case in cases with strong associations. The association of incubation time and infection time from an AIDS cohort study is estimated as a real data example. Keywords: AIDS, bivariate interval censored data, conditional Kendall s tau, jackknife variance, quasi-independence, unorderable pairs 1. Introduction There are several approaches to measure associations between two variables. Among them, Kendall s tau is commonly used because of rank invariant property and has powerful asymptotic properties such as a U-statistic. With a pair of bivariate data (T 1i, T 2i and (T 1 j, T 2 j, a Kendall s tau is defined as τ = E{sgn((T 1i T 1 j (T 2i T 2 j } where sgn(x = 1 for x > 0, sgn(x = 1 for x < 0, and sgn(x = 0 for x = 0. It is also interpreted as the difference between concordance rate and discordance rate. The value of τ is between 1 and 1. When T 1 and T 2 are independent, the probabilities of concordance and discordance are same and implies τ = 0. For complete data, it is estimated as, 1 i j n a i j b i j ˆτ = ( n, (1.1 2 where a score a i j = 1 if T 1i > T 1 j ; 1 if T 1i < T 1 j and b i j = 1 if T 2i > T 2 j ; 1 if T 2i < T 2 j. Then a i j b i j = 1 for (T 1i T 1 j (T 2i T 2 j > 0 which is a concordant pair and a i j b i j = 1 for (T 1i T 1 j (T 2i T 2 j < 0 which is a discordant pair. Then ˆτ is an unbiased estimator of τ and n 1/2 (ˆτ τ is known to have asymptotically normal distribution (Hoeffding, However, for an incomplete data with censoring and truncation, the estimation of τ has been challenging. For a right censored data, in order to compensate incomparable pairs, Brown et al. (1974 This research was supported by the Sookmyung Women s University research grants Department of Statistics, Sookmyung Women s University, Chengpa-Dong, Yonnsan-Gu, Seoul , Korea. Published 30 November 2015 / journal homepage: c 2015 The Korean Statistical Society, and Korean International Statistical Society. All rights reserved.

2 600 Yang-Jin Kim suggested weighted scores. Weier and Basu (1980 modified the estimator to reflect the admissible information of comparable pairs. Oakes (1982 suggested an estimator using only comparable pair and tested a null hypothesis, H 0 : T 1 T 2. Wang and Wells (2000 utilized an integral form with an estimated bivariate survival function and expressed it as a V-statistic. Hsieh (2010 considered several types of imputed times to replace a censored data and calculated τ estimates based on imputed values. Tsai (1990 has alternatively extended Oakes idea and considered a comparable pair for a left truncation data. In more detail, when T 1 and T 2 are denoted as left truncation time and failure time, respectively, the comparable pair satisfies T 1 < T 2. For this restricted data, he suggested a conditional tau statistic in that uses only partial information. A comparable pair with a left truncation is then defined as Ω i j = { max ( ( } T 1i, T 1 j min T2i, T 2 j (1.2 and a corresponding conditional tau is estimated as sgn (( ( ( T 1i T 1 j T2i T 2 j I Ωi j ˆτ c = I (. Ω i j This restriction also derives a test statistic for a quasi-independence, H 0 : T 1 Q T 2. Martin and Betensky (2005 consider more general truncation schemes and suggest several estimators for a conditional tau. They represented these estimators with U-statistics and derived corresponding variances. For example, the available data includes censoring information when a right censoring data is added to a left truncation, C i, C j, δ i = I(T 2i < C i and δ j = I(T 2 j < C j. Then a comparable or orderable pair (1.2 is redefined with Y i = min(t 2i, C i and Y j = min(t 2 j, C j, A i j = { max ( ( } { T 1i, T 1 j min Yi, Y j δi δ j = 1 δ i sgn ( Y j Y i = 1 δ j sgn ( } Y i Y j = 1 (1.3 and a corresponding conditional tau is estimated by sgn (( ( ( Y i Y j T1i T 1 j I Ai j τ c = I (. A i j They also extended to test a quasi-independence for doubly truncation data and an interval censored data with a truncation. However, the suggested estimators can make poor performance when a true τ gets away from zero. To correct a selection bias from censoring, Lakhal et al. (2009 suggested an IPCW estimator under a bivariate right censoring without truncation. Under a semi-competing risk data, Hsieh and Huang (2015 applied the IPCW technique to estimate a conditional tau. Denote X i and Y i as time to non-terminal event and time to terminal event, respectively. Therefore, X i is censored by Y i but does not censor Y i. Define the observable variables Z i = min(x i, Y i, C i, T i = min(y i, C i, δ Xi = I(X i Y i C i and δ Yi = I(Y i C i. Then a comparable and orderable pair is defined as S i j = { min ( X i, X j min (Yi, Y i } { δ Xi δ X j =1 δ Xi sgn ( Y j Y i =1 δx j sgn ( Y i Y j =1 }, (1.4 and a corresponding conditional tau is ˆτ c,w = sgn (( ( ( Y i Y j Zi Z j I S i j / ˆpi j I (, S i j / ˆpi j

3 Bivariate Interval Censored Data 601 Figure 1: Bivariate interval censored data. where and ˆp i j = [ ˆP(min(C i, C j min(x i, X j ] 2. To estimate ˆP(C > t, the Kaplan-Meier estimator is implemented with (Y i, 1 δ i. Our interest in this paper is to estimate a conditional tau under bivariate interval censored data and test a quasi-independence between two interval censored data. In Section 2, we extend a conditional tau to bivariate interval censored data. In Section 3, finite sample properties of the suggested method are evaluated through some simulation studies and AIDS cohort study is analyzed in Section 4. We remark some issues in Section Conditional Kendall s Taus for Interval Censored Data We propose a new estimator for τ c under bivariate interval censored data where it is more burdensome to calculate both concordance and discordance because of overlapping rectangles and of uncertain orderings of even non-overlapping rectangles. Figure 1 shows a graphic representing bivariate interval censored data with concordant, discordant and incomparable pairs. Top left is a concordant pair, top right is a discordant, bottom left is non-overlapping but incomparable and bottom right shows an overlapping case. There are several approaches to calculate an association for interval censored data. Betensky and Finkelstein (1999 implemented a multiple imputation technique using the estimated bivariate distribution in order to replace unknown bivariate failure times. However, the derived imputed results reproduce rectangles which can result in incomparable pairs. Therefore, these are again excluded from the calculation. As another method, Bogaerts and Lesaffre (2008 utilized the following formulae, τ = 4 F(x, ydf(x, y 1, where a bivariate distribution function was estimated with a smoothing technique based on a normal approximation. They also suggested a cross-hazard function to measure a local association; however, diverse simulation studies were not discussed in their paper. Denote {(L 1 i, R1 i, (L2 i, R2 i, δ 1i, δ 2i, i = 1,..., n} as an observable data consisting of bivariate interval censored data and censoring indicators, respectively. Here, instead of observing failure times T 1i and T 2i, we observe interval censoring times satisfying L 1 i < T 1i < R 1 i and L 2 i < T 2i < R 2 i, respectively. Censoring indicators are also defined as δ 1i = (L 1 i < R 1 i <, that is, δ 1i = 0 if R i1 =. Similar definition is defined with δ 2i. We assume that interval censoring times are independent of both failure times. In order to extend the comparable and orderable pair like (1.2, (1.3 and (1.4 to a bivariate interval censored data, define L 1 i j = max(l 1 i, L1 j, L 2 i j = max(l 2 i, L2 j, R 1 i j = min(r 1 i, R1 j and R 2 i j = min(r 2 i, R2 j. Furthermore, pairwise censoring indicators are defined as δ1 i j = 1 (1 δ 1i(1 δ 1 j and

4 602 Yang-Jin Kim δ 2 i j = 1 (1 δ 2i(1 δ 2 j. Then the comparable and orderable set is a conditional tau is calculated as ˆτ B c = B i j = {( R 1 i j < L 1 i j ( R 2 i j < L 2 i j sgn (( R 1 i j ( L i j 1 R 2 i j L i 2 j I (. B i j } I ( Bi j The jackknife approach is applied since the asymptotic variance of ˆτ B c is difficult to derive, ˆσ 2 c = n 1 n n (ˆτ B(i c ˆτ c B( 2, where ˆτ B(i c 3. Simulation is an estimate based on the data deleting the i th subject and ˆτ B( c = n 1 n ˆτ B(i. In this section, simulation studies are performed to examine the finite sample performance of the suggested estimators. 300 simulation samples were generated and bivariate failure times are generated from Clayton s copula model where the association parameter can be expressed as α = 2τ/(1 τ, that is, (T 1, T 2 has a following joint survival function, S (t 1, t 2 = ( S 1 (t 1 α + S 2 (t 2 α 1 1 α. In order to make an interval censored data for T 1, generate m s u from a uniform distribution U(0, Then set Li 1 = j k=1 u j = ũ j and R 1 i = Li 1 + u = ũ j+1 satisfying Li 1 < T 1 < R 1 i. Here m is set depending on right censoring rate. Similar method is applied to make an interval censored data for T 2. With each setup described above, two sample sizes (n = 100, 200 are considered. In this simulation, we set τ = 0.0, 0.3, 0.5, 0.7. Where, τ = 0 means T 1 and T 2 are independent. We consider two different high right censoring rates since it is related with the amount of orderable pairs. Table 1 shows the simulation result under moderate right censoring rate and Table 2 is one for a high right censoring rate. Under a moderate right censoring rate, the suggested method brings in unbiased estimators and satisfies 95% coverage probabilities (CP at most cases except τ = 0.7. The averaged standard error based on the Jackknife method (SSE = ˆσ c is also close to the empirical standard deviation (SEE. However, a high right censoring case (i.e. a high rate of unorderable pair provides under-estimated results except τ = 0. In particular, a stronger association produces a more underestimated estimation. Table 3 shows biases and MSEs under two different censoring rates that compares the suggested method and the Betensky and Finkelstein s estimator. Under moderate censoring, the suggested estimate has a smaller bias and smaller MSE but at a high right censoring, the conditional tau has larger bias at τ = Data Analysis The suggested method is applied to the datasets from AIDS Clinical Trials Group Protocol 181. A cohort study of 257 individuals with Type A or Type B hemophilia was analyzed by DeGruttola and Lagakos (1989 and Kim et al. (1993 to estimate the distribution and regression coefficient,

5 Bivariate Interval Censored Data 603 Table 1: Estimation of τ B c with moderate right censored data τ ˆτ c B n = 100 n = 200 SSE(ˆτ c B SEE(ˆτ c B CP ˆτ c B SSE(ˆτ c B SEE(ˆτ c B CP Table 2: Estimation of τ B c with high right censored data τ ˆτ c B n = 100 n = 200 SSE(ˆτ c B SEE(ˆτ c B CP ˆτ c B SSE(ˆτ c B SEE(ˆτ c B CP Table 3: Bias and MSE of τ B c and τ m at moderate right and high right censored case Moderate right censored High right censored τ = 0.0 τ = 0.3 τ = 0.5 Bias MSE Bias MSE Bias MSE ˆτ c B ˆτ m ˆτ c B ˆτ m respectively. As the application of the suggested method, we investigate the association between HIV infection time and AIDS incubation time (or diagnosis time using a null hypothesis, H 0 : τ c = 0. HIV infection is determined from a blood test; therefore, the exact HIV infection time is unavailable and is observed with an interval form such as XL X XR. Here, XL is the last inspection time with negative infection and XR is the first inspection time with positive infection, respectively. Therefore, a corresponding incubation time is also an interval censored with (T L, TR where T denotes an AIDS diagnosis time, then T L = T XR and TR = T XL, respectively. Among heavily treated 97 HIV positive patients, 29 patients got AIDS positive results. When applying the suggested method, a conditional tau τ B c = (se = ˆσ c = is estimated and a corresponding p-value = is derived based on an asymptotic normality. It indicates that infection time and incubation time are negatively correlated and patients with an early infection time have a higher chance to extend the AIDS diagnosis time than patients with a late infection time. 5. Discussion In this paper, we investigate an association measure for bivariate interval censored data. However, incomparable pairs occur and an ordinary tau statistic is not suitable due to the existence of both overlapping rectangles and uncertain ordering for even non-overlapping rectangles. Tsai (1990 suggested a quasi-independence based on the comparable set for truncated data and Martin and Betensky (2005 applied it to data with doubly truncation and interval censored with truncation. We defined an orderable set and calculated a conditional tau by extending their method to bivariate interval censored data. According to simulation results, the suggested method provides unbiased result under a moderate right censoring and a jackknife variance estimator works well for several tau values. However, extra simulation study shows the conditional tau estimator results in under-estimation under a high right

6 604 Yang-Jin Kim censoring. In order to solve this problem, we can consider a weighting method following Hsieh and Huang (2015 who considered a semi-competing risk data. However, while their weight is based on the distribution of right censoring times, our case is not so simple since interval censoring data occurs with observation process; consequently, modeling such process requires several assumptions and models. Finkelstein et al. (2002 derived a joint likelihood of observation process and event process and applied an EM algorithm in order to estimate related parameters. Therefore, the estimated observation process can be implemented to reflect a high right censoring rate. Another alternative weighting method is related with the recovery of the loss of information. Bivariate interval censoring data has pairs with overlapping rectangles and uncertain orderings of even non-overlapping rectangles. Therefore, it would bring a more efficient result by assigning some weight instead of dropping these pairs from the estimation procedure. We also consider to extend Brown et al. (1974 s method. References Betensky, R. and Finkelstein, D. F. (1999. An extension of Kendall s coefficient of concordance to bivariate interval censored data, Statistics in Medicine, 18, Bogaerts, K. and Lesaffre, E. (2008. Estimating local and global measures of association for bivariate interval censored data with a smooth estimate of density, Statistics in Medicine, 27, Brown, B. W., Hollander, M. and Korwar, R. M. (1974. Nonparametric Tests of Independence for Censored Data, with Applications to Heart Transplant Studies, Reliability and Biometry, DeGruttola, V. and Lagakos, S. W. (1989. Analysis of doubly-censored survival data, with application to AIDS, Biometrics, 45, Finkelstein, D. M., Giggins, W. B. and Schoenfeld, D. A. (2002. Analysis of failure time data with dependent interval censoring, Biometrics, 58, Hoeffding, W. (1948. A class of statistics with asymptotic normal distributions, The Annals of Mathematical Statistics, 19, Hsieh, J-J. (2010. Estimation of Kendall s tau from censored data, Computational Statistics and Data Analysis, 54, Hsieh, J. J. and Huang, W. C. (2015. Nonparametric estimation and test of conditional Kendall s tau under semi-competing risk data and truncated data, Journal of Applied Statistics, 42, Kim, M. Y., De Gruttola, V. and Lagakos, S. W. (1993. Analyzing doubly censored data with covariates, with application to AIDS, Biometrics, 49, Lakhal, L., Rivest, L. P. and Beaudoin, D. (2009. IPCW estimator for Kendall s tau under bivariate censoring, International Journal of Biostatistics, 5, Martin, E. C. and Betensky, R. A. (2005. Testing quasi-independence of failure and truncation times bias conditional Kendall s tau, Journal of American Statistical Association, 100, Oakes, D. (1982. A concordance test for independence in the presence of censoring, Biometrics, 38, Tsai, T. W (1990. Testing the assumption of independence of truncation time and failure time, Biometrika, 77, Wang, W. and Wells, M. (2000. Estimation of Kendall s tau under censoring, Statistica Sinica, 10, Weier, D. R. and Basu, A. P. (1980. An investigation of Kendall s τ modified for censored data with applications, Journal of Statistics and Planning Inference, 4, Received July 13, 2015; Revised October 2, 2015; Accepted October 6, 2015

### AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

### A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

### Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival

### SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE

### Multivariate Survival Data With Censoring.

1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

### Harvard University. Harvard University Biostatistics Working Paper Series. Survival Analysis with Change Point Hazard Functions

Harvard University Harvard University Biostatistics Working Paper Series Year 2006 Paper 40 Survival Analysis with Change Point Hazard Functions Melody S. Goodman Yi Li Ram C. Tiwari Harvard University,

### Multivariate Assays With Values Below the Lower Limit of Quantitation: Parametric Estimation By Imputation and Maximum Likelihood

Multivariate Assays With Values Below the Lower Limit of Quantitation: Parametric Estimation By Imputation and Maximum Likelihood Robert E. Johnson and Heather J. Hoffman 2* Department of Biostatistics,

### Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

### Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

### IDR: Irreproducible discovery rate

IDR: Irreproducible discovery rate Sündüz Keleş Department of Statistics Department of Biostatistics and Medical Informatics University of Wisconsin, Madison April 18, 2017 Stat 877 (Spring 17) 04/11-04/18

### Model Selection Procedures

Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore

### Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

### Survival Distributions, Hazard Functions, Cumulative Hazards

BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

### Topics in Current Status Data. Karen Michelle McKeown. A dissertation submitted in partial satisfaction of the. requirements for the degree of

Topics in Current Status Data by Karen Michelle McKeown A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor in Philosophy in Biostatistics in the Graduate Division

### Ef ciency considerations in the additive hazards model with current status data

367 Statistica Neerlandica (2001) Vol. 55, nr. 3, pp. 367±376 Ef ciency considerations in the additive hazards model with current status data D. Ghosh 1 Department of Biostatics, University of Michigan,

### Extensions of Cox Model for Non-Proportional Hazards Purpose

PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

### Duke University. Duke Biostatistics and Bioinformatics (B&B) Working Paper Series. Randomized Phase II Clinical Trials using Fisher s Exact Test

Duke University Duke Biostatistics and Bioinformatics (B&B) Working Paper Series Year 2010 Paper 7 Randomized Phase II Clinical Trials using Fisher s Exact Test Sin-Ho Jung sinho.jung@duke.edu This working

### Latent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness

Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 36 11-1-2016 Latent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness Li Qin Yale University,

### Resampling and the Bootstrap

Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

### Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models

Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models G. R. Pasha Department of Statistics, Bahauddin Zakariya University Multan, Pakistan E-mail: drpasha@bzu.edu.pk

### Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

### A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES

A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES MICHAEL O HARA AND CHRISTOPHER F. PARMETER Abstract. This paper presents a Monte Carlo comparison of

### CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

### A Regression Model for the Copula Graphic Estimator

Discussion Papers in Economics Discussion Paper No. 11/04 A Regression Model for the Copula Graphic Estimator S.M.S. Lo and R.A. Wilke April 2011 2011 DP 11/04 A Regression Model for the Copula Graphic

### BIOS 312: Precision of Statistical Inference

and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

### Semiparametric Regression Analysis of Bivariate Interval-Censored Data

University of South Carolina Scholar Commons Theses and Dissertations 12-15-2014 Semiparametric Regression Analysis of Bivariate Interval-Censored Data Naichen Wang University of South Carolina - Columbia

### Exam C Solutions Spring 2005

Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8

### The International Journal of Biostatistics

The International Journal of Biostatistics Volume 2, Issue 1 2006 Article 2 Statistical Inference for Variable Importance Mark J. van der Laan, Division of Biostatistics, School of Public Health, University

### On assessing the association for bivariate current status data

Biometrika (2), 87, 4, pp. 879 893 2 Biometrika Trust Printed in Great Britain On assessing the association for bivariate current status data BY WEIJING WANG Institute of Statistics, National Chiao-T ung

### Statistical inference on the penetrances of rare genetic mutations based on a case family design

Biostatistics (2010), 11, 3, pp. 519 532 doi:10.1093/biostatistics/kxq009 Advance Access publication on February 23, 2010 Statistical inference on the penetrances of rare genetic mutations based on a case

### Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

### CURE MODEL WITH CURRENT STATUS DATA

Statistica Sinica 19 (2009), 233-249 CURE MODEL WITH CURRENT STATUS DATA Shuangge Ma Yale University Abstract: Current status data arise when only random censoring time and event status at censoring are

### Analysis of Cure Rate Survival Data Under Proportional Odds Model

Analysis of Cure Rate Survival Data Under Proportional Odds Model Yu Gu 1,, Debajyoti Sinha 1, and Sudipto Banerjee 2, 1 Department of Statistics, Florida State University, Tallahassee, Florida 32310 5608,

### Mixed model analysis of censored longitudinal data with flexible random-effects density

Biostatistics (2012), 13, 1, pp. 61 73 doi:10.1093/biostatistics/kxr026 Advance Access publication on September 13, 2011 Mixed model analysis of censored longitudinal data with flexible random-effects

### Journal of Statistical Software

JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen Mei-Jie

### In Defence of Score Intervals for Proportions and their Differences

In Defence of Score Intervals for Proportions and their Differences Robert G. Newcombe a ; Markku M. Nurminen b a Department of Primary Care & Public Health, Cardiff University, Cardiff, United Kingdom

### Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in

### Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

### Nonparametric estimation for current status data with competing risks

Nonparametric estimation for current status data with competing risks Marloes Henriëtte Maathuis A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy

### The Design and Analysis of Benchmark Experiments Part II: Analysis

The Design and Analysis of Benchmark Experiments Part II: Analysis Torsten Hothorn Achim Zeileis Friedrich Leisch Kurt Hornik Friedrich Alexander Universität Erlangen Nürnberg http://www.imbe.med.uni-erlangen.de/~hothorn/

### Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

### 18Ï È² 7( &: ÄuANOVAp.O`û5 571 Based on this ANOVA model representation, Sobol (1993) proposed global sensitivity index, S i1...i s = D i1...i s /D, w

A^VÇÚO 1 Êò 18Ï 2013c12 Chinese Journal of Applied Probability and Statistics Vol.29 No.6 Dec. 2013 Optimal Properties of Orthogonal Arrays Based on ANOVA High-Dimensional Model Representation Chen Xueping

### From semi- to non-parametric inference in general time scale models

From semi- to non-parametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research

### Chapter 4 Regression Models

23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,

### ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL

Statistica Sinica 18(28, 219-234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.

### Likelihood-based testing and model selection for hazard functions with unknown change-points

Likelihood-based testing and model selection for hazard functions with unknown change-points Matthew R. Williams Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University

### Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events

Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events Liam F. McGrath September 15, 215 Abstract When separation is a problem in binary

### ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

### Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

### An Analysis of Record Statistics based on an Exponentiated Gumbel Model

Communications for Statistical Applications and Methods 2013, Vol. 20, No. 5, 405 416 DOI: http://dx.doi.org/10.5351/csam.2013.20.5.405 An Analysis of Record Statistics based on an Exponentiated Gumbel

### Understanding the Cox Regression Models with Time-Change Covariates

Understanding the Cox Regression Models with Time-Change Covariates Mai Zhou University of Kentucky The Cox regression model is a cornerstone of modern survival analysis and is widely used in many other

### Reliability Analysis of Repairable Systems with Dependent Component Failures. under Partially Perfect Repair

Reliability Analysis of Repairable Systems with Dependent Component Failures under Partially Perfect Repair Qingyu Yang & Nailong Zhang Department of Industrial and System Engineering Wayne State University

### Replicability and meta-analysis in systematic reviews for medical research

Tel-Aviv University Raymond and Beverly Sackler Faculty of Exact Sciences Replicability and meta-analysis in systematic reviews for medical research Thesis submitted in partial fulfillment of the requirements

### A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning

A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning Saharon Rosset Data Analytics Research Group IBM T.J. Watson Research Center Yorktown Heights, NY 1598 srosset@us.ibm.com Hui

### arxiv: v1 [math.st] 24 Aug 2017

Submitted to the Annals of Statistics MULTIVARIATE DEPENDENCY MEASURE BASED ON COPULA AND GAUSSIAN KERNEL arxiv:1708.07485v1 math.st] 24 Aug 2017 By AngshumanRoy and Alok Goswami and C. A. Murthy We propose

### Copulas and dependence measurement

Copulas and dependence measurement Thorsten Schmidt. Chemnitz University of Technology, Mathematical Institute, Reichenhainer Str. 41, Chemnitz. thorsten.schmidt@mathematik.tu-chemnitz.de Keywords: copulas,

### Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

### HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS Volume LXI 239 Number 7, 2013 http://dx.doi.org/10.11118/actaun201361072151 HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

### PASS Sample Size Software. Poisson Regression

Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

### Analysis of recurrent event data under the case-crossover design. with applications to elderly falls

STATISTICS IN MEDICINE Statist. Med. 2007; 00:1 22 [Version: 2002/09/18 v1.11] Analysis of recurrent event data under the case-crossover design with applications to elderly falls Xianghua Luo 1,, and Gary

### ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

### Sigmaplot di Systat Software

Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical

### Numerical Comparisons for Various Estimators of Slope Parameters for Unreplicated Linear Functional Model

Matematika, 2004, Jilid 20, bil., hlm. 9 30 c Jabatan Matematik, UTM. Numerical Comparisons for Various Estimators of Slope Parameters for Unreplicated Linear Functional Model Abdul Ghapor Hussin Centre

### Regression models for multivariate ordered responses via the Plackett distribution

Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

### A Study on the Correlation of Bivariate And Trivariate Normal Models

Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 11-1-2013 A Study on the Correlation of Bivariate And Trivariate Normal Models Maria

### Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

### What s New in Econometrics. Lecture 13

What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

### Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

### Chapter 12 REML and ML Estimation

Chapter 12 REML and ML Estimation C. R. Henderson 1984 - Guelph 1 Iterative MIVQUE The restricted maximum likelihood estimator (REML) of Patterson and Thompson (1971) can be obtained by iterating on MIVQUE,

### A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions

A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions Y.G. Berger O. De La Riva Torres Abstract We propose a new

### A NEW METHOD FOR APPROXIMATING THE ASYMPTOTIC VARIANCE OF SPEARMAN S RANK CORRELATION

Statistica Sinica 9(1999), 535-558 A NEW METHOD FOR APPROXIMATING THE ASYMPTOTIC VARIANCE OF SPEARMAN S RANK CORRELATION Craig B. Borkowf National Heart, Lung and Blood Institute Abstract: Epidemiologists

### Lecture 14 Simple Linear Regression

Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

### Kernel density estimation in R

Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

### Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Nonparametric Regression and Bonferroni joint confidence intervals Yang Feng Simultaneous Inferences In chapter 2, we know how to construct confidence interval for β 0 and β 1. If we want a confidence

### 1 Random walks and data

Inference, Models and Simulation for Complex Systems CSCI 7-1 Lecture 7 15 September 11 Prof. Aaron Clauset 1 Random walks and data Supposeyou have some time-series data x 1,x,x 3,...,x T and you want

### Survival models and health sequences

Survival models and health sequences Walter Dempsey University of Michigan July 27, 2015 Survival Data Problem Description Survival data is commonplace in medical studies, consisting of failure time information

### OPTIMIZING SEARCH ENGINES USING CLICKTHROUGH DATA. Paper By: Thorsten Joachims (Cornell University) Presented By: Roy Levin (Technion)

OPTIMIZING SEARCH ENGINES USING CLICKTHROUGH DATA Paper By: Thorsten Joachims (Cornell University) Presented By: Roy Levin (Technion) Outline The idea The model Learning a ranking function Experimental

### Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

### Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

### Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

### arxiv: v1 [stat.ml] 1 Jul 2017

Some methods for heterogeneous treatment effect estimation in high-dimensions arxiv:1707.00102v1 [stat.ml] 1 Jul 2017 Scott Powers, Junyang Qian, Kenneth Jung, Alejandro Schuler, Nigam H. Shah, Trevor

### ingestion of selenium tablets for plasma levels to rise; another explanation may be that selenium aects only initiation not promotion of tumors, so tu

Likelihood Ratio Tests for a Change Point with Survival Data By XIAOLONG LUO Department of Biostatistics, St. Jude Children's Research Hospital, P.O. Box 318, Memphis, Tennessee 38101, U.S.A. BRUCE W.

### Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

### Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

### Comparison between conditional and marginal maximum likelihood for a class of item response models

(1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia

### Monitoring risk-adjusted medical outcomes allowing for changes over time

Biostatistics (2014), 15,4,pp. 665 676 doi:10.1093/biostatistics/kxt057 Advance Access publication on December 29, 2013 Monitoring risk-adjusted medical outcomes allowing for changes over time STEFAN H.

### Frailty models. November 30, Programme

Frailty models November 30, 2004 9.30h-10.30h Programme Veronique Rondeau (INSERM, Bordeaux) Title: A three-level nested frailty model for clustered survival data using a penalized likelihood 10.30h-11.00h

### Asymptotic statistics using the Functional Delta Method

Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random

### Modeling with Bivariate Geometric Distributions

Modeling with Bivariate Geometric Distributions JING LI AND SUNIL DHAR Department of Mathematical Sciences, New Jersey Institute of Technology, Newark, New Jersey, USA Abstract We study systems with several

### Generalized Linear Models with Functional Predictors

Generalized Linear Models with Functional Predictors GARETH M. JAMES Marshall School of Business, University of Southern California Abstract In this paper we present a technique for extending generalized

### Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

### Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators

### Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical

### The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

### Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

### A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

### Maximally selected chi-square statistics for ordinal variables

bimj header will be provided by the publisher Maximally selected chi-square statistics for ordinal variables Anne-Laure Boulesteix 1 Department of Statistics, University of Munich, Akademiestrasse 1, D-80799