The coxvc_111 package


 Homer Allison
 1 years ago
 Views:
Transcription
1 Appendix A The coxvc_111 package A.1 Introduction The coxvc_111 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models [24] with time varying effects of the covariates and reducedrank models [77]. What makes those two modelling approaches so special is that an expanded data set has to be created before fitting, making the task computationally demanding, since even small data sets explode when stacking together all the possible risk sets. Using coxvc the models can be fitted on the original data, in a very fast and efficient algorithm, as described in [76]. The set of routines included in the package also contains some small useful functions that the authors often use when fitting survival models. The coxvc requires packages MASS, splines and survival [64], which are automatically loaded when you use the package. Please refer to the manual of those packages for more information. The MASS [102] package is loaded for using the command ginverse which is essential when estimating the generalized inverse matrix of the information matrix from a reducedrank model. Splines are loaded in order to transform some of the covariates when running the models. Note that this package is not essential (although the build in examples of the coxvc package use splines) but it is definitely useful in many applications. Last, the survival package is the base core of the package, since it is needed for creating the survival objects used in our examples. A.2 Statistical background The Cox proportional hazards models is the most common method to analyze survival data. However, the main assumption of proportionality  the hazard ratio of two different cases remain constant regardless of time is often violated, especially in studies with long follow up. The most straightforward way to extent the model is via the inclusion of interactions of the covariates with time 123
2 The coxvc_111 package functions. A nonproportional Cox model may be written as: h(t X) = h 0 (t) exp(xθf ) (A.1) where h 0 (t) is the unspecified baseline hazard, X is an 1 p matrix of p covariates, F is a n q matrix of q time functions, and Θ is a p q matrix of estimable coefficients. Perperoglou, le Cessie and van Houwelingen [77] introduced the idea of reducedrank regression to survival analysis with time varying coefficients. A reducedrank model requires the matrix of regression coefficients Θ to be written as a product of two submatrices, B of size p r and Γ of size q r, thus resulting in Θ = BΓ, a matrix of reducedrank r, smaller than the number of covariates p or the number of time functions q. For fitting the full model, r has to be chosen to be equal to the minimum (p, q), in which case the structure matrix Θ is of full rank. This package was created to fulfil the demand of fitting reduced rank hazards models in a fast and efficient way. For motivation of the package use refer to [76]. The new version of the package contains an additional set of small functions that were found useful to the author in several cases when analyzing survival data. A.3 Examples First load the coxvc library: > library(coxvc) The sample data within this library come from a study of ovarian cancer patients [104]. There are in total 358 cases of patients with information of the following variables: 124 time The number of days from enrollment until death or censoring. death An indicator of death (1) or censoring (0). karn The karnofsky index measuring the ability of the patients to perform several tasks. diam The diameter of the residual tumor. figo The Figo index, denoting the site of the metastasis. x Patient id
3 A.3. Examples Table A.1: Definitions of variables and patients frequencies X k Karnofsky < 70 n X f 0 1 Figo III IV n X d Diameter Micro < > 5 n For more information refer to table A.1. First attach the data: > data(ova) > attach(ova) A short summary of the data follows: > summary(ova) time death karn figo Min. : 7.0 Min. :0.000 Min. :0.000 Min. : st Qu.: st Qu.: st Qu.: st Qu.: Median : Median :1.000 Median :1.000 Median : Mean : Mean :0.743 Mean :1.173 Mean : rd Qu.: rd Qu.: rd Qu.: rd Qu.: Max. : Max. :1.000 Max. :4.000 Max. : diam x Min. :0.000 Min. : st Qu.: st Qu.: Median :3.000 Median : Mean :2.651 Mean : rd Qu.: rd Qu.: Max. :4.000 Max. : A simple Cox proportional hazards model can be fitted in the usual way using the coxph command from survival library: 125
4 The coxvc_111 package > fit.ph < coxph(surv(time, death) ~ karn + diam + figo) > fit.ph Call: coxph(formula = Surv(time, death) ~ karn + diam + figo) coef exp(coef) se(coef) z p karn e03 diam e05 figo e05 Likelihood ratio test=64.1 on 3 df, p=7.68e14 n= 358 A test of proportionality based on Schoenfeld residuals [92] reveals that in fact there are deviations from proportional hazards in the data. > cox.zph(fit.ph) rho chisq p karn diam figo GLOBAL NA as it is indicated by the small global pvalue given above. A graphical inspection given by: > par(mfrow = c(3, 1)) > plot(cox.zph(fit.ph)) The results are shown in figure A.1 and suggest that there may be an interaction of time with the covariates. A first approach will be to fit a full rank model, which includes the full Θ matrix. We choose to transform time using Bsplines, thus create the F matrix to contain F 1 (t) = 1 a constant and cubic Bspline functions on 3 degrees of freedom: > Ft < cbind(rep(1, nrow(ova)), bs(time, df = 3)) Then the full rank model is given by: > fit.r3 < coxvc(surv(time, death) ~ karn + diam + figo, Ft, rank = 3, + data = ova) > fit.r3 126
5 A.3. Examples Beta(t) for karn Time Beta(t) for diam Time Beta(t) for figo Time Figure A.1: Test of proportionality based on scaled Schoenfeld residuals along with a spline smooth with 90% confidence intervals. call: coxvc(formula = Surv(time, death) ~ karn + diam + figo, Ft = Ft, rank = 3, data = ova) coef exp(coef) se(coef) z p karn diam figo karn:f1(t) diam:f1(t) figo:f1(t) karn:f2(t) diam:f2(t) figo:f2(t)
6 The coxvc_111 package karn:f3(t) diam:f3(t) figo:f3(t) loglikelihood= algorithm converged in 5 iterations The class of object fit.r3 is coxvc. The generic function printcoxvc is included in the package for printing results from the full model. The model has 21 parameters, and in practice the results are identical with fitting a coxph model on the expanded data set. However, the fit here was done in 5 iterations, on the original data set, which makes the routine much faster and more efficient. There are in total 266 events present in the ovarian data set. The object fit.r3 also contains the baseline hazard evaluated at this event time points. The function expand.haz can be used for expanding either the baseline or the cumulative baseline hazard. > haz < fit.r3$hazard > length(haz) [1] 266 > haz.exp < expand.haz(haz, death, fun = "baseline") > length(haz.exp) [1] 358 When expanding the baseline hazard, the function assigns a zero value in the time points of censoring, while when expanding a cumulative baseline hazard, the function assigns the value of the cumulative baseline at the time where the previous event took place whenever there is a censored case. > cum.haz < cumsum(haz) > cum.haz.exp < expand.haz(cum.haz, death, fun = "cumulative") The function plotcoxvc is included in the package to draw figures of the time varying behavior of the covariates: > plotcoxvc(fit.r3, fun = "effects", xlab = "time in days") The same function can be also used for plotting the survival function. Since the object fit.r3 is a coxvc using plot(survfit(...)) will not give the survival plot. Instead, the function plotcoxvc can be used: 128
7 A.3. Examples karn diam figo time in days Figure A.2: Estimated effects of the covariates over time, for the full rank model. > plotcoxvc(fit.r3, fun = "survival", xlab = "time in days") In figure A.2 we have seen that the time varying behavior of the covariates is too flexible, especially in the last days of the follow up. We fitted a rank=2 model at the data, to see whether the fit improves: > fit.r2 < coxvc(surv(time, death) ~ karn + diam + figo, Ft, rank = 2, + data = ova) > fit.r2 call: coxvc(formula = Surv(time, death) ~ karn + diam + figo, Ft = Ft, rank = 2, data = ova) coef exp(coef) se(coef) karn
8 The coxvc_111 package time in days Figure A.3: Survival function for the full rank model. diam figo karn:f1(t) diam:f1(t) figo:f1(t) karn:f2(t) diam:f2(t) figo:f2(t) karn:f3(t) diam:f3(t) figo:f3(t) loglikelihood= , Rank= 2 algorithm converged in 12 iterations 130
9 A.3. Examples Beta : Gamma: [,1] [,2] [,1] [,2] [1,] [1,] [2,] [2,] [3,] [3,] [4,] > summary(fit.r2) call: coxvc(formula = Surv(time, death) ~ karn + diam + figo, Ft = Ft, rank = 2, data = ova) Beta : Gamma: [,1] [,2] [,1] [,2] [1,] [1,] [2,] [2,] [3,] [3,] [4,] The class of fit.r2 is coxrr. For reducedrank models the generic function print.coxrr will print the estimated coefficients of the model along with their standard errors and so forth, as well as the factors of the Θ matrix, B and Γ. Moreover, the function summary.coxrr will provide also summary of the B and Γ matrices. We see that the rank=2 model, with 16 parameters in total, has a more reasonable fitting of the covariate effects > plotcoxvc(fit.r2, fun = "effects", xlab = "time in days") while the rank=1 model with 9 free parameters, is more much more rigid: > fit.r1 < coxvc(surv(time, death) ~ karn + diam + figo, Ft, rank = 1, + data = ova) > fit.r1 call: coxvc(formula = Surv(time, death) ~ karn + diam + figo, Ft = Ft, rank = 1, data = ova) 131
10 The coxvc_111 package karn diam figo time in days Figure A.4: Estimated effects of the covariates over time, for the rank=2 model. coef exp(coef) se(coef) karn diam figo karn:f1(t) diam:f1(t) figo:f1(t) karn:f2(t) diam:f2(t) figo:f2(t) karn:f3(t) diam:f3(t) figo:f3(t)
11 A.3. Examples loglikelihood= , Rank= 1 algorithm converged in 5 iterations Beta : Gamma: [,1] [,1] [1,] [1,] [2,] [2,] [3,] [3,] [4,] > plotcoxvc(fit.r1, fun = "effects", xlab = " time in days") karn diam figo time in days Figure A.5: Estimated effects of the covariates over time, for the rank=1 model. The package also contains a small function calc.h0 to compute the baseline hazard from a Cox model, evaluated for a case with all covariate values equal 133
12 The coxvc_111 package to zero. For example consider the simple proportional hazards model fit.ph. To get an estimate of the baseline hazard the function coxph.details can be used: > haz.ph < coxph.detail(fit.ph)$haz > haz.ph0 < calc.h0(fit.ph) The object haz.ph is the baseline hazard evaluated at the mean value of the covariates, while the object haz.ph0 is the baseline hazard evaluated for all covariate values equal to zero. This can be seen in graph A.6: > plot(time[death == 1], exp(cumsum(haz.ph)), ylim = c(0, 1), + ylab = "", "l") > lines(time[death == 1], exp(cumsum(haz.ph0)), col = 2) time[death == 1] Figure A.6: Figure of survival for an average person (black line) and a person with covariates X = 0 134
Chapter 4 Regression Models
23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,
More informationMultivariable Fractional Polynomials
Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example
More informationExtensions of Cox Model for NonProportional Hazards Purpose
PhUSE 2013 Paper SP07 Extensions of Cox Model for NonProportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used
More informationUnderstanding the Cox Regression Models with TimeChange Covariates
Understanding the Cox Regression Models with TimeChange Covariates Mai Zhou University of Kentucky The Cox regression model is a cornerstone of modern survival analysis and is widely used in many other
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationLecture 11. Interval Censored and. DiscreteTime Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255  Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationPackage ICGOR. January 13, 2017
Package ICGOR January 13, 2017 Type Package Title Fit Generalized Odds Rate Hazards Model with Interval Censored Data Version 2.0 Date 20170112 Author Jie Zhou, Jiajia Zhang, Wenbin Lu Maintainer Jie
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationMetaanalysis of epidemiological doseresponse studies
Metaanalysis of epidemiological doseresponse studies Nicola Orsini 2nd Italian Stata Users Group meeting October 1011, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.
More informationSurvival models and health sequences
Survival models and health sequences Walter Dempsey University of Michigan July 27, 2015 Survival Data Problem Description Survival data is commonplace in medical studies, consisting of failure time information
More informationTied survival times; estimation of survival probabilities
Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation
More informationMüller: Goodnessoffit criteria for survival data
Müller: Goodnessoffit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.unimuenchen.de/ Projektpartner Goodness of fit criteria for survival data
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationPackage ICBayes. September 24, 2017
Package ICBayes September 24, 2017 Title Bayesian Semiparametric Models for IntervalCensored Data Version 1.1 Date 2017924 Author Chun Pan, Bo Cai, Lianming Wang, and Xiaoyan Lin Maintainer Chun Pan
More informationBayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationLogistic regression model for survival time analysis using timevarying coefficients
Logistic regression model for survival time analysis using timevarying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshimau.ac.jp Research
More informationNotes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.
Unit 2: Models, Censoring, and Likelihood for FailureTime Data Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Ramón
More informationLoglinearity for Cox s regression model. Thesis for the Degree Master of Science
Loglinearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 HosmerLemeshow Statistic The HosmerLemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationJournal of Statistical Software
JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen MeiJie
More informationAssessment of time varying long term effects of therapies and prognostic factors
Assessment of time varying long term effects of therapies and prognostic factors Dissertation by Anika Buchholz Submitted to Fakultät Statistik, Technische Universität Dortmund in Fulfillment of the Requirements
More informationDescription Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
More informationIntroduction to the rstpm2 package
Introduction to the rstpm2 package Mark Clements Karolinska Institutet Abstract This vignette outlines the methods and provides some examples for linkbased survival models as implemented in the R rstpm2
More informationx y x y 15 y is directly proportional to x. a Draw the graph of y against x.
3 8.1 Direct proportion 1 x 2 3 5 10 12 y 6 9 15 30 36 B a Draw the graph of y against x. y 40 30 20 10 0 0 5 10 15 20 x b Write down a rule for y in terms of x.... c Explain why y is directly proportional
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationMethodological challenges in research on consequences of sickness absence and disability pension?
Methodological challenges in research on consequences of sickness absence and disability pension? Prof., PhD Hjelt Institute, University of Helsinki 2 Two methodological approaches Lexis diagrams and Poisson
More informationJournal of Statistical Software
JSS Journal of Statistical Software January 2011, Volume 38, Issue 7. http://www.jstatsoft.org/ mstate: An R Package for the Analysis of Competing Risks and MultiState Models Liesbeth C. de Wreede Leiden
More informationIntroduction to Linear Regression
Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46
More informationReliability Engineering I
Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00
More informationLecture 3. Truncation, lengthbias and prevalence sampling
Lecture 3. Truncation, lengthbias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More information0.1 weibull: Weibull Regression for Duration Dependent
0.1 weibull: Weibull Regression for Duration Dependent Variables Choose the Weibull regression model if the values in your dependent variable are duration observations. The Weibull model relaxes the exponential
More informationECON 5350 Class Notes Functional Form and Structural Change
ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationPackage GORCure. January 13, 2017
Package GORCure January 13, 2017 Type Package Title Fit Generalized Odds Rate Mixture Cure Model with Interval Censored Data Version 2.0 Date 20170112 Author Jie Zhou, Jiajia Zhang, Wenbin Lu Maintainer
More informationUnivariate Descriptive Statistics for One Sample
Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 4 5 6 7 8 Introduction Our first step in descriptive statistics is to characterize the data in a single group of
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationMixed effects models
Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research
More informationA comparison of inverse transform and composition methods of data simulation from the Lindley distribution
Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 22877843 / Online ISSN 23834757 A comparison of inverse transform
More informationGeneralized Linear Models with Functional Predictors
Generalized Linear Models with Functional Predictors GARETH M. JAMES Marshall School of Business, University of Southern California Abstract In this paper we present a technique for extending generalized
More informationKernel density estimation in R
Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationPackage lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer
Package lmm March 19, 2012 Version 0.4 Date 2012319 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some
More informationSPSS LAB FILE 1
SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationDistributedlag linear structural equation models in R: the dlsem package
Distributedlag linear structural equation models in R: the dlsem package Alessandro Magrini Dep. Statistics, Computer Science, Applications University of Florence, Italy dlsem
More informationFollowup data with the Epi package
Followup data with the Epi package Summer 2014 Michael Hills Martyn Plummer Bendix Carstensen Retired Highgate, London International Agency for Research on Cancer, Lyon plummer@iarc.fr Steno Diabetes
More informationPackage rnmf. February 20, 2015
Type Package Title Robust Nonnegative Matrix Factorization Package rnmf February 20, 2015 An implementation of robust nonnegative matrix factorization (rnmf). The rnmf algorithm decomposes a nonnegative
More informationStatistical Analysis of Pipe Breaks in Water Distribution Systems in Ethiopia, the Case of Hawassa
IOSR Journal of Mathematics (IOSRJM) eissn: 22785728, pissn: 2319765X. Volume 12, Issue 3 Ver. IV (May.  Jun. 2016), PP 127136 www.iosrjournals.org Statistical Analysis of Pipe Breaks in Water Distribution
More informationpensim Package Example (Version 1.2.9)
pensim Package Example (Version 1.2.9) Levi Waldron March 13, 2014 Contents 1 Introduction 1 2 Example data 2 3 Nested crossvalidation 2 3.1 Summarization and plotting..................... 3 4 Getting
More informationBayesian course  problem set 5 (lecture 6)
Bayesian course  problem set 5 (lecture 6) Ben Lambert November 30, 2016 1 Stan entry level: discoveries data The file prob5 discoveries.csv contains data on the numbers of great inventions and scientific
More informationBIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca LenzTönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationExam C Solutions Spring 2005
Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8
More informationAssessing the effect of a partly unobserved, exogenous, binary timedependent covariate on APPENDIX
Assessing the effect of a partly unobserved, exogenous, binary timedependent covariate on survival probabilities using generalised pseudovalues Ulrike Pötschger,2, Harald Heinzl 2, Maria Grazia Valsecchi
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More informationGeneralized additive modelling of hydrological sample extremes
Generalized additive modelling of hydrological sample extremes Valérie ChavezDemoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of
More informationInteraction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonlyused statistical model, and has the advantage
More informationRsquared for Bayesian regression models
Rsquared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pairwise responses is an important property used in almost all multivariate analyses.
More informationPackage elhmc. R topics documented: July 4, Type Package
Package elhmc July 4, 2017 Type Package Title Sampling from a Empirical Likelihood Bayesian Posterior of Parameters Using Hamiltonian Monte Carlo Version 1.1.0 Date 20170703 Author Dang Trung Kien ,
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More information0.1 blogit: Bivariate Logistic Regression for Two Dichotomous
0.1 blogit: Bivariate Logistic Regression for Two Dichotomous Dependent Variables Use the bivariate logistic regression model if you have two binary dependent variables (Y 1, Y 2 ), and wish to model them
More informationDouble Bootstrap Confidence Interval Estimates with Censored and Truncated Data
Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 22 112014 Double Bootstrap Confidence Interval Estimates with Censored and Truncated Data Jayanthi Arasan University Putra Malaysia,
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationResearch Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana UniversityPurdue University at Indianapolis
Hanxiang Department of Mathematical Sciences Indiana UniversityPurdue University at Indianapolis March 4, 2009 Outline Project I: Free Knot Spline Cox Model Project I: Free Knot Spline Cox Model Consider
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationA new strategy for metaanalysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for metaanalysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationRcompanion to: Estimation of the Thurstonian model for the 2AC protocol
Rcompanion to: Estimation of the Thurstonian model for the 2AC protocol Rune Haubo Bojesen Christensen, HyeSeong Lee & Per Bruun Brockhoff August 24, 2017 This document describes how the examples in
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More informationM3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction
M3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction Mike Stoolmiller: stoolmil@uoregon.edu University of Oregon 5/21/2013 Outline Example Research Questions
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationMultiperiod credit default prediction with timevarying covariates. Walter Orth University of Cologne, Department of Statistics and Econometrics
with timevarying covariates Walter Orth University of Cologne, Department of Statistics and Econometrics 2 20 Overview Introduction Approaches in the literature The proposed models Empirical analysis
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationTreatment Effects with Normal Disturbances in sampleselection Package
Treatment Effects with Normal Disturbances in sampleselection Package Ott Toomet University of Washington December 7, 017 1 The Problem Recent decades have seen a surge in interest for evidencebased policymaking.
More informationAnalysing categorical data using logit models
Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.researchtraining.net/manchester
More informationUnit 2 Regression and Correlation Practice Problems. SOLUTIONS Version STATA
PubHlth 640. Regression and Correlation Page 1 of 19 Unit Regression and Correlation Practice Problems SOLUTIONS Version STATA 1. A regression analysis of measurements of a dependent variable Y on an independent
More informationFrailty Models and Copulas: Similarities and Differences
Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationAttributable Risk Function in the Proportional Hazards Model
UW Biostatistics Working Paper Series 5312005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu
More informationCOMPUTER SESSION: ARMA PROCESSES
UPPSALA UNIVERSITY Department of Mathematics Jesper Rydén Stationary Stochastic Processes 1MS025 Autumn 2010 COMPUTER SESSION: ARMA PROCESSES 1 Introduction In this computer session, we work within the
More informationModel Fitting. Jean Yves Le Boudec
Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We
More informationThe GLM really is different than OLS, even with a Normally distributed dependent variable, when the link function g is not the identity.
GLM with a Gammadistributed Dependent Variable. 1 Introduction I started out to write about why the Gamma distribution in a GLM is useful. I ve found it difficult to find an example which proves that
More informationIntroductory Statistics with R: Simple Inferences for continuous data
Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh Email: sungkyu@pitt.edu
More informationLecture 12: Interactions and Splines
Lecture 12: Interactions and Splines Sandy Eckel seckel@jhsph.edu 12 May 2007 1 Definition Effect Modification The phenomenon in which the relationship between the primary predictor and outcome varies
More informationIntroduction to SAS proc mixed
Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The
More informationSearch for Blazar FluxCorrelated TeV Neutrinos in IceCube 40String Data
Search for Blazar FluxCorrelated TeV Neutrinos in IceCube 40String Data Derek Fox, Colin Turley Fourth AMON Workshop 4 December 2015 Penn State 2 Background and Outline Two models for blazar emission:
More informationLecture 2. October 21, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
Lecture 2 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 21, 2007 1 2 3 4 5 6 Define probability calculus Basic axioms of probability Define
More information3 Results. Part I. 3.1 Base/primary model
3 Results Part I 3.1 Base/primary model For the development of the base/primary population model the development dataset (for data details see Table 5 and sections 2.1 and 2.2), which included 1256 serum
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationS The OverReliance on the Central Limit Theorem
S042008 The OverReliance on the Central Limit Theorem Abstract The objective is to demonstrate the theoretical and practical implication of the central limit theorem. The theorem states that as n approaches
More information0.1 gamma.mixed: Mixed effects gamma regression
0. gamma.mixed: Mixed effects gamma regression Use generalized multilevel linear regression if you have covariates that are grouped according to one or more classification factors. Gamma regression models
More informationHazards, Densities, Repeated Events for Predictive Marketing. Bruce Lund
Hazards, Densities, Repeated Events for Predictive Marketing Bruce Lund 1 A Proposal for Predicting Customer Behavior A Company wants to predict whether its customers will buy a product or obtain service
More informationFailure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1
Failure rate (Updated and Adapted from Notes by Dr. A.K. Nema) Part 1: Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is
More informationLab 11. Multilevel Models. Description of Data
Lab 11 Multilevel Models Henian Chen, M.D., Ph.D. Description of Data MULTILEVEL.TXT is clustered data for 386 women distributed across 40 groups. ID: 386 women, id from 1 to 386, individual level (level
More informationHighdimensional Ordinary Leastsquares Projection for Screening Variables
1 / 38 Highdimensional Ordinary Leastsquares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor
More information