# Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Save this PDF as:

Size: px
Start display at page:

Download "Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see"

## Transcription

1 Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see Description The following postestimation command is of special interest after stcrreg: Command Description stcurve plot the cumulative subhazard and cumulative incidence functions For information on stcurve, see [ST] stcurve. The following standard postestimation commands are also available: Command Description contrast contrasts and ANOVA-style joint tests of estimates estat ic Akaike s and Schwarz s Bayesian information criteria (AIC and BIC) estat summarize summary statistics for the estimation sample estat vce variance covariance matrix of the estimators (VCE) estimates cataloging estimation results lincom point estimates, standard errors, testing, and inference for linear combinations of coefficients margins marginal means, predictive margins, marginal effects, and average marginal effects marginsplot graph the results from margins (profile plots, interaction plots, etc.) nlcom point estimates, standard errors, testing, and inference for nonlinear combinations of coefficients predict predictions, residuals, influence statistics, and other diagnostic measures predictnl point estimates, standard errors, testing, and inference for generalized predictions pwcompare pairwise comparisons of estimates test Wald tests of simple and composite linear hypotheses testnl Wald tests of nonlinear hypotheses Syntax for predict predict [ type ] newvar [ if ] [ in ] [, sv statistic nooffset ] predict [ type ] { stub* newvarlist } [ if ] [ in ], mv statistic [ partial ] 1

2 2 stcrreg postestimation Postestimation tools for stcrreg sv statistic Description Main shr predicted subhazard ratio, also known as the relative subhazard; the default xb linear prediction x j β stdp standard error of the linear prediction; SE(x j β) basecif baseline cumulative incidence function (CIF) basecshazard baseline cumulative subhazard function kmcensor Kaplan Meier survivor curve for the censoring distribution mv statistic Main scores esr dfbeta schoenfeld Description pseudolikelihood scores efficient score residuals DFBETA measures of influence Schoenfeld residuals Unstarred statistics are available both in and out of sample; type predict... if e(sample)... if wanted only for the estimation sample. Starred statistics are calculated only for the estimation sample, even when if e(sample) is not specified. nooffset is allowed only with unstarred statistics. Menu for predict Statistics > Postestimation > Predictions, residuals, etc. Options for predict Main shr, the default, calculates the relative subhazard (subhazard ratio), that is, the exponentiated linear prediction, exp(x j β). xb calculates the linear prediction from the fitted model. That is, you fit the model by estimating a set of parameters, β 1, β 2,..., β k, and the linear prediction is β 1 x 1j + β 2 x 2j + + β k x kj, often written in matrix notation as x j β. The x 1j, x 2j,..., x kj used in the calculation are obtained from the data currently in memory and do not have to correspond to the data on the independent variables used in estimating β. stdp calculates the standard error of the prediction, that is, the standard error of x j β. basecif calculates the baseline CIF. This is the CIF of the subdistribution for the cause-specific failure process. basecshazard calculates the baseline cumulative subhazard function. This is the cumulative hazard function of the subdistribution for the cause-specific failure process.

3 stcrreg postestimation Postestimation tools for stcrreg 3 kmcensor calculates the Kaplan Meier survivor function for the censoring distribution. These estimates are used to weight within risk pools observations that have experienced a competing event. As such, these values are not predictions or diagnostics in the strict sense, but are provided for those who wish to reproduce the pseudolikelihood calculations performed by stcrreg; see [ST] stcrreg. nooffset is allowed only with shr, xb, and stdp, and is relevant only if you specified offset(varname) for stcrreg. It modifies the calculations made by predict so that they ignore the offset variable; the linear prediction is treated as x j β rather than xj β + offsetj. scores calculates the pseudolikelihood scores for each regressor in the model. These scores are components of the robust estimate of variance. For multiple-record data, by default only one score per subject is calculated and it is placed on the last record for the subject. Adding the partial option will produce partial scores, one for each record within subject; see partial below. Partial pseudolikelihood scores are the additive contributions to a subject s overall pseudolikelihood score. In single-record data, the partial pseudolikelihood scores are the pseudolikelihood scores. One score variable is created for each regressor in the model; the first new variable corresponds to the first regressor, the second to the second, and so on. esr calculates the efficient score residuals for each regressor in the model. Efficient score residuals are diagnostic measures equivalent to pseudolikelihood scores, with the exception that efficient score residuals treat the censoring distribution (that used for weighting) as known rather than estimated. For multiple-record data, by default only one score per subject is calculated and it is placed on the last record for the subject. Adding the partial option will produce partial efficient score residuals, one for each record within subject; see partial below. Partial efficient score residuals are the additive contributions to a subject s overall efficient score residual. In single-record data, the partial efficient scores are the efficient scores. One efficient variable is created for each regressor in the model; the first new variable corresponds to the first regressor, the second to the second, and so on. dfbeta calculates the DFBETA measures of influence for each regressor of in the model. The DFBETA value for a subject estimates the change in the regressor s coefficient due to deletion of that subject. For multiple-record data, by default only one value per subject is calculated and it is placed on the last record for the subject. Adding the partial option will produce partial DFBETAs, one for each record within subject; see partial below. Partial DFBETAs are interpreted as effects due to deletion of individual records rather than deletion of individual subjects. In single-record data, the partial DFBETAs are the DFBETAs. One DFBETA variable is created for each regressor in the model; the first new variable corresponds to the first regressor, the second to the second, and so on. schoenfeld calculates the Schoenfeld-like residuals. Schoenfeld-like residuals are diagnostic measures analogous to Schoenfeld residuals in Cox regression. They compare a failed observation s covariate values to the (weighted) average covariate values for all those at risk at the time of failure. Schoenfeld-like residuals are calculated only for those observations that end in failure; missing values are produced otherwise. One Schoenfeld residual variable is created for each regressor in the model; the first new variable corresponds to the first regressor, the second to the second, and so on.

4 4 stcrreg postestimation Postestimation tools for stcrreg Note: The easiest way to use the preceding four options is, for example,. predict double stub*, scores where stub is a short name of your choosing. Stata then creates variables stub1, stub2, etc. You may also specify each variable name explicitly, in which case there must be as many (and no more) variables specified as there are regressors in the model. partial is relevant only for multiple-record data and is valid with scores, esr, and dfbeta. Specifying partial will produce partial versions of these statistics, where one value is calculated for each record instead of one for each subject. The subjects are determined by the id() option to stset. Specify partial if you wish to perform diagnostics on individual records rather than on individual subjects. For example, a partial DFBETA would be interpreted as the effect on a coefficient due to deletion of one record, rather than the effect due to deletion of all records for a given subject. Remarks and examples stata.com Remarks are presented under the following headings: Baseline functions Null models Measures of influence Baseline functions Example 1: Cervical cancer study In example 1 of [ST] stcrreg, we fit a proportional subhazards model on data from a cervical cancer study.. use (Hypoxia study). stset dftime, failure(failtype == 1) (output omitted ). stcrreg ifp tumsize pelnode, compete(failtype == 2) (output omitted ) Competing-risks regression No. of obs = 109 No. of subjects = 109 Failure event : failtype == 1 No. failed = 33 Competing event: failtype == 2 No. competing = 17 No. censored = 59 Wald chi2(3) = Log pseudolikelihood = Prob > chi2 = Robust _t SHR Std. Err. z P> z [95% Conf. Interval] ifp tumsize pelnode

5 stcrreg postestimation Postestimation tools for stcrreg 5 After fitting the model, we can predict the baseline cumulative subhazard, H 1,0 (t), and the baseline CIF, CIF 1,0 (t):. predict bch, basecsh. predict bcif, basecif. list dftime failtype ifp tumsize pelnode bch bcif in 1/15 dftime failtype ifp tumsize pelnode bch bcif The baseline functions are for subjects who have zero-valued covariates, which in this example are not representative of the data. If baseline is an extreme departure from the covariate patterns in your data, then we recommend recentering your covariates to avoid numerical overflows when predicting baseline functions; see Making baseline reasonable in [ST] stcox postestimation for more details. For our data, baseline is close enough to not cause any numerical problems, but far enough to not be of scientific interest (zero tumor size?). You can transform the baseline functions to those for other covariate patterns according to the relationships H 1 (t) = exp(xβ)h 1,0 (t) and CIF 1 (t) = 1 exp{ exp(xβ)h 1,0 (t)} but it is rare that you will ever have to do that. stcurve will predict, transform, and graph these functions for you. When you use stcurve, you specify the covariate settings, and any you leave unspecified are set at the mean over the data used in the estimation.

6 6 stcrreg postestimation Postestimation tools for stcrreg. stcurve, cif at1(ifp = 5 pelnode = 0) at2(ifp = 20 pelnode = 0) Competing risks regression Cumulative Incidence analysis time ifp = 5 pelnode = 0 ifp = 20 pelnode = 0 Because they were left unspecified, the cumulative incidence curves are for mean tumor size. If you wish to graph cumulative subhazards instead of CIFs, use the stcurve option cumhaz in place of cif. Null models Predicting baseline functions after fitting a null model (one without covariates) yields nonparametric estimates of the cumulative subhazard and the CIF. Example 2: HIV and SI as competing events In example 4 of [ST] stcrreg, we analyzed the incidence of appearance of the SI HIV phenotype, where a diagnosis of AIDS is a competing event. We modeled SI incidence in reference to a genetic mutation indicated by the covariate ccr5. We compared two approaches: one that used stcrreg and assumed that the subhazard of SI was proportional with respect to ccr5 versus one that used stcox and assumed that the cause-specific hazards for both SI and AIDS were each proportional with respect to ccr5. For both approaches, we produced cumulative incidence curves for SI comparing those who did not have the mutation (ccr5==0) to those who did (ccr5==1). To see which approach better fits these data, we now produce cumulative incidence curves that make no model assumption about the effect of ccr5. We do this by fitting null models on the two subsets of the data defined by ccr5 and predicting the baseline CIF for each. Because the models have no covariates, the estimated baseline CIFs are nonparametric estimators.. use clear (HIV and SI as competing risks). stset time, failure(status == 2) // SI is the event of interest (output omitted )

7 stcrreg postestimation Postestimation tools for stcrreg 7. stcrreg if!ccr5, compete(status == 1) noshow // AIDS is the competing event Competing-risks regression No. of obs = 259 No. of subjects = 259 Failure event : status == 2 No. failed = 84 Competing event: status == 1 No. competing = 101 No. censored = 74 Wald chi2(0) = 0.00 Log pseudolikelihood = Prob > chi2 =. Robust _t SHR Std. Err. z P> z [95% Conf. Interval]. predict cif_si_0, basecif (65 missing values generated). label var cif_si_0 "ccr5 = 0". stcrreg if ccr5, compete(status == 1) noshow Competing-risks regression No. of obs = 65 No. of subjects = 65 Failure event : status == 2 No. failed = 23 Competing event: status == 1 No. competing = 12 No. censored = 30 Wald chi2(0) = 0.00 Log pseudolikelihood = Prob > chi2 =. Robust _t SHR Std. Err. z P> z [95% Conf. Interval]. predict cif_si_1, basecif (259 missing values generated). label var cif_si_1 "ccr5 = 1". twoway line cif_si* _t if _t<13, connect(j J) sort yscale(range(0 0.5)) > title(si) ytitle(cumulative Incidence) xtitle(analysis time) SI Cumulative Incidence analysis time ccr5 = 0 ccr5 = 1 After comparing with the graphs produced in [ST] stcrreg, we find that the nonparametric analysis favors the stcox approach over the stcrreg approach.

8 8 stcrreg postestimation Postestimation tools for stcrreg Technical note Predicting the baseline CIF after fitting a null model with stcrreg produces a nonparametric CIF estimator that is asymptotically equivalent, but not exactly equal, to an alternate estimator that is often used; see Coviello and Boggess (2004) for the details of that estimator. The estimator used by predict after stcrreg is a competing-risks extension of the Nelson Aalen estimator (Nelson 1972; Aalen 1978); see Methods and formulas. The other is a competing-risks extension of the Kaplan Meier (1958) estimator. In large samples with many failures, the difference is negligible. Measures of influence With predict after stcrreg, you can obtain pseudolikelihood scores that are used to calculate robust estimates of variance, Schoenfeld residuals that reflect each failure s contribution to the gradient of the log pseudolikelihood, efficient score residuals that represent each subject s (observation s) contribution to the gradient, and DFBETAs that measure the change in coefficients due to deletion of a subject or observation. Example 3: DFBETAs Returning to our cervical cancer study, we obtain DFBETAs for each of the three coefficients in the model and graph those for the first with respect to analysis time.. use clear (Hypoxia study). stset dftime, failure(failtype == 1) (output omitted ). stcrreg ifp tumsize pelnode, compete(failtype == 2) (output omitted ). predict df*, dfbeta. generate obs = _n. twoway scatter df1 dftime, yline(0) mlabel(obs) DFBETA ifp Time from diagnosis to first failure or last follow up (yrs) 84 8

9 stcrreg postestimation Postestimation tools for stcrreg 9 predict created the variables df1, df2, and df3, holding DFBETA values for variables ifp, tumsize, and pelnode, respectively. Based on the graph, we see that subject 4 is the most influential on the coefficient for ifp, the first covariate in the model. In the previous example, we had single-record data. If you have data with multiple records per subject, then by default DFBETAs will be calculated at the subject level, with one value representing each subject and measuring the effect of deleting all records for that subject. If you instead want record-level DFBETAs that measure the change due to deleting single records within subjects, add the partial option; see [ST] stcox postestimation for further details. Methods and formulas Continuing the discussion from Methods and formulas in [ST] stcrreg, the baseline cumulative subhazard function is calculated as Ĥ 1,0 (t) = j:t j t The baseline CIF is ĈIF 1,0 (t) = 1 exp{ Ĥ1,0(t)}. δ j l R j w l π lj exp(z l ) The Kaplan Meier survivor curve for the censoring distribution is Ŝ c (t) = { i 1 γ } ii(t i = t (j) ) r(t (j) ) t (j) <t where t (j) indexes the times at which censorings occur. Both the pseudolikelihood scores, û i, and the efficient score residuals, η i, are as defined previously. DFBETAs are calculated according to Collett (2003): DFBETA i = η i Var ( β) where Var ( β) is the model-based variance estimator, that is, the inverse of the negative Hessian. Schoenfeld residuals are r i = ( r 1i,..., r mi ) with r ki = δ i (x ki a ki ) References Aalen, O. O Nonparametric inference for a family of counting processes. Annals of Statistics 6: Cleves, M. A., W. W. Gould, R. G. Gutierrez, and Y. V. Marchenko An Introduction to Survival Analysis Using Stata. 3rd ed. College Station, TX: Stata Press. Collett, D Modelling Survival Data in Medical Research. 2nd ed. London: Chapman & Hall/CRC. Coviello, V., and M. M. Boggess Cumulative incidence estimation in the presence of competing risks. Stata Journal 4: Fyles, A., M. Milosevic, D. Hedley, M. Pintilie, W. Levin, L. Manchul, and R. P. Hill Tumor hypoxia has independent predictor impact only in patients with node-negative cervix cancer. Journal of Clinical Oncology 20:

10 10 stcrreg postestimation Postestimation tools for stcrreg Geskus, R. B On the inclusion of prevalent cases in HIV/AIDS natural history studies through a marker-based estimate of time since seroconversion. Statistics in Medicine 19: Kaplan, E. L., and P. Meier Nonparametric estimation from incomplete observations. Journal of the American Statistical Association 53: Milosevic, M., A. Fyles, D. Hedley, M. Pintilie, W. Levin, L. Manchul, and R. P. Hill Interstitial fluid pressure predicts survival in patients with cervix cancer independent of clinical prognostic factors and tumor oxygen measurements. Cancer Research 61: Nelson, W Theory and applications of hazard plotting for censored failure data. Technometrics 14: Pintilie, M Competing Risks: A Practical Perspective. Chichester, UK: Wiley. Putter, H., M. Fiocco, and R. B. Geskus Tutorial in biostatistics: Competing risks and multi-state models. Statistics in Medicine 26: Also see [ST] stcrreg Competing-risks regression [ST] stcurve Plot survivor, hazard, cumulative hazard, or cumulative incidence function [U] 20 Estimation and postestimation commands

### Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Title stata.com logistic postestimation Postestimation tools for logistic Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

### Title. Description. Special-interest postestimation commands. xtmelogit postestimation Postestimation tools for xtmelogit

Title xtmelogit postestimation Postestimation tools for xtmelogit Description The following postestimation commands are of special interest after xtmelogit: Command Description estat group summarize the

### Fractional Polynomials and Model Averaging

Fractional Polynomials and Model Averaging Paul C Lambert Center for Biostatistics and Genetic Epidemiology University of Leicester UK paul.lambert@le.ac.uk Nordic and Baltic Stata Users Group Meeting,

### Chapter 4 Regression Models

23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,

### The following postestimation commands are of special interest after ca and camat: biplot of row and column points

Title stata.com ca postestimation Postestimation tools for ca and camat Description Syntax for predict Menu for predict Options for predict Syntax for estat Menu for estat Options for estat Remarks and

### Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

### Marginal Effects in Multiply Imputed Datasets

Marginal Effects in Multiply Imputed Datasets Daniel Klein daniel.klein@uni-kassel.de University of Kassel 14th German Stata Users Group meeting GESIS Cologne June 10, 2016 1 / 25 1 Motivation 2 Marginal

### raise Coef. Std. Err. z P> z [95% Conf. Interval]

1 We will use real-world data, but a very simple and naive model to keep the example easy to understand. What is interesting about the example is that the outcome of interest, perhaps the probability or

### Meta-analysis of epidemiological dose-response studies

Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.

### Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

### From the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models

The Stata Journal (2002) 2, Number 3, pp. 301 313 From the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models Mario A. Cleves, Ph.D. Department

### Chapter 1 Statistical Inference

Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

### Journal of Statistical Software

JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen Mei-Jie

### Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series

### Answers to Problem Set #4

Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

### Multistate Modeling and Applications

Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

### The coxvc_1-1-1 package

Appendix A The coxvc_1-1-1 package A.1 Introduction The coxvc_1-1-1 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models

### CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA

STATISTICS IN MEDICINE, VOL. 17, 59 68 (1998) CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA J. K. LINDSEY AND B. JONES* Department of Medical Statistics, School of Computing Sciences,

### STAT331. Cox s Proportional Hazards Model

STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

### Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

### Müller: Goodness-of-fit criteria for survival data

Müller: Goodness-of-fit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Goodness of fit criteria for survival data

### Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

### Measures of Fit from AR(p)

Measures of Fit from AR(p) Residual Sum of Squared Errors Residual Mean Squared Error Root MSE (Standard Error of Regression) R-squared R-bar-squared = = T t e t SSR 1 2 ˆ = = T t e t p T s 1 2 2 ˆ 1 1

### Nonlinear mixed-effects models using Stata

Nonlinear mixed-effects models using Stata Yulia Marchenko Executive Director of Statistics StataCorp LP 2017 German Stata Users Group meeting Yulia Marchenko (StataCorp) 1 / 48 Outline What is NLMEM?

### 8 Analysis of Covariance

8 Analysis of Covariance Let us recall our previous one-way ANOVA problem, where we compared the mean birth weight (weight) for children in three groups defined by the mother s smoking habits. The three

### A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

### Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

### Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

### Inference for Binomial Parameters

Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

### Modelling Binary Outcomes 21/11/2017

Modelling Binary Outcomes 21/11/2017 Contents 1 Modelling Binary Outcomes 5 1.1 Cross-tabulation.................................... 5 1.1.1 Measures of Effect............................... 6 1.1.2 Limitations

### Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

### PASS Sample Size Software. Poisson Regression

Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

### 8. Nonstandard standard error issues 8.1. The bias of robust standard errors

8.1. The bias of robust standard errors Bias Robust standard errors are now easily obtained using e.g. Stata option robust Robust standard errors are preferable to normal standard errors when residuals

### Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

### Introduction to SAS proc mixed

Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

### Regression analysis of interval censored competing risk data using a pseudo-value approach

Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 555 562 http://dx.doi.org/10.5351/csam.2016.23.6.555 Print ISSN 2287-7843 / Online ISSN 2383-4757 Regression analysis of interval

### Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

### Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

### Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

### Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

### Model Estimation Example

Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

### ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

### Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

### Extensions of Cox Model for Non-Proportional Hazards Purpose

PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

### CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

### Title. Syntax. Description. Options. stata.com. forecast estimates Add estimation results to a forecast model

Title stata.com forecast estimates Add estimation results to a forecast model Syntax Description Options Remarks and examples References Also see Syntax Add estimation result currently in memory to model

### Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural

### Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

### Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.

Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 This handout steals heavily

### Unit 2 Regression and Correlation Practice Problems. SOLUTIONS Version STATA

PubHlth 640. Regression and Correlation Page 1 of 19 Unit Regression and Correlation Practice Problems SOLUTIONS Version STATA 1. A regression analysis of measurements of a dependent variable Y on an independent

### Investigating Models with Two or Three Categories

Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

### Nonlinear relationships Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Nonlinear relationships Richard Williams, University of Notre Dame, https://www.nd.edu/~rwilliam/ Last revised February, 5 Sources: Berry & Feldman s Multiple Regression in Practice 985; Pindyck and Rubinfeld

### A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

### Lab 10 - Binary Variables

Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

### Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

### Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

### Attributable Risk Function in the Proportional Hazards Model

UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

### sociology 362 regression

sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

### Analysis of repeated measurements (KLMED8008)

Analysis of repeated measurements (KLMED8008) Eirik Skogvoll, MD PhD Professor and Consultant Institute of Circulation and Medical Imaging Dept. of Anaesthesiology and Intensive Care 1 Repeated measurements

### SplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006

SplineLinear.doc 1 # 9 Problem:... 2 Objective... 2 Reformulate... 2 Wording... 2 Simulating an example... 3 SPSS 13... 4 Substituting the indicator function... 4 SPSS-Syntax... 4 Remark... 4 Result...

### 2. We care about proportion for categorical variable, but average for numerical one.

Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is

### A Regression Model for the Copula Graphic Estimator

Discussion Papers in Economics Discussion Paper No. 11/04 A Regression Model for the Copula Graphic Estimator S.M.S. Lo and R.A. Wilke April 2011 2011 DP 11/04 A Regression Model for the Copula Graphic

### SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE

### Multivariate Survival Data With Censoring.

1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

### Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

### Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Group Comparisons: Differences in Composition Versus Differences in Models and Effects Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 Overview.

### Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

### MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

MLMED User Guide Nicholas J. Rockwood The Ohio State University rockwood.19@osu.edu Beta Version May, 2017 MLmed is a computational macro for SPSS that simplifies the fitting of multilevel mediation and

### A new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston

A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials

### Step 2: Select Analyze, Mixed Models, and Linear.

Example 1a. 20 employees were given a mood questionnaire on Monday, Wednesday and again on Friday. The data will be first be analyzed using a Covariance Pattern model. Step 1: Copy Example1.sav data file

### Analysis of Cure Rate Survival Data Under Proportional Odds Model

Analysis of Cure Rate Survival Data Under Proportional Odds Model Yu Gu 1,, Debajyoti Sinha 1, and Sudipto Banerjee 2, 1 Department of Statistics, Florida State University, Tallahassee, Florida 32310 5608,

### Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

### Statistics 262: Intermediate Biostatistics Regression & Survival Analysis

Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,

### 11. Generalized Linear Models: An Introduction

Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

### BIOS 312: Precision of Statistical Inference

and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

### EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

### AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

### Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

### POLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II

POLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II Introduction Today we ll talk about interpreting MNL and CL models. We ll start with general issues of model fit, and then get to variable

### BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

### Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

### Models for Binary Outcomes

Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

### Accommodating covariates in receiver operating characteristic analysis

The Stata Journal (2009) 9, Number 1, pp. 17 39 Accommodating covariates in receiver operating characteristic analysis Holly Janes Fred Hutchinson Cancer Research Center Seattle, WA hjanes@fhcrc.org Gary

### leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health

### Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

### Biostatistics Quantitative Data

Biostatistics Quantitative Data Descriptive Statistics Statistical Models One-sample and Two-Sample Tests Introduction to SAS-ANALYST T- and Rank-Tests using ANALYST Thomas Scheike Quantitative Data This

### Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Mogens Erlandsen, Department of Biostatistics, Aarhus University, November 2010

30 CVP (mean and sd) Postgraduate course in ANOVA and Repeated Measurements Day Repeated measurements (part ) Mogens Erlandsen Deptartment of Biostatistics Aarhus University 5 0 15 10 0 1 3 4 5 6 7 8 9

### Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

### Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of

### Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on -APPENDIX-

Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on survival probabilities using generalised pseudo-values Ulrike Pötschger,2, Harald Heinzl 2, Maria Grazia Valsecchi

### Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj

### Generating survival data for fitting marginal structural Cox models using Stata Stata Conference in San Diego, California

Generating survival data for fitting marginal structural Cox models using Stata 2012 Stata Conference in San Diego, California Outline Idea of MSM Various weights Fitting MSM in Stata using pooled logistic

### 8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

### Lecture 8 Stat D. Gillen

Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

### Mixed effects models

Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

### 4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES FOR SINGLE FACTOR BETWEEN-S DESIGNS Planned or A Priori Comparisons We previously showed various ways to test all possible pairwise comparisons for

### Lecture 3 Linear random intercept models

Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under

### M3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction

M3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction Mike Stoolmiller: stoolmil@uoregon.edu University of Oregon 5/21/2013 Outline Example Research Questions