Variable Selection and Model Choice in Survival Models with Time-Varying Effects
|
|
- Brice Wright
- 6 years ago
- Views:
Transcription
1 Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität Erlangen-Nürnberg joint work with Thomas Kneib and Torsten Hothorn Department of Statistics Ludwig-Maximilians-Universität München user! benjamin.hofner@imbe.med.uni-erlangen.de
2 Introduction Cox PH model: λ i (t) = λ(t, x i ) = λ 0 (t) exp(x iβ) with λ i (t) hazard rate of observation i [i = 1,..., n] λ 0 (t) baseline hazard rate x i vector of covariates for observation i [i = 1,..., n] β vector of regression coefficients Problem: restrictive model, not allowing for non-proportional hazards (e.g., time-varying effects) non-linear effects
3 Additive Hazard Regression Generalisation: Additive Hazard Regression (Kneib & Fahrmeir, 2007) λ i (t) = exp(η i (t)) with J η i (t) = f j (x i (t)), j=1 generic representation of covariate effects f j (x i ) a) linear effects: f j (x i (t)) = f linear ( x i ) = x i β b) smooth effects: f j (x i (t)) = f smooth ( x i ) c) time-varying effects: f j (x i (t)) = f smooth (t) x i where x i x i (t). Note: c) includes log-baseline for x i 1
4 P-Splines flexible terms can be represented using P-splines (Eilers & Marx, 1996) with model term (x can be either x i or t): M f j (x) = β jm B jm (x) (j = 1,..., J) penalty: m=1 pen j (β j ) = { κj β j Kβ j cases b),c) 0 case a) K = D ( D (i.e., cross product of ) difference matrix D) D e.g = κ j smoothing parameter (larger κ j more penalization smoother fit)
5 Inference Penalized Likelihood Criterion: (NB: this is the full log-likelihood) L pen (β) = n i=1 [ δ i η i (t i ) ti 0 ] exp(η i (t)) dt J pen j (β j ) j=0 T i true survival time C i censoring time t i = min(t i, C i ) observed survival time (right censoring) δ i = 1(T i C i ) indicator for non-censoring Problem: Estimation and in particular model choice
6 Cox flex Boost Aim: Maximization of a (potentially) high-dimensional log-likelihood with different modeling alternatives Thus, we use: Iterative algorithm Likelihood-based boosting algorithm Component-wise base-learners Therefore: Use one base-learner g j ( ) for each covariate (or each model component) [ j {1,..., J} ] Component-Wise Boosting as a means of estimation and variable selection combined with model choice.
7 Cox flex Boost Aim: Maximization of a (potentially) high-dimensional log-likelihood with different modeling alternatives Thus, we use: Iterative algorithm Likelihood-based boosting algorithm Component-wise base-learners Therefore: Use one base-learner g j ( ) for each covariate (or each model component) [ j {1,..., J} ] Component-Wise Boosting as a means of estimation and variable selection combined with model choice.
8 Cox flex Boost Aim: Maximization of a (potentially) high-dimensional log-likelihood with different modeling alternatives Thus, we use: Iterative algorithm Likelihood-based boosting algorithm Component-wise base-learners Therefore: Use one base-learner g j ( ) for each covariate (or each model component) [ j {1,..., J} ] Component-Wise Boosting as a means of estimation and variable selection combined with model choice.
9 Cox flex Boost Algorithm (i) Initialization: Iteration index m := 0. Function estimates (for all j {1,..., J}): ˆf [0] j ( ) 0 Offset (MLE for constant log hazard): ( n ˆη [0] i=1 ( ) log δ ) i n i=1 t i
10 (ii) Estimation: m := m + 1. Fit all (linear/p-spline) base-learners separately by penalized MLE, i.e., ĝ j = g j ( ; ˆβ j ), j {1,..., J}, ˆβ j = arg max β L[m] j,pen (β) with the penalized log-likelihood ( analogously as above ) n [ L [m] j,pen (β) = δ i (ˆη [m 1] i + g j (x i (t i ); β)) i=1 with the additive predictor η i split ti { } ] exp ˆη [m 1] i ( t) + g j (x i ( t); β) d t pen j (β), 0 into the estimate from previous iteration ˆη [m 1] i and the current base-learner g j ( ; β)
11 (iii) Selection: Choose base-learner ĝ j with j = arg max j {1,...,J} L[m] j,unpen (ˆβ j ) (iv) Update: Function estimates (for all j {1,..., J}): { [m 1] ˆf j + ν ĝ j j = j ˆf [m] j = Additive predictor (= fit): ˆf [m 1] j j j ˆη [m] = ˆη [m 1] + ν ĝ j with step-length ν (0, 1] (here: ν = 0.1) (v) Stopping rule: Continue iterating steps (ii) to (iv) until m = m stop
12 Some Aspects of Cox flex Boost Estimation Selection Base-Learners full penalized MLE ν (step-length) based on unpenalized log-likelihood L [m] j,unpen specified by (initial) degrees of freedom, i.e., df j = df j Likelihood-based boosting (in general): See, e.g., Tutz and Binder (2006) Above aspects in Cox flex Boost: See, e.g., model based boosting (Bühlmann & Hothorn, 2007)
13 Degrees of Freedom Specifying df more intuitive than specifying smoothing parameter κ Comparable to other modeling components, e.g., linear effects Problem: Not constant over the (boosting) iterations But simulation studies showed: No big deviation from the initial df j = df j bbs(x 3) df(m) Estimated degrees of freedom traced over the boosting steps for the flexible base-learners of x 3 (in 200 replicates) and initially specified degrees of freedom (dashed line) boosting iteration m
14 Model Choice Recall from generic representation: f j ( x i ) can be a a) linear effect: f j (x i (t)) = f linear ( x i ) = x i β b) smooth effect: f j (x i (t)) = f smooth ( x i ) c) time-varying effect: f j (x i (t)) = f smooth (t) x i We see: x i can enter the model in 3 different ways But how? Add all possibilities as base-learners to the model. Boosting can chose between the possibilities But the df must be comparable! Otherwise: more flexible base-learners are preferred
15 Model Choice Recall from generic representation: f j ( x i ) can be a a) linear effect: f j (x i (t)) = f linear ( x i ) = x i β b) smooth effect: f j (x i (t)) = f smooth ( x i ) c) time-varying effect: f j (x i (t)) = f smooth (t) x i We see: x i can enter the model in 3 different ways But how? Add all possibilities as base-learners to the model. Boosting can chose between the possibilities But the df must be comparable! Otherwise: more flexible base-learners are preferred
16 For higher order differences (d 2): df > 1 (κ ) Polynomial of order d 1 remains unpenalized Solution: Decomposition (based on Kneib, Hothorn, & Tutz, 2008) g(x) = β 0 + β 1 x β d 1 x d 1 }{{} unpenalized, parametric part + g centered (x) }{{} deviation from polynomial Add unpenalized part as separate, parametric base-learners Assign df = 1 to the centered effect (and add as P-spline base-learner) Analogously for time-varying effects Technical realization (see Fahrmeir, Kneib, & Lang, 2004): decomposing the vector of regression coefficients β into ( β unpen, β pen ) utilizing a spectral decomposition of the penalty matrix
17 Early Stopping 1 Run the algorithm m stop -times (previously defined). 2 Determine new m stop,opt m stop :... based on out-of-bag sample (with simulations easy to use)... based on information criterion, e.g., AIC Prevents algorithm to stop in a local maximum (of the log-likelihood) Early stopping prevents overfitting
18 Variable Selection and Model Choice... is achieved by selection of base-learner (in step (iii) of Cox flex Boost), i.e., component-wise boosting and early stopping Simulation-Results (in Short) Good variable selection strategy Good model choice strategy if only linear and smooth effects are used Selection bias in favor of time-varying base-learners (if present) standardizing time could be a solution Estimates are better if model choice is performed
19 Computational Aspects Cox flex Boost is implemented using R Crucial computation: Integral in L [m] j,pen (β): ti 0 { } exp ˆη [m 1] i ( t) + g j (x i ( t); β) d t time consuming very often evaluated (maximization of L [m] j,pen (β)) R-function integrate() slow in this context (specialized) vectorized trapezoid integration implemented 100 times quicker Efficient storage of matrices can reduce computational burden recycling of results
20 Computational Aspects Cox flex Boost is implemented using R Crucial computation: Integral in L [m] j,pen (β): ti 0 { } exp ˆη [m 1] i ( t) + g j (x i ( t); β) d t time consuming very often evaluated (maximization of L [m] j,pen (β)) R-function integrate() slow in this context (specialized) vectorized trapezoid integration implemented 100 times quicker Efficient storage of matrices can reduce computational burden recycling of results
21 Computational Aspects Cox flex Boost is implemented using R Crucial computation: Integral in L [m] j,pen (β): ti 0 { } exp ˆη [m 1] i ( t) + g j (x i ( t); β) d t time consuming very often evaluated (maximization of L [m] j,pen (β)) R-function integrate() slow in this context (specialized) vectorized trapezoid integration implemented 100 times quicker Efficient storage of matrices can reduce computational burden recycling of results
22 Summary & Outlook Cox flex Boost allows for variable selection and model choice.... allows for flexible modeling flexible, non-linear effects time-varying effects (i.e., non-proportional hazards)... provides functions to manipulate and show results (summary(), plot(), subset(),... ) To be continued... Formula for AIC (for Boosting in Survival Models) Include mandatory covariates (update in each step) Measure for variable importance: e.g., ˆf [mstop] j ( )
23 Summary & Outlook Cox flex Boost allows for variable selection and model choice.... allows for flexible modeling flexible, non-linear effects time-varying effects (i.e., non-proportional hazards)... provides functions to manipulate and show results (summary(), plot(), subset(),... ) To be continued... Formula for AIC (for Boosting in Survival Models) Include mandatory covariates (update in each step) Measure for variable importance: e.g., ˆf [mstop] j ( )
24 Literature Bühlmann, P., & Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, 22(4), Eilers, P. H. C., & Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2), Fahrmeir, L., Kneib, T., & Lang, S. (2004). Penalized structured additive regression: A Bayesian perspective. Statistica Sinica, 14, Kneib, T., & Fahrmeir, L. (2007). A mixed model approach for geoadditive hazard regression. Scand. J. Statist., 34, Kneib, T., Hothorn, T., & Tutz, G. (2008). Variable selection and model choice in geoadditive regression. Biometrics (accepted). Tutz, G., & Binder, H. (2006). Generalized additive modelling with implicit variable selection by likelihood-based boosting. Biometrics, 62,
25 l with model choice l without model choice log(hazard rate) log(hazard rate) x x 1 log(hazard rate) log(hazard rate)
26 log(hazard rate) log(hazard rate) time time
Variable Selection and Model Choice in Structured Survival Models
Benjamin Hofner, Torsten Hothorn and Thomas Kneib Variable Selection and Model Choice in Structured Survival Models Technical Report Number 043, 2008 Department of Statistics University of Munich http://www.stat.uni-muenchen.de
More informationmboost - Componentwise Boosting for Generalised Regression Models
mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten Hothorn Department of Statistics Ludwig-Maximilians-University Munich 13.8.2008 Boosting in a Nutshell Boosting
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationA Framework for Unbiased Model Selection Based on Boosting
Benjamin Hofner, Torsten Hothorn, Thomas Kneib & Matthias Schmid A Framework for Unbiased Model Selection Based on Boosting Technical Report Number 072, 2009 Department of Statistics University of Munich
More informationgamboostlss: boosting generalized additive models for location, scale and shape
gamboostlss: boosting generalized additive models for location, scale and shape Benjamin Hofner Joint work with Andreas Mayr, Nora Fenske, Thomas Kneib and Matthias Schmid Department of Medical Informatics,
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationVariable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting
Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting Andreas Groll 1 and Gerhard Tutz 2 1 Department of Statistics, University of Munich, Akademiestrasse 1, D-80799, Munich,
More informationRegularization in Cox Frailty Models
Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University
More informationKneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"
Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/
More informationBeyond Mean Regression
Beyond Mean Regression Thomas Kneib Lehrstuhl für Statistik Georg-August-Universität Göttingen 8.3.2013 Innsbruck Introduction Introduction One of the top ten reasons to become statistician (according
More informationEstimating prediction error in mixed models
Estimating prediction error in mixed models benjamin saefken, thomas kneib georg-august university goettingen sonja greven ludwig-maximilians-university munich 1 / 12 GLMM - Generalized linear mixed models
More informationModel-based Boosting in R: A Hands-on Tutorial Using the R Package mboost
Benjamin Hofner, Andreas Mayr, Nikolay Robinzonov, Matthias Schmid Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost Technical Report Number 120, 2012 Department of Statistics University
More informationBoosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost
Riccardo De Bin Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost Technical Report Number 180, 2015 Department
More informationModel-based boosting in R
Model-based boosting in R Families in mboost Nikolay Robinzonov nikolay.robinzonov@stat.uni-muenchen.de Institut für Statistik Ludwig-Maximilians-Universität München Statistical Computing 2011 Model-based
More informationBoosting Techniques for Nonlinear Time Series Models
Ludwig-Maximilians-Universität München Department of Statistics Master-Thesis Boosting Techniques for Nonlinear Time Series Models Supervisors: Prof. Dr. Gerhard Tutz Dr. Florian Leitenstorfer May 2008
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationP -spline ANOVA-type interaction models for spatio-temporal smoothing
P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and
More informationLASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape
LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape Nikolaus Umlauf https://eeecon.uibk.ac.at/~umlauf/ Overview Joint work with Andreas Groll, Julien Hambuckers
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationBoosting Methods: Why They Can Be Useful for High-Dimensional Data
New URL: http://www.r-project.org/conferences/dsc-2003/ Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) March 20 22, Vienna, Austria ISSN 1609-395X Kurt Hornik,
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationLecture 3: Introduction to Complexity Regularization
ECE90 Spring 2007 Statistical Learning Theory Instructor: R. Nowak Lecture 3: Introduction to Complexity Regularization We ended the previous lecture with a brief discussion of overfitting. Recall that,
More informationFahrmeir: Discrete failure time models
Fahrmeir: Discrete failure time models Sonderforschungsbereich 386, Paper 91 (1997) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Discrete failure time models Ludwig Fahrmeir, Universitat
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationPackage penmsm. R topics documented: February 20, Type Package
Type Package Package penmsm February 20, 2015 Title Estimating Regularized Multi-state Models Using L1 Penalties Version 0.99 Date 2015-01-12 Author Maintainer Structured fusion
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationSelection of Effects in Cox Frailty Models by Regularization Methods
Biometrics DOI: 101111/biom12637 Selection of Effects in Cox Frailty Models by Regularization Methods Andreas Groll, 1, * Trevor Hastie, 2, ** and Gerhard Tutz 1, *** 1 Department of Statistics, Ludwig-Maximilians-University
More informationConditional Transformation Models
IFSPM Institut für Sozial- und Präventivmedizin Conditional Transformation Models Or: More Than Means Can Say Torsten Hothorn, Universität Zürich Thomas Kneib, Universität Göttingen Peter Bühlmann, Eidgenössische
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationYou know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?
You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationSurvival Analysis I (CHL5209H)
Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationGradient boosting for distributional regression: faster tuning and improved variable selection via noncyclical updates
DOI 1.17/s11-17-975- Gradient boosting for distributional regression: faster tuning and improved variable selection via non updates Janek Thomas 1 Andreas Mayr,3 Bernd Bischl 1 Matthias Schmid 3 Adam Smith
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationtwinsir: Individual-level epidemic modeling for a fixed population with known distances
This introduction to the twinsir modeling framework of the R package surveillance is based on a publication in the Journal of Statistical Software Meyer, Held, and Höhle (2017, Section 4) which is the
More informationSTAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis
STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationCART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions. f(x) = c m 1(x R m )
CART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions with R 1,..., R m R p disjoint. f(x) = M c m 1(x R m ) m=1 The CART algorithm is a heuristic, adaptive
More informationAdapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples. Harald Binder & Martin Schumacher
Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples Harald Binder & Martin Schumacher Universität Freiburg i. Br. Nr. 100 December 2007 Zentrum für
More informationChapter 7: Model Assessment and Selection
Chapter 7: Model Assessment and Selection DD3364 April 20, 2012 Introduction Regression: Review of our problem Have target variable Y to estimate from a vector of inputs X. A prediction model ˆf(X) has
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationEnsemble Methods and Random Forests
Ensemble Methods and Random Forests Vaishnavi S May 2017 1 Introduction We have seen various analysis for classification and regression in the course. One of the common methods to reduce the generalization
More informationReduced-rank hazard regression
Chapter 2 Reduced-rank hazard regression Abstract The Cox proportional hazards model is the most common method to analyze survival data. However, the proportional hazards assumption might not hold. The
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview
Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationStatistics 262: Intermediate Biostatistics Model selection
Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.
More informationGeneralized Boosted Models: A guide to the gbm package
Generalized Boosted Models: A guide to the gbm package Greg Ridgeway April 15, 2006 Boosting takes on various forms with different programs using different loss functions, different base models, and different
More informationAn Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K.
An Introduction to GAMs based on penalied regression splines Simon Wood Mathematical Sciences, University of Bath, U.K. Generalied Additive Models (GAM) A GAM has a form something like: g{e(y i )} = η
More informationDynamic Models Part 1
Dynamic Models Part 1 Christopher Taber University of Wisconsin December 5, 2016 Survival analysis This is especially useful for variables of interest measured in lengths of time: Length of life after
More informationSTAT 740: B-splines & Additive Models
1 / 49 STAT 740: B-splines & Additive Models Timothy Hanson Department of Statistics, University of South Carolina 2 / 49 Outline 1 2 3 3 / 49 Bayesian linear model Functional form of predictor Non-normal
More informationGeneralized additive modelling of hydrological sample extremes
Generalized additive modelling of hydrological sample extremes Valérie Chavez-Demoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationGeoadditive regression modeling of stream biological condition
Environ Ecol Stat (2011) 18:709 733 DOI 10.1007/s10651-010-0158-4 Geoadditive regression modeling of stream biological condition Matthias Schmid Torsten Hothorn Kelly O. Maloney Donald E. Weller Sergej
More informationBIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationMultivariate Calibration with Robust Signal Regression
Multivariate Calibration with Robust Signal Regression Bin Li and Brian Marx from Louisiana State University Somsubhra Chakraborty from Indian Institute of Technology Kharagpur David C Weindorf from Texas
More informationModelling spatial patterns of distribution and abundance of mussel seed using Structured Additive Regression models
Statistics & Operations Research Transactions SORT 34 (1) January-June 2010, 67-78 ISSN: 1696-2281 www.idescat.net/sort Statistics & Operations Research c Institut d Estadística de Catalunya Transactions
More informationCURE MODEL WITH CURRENT STATUS DATA
Statistica Sinica 19 (2009), 233-249 CURE MODEL WITH CURRENT STATUS DATA Shuangge Ma Yale University Abstract: Current status data arise when only random censoring time and event status at censoring are
More informationTutz, Binder: Flexible Modelling of Discrete Failure Time Including Time-Varying Smooth Effects
Tutz, Binder: Flexible Modelling of Discrete Failure Time Including Time-Varying Smooth Effects Sonderforschungsbereich 386, Paper 294 (2002) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationApplication of density estimation methods to quantal analysis
Application of density estimation methods to quantal analysis Koichi Yoshioka Tokyo Medical and Dental University Summary There has been controversy for the quantal nature of neurotransmission of mammalian
More informationDynamic survival models for credit risks.
Dynamic survival models for credit risks. Viani Djeundje and Jonathan Crook Credit Research Centre, University of Edinburgh Abstract Single event survival models predict the probability that an event will
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationEffective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression
Communications of the Korean Statistical Society 2009, Vol. 16, No. 4, 713 722 Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Young-Ju Kim 1,a a Department of Information
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationChapter 3. Linear Models for Regression
Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear
More informationSTK-IN4300 Statistical Learning Methods in Data Science
Outline of the lecture STK-IN4300 Statistical Learning Methods in Data Science Riccardo De Bin debin@math.uio.no AdaBoost Introduction algorithm Statistical Boosting Boosting as a forward stagewise additive
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationPenalized Splines, Mixed Models, and Recent Large-Sample Results
Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong
More informationTGDR: An Introduction
TGDR: An Introduction Julian Wolfson Student Seminar March 28, 2007 1 Variable Selection 2 Penalization, Solution Paths and TGDR 3 Applying TGDR 4 Extensions 5 Final Thoughts Some motivating examples We
More informationPENALIZED STRUCTURED ADDITIVE REGRESSION FOR SPACE-TIME DATA: A BAYESIAN PERSPECTIVE
Statistica Sinica 14(2004), 715-745 PENALIZED STRUCTURED ADDITIVE REGRESSION FOR SPACE-TIME DATA: A BAYESIAN PERSPECTIVE Ludwig Fahrmeir, Thomas Kneib and Stefan Lang University of Munich Abstract: We
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 14, 2014 Today s Schedule Course Project Introduction Linear Regression Model Decision Tree 2 Methods
More informationFlexible Spatio-temporal smoothing with array methods
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and
More informationA Penalty Approach to Differential Item Functioning in Rasch Models
Gerhard Tutz & Gunther Schauberger A Penalty Approach to Differential Item Functioning in Rasch Models Technical Report Number 134, 2012 Department of Statistics University of Munich http://www.stat.uni-muenchen.de
More informationModel checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking
Model checking overview Checking & Selecting GAMs Simon Wood Mathematical Sciences, University of Bath, U.K. Since a GAM is just a penalized GLM, residual plots should be checked exactly as for a GLM.
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationEstimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2
Estimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2 1 Norwegian University of Life Sciences, Department of Ecology and Natural Resource
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationChoosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation
Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationA Variational Model for Data Fitting on Manifolds by Minimizing the Acceleration of a Bézier Curve
A Variational Model for Data Fitting on Manifolds by Minimizing the Acceleration of a Bézier Curve Ronny Bergmann a Technische Universität Chemnitz Kolloquium, Department Mathematik, FAU Erlangen-Nürnberg,
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationDynamic Prediction of Disease Progression Using Longitudinal Biomarker Data
Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum
More informationPh.D. course: Regression models
Ph.D. course: Regression models Non-linear effect of a quantitative covariate PKA & LTS Sect. 4.2.1, 4.2.2 8 May 2017 www.biostat.ku.dk/~pka/regrmodels17 Per Kragh Andersen 1 Linear effects We have studied
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationAssessment of time varying long term effects of therapies and prognostic factors
Assessment of time varying long term effects of therapies and prognostic factors Dissertation by Anika Buchholz Submitted to Fakultät Statistik, Technische Universität Dortmund in Fulfillment of the Requirements
More informationSpectral Regularization
Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne
More informationBoosting. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL , 10.7, 10.13
Boosting Ryan Tibshirani Data Mining: 36-462/36-662 April 25 2013 Optional reading: ISL 8.2, ESL 10.1 10.4, 10.7, 10.13 1 Reminder: classification trees Suppose that we are given training data (x i, y
More informationBasis Penalty Smoothers. Simon Wood Mathematical Sciences, University of Bath, U.K.
Basis Penalty Smoothers Simon Wood Mathematical Sciences, University of Bath, U.K. Estimating functions It is sometimes useful to estimate smooth functions from data, without being too precise about the
More information