Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Size: px
Start display at page:

Download "Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina"

Transcription

1 Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes

2 Introduction Method Theoretical Results Simulation Studies Application Conclusions

3 Introduction

4 Introduction For survival data, one important goal is to use a subject s baseline information to predict the exact timing of an event.

5 Introduction For survival data, one important goal is to use a subject s baseline information to predict the exact timing of an event. A variety of regression models focus on estimating the survival function and evaluating covariate effects, but not on predicting event times.

6 Introduction For survival data, one important goal is to use a subject s baseline information to predict the exact timing of an event. A variety of regression models focus on estimating the survival function and evaluating covariate effects, but not on predicting event times. We consider supervised learning algorithms for prediction: they directly aim for prediction; they are nonparametric; they are powerful and flexible in handling large number of predictors.

7 Introduction Many learning methods exist for predicting non-censored outcomes, of which support vector machines (SVM) are commonly used for binary outcomes.

8 Introduction Many learning methods exist for predicting non-censored outcomes, of which support vector machines (SVM) are commonly used for binary outcomes. A simple geometric interpretation. A convex quadratic programming problem. Inclusion of non-linearity by using kernels. Adapted for regression with a continuous response by using the ɛ-insensitive loss.

9 Introduction For survival outcomes, censoring imposes new challenge for SVM. Most of existing methods focus on modifying the support vector regression.

10 Introduction For survival outcomes, censoring imposes new challenge for SVM. Most of existing methods focus on modifying the support vector regression. Shivaswamy et al. (2007) and Khan and Zubek (2008) generalized the ɛ-insensitive loss function.

11 Introduction Van Belle et al. (2009, 2011a) adopted the concept of concordance index to handle censored data. They considered all the comparable pairs. For observed times y i < y j, ranking constraints were added for predicted values to penalize misranking: f (x j ) f (x i ) 1 ζ ij, i < j.

12 Introduction Van Belle et al. (2009, 2011a) adopted the concept of concordance index to handle censored data. They considered all the comparable pairs. For observed times y i < y j, ranking constraints were added for predicted values to penalize misranking: f (x j ) f (x i ) 1 ζ ij, i < j. Van Belle et al. (2011b) included both regression (loss function modification) and ranking constraints. Prediction rule not clear; No theoretical justification; Observed information may not be fully used; Censoring is completely random.

13 Introduction Goldberg and Kosorok (2013) used inverse probability of censoring weighting to adapt standard support vector methods.

14 Introduction Goldberg and Kosorok (2013) used inverse probability of censoring weighting to adapt standard support vector methods. Their method may suffer from severe bias when the censoring distribution is misspecified. The learning only uses uncensored observations. Large weights (high censoring) often make algorithms numerically unstable and even infeasible.

15 Method

16 Method We represent the survival times in the framework of the counting process.

17 Method We represent the survival times in the framework of the counting process. At each event time, we use a support vector machine to identify event vs. non-events.

18 Method A risk score f (t, X) = α(t) + g(x) is used to classify the binary outcomes (o/x) in a maximal separation sense.

19 Method A risk score f (t, X) = α(t) + g(x) is used to classify the binary outcomes (o/x) in a maximal separation sense. α(t): depending on t; used to stratify times for risk sets;

20 Method A risk score f (t, X) = α(t) + g(x) is used to classify the binary outcomes (o/x) in a maximal separation sense. α(t): depending on t; used to stratify times for risk sets; g(x): a function of covariates for true prediction, e.g. g(x) = X T β for linear kernel;

21 Method A risk score f (t, X) = α(t) + g(x) is used to classify the binary outcomes (o/x) in a maximal separation sense. α(t): depending on t; used to stratify times for risk sets; g(x): a function of covariates for true prediction, e.g. g(x) = X T β for linear kernel; imbalance between events and non-events at each t (1 vs n).

22 Method Notations: counting process N i (t) = I(T i C i t); at-risk process Y i (t) = I(T i C i t); total number of events: d = n i=1 I(T i C i ). Primal form: min α,g 1 2 g 2 + C n s.t. Y i (t j )ζ i (t j ) 0, n i=1 d Y i (t j )w i (t j )ζ i (t j ) j=1 Y i (t j )δn i (t j ){α(t j ) + g(x i )} Y i (t j ){1 ζ i (t j )}, where δn i (t j ) 2{N i (t j ) N i (t j )} 1, and w i (t j ) = I { δn i (t j ) = 1 } { } 1 1 n i=1 Y +I { δn i (t j ) = 1 } { } 1 i(t j ) n i=1 Y. i(t j )

23 Method n d Dual form: L D = γ ij Y i (t j ) i=1 i =1 j=1 j =1 i=1 j=1 1 n n d d γ ij γ i j 2 Y i(t j )Y i (t j )δn i (t j )δn i (t j )K(X i, X i ) s.t. 0 γ ij w i (t j )C n, n γ ij Y i (t j )δn i (t j ) = 0. i=1

24 Method n d Dual form: L D = γ ij Y i (t j ) i=1 i =1 j=1 j =1 i=1 j=1 1 n n d d γ ij γ i j 2 Y i(t j )Y i (t j )δn i (t j )δn i (t j )K(X i, X i ) s.t. 0 γ ij w i (t j )C n, n γ ij Y i (t j )δn i (t j ) = 0. i=1 The kernel function K(X i, X i ) = g(x i ) T g(x i ). The predictive score ĝ(x i ) = n i=1 d j=1 ˆγ ijy i (t j )δn i (t j )K(X i, X i ).

25 Method Prediction: using the predictive scores ĝ(x) and distinct event times in the training data set to predict the survival outcome of the future subject. A future subject with ĝ(xnew) Training data set: ĝ(x 3 ) T 1 ĝ(x 2 ) T 2 ĝ(x 1 ) T 3... If the value of ĝ(xnew) is closest to ĝ(x 2 ), we predict the survival outcome of this subject to be T 2.

26 Theoretical Results

27 Empirical risk Regularization Form: min R n (f ) + λ n g, where n d R n(f ) = n 1 w i (t j )Y i (t j )[1 (α(t j ) + g(x i ))δn i (t j )] +. i=1 j=1

28 Empirical risk Regularization Form: min R n (f ) + λ n g, where R n(f ) = n 1 n i=1 d w i (t j )Y i (t j )[1 (α(t j ) + g(x i ))δn i (t j )] +. j=1 After substituting ˆα(t j ) into R n (f ), we obtain a profile empirical risk PR n (g) for g( ): 1 n n i=1 n k=1 I(Y k Y i )[2 g(x i ) + g(x k )] + i n k=1 I(Y k Y i ) 2 n n i=1 i n k=1 I(Y k Y i ).

29 Empirical risk Regularization Form: min R n (f ) + λ n g, where R n(f ) = n 1 n i=1 d w i (t j )Y i (t j )[1 (α(t j ) + g(x i ))δn i (t j )] +. j=1 After substituting ˆα(t j ) into R n (f ), we obtain a profile empirical risk PR n (g) for g( ): 1 n n i=1 n k=1 I(Y k Y i )[2 g(x i ) + g(x k )] + i n k=1 I(Y k Y i ) 2 n n i=1 i n k=1 I(Y k Y i ). ĝ(x) minimizes PR n (g) + λ n g, and R n (ˆf ) = PR n (ĝ). PR n (g) takes a similar form as the partial likelihood function in survival analysis under a different loss function.

30 Risk Function and Optimal Decision Rule We consider another empirical risk R 0n (f ), comparable to the concept of 0-1 loss in standard SVMs, R 0n (f ) = n 1 n i=1 d w i (t j )Y i (t j )I(δN i (t j )f (t j, X i ) < 0). j=1

31 Risk Function and Optimal Decision Rule We consider another empirical risk R 0n (f ), comparable to the concept of 0-1 loss in standard SVMs, R 0n (f ) = n 1 n i=1 d w i (t j )Y i (t j )I(δN i (t j )f (t j, X i ) < 0). j=1 Denoting the asymptotic limits of R n (f ) and R 0n (f ) to be R(f ) and R 0 (f ), we find the optimal decision rule f (t, x) that minimizes both R(f ) and R 0 (f ), and the minimal risk R 0 (f ).

32 Risk Function and Optimal Decision Rule Theorem Let h(t, x) denote the conditional hazard rate function of T = t given X = x and let h(t) = E[dN(t)/dt]/E[Y(t)] = E[h(t, X) Y(t) = 1] be the average hazard rate at time t. Then f (t, x) = sign(h(t, x) h(t)) minimizes R(f ). Furthermore, f (t, x) also minimizes R 0 (f ) and R 0 (f ) = P(T C) 1 [ ] 2 E E{Y(t) X = x} h(t, x) h(t) dt. In addition, for any f (t, x) [ 1, 1], for some constant c. R 0 (f ) R 0 (f ) R(f ) R(f )

33 Asymptotic Properties We consider the profile risk for PR(g) and the reproducing kernel space H n from a Gaussian kernel. Then we derive the asymptotic learning rate. Theorem Assume that X s support is compact and E[Y(τ) X] is bounded from zero where τ is the study duration. Furthermore, assume λ n and σ n satisfies λ n, σ n 0, and nλ n σ n (2/p 1/2)d for some p (0, 2). Then it holds λ n ĝ 2 H n + PR(ĝ) inf PR(g) + O p g { λ n + σ d/2 n } + λ 1/2 n σ n (1/p 1/4)d. n

34 Simulation Studies

35 Simulation Setup Our method compared with Van Belle et al., 2011 (Modified SVR) and Goldberg and Kosorok, 2013 (IPCW). Five baseline covariates X marginally normal with mean 0 and variance Survival times generated from Cox model with baseline Weibull distribution and β = (2, 1.6, 1.2, 0.8, 0.4). Censoring times depending on covariates X, and censoring ratios 40% and 60%: Case 1 using AFT models for censoring; Case 2 using Cox models for censoring.

36 Simulation Setup Using linear kernel; Sample size 100 and 200; 500 replicates. Tuning parameter C n selected using 5-fold cross-validation via a grid 2 16, 2 15,..., observations in the testing data set. Evaluating prediction performances: correlation, RMSE.

37 Simulation Results Table: Case 1, censoring times following the AFT model # of n = 100 n = 200 Censoring Noises Method Corr. RMSE Ratio Corr. RMSE Ratio 40% 0 Modified SVR (0.60) (0.58) 1.24 IPCW-KM (0.52) (0.41) 1.21 IPCW-Cox (0.64) (0.57) 1.25 SVHR (0.27) (0.17) Modified SVR (0.60) (0.57) 1.22 IPCW-KM (0.47) (0.44) 1.22 IPCW-Cox (0.54) (0.57) 1.27 SVHR (0.35) (0.20) Modified SVR (0.47) (0.50) 1.15 IPCW-KM (0.32) (0.34) 1.22 IPCW-Cox (0.46) (0.47) 1.26 SVHR (0.36) (0.23) Modified SVR (0.89) (0.47) 1.09 IPCW-KM (0.21) (0.14) 1.09 IPCW-Cox (0.23) (0.39) 1.15 SVHR (0.32) (0.25) 1.00

38 Simulation Results Table: Case 1, censoring times following the AFT model # of n = 100 n = 200 Censoring Noises Method Corr. RMSE Ratio Corr. RMSE Ratio 60% 0 Modified SVR (0.54) (0.42) 1.24 IPCW-KM (0.41) (0.37) 1.32 IPCW-Cox (0.47) (0.48) 1.33 SVHR (0.43) (0.33) Modified SVR (0.53) (0.50) 1.21 IPCW-KM (0.34) (0.32) 1.31 IPCW-Cox (0.39) (0.39) 1.33 SVHR (0.48) (0.33) Modified SVR (0.45) (0.45) 1.15 IPCW-KM (0.30) (0.24) 1.26 IPCW-Cox (0.30) (0.27) 1.29 SVHR (0.44) (0.36) Modified SVR (1.08) (1.52) 1.21 IPCW-KM (0.26) (0.20) 1.10 IPCW-Cox (0.20) (0.21) 1.15 SVHR (0.24) (0.25) 1.00

39 Simulation Results Table: Case 2, censoring times following the Cox model # of n = 100 n = 200 Censoring Noises Method Corr. RMSE Ratio Corr. RMSE Ratio 40% 0 Modified SVR (0.59) (0.54) 1.12 IPCW-KM (0.42) (0.31) 1.12 IPCW-Cox (0.57) (0.46) 1.12 SVHR (0.25) (0.16) Modified SVR (0.51) (0.50) 1.12 IPCW-KM (0.42) (0.34) 1.13 IPCW-Cox (0.52) (0.51) 1.16 SVHR (0.29) (0.18) Modified SVR (0.40) (0.38) 1.06 IPCW-KM (0.34) (0.30) 1.13 IPCW-Cox (0.40) (0.43) 1.18 SVHR (0.33) (0.20) Modified SVR (0.92) (0.54) 1.04 IPCW-KM (0.21) (0.18) 1.05 IPCW-Cox (0.23) (0.22) 1.07 SVHR (0.40) (0.24) 1.00

40 Simulation Results Table: Case 2, censoring times following the Cox model # of n = 100 n = 200 Censoring Noises Method Corr. RMSE Ratio Corr. RMSE Ratio 60% 0 Modified SVR (0.56) (0.47) 1.12 IPCW-KM (0.43) (0.33) 1.16 IPCW-Cox (0.56) (0.48) 1.17 SVHR (0.37) (0.25) Modified SVR (0.48) (0.46) 1.09 IPCW-KM (0.38) (0.35) 1.17 IPCW-Cox (0.44) (0.47) 1.20 SVHR (0.37) (0.27) Modified SVR (0.42) (0.38) 1.06 IPCW-KM (0.31) (0.26) 1.16 IPCW-Cox (0.40) (0.33) 1.20 SVHR (0.40) (0.29) Modified SVR (0.87) (0.80) 1.05 IPCW-KM (0.29) (0.21) 1.03 IPCW-Cox (0.26) (0.23) 1.08 SVHR (0.38) (0.35) 1.00

41 Application

42 Huntington s Disease Study Data The study is to identify and combine clinical and biological markers to detect early indicators of disease progression. The outcome is age observed at event or censored time. There are 705 subjects and 126 of them are non-censored. We study the prediction capability of 15 covariates on the age-at-onset of Huntington s Disease. Three-fold cross validation is used to choose tuning parameter C n. We consider both linear kernel and Gaussian kernel. To evaluate the prediction capability, we assess the usefulness of combined score in performing risk stratification.

43 Huntington s Disease Study Data Table: Normalized coefficient estimates using linear kernel Marker Normalized β Cox model a Total Motor Score * CAP * Stroop Color Stroop Word SDMT Stroop Interference FRSBE Total UHDRS Psychiatric SCL90 Depression SCL90 GSI SCL90 PST SCL90 PSDI TFC Education Male Gender * a The estimates from Cox model with significant p-value (p-value < 0.05) are marked with *.

44 Huntington s Disease Study Data Table: Comparison of prediction capability for different methods 25th percentile 50th percentile 75th percentile Kernel Method C-index Logrank χ 2 a HR b Logrank χ 2 HR Logrank χ 2 HR Linear Modified SVR IPCW-KM IPCW-Cox SVHR Gaussian Modified SVR IPCW-KM IPCW-Cox SVHR a Logrank χ 2, Chi-square statistics from Logrank tests for two groups separated using the 25th percentile, 50th percentile, and 75th percentile of predicted values. b HR, Hazard Ratios comparing two groups separated using the 25th percentile, 50th percentile, and 75th percentile of predicted values.

45 Huntington s Disease Study Data 10 Linear Kernel 8 Hazard ratio Percentile as the cut point separating binary groups Figure: Hazard ratios comparing two groups separated using percentiles of predicted values as cut points. Hazard ratio Gaussian Kernel Dotted curve: Modified SVR; Dashed curve: IPCW-KM; Dashed-dotted curve: IPCW-Cox; Black solid curve: SVHR Percentile as the cut point separating binary groups

46 Atherosclerosis Risk in Communities Study Data A prospective epidemiologic study to investigate the etiology of atherosclerosis in cardiovascular risk factors. Baseline examination enrolled 15,792 participants with ages from four U.S. communities. In this example, we apply our method to part of the baseline data, where participants are African-American males with hypertension living in Jackson, Mississippi. There are 624 participants and 133 of them are non-censored. We assess the prediction capability of some common cardiovascular risk factors for incident heart failure until We analyze the data following the same procedure.

47 Atherosclerosis Risk in Communities Study Data Table: Normalized coefficient estimates using linear kernel Covariate Normalized β Cox model a Age (in years) * Diabetes * BMI (kg/m 2 ) SBP (mm of Hg) Fasting glucose (mg/dl) Serum albumin (g/dl) * Serum creatinine (mg/dl) Heart rate (beats/minute) Left ventricular hypertrophy * Bundle branch block * Prevalent CHD * Valvular heart disease * HDL (mg/dl) * LDL (mg/dl) Pack years of smoking * Current smoking status Former smoking status * a The estimates from Cox model with significant p-value (p-value < 0.05) are marked with *.

48 Atherosclerosis Risk in Communities Study Data Table: Comparison of prediction capability for different methods 25th percentile 50th percentile 75th percentile Kernel Method C-index Logrank χ 2 a HR b Logrank χ 2 HR Logrank χ 2 HR Linear Modified SVR IPCW-KM IPCW-Cox Our method Gaussian Modified SVR IPCW-KM IPCW-Cox Our method a Logrank χ 2, Chi-square statistics from Logrank tests for two groups separated using the 25th percentile, 50th percentile, and 75th percentile of predicted values. b HR, Hazard Ratios comparing two groups separated using the 25th percentile, 50th percentile, and 75th percentile of predicted values.

49 Atherosclerosis Risk in Communities Study Data 10 Linear Kernel 8 Hazard ratio Percentile as the cut point separating binary groups Figure: Hazard ratios comparing two groups separated using percentiles of predicted values as cut points. Hazard ratio Gaussian Kernel Dotted curve: Modified SVR; Dashed curve: IPCW-KM; Dashed-dotted curve: IPCW-Cox; Black solid curve: SVHR Percentile as the cut point separating binary groups

50 Conclusions

51 Concluding Remarks We adapted the learning algorithm SVM to predict event times in the framework of counting process. The proposed method is optimal in discriminating covariate specific hazard from population average hazard. We can handle censored data appropriately without specifying the censoring distribution. Numerical studies showed superiority of SVHR, especially in the presence of high censoring ratio and noise variables. One potential challenge is the fast growing dimensions of the quadratic programming optimization as the sample size increases.

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina Indirect Rule Learning: Support Vector Machines Indirect learning: loss optimization It doesn t estimate the prediction rule f (x) directly, since most loss functions do not have explicit optimizers. Indirection

More information

VARIABLE SELECTION AND STATISTICAL LEARNING FOR CENSORED DATA. Xiaoxi Liu

VARIABLE SELECTION AND STATISTICAL LEARNING FOR CENSORED DATA. Xiaoxi Liu VARIABLE SELECTION AND STATISTICAL LEARNING FOR CENSORED DATA Xiaoxi Liu A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Residuals and model diagnostics

Residuals and model diagnostics Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

Support Vector Machines

Support Vector Machines Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)

More information

Support Vector Machine

Support Vector Machine Support Vector Machine Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Linear Support Vector Machine Kernelized SVM Kernels 2 From ERM to RLM Empirical Risk Minimization in the binary

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina Lecture 9: Learning Optimal Dynamic Treatment Regimes Introduction Refresh: Dynamic Treatment Regimes (DTRs) DTRs: sequential decision rules, tailored at each stage by patients time-varying features and

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction

More information

10-701/ Machine Learning - Midterm Exam, Fall 2010

10-701/ Machine Learning - Midterm Exam, Fall 2010 10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Machine Learning Practice Page 2 of 2 10/28/13

Machine Learning Practice Page 2 of 2 10/28/13 Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes

More information

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University September 28,

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable

More information

Support Vector Machines for Classification and Regression

Support Vector Machines for Classification and Regression CIS 520: Machine Learning Oct 04, 207 Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

About this class. Maximizing the Margin. Maximum margin classifiers. Picture of large and small margin hyperplanes

About this class. Maximizing the Margin. Maximum margin classifiers. Picture of large and small margin hyperplanes About this class Maximum margin classifiers SVMs: geometric derivation of the primal problem Statement of the dual problem The kernel trick SVMs as the solution to a regularization problem Maximizing the

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Robust Kernel-Based Regression

Robust Kernel-Based Regression Robust Kernel-Based Regression Budi Santosa Department of Industrial Engineering Sepuluh Nopember Institute of Technology Kampus ITS Surabaya Surabaya 60111,Indonesia Theodore B. Trafalis School of Industrial

More information

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements [Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature suggests the design variables should be normalized to a range of [-1,1] or [0,1].

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback University of South Carolina Scholar Commons Theses and Dissertations 2017 Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback Yanan Zhang University of South Carolina Follow

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:

More information

Convex Optimization and Support Vector Machine

Convex Optimization and Support Vector Machine Convex Optimization and Support Vector Machine Problem 0. Consider a two-class classification problem. The training data is L n = {(x 1, t 1 ),..., (x n, t n )}, where each t i { 1, 1} and x i R p. We

More information

Survival SVM: a Practical Scalable Algorithm

Survival SVM: a Practical Scalable Algorithm Survival SVM: a Practical Scalable Algorithm V. Van Belle, K. Pelckmans, J.A.K. Suykens and S. Van Huffel Katholieke Universiteit Leuven - Dept. of Electrical Engineering (ESAT), SCD Kasteelpark Arenberg

More information

Lecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University

Lecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University Lecture 18: Kernels Risk and Loss Support Vector Regression Aykut Erdem December 2016 Hacettepe University Administrative We will have a make-up lecture on next Saturday December 24, 2016 Presentations

More information

Introduction to Logistic Regression and Support Vector Machine

Introduction to Logistic Regression and Support Vector Machine Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel

More information

Linear Methods for Classification

Linear Methods for Classification Linear Methods for Classification Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Classification Supervised learning Training data: {(x 1, g 1 ), (x 2, g 2 ),..., (x

More information

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in

More information

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes by Se Hee Kim A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

Achieving Optimal Covariate Balance Under General Treatment Regimes

Achieving Optimal Covariate Balance Under General Treatment Regimes Achieving Under General Treatment Regimes Marc Ratkovic Princeton University May 24, 2012 Motivation For many questions of interest in the social sciences, experiments are not possible Possible bias in

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems

More information

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane

More information

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this

More information

Support Vector Machines for Classification: A Statistical Portrait

Support Vector Machines for Classification: A Statistical Portrait Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

High-Dimensional Statistical Learning: Introduction

High-Dimensional Statistical Learning: Introduction Classical Statistics Biological Big Data Supervised and Unsupervised Learning High-Dimensional Statistical Learning: Introduction Ali Shojaie University of Washington http://faculty.washington.edu/ashojaie/

More information

Support Vector Machines Explained

Support Vector Machines Explained December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

PhD course: Statistical evaluation of diagnostic and predictive models

PhD course: Statistical evaluation of diagnostic and predictive models PhD course: Statistical evaluation of diagnostic and predictive models Tianxi Cai (Harvard University, Boston) Paul Blanche (University of Copenhagen) Thomas Alexander Gerds (University of Copenhagen)

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

L5 Support Vector Classification

L5 Support Vector Classification L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two

More information

Relevance Vector Machines

Relevance Vector Machines LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

CIS 520: Machine Learning Oct 09, Kernel Methods

CIS 520: Machine Learning Oct 09, Kernel Methods CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed

More information

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Patrick J. Heagerty PhD Department of Biostatistics University of Washington 166 ISCB 2010 Session Four Outline Examples

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Nearest Neighbors Methods for Support Vector Machines

Nearest Neighbors Methods for Support Vector Machines Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad

More information

Support Vector Machine (continued)

Support Vector Machine (continued) Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Data splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:

Data splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE: #+TITLE: Data splitting INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+AUTHOR: Thomas Alexander Gerds #+INSTITUTE: Department of Biostatistics, University of Copenhagen

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines www.cs.wisc.edu/~dpage 1 Goals for the lecture you should understand the following concepts the margin slack variables the linear support vector machine nonlinear SVMs the kernel

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information