Model-based boosting in R

Similar documents
mboost - Componentwise Boosting for Generalised Regression Models

Beyond Mean Regression

gamboostlss: boosting generalized additive models for location, scale and shape

Generalized Boosted Models: A guide to the gbm package

Boosting Methods: Why They Can Be Useful for High-Dimensional Data

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

General Regression Model

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

A Framework for Unbiased Model Selection Based on Boosting

Generalized Boosted Models: A guide to the gbm package

Boosting Based Conditional Quantile Estimation for Regression and Binary Classification

Generalized Linear Models Introduction

Introduction to Statistical Analysis

Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost

Analysing geoadditive regression data: a mixed model approach

A Study of Relative Efficiency and Robustness of Classification Methods

Boosting. March 30, 2009

Conditional Transformation Models

Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost

STAT5044: Regression and Anova

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Does Modeling Lead to More Accurate Classification?

Variable Selection and Model Choice in Structured Survival Models

STK-IN4300 Statistical Learning Methods in Data Science

Modelling geoadditive survival data

Stat 5101 Lecture Notes

Generalized linear models

Boosting structured additive quantile regression for longitudinal childhood obesity data

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape

Statistics 203: Introduction to Regression and Analysis of Variance Course review

A Gentle Introduction to Gradient Boosting. Cheng Li College of Computer and Information Science Northeastern University

Stochastic Gradient Descent

MAS3301 / MAS8311 Biostatistics Part II: Survival

LOGISTIC REGRESSION Joseph M. Hilbe

High-Throughput Sequencing Course

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Statistical Data Mining and Machine Learning Hilary Term 2016

Chapter 1. Modeling Basics

Boosting. CAP5610: Machine Learning Instructor: Guo-Jun Qi

Generalized Linear Models for Non-Normal Data

Probabilistic Index Models

Multistate Modeling and Applications

Final Exam. Name: Solution:

JEROME H. FRIEDMAN Department of Statistics and Stanford Linear Accelerator Center, Stanford University, Stanford, CA

Boosting. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL , 10.7, 10.13

5. Parametric Regression Model

STAT 6350 Analysis of Lifetime Data. Probability Plotting

A general mixed model approach for spatio-temporal regression data

Semiparametric Regression

ISQS 5349 Spring 2013 Final Exam

Distribution Fitting (Censored Data)

Building a Prognostic Biomarker

Generalized Linear Models

Beyond GLM and likelihood

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Chapter 11: Robust & Quantile regression

Linear Regression Models P8111

Boosting Techniques for Nonlinear Time Series Models

Maximum Entropy Klassifikator; Klassifikation mit Scikit-Learn

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

Announcements Kevin Jamieson

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

CART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions. f(x) = c m 1(x R m )

Joint Modeling of Longitudinal Item Response Data and Survival

Distribution-free ROC Analysis Using Binary Regression Techniques

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Boosting. Acknowledgment Slides are based on tutorials from Robert Schapire and Gunnar Raetsch

Kernel Logistic Regression and the Import Vector Machine

Relationship between Loss Functions and Confirmation Measures

Survival Analysis Math 434 Fall 2011

Bridging the two cultures: Latent variable statistical modeling with boosted regression trees

ABC-Boost: Adaptive Base Class Boost for Multi-class Classification

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Logistic Regression in R. by Kerry Machemer 12/04/2015

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

TMA 4275 Lifetime Analysis June 2004 Solution

Survival Regression Models

Machine Learning. Module 3-4: Regression and Survival Analysis Day 2, Asst. Prof. Dr. Santitham Prom-on

Generalized Linear Models

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Residuals and model diagnostics

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber

A Brief Introduction to Adaboost

CART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions

Incorporating Boosted Regression Trees into Ecological Latent Variable Models

Why does boosting work from a statistical view

LEAST ANGLE REGRESSION 469

Multiple Additive Regression Trees with Application in Epidemiology

The Margin Vector, Admissible Loss and Multi-class Margin-based Classifiers

STAT 5200 Handout #26. Generalized Linear Mixed Models

Ensemble Methods: Jay Hyer

Bayesian Learning (II)

Chapter 1 Statistical Inference

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Consider Table 1 (Note connection to start-stop process).

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

Package mma. March 8, 2018

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Transcription:

Model-based boosting in R Families in mboost Nikolay Robinzonov nikolay.robinzonov@stat.uni-muenchen.de Institut für Statistik Ludwig-Maximilians-Universität München Statistical Computing 2011

Model-based boosting p ξ(y X = x) = f j (x) j =1 The right hand side is a sum of components taking a subset of the covariates into account through base-learners. The left hand side describes some characteristic of the conditional distribution of the response through the loss function (family). Gaussian QuantReg Binomial Poisson CoxPH. Modular nature bols bbs btree bspatial brandom StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 2 / 15.

Families for Continuous Response

1. L 2 -loss, Gaussian() ρ(y, f ) = 1 (y f )2 2 R> Gaussian() Squared Error (Regression) Loss function: (y - f)^2 R> Gaussian()@ngradient function (y, f, w = 1) y - f ρ 4 3 2 1 Loss function 2 1 0 1 2 y f L 2 -Boosting. Default family in glmboost() and gamboost(). Normally distributed response. StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 4 / 15

2. L 1 -loss, Laplace() 2.0 Loss function 1.5 ρ(y, f ) = y f ρ 1.0 0.5 2 1 0 1 2 y f A median regression approach. Resistance to long-tailed error distributions and outliers (robustness). R> fit <- gamboost(100/mpg ~ bols(wt) + bols(hp), data = mtcars, + family = Laplace()) StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 5 / 15

3. Huber-loss, Huber() ρ(y, f ; δ) = { (y f ) 2 /2 if y f δ, δ( y f δ/2) if y f > δ R> fit <- gamboost(100/mpg ~ bols(wt) + bols(hp), data = mtcars, + family = Huber(d = NULL)) A compromise between L 1 and L 2 loss. δ defines the outliers which are subject to absolute error loss. Friedman (2001) proposed a strategy for adaptively changing δ at each boosting step (default in mboost): ρ 2.0 1.5 1.0 0.5 Loss function delta 0.2 0.5 1 2 10 δ [m] = median( y i f [m 1] (x i), i = 1,..., n ) 2 1 0 1 2 y f StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 6 / 15

4. Check function -loss, QuantReg() ρ(y, f ; τ) = { (y f ) τ if (y f ) 0, (y f ) (τ 1) if (y f ) < 0. R> fit <- gamboost(100/mpg ~ bols(wt) + bols(hp), data = mtcars, + family = QuantReg(tau = 0.5)) Loss function One needs to specify the quantile τ. Note that Laplace() is essentially the same as QuantReg(tau = 0.5) with nu = 0.2. For further details see next talk and Fenske et al. (2011). ρ 1.5 1.0 0.5 2 1 0 1 2 y f tau 0.5 0.6 0.7 0.8 0.9 StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 7 / 15

Families for Binary Response

Binary response The Binomial() family uses the negative binomial log-likelihood as a loss function. p ˆf = ˆf i(x i) half of the log-odds ratio i e2ˆf π(ˆf ) = P(Y = 1 x) = logit 1 (2ˆf ) = 1 + e 2ˆf ρ(y, f ) = [y log(π(ˆf )) + (1 y) log(1 π(ˆf ))] = log(1 + exp( 2 ỹˆf )) The binary response Y {0, 1} is internally re-coded to Ỹ = 2Y 1 { 1, 1}. ˆf equals the half of the logits (see next talks). You can choose between Binomial(link = "logit") and Binomial(link = "probit"). StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 9 / 15

Binary response AdaExp() ρ(y, f ) = exp( ỹf ) This essentially leads to the AdaBoost algorithm by Freund and Schapire (1996). Similar to Binomial(). 7 6 Loss function ρ 5 4 3 2 loss Binomial AdaExp 1 2 1 ~ 0 1 2 y f StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 10 / 15

Families for Count & Censored Responses

Count response Poisson() implements a family for fitting count data. The loss function is the negative Poisson log-likelihood. The natural link function log(µ) = η is assumed as a link function. NBinomial() leads to regression models with a negative binomial conditional distribution of the response. Suitable for overdispersed count data (Schmid et al., 2010). StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 12 / 15

Censored response CoxPH() is suitable for survival models. This family implements the negative partial log-likelihood for Cox models resulting in a proportional hazards model. Weibull() is an implementation of the negative log-likelihood function of accelerated failure time models with Weibull outcomes (see Schmid and Hothorn, 2008, for further details). Loglog() and Lognormal() are similar to Weibull but designed for log-logistic outcome and log-normal outcome, respectively. StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 13 / 15

Take home notes A flexible framework addressing a rich set of scientific questions. Large number of possible combinations of families and base-learners. New insights into understanding our data. StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 14 / 15

References Nora Fenske, Thomas Kneib, and Torsten Hothorn. Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. Journal of the American Statistical Association, to appear, 2011. Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference, 148:156, 1996. J.H. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189 1232, 2001. Matthias Schmid and Torsten Hothorn. Flexible boosting of accelerated failure time models. BMC Bioinformatics, 9(269), 2008. Matthias Schmid, Sergej Potapov, Annette Pfahlberg, and Torsten Hothorn. Estimation and regularization techniques for regression models with multidimensional prediction functions. Statistics and Computing, 20:139 150, 2010. StatComp 2011 (Nikolay Robinzonov) Model-based boosting in R 15 / 15