COMPSTAT2010 in Paris. Hiroki Motogaito. Masashi Goto

Size: px

Start display at page:

Download "COMPSTAT2010 in Paris. Hiroki Motogaito. Masashi Goto"

Caren Simmons
6 years ago
Views:

1 COMPSTAT2010 in Paris Ensembled Multivariate Adaptive Regression Splines with Nonnegative Garrote Estimator t Hiroki Motogaito Osaka University Masashi Goto Biostatistical Research Association, NPO. JAPAN

2 Introduction and motivation Tree methods Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with nonnegative garrote Example and simulation Concluding remarks 2

3 Introduction and motivation Tree methods Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with nonnegative garrote Example and simulation Concluding remarks 3

4 Introduction and motivation Unstable Less interpretable f ˆf (x fˆ (x x f ˆ ( x Stabilizing i fˆ (x x MARS Bagging g (Friedman,1991 (Breiman,1996 Motivation a new version MARS that t has both stability and interpretability t 4

5 Introduction and motivation Tree methods Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with nonnegative garrote Example and simulation Concluding remarks 5

6 Multivariate i t Adaptive Regression Splines(Friedman,1991 i Model form Regression model M fˆ ˆ ˆ MARS 0 mb m (x m 1 Basis function K B m m (x [ i( k, m ( x p ( k, m t ( k, m k 1 B ( ] Algorithms 0.5 Forward stepwise 0.45 Increase basis functions Backward stepwise Prune off Select the best tree 基底関数数の値 [ p ( x 0.5 ] [ p q ( x 0.5 ] x x p q=1 1 and knot t= x p 6

7 Bagging g (Breiman,1996 Model form(bagging MARS Regression model Each tree 1 f ˆ ˆ E Bagging g MARS ( x ˆf f 1 e f e (x : MARS model E e 1 Algorithms Sample Bootstrap sample Bootstrap sample Bootstrap sample Bootstrap sample Bootstrap sample f ˆ 1 ( x f ˆf 2 ( x f ˆf ( x e fˆf E( ( x averaging fˆ (x 7

8 Introduction and motivation Previous research Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with nonnegative garrote Example and simulation Concluding remarks 8

9 Proposed method Motivation a new version MARS that has both stability and interpretability Stable, but less interpretable Stable and interpretable 3 2 Selection 4 1 & Ranking Typical tree Bagging nonnegative garrote (Breiman,1995 Proposed method 9

10 Ensembled MARS with non-negative negati e garrote(1/2 Model form Regression model Each tree E f ˆ ˆ ˆ ˆ ( x ( x ĉ :non-negative e c 1 e fe fe(x : MARS model, ce negative garrote estimator Algorithms Generate Bagging trees. Attach c ht t ˆ e on each tree and estimate c e using nonnegative garrote(breiman,1995. Select candidate trees(if cˆ e 0, the tree is removed. ˆ ˆ Get f E c ˆ f ( x e 1 e e. Interpretable structure through typical tree(max ĉc e 10

11 Ensembled MARS with non-negative negati e garrote(2/2 non-negative garrote (Breiman,1995 p N P P ( p 2 arg min Y c ˆ x ( n p p n 1 P { c p } 1 n 1 p 1 { cˆ } c 0, c s ˆ where is the least square estimator t and. p p P, subject to, 1 s P p 1 p Ensembled MARS with non-negative garrote N E E 2 ˆ 0 E arg min ( Yn ce f e ( n bj t t c 1 E e, ce { c } e 1 n 1 e 1 e 1 { cˆ } x 0 1 e where f ˆ e ( x n is MARS model., subject to, characteristics All c e 1 / E indicates Bagging. Selection of optimal s is unnecessary( s 1. 11

12 Introduction and motivation Previous research Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with non-negative negative garrote Example and simulation Concluding remarks 12

13 Literature example Prostate cancer data (Stamney et al.,1989: Tibshirani,1996 y : Level of prostate-specific t antigen x ( x x T 1,...,, 8 : Clinical measures x 1 : Log of tumor size x : Weight of prostate 2 x 3 : Patient s t age x 4 : Log of benign prostatic hyperplasia amount x 5 : Dummy variables of whether it is metastasizing to seminal vesicle x 6 : Log of capsular penetration x 7 : Gleason score x 8 : Gleason score s ratio of 4 or 5 Sample size : N 97 13

14 Literature example GC CV Ensembled MARS-NNG Bagging g MARS 14

15 Literature example Number of trees Bagging Ensembled MARS-NNG Structure 97 9 Bagging Ensembled MARS-NNG x1 x2 x 2 2 x x 4 x1 candidates Typical tree 15

16 Small simulation Design Model(Friedman, y 10 sin( x 20( , where is 1x2 x3 x4 x5 N (0,1 y where is. Training i sample size: 100 Testing sample size: 1,000 Number of simulation: 100 Method MARS, Bagging MARS, Ensembled MARS-NNG Evaluation MSESTD(Standardized di d mean square error 16

17 Small simulation 0.07 ESTD MSE Ensembled ed MARS-NNG Bagging MARS MARS Number 11.6 of trees (averaged

18 Introduction and motivation Previous research Multivariate i t Adaptive Regression Splines(MARS Bagging g MARS Our method proposed Agenda Ensembled MARS with non-negative negative garrote Example and simulation Concluding remarks 18

19 Concluding remarks We proposed p a new ensembled method of MARS. Our method proposed p is stable and interpretable. Ensembled MARS-NNG provided superior or comparable results to MARS and Bagging g MARS. 19

20 References Breiman, L. (1995 Better subset regression using the nonnegative garrote. Technometrics,, 37, , Breiman, L. (1996. Bagging predictors. Machine Learning, 24, Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. (1984. Classification And Regression Trees. Wadsworth. Friedman, J. H. (1991 Multivariate Adaptive Regression Splines (with discussion. Annals of Statistics,19, Friedman, J. H. (2001. Greedy function approximation: a gradient boosting machine. Ann. Statist., 29(5, Meinshausen, N. (2009: Node Harvest: simple and interpretable regression and classification. Arxiv preprint arxiv: Motogaito, H., Sugimoto, T. & Goto, M. (2007: Multivariate Adaptive Regression Splines with Non-negative negative Garrote Estimator. Japanese J. Appl. Statist., ti t 36, (in Japanese. Yuan, M. & Lin, Y. (2007 On the non-negative negative garrote estimator. J. R. Statist. ti t Soc., B 69(2,

21 Thank you very much for your attention osaka-u acjp 21

22 Back up 22

23 Small simulation , Ensembled MARS-NNG Bagging g MARS MARS 23

24 Literature example x 1 x x8 x8 1 x 1 x x3 x 2 x 6 MARS 24

25 Literature example x 1 x2 x 2 2 x x 4 x1 Ensembled MARS-NNG 25

Nonnegative Garrote Component Selection in Functional ANOVA Models

Nonnegative Garrote Component Selection in Functional ANOVA Models Ming Yuan School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 3033-005 Email: myuan@isye.gatech.edu