Brieman, L., Friedman, J., Olshen, R. and Stone, C., (1984), Classification and Regression Trees, Wadworth, Belmont, CA.
|
|
- Aubrey Melton
- 5 years ago
- Views:
Transcription
1 Bibliography: Brieman, L., Friedman, J., Olshen, R. and Stone, C., (1984), Classification and Regression Trees, Wadworth, Belmont, CA. Breiman, L, (1999, )Random Forests Statistics Department, University California Berkley CA Brieman, L.,(2001), Statistical modeling: The two cultures, Statist. Sci. 16, No. 3, Brieman, L.,(2005) Correspondence. Gamel, J., McLean, I. and Greenberg, R., (1988), Interval-by-interval Cox model analysis of 3680 cases of intraocular melanomas Shows a decline in the prognostic value of size and cell type over time and tumor excision, Cancer 61: Gamel, J., Greenberg, R. and McLean I., (1998), A stable linear algorithm for fitting the lognormal model to survival data, Computers and Biomedical Research No31:38-47 Gamel, J., George, S., Edwards, M. and Seigler, H., (2002),The long-term clinical course of patients with cutaneous melanoma, Cancer, 95, No. 6: Hofstadter, D., (1979), Godel, Escher, Bach: an Eternal Golden Braid, Basic Books Inc, New York Seigler (2005) Personal Communication Slingluff, C., Vollmer, R., Reintgen, D. and Seigler, H. Lethal Thin malignant melanoma: Identifying patients at risk, Ann. Surg. 1988;208: No2, Stadelmann, W., Rapaport, D., Soong, S. et al., (1998), Prognostic factors that influence melanoma outcome. In; Balch, C.,Houghton, A., Sober,A., eds. Cutaneous Melanoma, 3 rd ed.,st Louis, MO: Quality Medical Publishing; Venables, W. and Ripley, B. Modern Applied Statistics with S-PLUS, Springer-Verlag, New York, Inc. Vollmer, R. and Seigler, H., (2001A), A model for pretest probability of lymph node metastasis from cutaneous melanoma, Am. J. Clin. Pathol. 114: Vollmer, R. and Seigler, H., (2001B), Using a continuous transformation of the Breslow thickness for prognosis in cutaneous melanoma, Am. J. Clin. Pathol. 115:
2 R 2.01 (2005), The R Foundation for Statistical Computing, cran@r-project.org Packages Sarkar, D. (2004) Lattice Graphics, Implementation of Trellis Graphics Breiman, L., Cutler, A., Liaw, A. and Wiener, M. randomforest:,(2005) Breiman and Cutler s random forests for classification and regres Ripley, B. (2005) tree: Classification and regression trees 61
3 APPENDIX 62
4 deviance Fig.1 Cross Validation Deviance Plot size Inf misclass Fig.2 Cross Validation Misclassification Plot size 63
5 Fig.3. Plot of Full Tree without Text, All Patients, All Variables and Any Recurrence 64
6 dextent:abc stggrp:ab typcas:a Fig.4. Plot of Tree All Patients, All Variables and Any Recurrence. Pruned Tree k=30 dextent:abc stggrp:ab stggrp:bc typcas:a prisite:bcde clark:f anyimm:a side:b prisite:abcfghi hist:abcegh clark:ac AGE < 37.2 prisite:bcg anyimm:a clark:bc clark:bce AGE < Fig.5. Plot of Tree All Patients All Variables and Any Recurrence Pruned Tree k=8 65
7 MeanDecreaseAccurac MeanDecreaseAccurac dextent stggrp hist histgrp clark satel race prisite ulcer AGE THICK side typcas sex anyimm dextent stggrp histgrp hist race satel prisite clark AGE ulcer side typcas sex anyimm THICK Importance Importance Fig.6. The Variable Importance Plot of All Patients, All Variables and Any Recurrence (L). The Variable Importance Plot of All Patients, All Variables and Recurrence More Than Local (R). 66
8 MeanDecreaseAccurac MeanDecreaseAccurac prisite hist histgrp clark race side ulcer AGE sex anyimm THICK satel prisite AGE hist race side histgrp ulcer sex anyimm satel clark THICK Importance Importance Fig.7. The Variable Importance Plot of All Patients, Leave Out STGGRP, DEXTEXT and TYPCAS. Any Recurrence (L). The Variable Importance Plot of All Patients, Leave Out STGGRP, DEXTEXT and TYPCAS, Recurrence More Than Local(R) 67
9 MeanDecreaseAccuracy MeanDecreaseAccurac dextent hist histgrp clark AGE race prisite ulcer satel side THICK typcas sex anyimm stggrp dextent hist histgrp ulcer race side prisite satel sex stggrp anyimm AGE THICK clark Importance Importance Fig.8. The Variable Importance Plot of Limited Patients, All Variables and Any Recurrence (L). The Variable Importance Plot of Limited Patients, All Variables and Recurrence More Than Local (R) 68
10 TEXT FULL TREE ALL PATIENTS ALL VARIABLES ANY RECURRENCE 1) root ( ) 2) dextent: 0,1, ( ) 4) stggrp: 0, ( ) 8) typcas: ( ) 16) anyimm: ( ) 32) prisite: 1,2,3,6,7,12, ( ) 64) AGE < ( ) 128) satel: ( ) * 129) satel: ( ) 258) AGE < ( ) 516) AGE < ( ) * 517) AGE > ( ) * 259) AGE > ( ) * 65) AGE > ( ) 130) AGE < ( ) * 131) AGE > ( ) 262) histgrp: 1,2,3, ( ) 524) prisite: 1,2,3,7, ( ) 1048) AGE < ( ) 2096) AGE < ( ) * 2097) AGE > ( ) 4194) side: ( ) * 4195) side: ( ) * 1049) AGE > ( ) * 525) prisite: ( ) 1050) sex: ( ) * 1051) sex: ( ) 2102) AGE < ( ) * 2103) AGE > ( ) * 263) histgrp: 4, ( ) 526) sex: ( ) 1052) AGE < ( ) * 1053) AGE > ( ) * 527) sex: ( ) * 33) prisite: 4, ( ) 66) histgrp: 2,4, ( ) 132) AGE < ( ) * 133) AGE > ( ) * 67) histgrp: 1, ( ) 134) clark: ( ) * 135) clark: 1, ( ) * 17) anyimm: ( ) 34) hist: 1,2,3,6,12, ( ) 68) prisite: 2,3, ( ) 136) clark: 2,4,5, ( ) 272) side: ( ) Fig.9. Text Full Tree All Patients All Variables Any Recurrence 69
11 544) prisite: 2, ( ) * 545) prisite: ( ) 1090) THICK < ( ) * 1091) THICK > ( ) 2182) THICK < ( ) * 2183) THICK > ( ) 4366) AGE < ( ) * 4367) AGE > ( ) 8734) AGE < ( ) * 8735) AGE > ( ) * 273) side: ( ) * 137) clark: 1, ( ) 274) THICK < ( ) 548) sex: ( ) * 549) sex: ( ) 1098) AGE < ( ) 2196) AGE < ( ) * 2197) AGE > ( ) * 1099) AGE > ( ) * 275) THICK > ( ) 550) THICK < ( ) * 551) THICK > ( ) 1102) histgrp: 1,3, ( ) * 1103) histgrp: ( ) 2206) AGE < ( ) 4412) AGE < ( ) 8824) AGE < ( ) 17648) AGE < ( ) 35296) AGE < ( ) 70592) side: ( ) ) THICK < ( ) ) satel: 1, ( ) * ) satel: ( ) * ) THICK > ( ) ) ulcer: 1, ( ) ) THICK < ( ) * ) THICK > ( ) ) THICK < ( ) * ) THICK > ( ) * ) ulcer: ( ) ) THICK < ( ) * ) THICK > ( ) * 70593) side: ( ) ) THICK < ( ) ) THICK < ( ) ) THICK < ( ) * ) THICK > ( ) ) AGE < ( ) * ) AGE > ( ) * Fig.9. cont 70
12 282373) THICK > ( ) * ) THICK > ( ) ) THICK < ( ) ) AGE < ( ) * ) AGE > ( ) * ) THICK > ( ) ) ulcer: 2, ( ) ) sex: ( ) ) THICK < ( ) * ) THICK > ( ) ) THICK < ( ) ) THICK < ( ) ) AGE < ( ) * ) AGE > ( ) * ) THICK > ( ) * ) THICK > ( ) * ) sex: ( ) ) THICK < ( ) ) AGE < ( ) * ) AGE > ( ) ) AGE < ( ) ) AGE < ( ) ) AGE < ( ) ) AGE > ( ) * ) AGE > ( ) * ) AGE > ( ) ) AGE < ( ) * ) AGE > ( ) * ) THICK > ( ) * ) ulcer: ( ) * 35297) AGE > ( ) * 17649) AGE > ( ) * 8825) AGE > ( ) * 4413) AGE > ( ) * 2207) AGE > ( ) 4414) AGE < ( ) * 4415) AGE > ( ) * 69) prisite: 1,6,12,13, ( ) 138) clark: 2,3, ( ) 276) prisite: 1,13, ( ) 552) AGE < ( ) * 553) AGE > ( ) 1106) AGE < ( ) 2212) prisite: ( ) 4424) histgrp: ( ) 8848) clark: ( ) * 8849) clark: ( ) * 4425) histgrp: ( ) * 2213) prisite: ( ) * Fig.9. cont 71
13 1107) AGE > ( ) 2214) side: 0, ( ) 4428) THICK < ( ) 8856) AGE < ( ) * 8857) AGE > ( ) 17714) AGE < ( ) 35428) THICK < ( ) * 35429) THICK > ( ) 70858) AGE < ( ) ) AGE < ( ) ) sex: ( ) * ) sex: ( ) * ) AGE > ( ) * 70859) AGE > ( ) * 17715) AGE > ( ) * 4429) THICK > ( ) 8858) histgrp: 1,2,3, ( ) 17716) AGE < ( ) 35432) AGE < ( ) 70864) sex: ( ) * 70865) sex: ( ) ) AGE < ( ) * ) AGE > ( ) ) histgrp: ( ) * ) histgrp: ( ) ) AGE < ( ) * ) AGE > ( ) * 35433) AGE > ( ) 70866) AGE < ( ) * 70867) AGE > ( ) ) AGE < ( ) * ) AGE > ( ) * 17717) AGE > ( ) 35434) AGE < ( ) * 35435) AGE > ( ) 70870) AGE < ( ) * 70871) AGE > ( ) * 8859) histgrp: ( ) * 2215) side: ( ) 4430) AGE < ( ) 8860) AGE < ( ) 17720) THICK < ( ) 35440) sex: ( ) 70880) AGE < ( ) * 70881) AGE > ( ) ) THICK < ( ) * ) THICK > ( ) * 35441) sex: ( ) 70882) AGE < ( ) * Fig.9. cont 72
14 70883) AGE > ( ) ) AGE < ( ) ) ulcer: ( ) * ) ulcer: 1, ( ) * ) AGE > ( ) * 17721) THICK > ( ) * 8861) AGE > ( ) 17722) prisite: ( ) 35444) satel: ( ) * 35445) satel: ( ) 70890) AGE < ( ) * 70891) AGE > ( ) * 17723) prisite: ( ) * 4431) AGE > ( ) 8862) sex: ( ) 17724) AGE < ( ) 35448) AGE < ( ) 70896) AGE < ( ) ) ulcer: 2, ( ) ) prisite: 1, ( ) * ) prisite: ( ) ) AGE < ( ) * ) AGE > ( ) * ) ulcer: ( ) * 70897) AGE > ( ) * 35449) AGE > ( ) * 17725) AGE > ( ) 35450) AGE < ( ) 70900) AGE < ( ) * 70901) AGE > ( ) ) AGE < ( ) * ) AGE > ( ) * 70901) AGE > ( ) ) AGE < ( ) * ) AGE > ( ) * 35451) AGE > ( ) * 8863) sex: ( ) * 277) prisite: 6, ( ) 554) AGE < ( ) * 555) AGE > ( ) 1110) THICK < ( ) * 1111) THICK > ( ) * 139) clark: 1, ( ) 278) AGE < ( ) * 279) AGE > ( ) 558) clark: ( ) 1116) ulcer: 1, ( ) * 1117) ulcer: ( ) * 559) clark: ( ) Fig.9. cont 73
15 1118) ulcer: ( ) 2236) THICK < ( ) 4472) histgrp: 1,2, ( ) 8944) prisite: 1,12, ( ) 17888) AGE < ( ) 35776) sex: ( ) 71552) THICK < ( ) ) THICK < ( ) * ) THICK > ( ) ) AGE < ( ) * ) AGE > ( ) * 71553) THICK > ( ) * 35777) sex: ( ) * 17889) AGE > ( ) * 8945) prisite: ( ) * 4473) histgrp: ( ) * 2237) THICK > ( ) 4474) AGE < ( ) * 4475) AGE > ( ) 8950) prisite: 1, ( ) * 8951) prisite: ( ) * 1119) ulcer: 1, ( ) 2238) sex: ( ) * 2239) sex: ( ) * 35) hist: 4,10, ( ) 70) clark: 2,4, ( ) 140) THICK < ( ) * 141) THICK > ( ) * 9) typcas: ( ) 18) side: ( ) 36) clark: 1, ( ) 72) AGE < ( ) * 73) AGE > ( ) 146) hist: 2, ( ) 292) THICK < ( ) 584) AGE < ( ) * 585) AGE > ( ) * 293) THICK > ( ) 586) AGE < ( ) 1172) sex: ( ) * 1173) sex: ( ) * 587) AGE > ( ) * 147) hist: 3, ( ) * 37) clark: 2, ( ) 74) anyimm: ( ) * 75) anyimm: ( ) * 19) side: 0, ( ) 38) AGE < ( ) 76) clark: 2, ( ) Fig.9. cont 74
16 152) AGE < ( ) * 153) AGE > ( ) * 77) clark: 1, ( ) * 39) AGE > ( ) 78) AGE < ( ) 156) AGE < ( ) * 157) AGE > ( ) 314) prisite: 3, ( ) * 315) prisite: 1, ( ) 630) AGE < ( ) * 631) AGE > ( ) * 79) AGE > ( ) * 5) stggrp: 4, ( ) 10) prisite: 2,3,4, ( ) 20) THICK < ( ) 40) sex: ( ) 80) clark: 2,3, ( ) * 81) clark: ( ) * 41) sex: ( ) * 21) THICK > ( ) * 11) prisite: 1,12,13, ( ) 22) satel: ( ) 44) prisite: 1, ( ) 88) AGE < ( ) 176) AGE < ( ) 352) AGE < ( ) 704) THICK < ( ) * 705) THICK > ( ) 1410) AGE < ( ) * 1411) AGE > ( ) 2822) THICK < ( ) 5644) anyimm: ( ) * 5645) anyimm: ( ) 11290) AGE < ( ) 22580) AGE < ( ) * 22581) AGE > ( ) * 11291) AGE > ( ) * 2823) THICK > ( ) * 353) AGE > ( ) * 177) AGE > ( ) * 89) AGE > ( ) 178) AGE < ( ) * 179) AGE > ( ) 358) AGE < ( ) * 359) AGE > ( ) 718) AGE < ( ) * 719) AGE > ( ) * 45) prisite: 12, ( ) * 23) satel: 1, ( ) * Fig.9. cont 75
17 3) dextent: 3,4,5,6, ( ) 6) stggrp: 1, ( ) 12) prisite: ( ) 24) AGE < ( ) * 25) AGE > ( ) * 13) prisite: 2,3,13, ( ) * 7) stggrp: ( ) 14) clark: ( ) 28) prisite: ( ) * 29) prisite: 2, ( ) * 15) clark: 1,2,3,4, ( ) * Fig.9. cont 76
18 Table Consecutive Random Forests 10 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR Table Consecutive Random Forests 25 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR 77
19 Table Consecutive Random Forests 50 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR Table Consecutive Random Forests 100 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR 78
20 Table Consecutive Random Forests 200 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR Table Consecutive Random Forests 300 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR 79
21 Table Consecutive Random Forests 500 Trees SAMPLE # MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR Table Consecutive Random Forests 1000 Trees MTRY 4 SAMPLE # VOTES VOTES TOTAL PERCENT N Y N Y ERROR
22 Table 9. Consecutive Larger Trees NUMBER TREES 5 Consecutive runs 2000 trees 3 Consecutive runs 3000 trees 2 Consecutive runs 5000 trees MTRY 4 VOTES VOTES N Y N Y TOTAL PERCENT ERROR 81
23 Table 10. Single Tree Results All Patients All Variables Any Recurrence SINGLE TREE RESULTS ALL PATIENTS ALL VARIABLES ANY RECURRENCE TOTAL SET TOTAL PERCENT NUMBER OF ACTUAL, AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.50 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >=.15 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >=.10 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >=.05 FULL TREE k=5 k=30 of of TOTAL
24 Table 11. Single Tree Results All Patients All Variables Recurrence More Than Local SINGLE TREE RESULTS ALL PATIENTS ALL VARIABLES RECURRENCE MORE THAN LOCAL TOTAL SET TOTAL PERCENT NUMBER OF ACTUAL, AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.50 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >.15 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >.1 FULL TREE k=5 k=30 of of TOTAL THIS IS RESULTS WITH >.05 FULL TREE k=5 k=30 of of TOTAL
25 Table12. Single Tree Results All Patients All Variables Recurrence More Than Local 2 nd Run SINGLE TREE ALL PATIENTS ALL VARIABLES RECURRENCE MORE THAN LOCAL 2 ND RUN TOTAL SET TOTAL PERCENT NUMBER OF ACTUAL, AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.50 FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >.15 FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >.10 FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >.05 FULL TREE k=5 k=30 of of TOTAL M.S.E
26 Table 13. Single Tree Results All Patients Leave Out Variables Any Recurrence SINGLE TREE RESULTS All PATIENTS LEAVE OUT VARIABLES ANY RECURRENCE TOTAL SET PERCENT NUMBER OF ACTUAL AND AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.50 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.15 FULL TREE k=5 k=30 of of THIS IS RESULTS WITH >=.1 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.05 FULL TREE k=5 k=30 of of for Full
27 Table 14. Single Tree Results All Patients Leave Out Variables Any Recurrence 2 nd Run SINGLE TREE RESULTS All PATIENTS LEAVE OUT VARIABLES RECURRENCE MORE THAN LOCAL 2 nd RUN TOTAL SET PERCENT NUMBER OF ACTUAL AND AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.5 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.15 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.10 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.05 FULL TREE k=5 k=30 of of for Full
28 Table 15. Single Tree Results All Patients Leave Out Variables Recurrence More Than Local SINGLE TREE All PATIENTS LEAVE OUT VARIABLES ANY RECURRENCE TOTAL SET PERCENT NUMBER OF ACTUAL AND AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.5" FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.15" FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.1" FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.05" FULL TREE k=5 k=30 of of for Full
29 Table 16. Single Tree Results All Patients Leave Out Variables Recurrence More Than Local 2 nd Run SINGLE TREE All PATIENTS LEAVE OUT VARIABLES RECURRENCE MORE THAN LOCAL 2 nd RUN TOTAL SET PERCENT NUMBER OF ACTUAL AND AND PERCENT IN RANDOM SAMPLE PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.50 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.15 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.10 FULL TREE k=5 k=30 of of for Full THIS IS RESULTS WITH >=.05 FULL TREE k=5 k=30 of of for Full
30 Table 17. Single Tree Results Limited Patients All Variables Any Recurrence SINGLE TREE RESULTS LIMITED PATIENTS ANY RECURRENCE TOTAL SET TOTAL PERCENT NUMBER OF ACTUAL, AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.5" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.15" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.1" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.05" FULL TREE k=5 k=30 of of TOTAL M.S.E
31 Table 18. Single Tree Results Limited Patients All Variables Recurrence More Than Local SINGLE TREE RESULTS LIMITED PATIENTS RECURRENCE MORE THAN LOCAL TOTAL SET TOTAL PERCENT NUMBER OF ACTUAL, AND PERCENT IN RANDOM SAMPLES PERCENT PERCENT PREDICTED WITH >=.50 IS >=.15 IS >=.10 IS >=.05 IS THIS IS RESULTS WITH >=.5" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.15" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.1" FULL TREE k=5 k=30 of of TOTAL M.S.E THIS IS RESULTS WITH >=.5" FULL TREE k=5 k=30 of of TOTAL M.S.E
32 Table 19. RF All Patients All Variables Any Recurrence TREES 300 MTRY = 15 T.N. CUTOFF N Y TOTAL ERROR
33 MTRY 13 T.N. CUTOFF N Y TOTAL ERROR Table 19 cont 92
34 MTRY 10 T.N. CUTOFF N Y TOTAL ERROR Table 19 cont 93
35 MTRY 7 T.N. CUTOFF N Y TOTAL ERROR Table 19 cont 94
36 MTRY 4 T.N. CUTOFF TOTAL N Y ERROR Table 19 cont 95
37 MTRY 2 T.N. CUTOFF TOTAL N Y ERROR Table 19 cont 96
38 MTRY 1 T.N. CUTOFF TOTAL N Y ERROR Table 19 cont 97
39 Table 20. RF All Patients All Variables Recurrence More Than Local TREES 300 MTRY = 15 T.N. CUTOFF N Y M.S.E TOTAL ERROR
40 MTRY 13 T.N. CUTOFF M.S.E TOTAL N Y ERROR Table 20 cont 99
41 MTRY 10 T.N. CUTOFF N Y M.S.E TOTAL ERROR Table 20 cont 100
42 MTRY 7 T.N. CUTOFF N Y M.S.E TOTAL ERROR Table 20 cont 101
43 MTRY 4 T.N. CUTOFF N Y M.S.E TOTAL ERROR Table 20 cont 102
44 MTRY 2 T.N. CUTOFF N Y M.S.E TOTAL ERROR Table 20 cont 103
45 MTRY 1 T.N. CUTOFF N Y M.S.E TOTAL ERROR Table 20 cont 104
46 Table 21. RF All Patients Leave Out STAGE GROUP, DETEXT AND TYPCAS Any Recurrence TREES 300 MTRY = 12 T.N. CUTOFF N Y TOTAL ERROR Table 21 cont 105
47 MTRY 10 T.N. CUTOFF TOTAL N Y ERROR Table 21 cont 106
48 MTRY 4 T.N. CUTOFF N Y TOTAL ERROR Table 21 cont 107
49 MTRY 2 T.N. CUTOFF N Y TOTAL ERROR Table 21 cont 108
50 Table 22. RF All Patients Leave Out STAGE GROUP, DETEXT And TYPCAS Recurrence More Than Local TREES 300 MTRY = 12 T.N. CUTOFF N Y TOTAL ERROR
51 MTRY 10 T.N. CUTOFF TOTAL N Y ERROR Table 22 cont 110
Statistical Consulting Topics Classification and Regression Trees (CART)
Statistical Consulting Topics Classification and Regression Trees (CART) Suppose the main goal in a data analysis is the prediction of a categorical variable outcome. Such as in the examples below. Given
More informationClassification using stochastic ensembles
July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics
More informationCOMPSTAT2010 in Paris. Hiroki Motogaito. Masashi Goto
COMPSTAT2010 in Paris Ensembled Multivariate Adaptive Regression Splines with Nonnegative Garrote Estimator t Hiroki Motogaito Osaka University Masashi Goto Biostatistical Research Association, NPO. JAPAN
More informationUnsupervised Learning with Random Forest Predictors
Unsupervised Learning with Random Forest Predictors Tao Shi, and Steve Horvath,, Department of Human Genetics, David Geffen School of Medicine, UCLA Department of Biostatistics, School of Public Health,
More informationSupplementary Table 1. The relationship between LncHIFCAR expression and clinicopathologic parameters in OSCC Age (years) Clinicopathologic parameters LncHIFCAR expression Number High Low of cases P value
More informationWALD LECTURE II LOOKING INSIDE THE BLACK BOX. Leo Breiman UCB Statistics
1 WALD LECTURE II LOOKING INSIDE THE BLACK BOX Leo Breiman UCB Statistics leo@stat.berkeley.edu ORIGIN OF BLACK BOXES 2 Statistics uses data to explore problems. Think of the data as being generated by
More informationGrowing a Large Tree
STAT 5703 Fall, 2004 Data Mining Methodology I Decision Tree I Growing a Large Tree Contents 1 A Single Split 2 1.1 Node Impurity.................................. 2 1.2 Computation of i(t)................................
More informationMethods for generating vegetation maps from remotely
Mapping Ecological Systems with a Random Forest Model: Tradeoffs between Errors and Bias Emilie Grossmann 1, Janet Ohmann 2, James Kagan 3, Heather May 1 and Matthew Gregory 1 1 Forest Ecosystems and Society,
More informationday month year documentname/initials 1
ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationConditional variable importance in R package extendedforest
Conditional variable importance in R package extendedforest Stephen J. Smith, Nick Ellis, C. Roland Pitcher February 10, 2011 Contents 1 Introduction 1 2 Methods 2 2.1 Conditional permutation................................
More informationProbabilistic Random Forests: Predicting Data Point Specific Misclassification Probabilities ; CU- CS
University of Colorado, Boulder CU Scholar Computer Science Technical Reports Computer Science Spring 5-1-23 Probabilistic Random Forests: Predicting Data Point Specific Misclassification Probabilities
More informationA new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationSF2930 Regression Analysis
SF2930 Regression Analysis Alexandre Chotard Tree-based regression and classication 20 February 2017 1 / 30 Idag Overview Regression trees Pruning Bagging, random forests 2 / 30 Today Overview Regression
More informationAnalysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms
Analysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms Marco Sandri and Paola Zuccolotto University of Brescia - Department of Quantitative Methods C.da Santa
More informationWALD III SOFTWARE FOR THE MASSES (AND AN EXAMPLE) Leo Breiman UCB Statistics
1 WALD III SOFTWARE FOR THE MASSES (AND AN EXAMPLE) Leo Breiman UCB Statistics leo@stat.berkeley.edu 2 IS THERE AN OBLIGATION? Tens of thousands of statisticians around the world are using statistical
More informationVariable importance in RF. 1 Start. p < Conditional variable importance in RF. 2 n = 15 y = (0.4, 0.6) Other variable importance measures
n = y = (.,.) n = 8 y = (.,.89) n = 8 > 8 n = y = (.88,.8) > > n = 9 y = (.8,.) n = > > > > n = n = 9 y = (.,.) y = (.,.889) > 8 > 8 n = y = (.,.8) n = n = 8 y = (.889,.) > 8 n = y = (.88,.8) n = y = (.8,.)
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationRandom Forests for Ordinal Response Data: Prediction and Variable Selection
Silke Janitza, Gerhard Tutz, Anne-Laure Boulesteix Random Forests for Ordinal Response Data: Prediction and Variable Selection Technical Report Number 174, 2014 Department of Statistics University of Munich
More informationA note on R 2 measures for Poisson and logistic regression models when both models are applicable
Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer
More informationIdentifying representative trees from ensembles
Research Article Received 26 February 2009, Accepted 7 November 20 Published online 3 February 202 in Wiley Online Library (wileyonlinelibrary.com) DOI: 0.002/sim.4492 Identifying representative trees
More informationThe influence of categorising survival time on parameter estimates in a Cox model
The influence of categorising survival time on parameter estimates in a Cox model Anika Buchholz 1,2, Willi Sauerbrei 2, Patrick Royston 3 1 Freiburger Zentrum für Datenanalyse und Modellbildung, Albert-Ludwigs-Universität
More informationRelative-risk regression and model diagnostics. 16 November, 2015
Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0
More informationSupplementary material for Intervention in prediction measure: a new approach to assessing variable importance for random forests
Supplementary material for Intervention in prediction measure: a new approach to assessing variable importance for random forests Irene Epifanio Dept. Matemàtiques and IMAC Universitat Jaume I Castelló,
More information( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan
Outline: Cox regression part 2 Ørnulf Borgan Department of Mathematics University of Oslo Recapitulation Estimation of cumulative hazards and survival probabilites Assumptions for Cox regression and check
More informationFormula for the t-test
Formula for the t-test: How the t-test Relates to the Distribution of the Data for the Groups Formula for the t-test: Formula for the Standard Error of the Difference Between the Means Formula for the
More informationVapnik-Chervonenkis Dimension of Axis-Parallel Cuts arxiv: v2 [math.st] 23 Jul 2012
Vapnik-Chervonenkis Dimension of Axis-Parallel Cuts arxiv:203.093v2 [math.st] 23 Jul 202 Servane Gey July 24, 202 Abstract The Vapnik-Chervonenkis (VC) dimension of the set of half-spaces of R d with frontiers
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationSTATISTICAL COMPUTING USING R/S. John Fox McMaster University
STATISTICAL COMPUTING USING R/S John Fox McMaster University The S statistical programming language and computing environment has become the defacto standard among statisticians and has made substantial
More informationDyadic Classification Trees via Structural Risk Minimization
Dyadic Classification Trees via Structural Risk Minimization Clayton Scott and Robert Nowak Department of Electrical and Computer Engineering Rice University Houston, TX 77005 cscott,nowak @rice.edu Abstract
More informationMulti-state models: prediction
Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications
More informationChapter 6. Ensemble Methods
Chapter 6. Ensemble Methods Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Introduction
More informationBoulesteix: Maximally selected chi-square statistics and binary splits of nominal variables
Boulesteix: Maximally selected chi-square statistics and binary splits of nominal variables Sonderforschungsbereich 386, Paper 449 (2005) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Maximally
More informationRandom projection ensemble classification
Random projection ensemble classification Timothy I. Cannings Statistics for Big Data Workshop, Brunel Joint work with Richard Samworth Introduction to classification Observe data from two classes, pairs
More informationMultivariable Fractional Polynomials
Multivariable Fractional Polynomials Axel Benner May 17, 2007 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example
More informationRegression tree methods for subgroup identification I
Regression tree methods for subgroup identification I Xu He Academy of Mathematics and Systems Science, Chinese Academy of Sciences March 25, 2014 Xu He (AMSS, CAS) March 25, 2014 1 / 34 Outline The problem
More informationPubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION
PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).
More informationRegression and Classification Trees
Regression and Classification Trees 1 Regression Trees The basic idea behind regression trees is the following: Group the n subjects into a bunch of groups based solely on the explanatory variables. Prediction
More informationβ j = coefficient of x j in the model; β = ( β1, β2,
Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)
More informationEvaluation of the predictive capacity of a biomarker
Evaluation of the predictive capacity of a biomarker Bassirou Mboup (ISUP Université Paris VI) Paul Blanche (Université Bretagne Sud) Aurélien Latouche (Institut Curie & Cnam) GDR STATISTIQUE ET SANTE,
More informationLecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationREGRESSION TREE CREDIBILITY MODEL
LIQUN DIAO AND CHENGGUO WENG Department of Statistics and Actuarial Science, University of Waterloo Advances in Predictive Analytics Conference, Waterloo, Ontario Dec 1, 2017 Overview Statistical }{{ Method
More informationApplied Survival Analysis Lab 10: Analysis of multiple failures
Applied Survival Analysis Lab 10: Analysis of multiple failures We will analyze the bladder data set (Wei et al., 1989). A listing of the dataset is given below: list if id in 1/9 +---------------------------------------------------------+
More informationMODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS
MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS by Meredith JoAnne Lotz B.A., St. Olaf College, 2004 Submitted to the Graduate Faculty
More informationJunction-Explorer Help File
Junction-Explorer Help File Dongrong Wen, Christian Laing, Jason T. L. Wang and Tamar Schlick Overview RNA junctions are important structural elements of three or more helices in the organization of the
More informationMachine Learning - TP
Machine Learning - TP Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr http://www.nathalievilla.org IUT STID (Carcassonne) & SAMM (Université Paris 1) Formation INRA, Niveau 3 Formation INRA (Niveau
More informationADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables
ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53
More informationAn asymmetric entropy measure for decision trees
An asymmetric entropy measure for decision trees Simon Marcellin Laboratoire ERIC Université Lumière Lyon 2 5 av. Pierre Mendès-France 69676 BRON Cedex France simon.marcellin@univ-lyon2.fr Djamel A. Zighed
More informationThe Design and Analysis of Benchmark Experiments Part II: Analysis
The Design and Analysis of Benchmark Experiments Part II: Analysis Torsten Hothorn Achim Zeileis Friedrich Leisch Kurt Hornik Friedrich Alexander Universität Erlangen Nürnberg http://www.imbe.med.uni-erlangen.de/~hothorn/
More informationVariable importance measures in regression and classification methods
MASTER THESIS Variable importance measures in regression and classification methods Institute for Statistics and Mathematics Vienna University of Economics and Business under the supervision of Univ.Prof.
More informationTechnical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA
Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA P. VENKATESAN* K. VISWANATHAN + R. PRABHAKAR* * Tuberculosis Research Centre,
More informationModel Testing for Future Reintroductions of Desert Bighorn Sheep at Capitol Reef National Park
University of Wyoming National Park Service Research Center Annual Report Volume 13 13th Annual Report, 1989 Article 7 1-1-1989 Model Testing for Future Reintroductions of Desert Bighorn Sheep at Capitol
More informationBAGGING PREDICTORS AND RANDOM FOREST
BAGGING PREDICTORS AND RANDOM FOREST DANA KANER M.SC. SEMINAR IN STATISTICS, MAY 2017 BAGIGNG PREDICTORS / LEO BREIMAN, 1996 RANDOM FORESTS / LEO BREIMAN, 2001 THE ELEMENTS OF STATISTICAL LEARNING (CHAPTERS
More informationMachine Learning. Nathalie Villa-Vialaneix - Formation INRA, Niveau 3
Machine Learning Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr http://www.nathalievilla.org IUT STID (Carcassonne) & SAMM (Université Paris 1) Formation INRA, Niveau 3 Formation INRA (Niveau
More informationRANDOM FORESTS FOR CLASSIFICATION IN ECOLOGY
Ecology, 88(11), 2007, pp. 2783 2792 Ó 2007 by the Ecological Society of America RANDOM FORESTS FOR CLASSIFICATION IN ECOLOGY D. RICHARD CUTLER, 1,7 THOMAS C. EDWARDS, JR., 2 KAREN H. BEARD, 3 ADELE CUTLER,
More informationClassification of Longitudinal Data Using Tree-Based Ensemble Methods
Classification of Longitudinal Data Using Tree-Based Ensemble Methods W. Adler, and B. Lausen 29.06.2009 Overview 1 Ensemble classification of dependent observations 2 3 4 Classification of dependent observations
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationGeneralized Additive Models
By Trevor Hastie and R. Tibshirani Regression models play an important role in many applied settings, by enabling predictive analysis, revealing classification rules, and providing data-analytic tools
More informationRECSM Working Paper Number 54 January 2018
Machine learning for propensity score matching and weighting: comparing different estimation techniques and assessing different balance diagnostics Massimo Cannas Department of Economic and Business Sciences,
More informationPersonalities. Charles Darwin John Maynard Smith Alan Turing John von Neumann Sewell Wright
Personalities Charles Darwin John Maynard Smith Alan Turing John von Neumann Sewell Wright Good questions Major transitions [JMSmith & ESzathmary95] Replicating molecules Chromosomes linkage among self-replicating
More informationTREE-BASED METHODS FOR SURVIVAL ANALYSIS AND HIGH-DIMENSIONAL DATA. Ruoqing Zhu
TREE-BASED METHODS FOR SURVIVAL ANALYSIS AND HIGH-DIMENSIONAL DATA Ruoqing Zhu A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements
More informationBINARY TREE-STRUCTURED PARTITION AND CLASSIFICATION SCHEMES
BINARY TREE-STRUCTURED PARTITION AND CLASSIFICATION SCHEMES DAVID MCDIARMID Abstract Binary tree-structured partition and classification schemes are a class of nonparametric tree-based approaches to classification
More informationRegression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression
LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.
More informationLinear Recurrent Subsequences of Meta-Fibonacci Sequences
Linear Recurrent Subsequences of Meta-Fibonacci Sequences Nathan Fox arxiv:1508.01840v1 [math.nt] 7 Aug 2015 Abstract In a recent paper, Frank Ruskey asked whether every linear recurrent sequence can occur
More informationMultimodal Deep Learning for Predicting Survival from Breast Cancer
Multimodal Deep Learning for Predicting Survival from Breast Cancer Heather Couture Deep Learning Journal Club Nov. 16, 2016 Outline Background on tumor histology & genetic data Background on survival
More informationLogic Regression. Ingo Ruczinski. Department of Biostatistics Johns Hopkins University.
Logic Regression Ingo Ruczinski Department of Biostatistics Jons Hopkins University Email: ingo@ju.edu ttp://biosun.biostat.jsp.edu/ iruczins Wit Carles Kooperberg Micael LeBlanc, FHCRC Introduction Motivation
More informationVariable Selection in Random Forest with Application to Quantitative Structure-Activity Relationship
Variable Selection in Random Forest with Application to Quantitative Structure-Activity Relationship Vladimir Svetnik, Andy Liaw, and Christopher Tong Biometrics Research, Merck & Co., Inc. P.O. Box 2000
More informationThe coxvc_1-1-1 package
Appendix A The coxvc_1-1-1 package A.1 Introduction The coxvc_1-1-1 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models
More informationMultivariable Fractional Polynomials
Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationCensoring Unbiased Regression Trees and Ensembles
Johns Hopkins University, Dept. of Biostatistics Working Papers 1-31-216 Censoring Unbiased Regression Trees and Ensembles Jon Arni Steingrimsson Department of Biostatistics, Johns Hopkins Bloomberg School
More informationNonlinear Knowledge-Based Classification
Nonlinear Knowledge-Based Classification Olvi L. Mangasarian Edward W. Wild Abstract Prior knowledge over general nonlinear sets is incorporated into nonlinear kernel classification problems as linear
More informationStatistics and learning: Big Data
Statistics and learning: Big Data Learning Decision Trees and an Introduction to Boosting Sébastien Gadat Toulouse School of Economics February 2017 S. Gadat (TSE) SAD 2013 1 / 30 Keywords Decision trees
More informationSupervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees!
Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees! Summary! Input Knowledge representation! Preparing data for learning! Input: Concept, Instances, Attributes"
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationAnatrytone logan. Species Distribution Model (SDM) assessment metrics and metadata Common name: Delaware Skipper Date: 17 Nov 2017 Code: anatloga
Anatrytone logan Species Distribution Model (SDM) assessment metrics and metadata Common name: Delaware Skipper Date: 17 Nov 2017 Code: anatloga fair TSS=0.74 ability to find new sites This SDM incorporates
More informationDoes Modeling Lead to More Accurate Classification?
Does Modeling Lead to More Accurate Classification? A Comparison of the Efficiency of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang
More informationarxiv: v1 [stat.me] 5 Dec 2018
Joint latent class trees: A Tree-Based Approach to Joint Modeling of Time-to-event and Longitudinal Data Ningshan Zhang and Jeffrey S. Simonoff IOMS Department, Leonard N. Stern School of Business, New
More informationA Study of Relative Efficiency and Robustness of Classification Methods
A Study of Relative Efficiency and Robustness of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang April 28, 2011 Department of Statistics
More informationDefining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS
Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Efforts to Improve Quality of Care Stephen Jones, PhD Bio-statistical Research
More informationMethods for Predicting an Ordinal Response with High-Throughput Genomic Data
Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2016 Methods for Predicting an Ordinal Response with High-Throughput Genomic Data Kyle L. Ferber Virginia
More informationInfluence measures for CART
Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work with Avner Bar-Hen Servane Gey (MAP5, Paris Descartes ) CART CART Classification And Regression Trees, Breiman et al. (1984) Learning set
More informationEnsemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12
Ensemble Methods Charles Sutton Data Mining and Exploration Spring 2012 Bias and Variance Consider a regression problem Y = f(x)+ N(0, 2 ) With an estimate regression function ˆf, e.g., ˆf(x) =w > x Suppose
More informationVariable importance in binary regression trees and forests
Electronic Journal of Statistics Vol. 1 (2007) 519 537 ISSN: 1935-7524 DOI: 10.1214/07-EJS039 Variable importance in binary regression trees and forests Hemant Ishwaran Department of Quantitative Health
More informationBET: Bayesian Ensemble Trees for Clustering and Prediction in Heterogeneous Data
BET: Bayesian Ensemble Trees for Clustering and Prediction in Heterogeneous Data arxiv:1408.4140v1 [stat.ml] 18 Aug 2014 Leo L. Duan 1, John P. Clancy 2 and Rhonda D. Szczesniak 3 4 Summary We propose
More informationDefinitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen
Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation
More informationTitle. Citation Remote Sensing Letters, 5(2): Issue Date Doc URLhttp://hdl.handle.net/2115/ Type.
Title Random Forest classification of crop type usin Author(s) Sonobe, Rei; Tani, Hiroshi; Wang, Xiufeng; Kob Citation Remote Sensing Letters, 5(2): 157-164 Issue Date 2014-02 Doc URLhttp://hdl.handle.net/2115/57984
More informationPackage penalized. February 21, 2018
Version 0.9-50 Date 2017-02-01 Package penalized February 21, 2018 Title L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model Author Jelle Goeman, Rosa Meijer, Nimisha
More informationChecking model assumptions with regression diagnostics
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationForecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method
Forecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method Barry King Abstract Other researchers have used Croston s method to forecast traffic at casino game tables. Our data
More informationFirst Aid Kit for Survival. Hypoxia cohort. Goal. DFS=Clinical + Marker 1/21/2015. Two analyses to exemplify some concepts of survival techniques
First Aid Kit for Survival Melania Pintilie pintilie@uhnres.utoronto.ca Two analyses to exemplify some concepts of survival techniques Checking linearity Checking proportionality of hazards Predicted curves:
More informationMaximally selected chi-square statistics for at least ordinal scaled variables
Maximally selected chi-square statistics for at least ordinal scaled variables Anne-Laure Boulesteix anne-laure.boulesteix@stat.uni-muenchen.de Department of Statistics, University of Munich, Akademiestrasse
More informationGeneralization to Multi-Class and Continuous Responses. STA Data Mining I
Generalization to Multi-Class and Continuous Responses STA 5703 - Data Mining I 1. Categorical Responses (a) Splitting Criterion Outline Goodness-of-split Criterion Chi-square Tests and Twoing Rule (b)
More informationLog-linearity for Cox s regression model. Thesis for the Degree Master of Science
Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationRegression tree-based diagnostics for linear multilevel models
Regression tree-based diagnostics for linear multilevel models Jeffrey S. Simonoff New York University May 11, 2011 Longitudinal and clustered data Panel or longitudinal data, in which we observe many
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More information