Variable importance in RF. 1 Start. p < Conditional variable importance in RF. 2 n = 15 y = (0.4, 0.6) Other variable importance measures
|
|
- Dorthy Barton
- 5 years ago
- Views:
Transcription
1 n = y = (.,.) n = 8 y = (.,.89) n = 8 > 8 n = y = (.88,.8) > > n = 9 y = (.8,.) n = > > > > n = n = 9 y = (.,.) y = (.,.889) > 8 > 8 n = y = (.,.8) n = n = 8 y = (.889,.) > 8 n = y = (.88,.8) n = y = (.8,.) > 9 n = n = n = y = (.8,.9) 8 > 8 n = 8 > 8 n = n = y = (.8,.) n = y = (.,.) 8 > 8 n = y = (.,.) > n = y = (.8,.) 8 > 8 > 8 n = n = y = (.8,.) > n = > 9 n = y = (.,.) > n = y = (.88,.8) > n = y = (.88,.8) > n = y = (.,.) n = y = (.88,.8) > > n = n = n = n = y = (.9,.) > 8 > 8 n = y = (.,.) 8 > 8 n = 9 y = (.,.) > n = y = (.,.) > > n = n = y = (.,.8) y = (.,.8) > n = 8 y = (.,.9) 8 > 8 n = y = (.,.8) > n = y = (.9,.9) n = 8 n = 8 n = 9 > > > 9 n = y = (.9,.8) n = 8 y = (.,.) n = n = 8 y = (.,.) n = y = (.9,.) n = y = (.8,.) n = 8 y = (.,.) > n = > n = y = (.,.) > > > n = y = (.88,.8) > > > n = 8 > 8 n = 8 y = (.8,.) n = y = (.,.) > n = y = (.8,.) > n = n = y = (.8,.9) > n = n = y = (.,.) n = y = (.9,.) Measuring in random forests A Comparison of Different Importance Measures Carolin Strobl (LMU München) Other Other Wien, Jänner 9 Measuring in random forests Measuring in random forests Other Gini mean Gini gain produced by X j over all trees (can be severely biased due to estimation bias and mutiple testing; Strobl et al., ) Other
2 Measuring in random forests The permutation within each tree t Gini mean Gini gain produced by X j over all trees (can be severely biased due to estimation bias and mutiple testing; Strobl et al., ) permutation mean decrease in classification accuracy after Other VI (t) (x j ) = ŷ (t) i ) (y i B (t) I i = ŷ (t) B (t) i ) (y i B (t) I i = ŷ (t) i,π j B (t) = f (t) (x i ) = predicted class before permuting Other permuting X j over all trees (unbiased when subsampling is used; Strobl et al., ) ŷ (t) i,π j = f (t) (x i,πj ) = predicted class after permuting X j x i,πj = (x i,,..., x i,j, x πj (i),j, x i,j+,..., x i,p ) Note: VI (t) (x j ) = by definition, if X j is not in tree t The permutation What kind of independence corresponds to this kind of permutation? over all trees: VI (x j ) = ntree t= VI (t) (x j ) ntree Other obs Y X j Z y x πj (),j z.... i y i x πj (i),j z i.... Other n y n x πj (n),j z n H : X j Y, Z or X j Y X j Z P(Y, X j, Z) H = P(Y, Z) P(X j )
3 What kind of independence corresponds to What kind of independence corresponds to this kind of permutation? this kind of permutation? the original permutation scheme reflects independence of X j from both Y and the remaining predictor s Z Other the original permutation scheme reflects independence of X j from both Y and the remaining predictor s Z Other a high can result from violation of either one! Suggestion: permutation scheme Technically obs Y X j Z y x πj Z=a (),j z = a y x πj Z=a (),j z = a use any partition of the feature space for conditioning y x πj Z=a (),j z = a y x πj Z=b (),j z = b y x πj Z=b (),j z = b y x πj Z=b (),j z = b.... Other Other H : X j Y Z P(Y, X j Z) or P(Y X j, Z) H = P(Y Z) P(Xj Z) H = P(Y Z)
4 Technically Toy example spurious correlation between shoe size and reading skills in school-children use any partition of the feature space for conditioning here: use binary partition already learned by tree for each tree Other > mycf <- cforest(score ~., data = readingskills, + control = cforest_unbiased(mtry = )) Other determine s to condition on (via threshold) extract their cutpoints generate partition using cutpoints as bisectors > varimp(mycf) nativespeaker age shoesize > varimp(mycf, conditional = TRUE) Strobl et al. (8) nativespeaker age shoesize from party.9-99 Simulation results Peptide-binding data mtry = mtry = mtry = 8 8 Other unconditional conditional.. * hy8 flex8 pol Other 8 9
5 Other Other partial correlation, standardized beta conditional effect of X j given all other s in the model random forest permutation averaging over trees averaging over orderings for linear models (relaimpo, Grömping, ) Other unconditional varimp (randomforest, party, Breiman et al., ; Hothorn et al., 8) Other LMG Lindeman, Merenda, and Gold (98), conditional varimp (party, Hothorn et al., 89) dominance analysis Azen and Budescu () PMVD Feldman () for GLMs (hier.part, Walsh and Nally, 8) hierarchical partitioning Chevan and Sutherland (99) elastic net (elasticnet, caret, Zou and Hastie, 8; Kuhn, 8) grouping property: correlated predictors get similar (largest) score R decomposition Desirable (?) properties Desirable (?) properties proper decomposition: scores sum up to model R proper decomposition: scores sum up to model R non-negativity exclusion: β j = score = Other LMG, PMVD non-negativity exclusion: β j = score = Other inclusion: β j score inclusion: β j score Grömping () Grömping ()
6 Desirable (?) properties Desirable (?) properties proper decomposition: scores sum up to model R proper decomposition: scores sum up to model R LMG, PMVD non-negativity LMG, PMVD, RF varimp (in principle) exclusion: β j = score = Other LMG, PMVD non-negativity LMG, PMVD, RF varimp (in principle) exclusion: β j = score = partial correlation, standardized betas, PMVD, RF conditional varimp (in principle), elasticnet? Other inclusion: β j score inclusion: β j score Grömping () Grömping () Desirable (?) properties Simulation study proper decomposition: scores sum up to model R LMG, PMVD non-negativity LMG, PMVD, RF varimp (in principle) exclusion: β j = score = partial correlation, standardized betas, PMVD, RF conditional varimp (in principle), elasticnet? inclusion: β j score all Other dgp: y i = β x i, + + β x i, + ε i, ε i i.i.d. N(, ) X,..., X N(, Σ) Σ = Other Grömping () X j X X X X X X X X 8 X 9 X X X β j
7 Linear model Linear model LiMo LiMo (standardized) coefficient 8 Other R Other 8 LMG LMG LMG LMG mtry =.... Other R Other 8
8 PMVD PMVD PMVD PMVD mtry =..... Other R Other 8 RF unconditional RF unconditional RF mtry = RF mtry = Other Other
9 RF unconditional RF unconditional RF mtry = 8 RF mtry = 8 Other 8 Other RF unconditional RF unconditional RF mtry = RF mtry = R Other R Other 8 8
10 RF unconditional RF unconditional RF mtry = 8 RF mtry = R Other R Other 8 8 RF conditional RF conditional RF conditional mtry = RF conditional mtry = 8 Other Other
11 RF conditional RF conditional RF conditional mtry = 8 RF conditional mtry = Other Other RF conditional RF conditional RF conditional mtry = RF conditional mtry = R Other R Other 8 8
12 RF conditional RF conditional RF conditional mtry = 8 RF conditional mtry = R Other R Other 8 8 Elastic net Elastic net enet elastic net (standardized) coefficient Other R Other 8
13 Now wait a second... Elastic net elastic net lambda = what about elastic net s grouping property? Other Standardized Coefficients Other fraction Elastic net Elastic net elastic net lambda = elastic net lambda =. Standardized Coefficients Other Standardized Coefficients Other fraction fraction
14 Elastic net elastic net lambda = Standardized Coefficients Other Other fraction w.r.t. prediction accuracy: following the exclusion principle rule Other w.r.t. prediction accuracy: following the exclusion principle rule standardized betas, PMVD (not quite), RF conditional (especially with large mtry) and elastic net (tuned!) Other
15 I w.r.t. prediction accuracy: following the exclusion principle rule standardized betas, PMVD (not quite), RF conditional (especially with large mtry) and elastic net (tuned!) I I following the exclusion principle rule Other standardized betas, PMVD (not quite), RF conditional (especially with large mtry) and elastic net (tuned!) RF: not limited to linear model, interactions included, w.r.t. prediction accuracy: I applicable even if p > Other RF: not limited to linear model, interactions included, applicable even if p > I if you want elastic net to group: don t tune!? Azen, R. and D. V. Budescu (). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods 8 (), 9 8. Breiman, L., A. Cutler, A. Liaw, and M. Wiener (). Breiman and Cutler s Random Forests for Classification and Regression. Other R package version.-. Other Chevan, A. and M. Sutherland (99). Hierarchical partitioning. The American Statistician (), 9 9. Feldman, B. (). Relative and value. Technical report. Gro mping, U. (). relaimpo: Relative Importance of Regressors in Linear Models. R package version.. Gro mping, U. (). Estimators of relative for linear regression based on variance decomposition. The American Statistician (), 9.
16 Kuhn, M. (8). caret: Classification and Regression Training. R package version.. Lindeman, R., P. Merenda, and R. Gold (98). Introduction to Bivariate and Multivariate Analysis. Glenview: Scott Foresman & Co. Strobl, C., A.-L. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis (8). for random forests. BMC Bioinformatics 9:. Strobl, C., A.-L. Boulesteix, A. Zeileis, and T. Hothorn (). Bias in random forest : Illustrations, sources and a solution. BMC Bioinformatics 8:. Other Walsh, C. and R. M. Nally (8). hier.part: Hierarchical Partitioning. R package version.-. Zou, H. and T. Hastie (8). elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA. R package version.-.
Conditional variable importance in R package extendedforest
Conditional variable importance in R package extendedforest Stephen J. Smith, Nick Ellis, C. Roland Pitcher February 10, 2011 Contents 1 Introduction 1 2 Methods 2 2.1 Conditional permutation................................
More informationVariable importance measures in regression and classification methods
MASTER THESIS Variable importance measures in regression and classification methods Institute for Statistics and Mathematics Vienna University of Economics and Business under the supervision of Univ.Prof.
More informationAssessing Relative Importance Using RSP Scoring to Generate Variable Importance Factor (VIF)
International Journal of Statistics and Probability; Vol. 4, No. ; 15 ISSN 197-73 E-ISSN 197-74 Published by Canadian Center of Science and Education Assessing Relative Importance Using RSP Scoring to
More informationSupplementary material for Intervention in prediction measure: a new approach to assessing variable importance for random forests
Supplementary material for Intervention in prediction measure: a new approach to assessing variable importance for random forests Irene Epifanio Dept. Matemàtiques and IMAC Universitat Jaume I Castelló,
More informationClassification using stochastic ensembles
July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics
More informationRandom Forests for Ordinal Response Data: Prediction and Variable Selection
Silke Janitza, Gerhard Tutz, Anne-Laure Boulesteix Random Forests for Ordinal Response Data: Prediction and Variable Selection Technical Report Number 174, 2014 Department of Statistics University of Munich
More informationAnalysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms
Analysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms Marco Sandri and Paola Zuccolotto University of Brescia - Department of Quantitative Methods C.da Santa
More informationDecision trees COMS 4771
Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).
More informationPrediction & Feature Selection in GLM
Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis
More informationSF2930 Regression Analysis
SF2930 Regression Analysis Alexandre Chotard Tree-based regression and classication 20 February 2017 1 / 30 Idag Overview Regression trees Pruning Bagging, random forests 2 / 30 Today Overview Regression
More informationStata module for decomposing goodness of fit according to Shapley and Owen values
rego Stata module for decomposing goodness of fit according to Shapley and Owen values Frank Huettner and Marco Sunder Department of Economics University of Leipzig, Germany Presentation at the UK Stata
More informationVariance Reduction and Ensemble Methods
Variance Reduction and Ensemble Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Last Time PAC learning Bias/variance tradeoff small hypothesis
More informationTo Tune or Not to Tune the Number of Trees in Random Forest
Journal of Machine Learning Research 18 (2018) 1-18 Submitted 5/17; Revised 2/18; Published 4/18 o une or Not to une the Number of rees in Random Forest Philipp Probst probst@ibe.med.uni-muenchen.de Institut
More informationABC random forest for parameter estimation. Jean-Michel Marin
ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint
More informationData analysis strategies for high dimensional social science data M3 Conference May 2013
Data analysis strategies for high dimensional social science data M3 Conference May 2013 W. Holmes Finch, Maria Hernández Finch, David E. McIntosh, & Lauren E. Moss Ball State University High dimensional
More informationComputing Random Forests Variable Importance Measures (VIM) on Mixed Continuous and Categorical Data
DEGREE PROJECT IN THE FIELD OF TECHNOLOGY ENGINEERING PHYSICS AND THE MAIN FIELD OF STUDY COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016 Computing Random Forests Variable
More informationBAGGING PREDICTORS AND RANDOM FOREST
BAGGING PREDICTORS AND RANDOM FOREST DANA KANER M.SC. SEMINAR IN STATISTICS, MAY 2017 BAGIGNG PREDICTORS / LEO BREIMAN, 1996 RANDOM FORESTS / LEO BREIMAN, 2001 THE ELEMENTS OF STATISTICAL LEARNING (CHAPTERS
More informationVariable Selection and Weighting by Nearest Neighbor Ensembles
Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction
More informationVariable importance in binary regression trees and forests
Electronic Journal of Statistics Vol. 1 (2007) 519 537 ISSN: 1935-7524 DOI: 10.1214/07-EJS039 Variable importance in binary regression trees and forests Hemant Ishwaran Department of Quantitative Health
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationA Framework for Unbiased Model Selection Based on Boosting
Benjamin Hofner, Torsten Hothorn, Thomas Kneib & Matthias Schmid A Framework for Unbiased Model Selection Based on Boosting Technical Report Number 072, 2009 Department of Statistics University of Munich
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More informationMultiple Linear Regression for the Supervisor Data
for the Supervisor Data Rating 40 50 60 70 80 90 40 50 60 70 50 60 70 80 90 40 60 80 40 60 80 Complaints Privileges 30 50 70 40 60 Learn Raises 50 70 50 70 90 Critical 40 50 60 70 80 30 40 50 60 70 80
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationSparse Principal Component Analysis Formulations And Algorithms
Sparse Principal Component Analysis Formulations And Algorithms SLIDE 1 Outline 1 Background What Is Principal Component Analysis (PCA)? What Is Sparse Principal Component Analysis (spca)? 2 The Sparse
More informationStatistics and learning: Big Data
Statistics and learning: Big Data Learning Decision Trees and an Introduction to Boosting Sébastien Gadat Toulouse School of Economics February 2017 S. Gadat (TSE) SAD 2013 1 / 30 Keywords Decision trees
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationmeasure in classification trees
A bias correction algorithm for the Gini variable importance measure in classification trees Marco Sandri and Paola Zuccolotto University of Brescia - Department of Quantitative Methods C.da Santa Chiara
More informationProbability and Statistical Decision Theory
Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Probability and Statistical Decision Theory Many slides attributable to: Erik Sudderth (UCI) Prof. Mike Hughes
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationLecture 14: Variable Selection - Beyond LASSO
Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)
More informationREGRESSION TREE CREDIBILITY MODEL
LIQUN DIAO AND CHENGGUO WENG Department of Statistics and Actuarial Science, University of Waterloo Advances in Predictive Analytics Conference, Waterloo, Ontario Dec 1, 2017 Overview Statistical }{{ Method
More informationInternational Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA
International Journal of Pure and Applied Mathematics Volume 19 No. 3 2005, 359-366 A NOTE ON BETWEEN-GROUP PCA Anne-Laure Boulesteix Department of Statistics University of Munich Akademiestrasse 1, Munich,
More informationarxiv: v1 [stat.ml] 16 May 2017
To tune or not to tune the number of trees in random forest? To tune or not to tune the number of trees in random forest? arxiv:1705.05654v1 [stat.ml] 16 May 2017 Philipp Probst probst@ibe.med.uni-muenchen.de
More informationRegression tree methods for subgroup identification I
Regression tree methods for subgroup identification I Xu He Academy of Mathematics and Systems Science, Chinese Academy of Sciences March 25, 2014 Xu He (AMSS, CAS) March 25, 2014 1 / 34 Outline The problem
More informationPre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models
Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider variable
More informationECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting PAC Learning Readings: Murphy 16.4;; Hastie 16 Stefan Lee Virginia Tech Fighting the bias-variance tradeoff Simple
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationFeature Engineering, Model Evaluations
Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering
More informationRegularization Algorithms for Learning
DISI, UNIGE Texas, 10/19/07 plan motivation setting elastic net regularization - iterative thresholding algorithms - error estimates and parameter choice applications motivations starting point of many
More informationForecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method
Forecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method Barry King Abstract Other researchers have used Croston s method to forecast traffic at casino game tables. Our data
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationRegularized Linear Models in Stacked Generalization
Regularized Linear Models in Stacked Generalization Sam Reid and Greg Grudic University of Colorado at Boulder, Boulder CO 80309-0430, USA Abstract Stacked generalization is a flexible method for multiple
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationClassification-relevant Importance Measures for the West German Business Cycle
Classification-relevant Importance Measures for the West German Business Cycle Daniel Enache, Claus Weihs and Ursula Garczarek Department of Statistics, University of Dortmund, 44221 Dortmund, Germany
More informationmeasure in classification trees
A bias correction algorithm for the Gini variable importance measure in classification trees Marco Sandri and Paola Zuccolotto University of Brescia - Department of Quantitative Methods C.da Santa Chiara
More informationRegularization: Ridge Regression and the LASSO
Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression
More informationFull versus incomplete cross-validation: measuring the impact of imperfect separation between training and test sets in prediction error estimation
cross-validation: measuring the impact of imperfect separation between training and test sets in prediction error estimation IIM Joint work with Christoph Bernau, Caroline Truntzer, Thomas Stadler and
More informationLecture 3. Linear Regression II Bastian Leibe RWTH Aachen
Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression
More informationEnsemble Methods and Random Forests
Ensemble Methods and Random Forests Vaishnavi S May 2017 1 Introduction We have seen various analysis for classification and regression in the course. One of the common methods to reduce the generalization
More informationSmoothly Clipped Absolute Deviation (SCAD) for Correlated Variables
Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)
More informationA Bias Correction for the Minimum Error Rate in Cross-validation
A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationDeconstructing Data Science
econstructing ata Science avid Bamman, UC Berkeley Info 290 Lecture 6: ecision trees & random forests Feb 2, 2016 Linear regression eep learning ecision trees Ordinal regression Probabilistic graphical
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationImportance Sampling: An Alternative View of Ensemble Learning. Jerome H. Friedman Bogdan Popescu Stanford University
Importance Sampling: An Alternative View of Ensemble Learning Jerome H. Friedman Bogdan Popescu Stanford University 1 PREDICTIVE LEARNING Given data: {z i } N 1 = {y i, x i } N 1 q(z) y = output or response
More informationarxiv: v1 [stat.ml] 24 Jun 2016
Regression Trees and Random forest based feature selection for malaria risk exposure prediction. 1, 2, Bienvenue Kouwayè arxiv:1606.07578v1 [stat.ml] 24 Jun 2016 1- Université d Abomey-Calavi, International
More informationMultivariate Regression (Chapter 10)
Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationSB2b Statistical Machine Learning Bagging Decision Trees, ROC curves
SB2b Statistical Machine Learning Bagging Decision Trees, ROC curves Dino Sejdinovic (guest lecturer) Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~flaxman/course_ml.html
More informationANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control You know how ANOVA works the total variation among
More informationUVA CS 4501: Machine Learning
UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course
More informationThe Design and Analysis of Benchmark Experiments
University of Wollongong Research Online Faculty of Commerce - Papers Archive) Faculty of Business 25 The Design and Analysis of Benchmark Experiments Torsten Hothorn University of Erlangen-Nuremberg Friedrich
More informationCMSC858P Supervised Learning Methods
CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationAlternative Methods to Quantify Variable Importance in Ecology
Steffen Oppel, Carolin Strobl and Falk Huettmann Alternative Methods to Quantify Variable Importance in Ecology Technical Report Number 65, 2009 Department of Statistics University of Munich http://www.stat.uni-muenchen.de
More informationEnsemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods
The wisdom of the crowds Ensemble learning Sir Francis Galton discovered in the early 1900s that a collection of educated guesses can add up to very accurate predictions! Chapter 11 The paper in which
More informationAn experimental study of the intrinsic stability of random forest variable importance measures
Wang et al. BMC Bioinformatics (2016) 17:60 DOI 10.1186/s12859-016-0900-5 RESEARCH ARTICLE Open Access An experimental study of the intrinsic stability of random forest variable importance measures Huazhen
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationThe MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010
Department of Biostatistics Department of Statistics University of Kentucky August 2, 2010 Joint work with Jian Huang, Shuangge Ma, and Cun-Hui Zhang Penalized regression methods Penalized methods have
More informationChapter 6. Ensemble Methods
Chapter 6. Ensemble Methods Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Introduction
More informationMachine Learning Recitation 8 Oct 21, Oznur Tastan
Machine Learning 10601 Recitation 8 Oct 21, 2009 Oznur Tastan Outline Tree representation Brief information theory Learning decision trees Bagging Random forests Decision trees Non linear classifier Easy
More informationMultiple regression and inference in ecology and conservation biology: further comments on identifying important predictor variables
Biodiversity and Conservation 11: 1397 1401, 2002. 2002 Kluwer Academic Publishers. Printed in the Netherlands. Multiple regression and inference in ecology and conservation biology: further comments on
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationDiscriminative Learning and Big Data
AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationThe prediction of house price
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationComparisons of penalized least squares. methods by simulations
Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy
More informationarxiv: v1 [math.st] 14 Mar 2016
Impact of subsampling and pruning on random forests. arxiv:1603.04261v1 math.st] 14 Mar 2016 Roxane Duroux Sorbonne Universités, UPMC Univ Paris 06, F-75005, Paris, France roxane.duroux@upmc.fr Erwan Scornet
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationProfiling and Prediction of Non-Emergency Calls in New York City
Semantic Cities: Beyond Open Data to Models, Standards and Reasoning: Papers from the AAAI-14 Workshop Profiling and Prediction of Non-Emergency Calls in New York City Yilong Zha, Manuela Veloso On leave
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationMachine Learning - TP
Machine Learning - TP Nathalie Villa-Vialaneix - nathalie.villa@univ-paris1.fr http://www.nathalievilla.org IUT STID (Carcassonne) & SAMM (Université Paris 1) Formation INRA, Niveau 3 Formation INRA (Niveau
More informationConstructing Prediction Intervals for Random Forests
Senior Thesis in Mathematics Constructing Prediction Intervals for Random Forests Author: Benjamin Lu Advisor: Dr. Jo Hardin Submitted to Pomona College in Partial Fulfillment of the Degree of Bachelor
More informationA simulation study of model fitting to high dimensional data using penalized logistic regression
A simulation study of model fitting to high dimensional data using penalized logistic regression Ellinor Krona Kandidatuppsats i matematisk statistik Bachelor Thesis in Mathematical Statistics Kandidatuppsats
More informationMeasuring the Stability of Results from Supervised Statistical Learning
Measuring the Stability of Results from Supervised Statistical Learning Michel Philipp, Thomas Rusch, Kurt Hornik, Carolin Strobl Research Report Series Report 131, January 2017 Institute for Statistics
More informationCensoring Unbiased Regression Trees and Ensembles
Johns Hopkins University, Dept. of Biostatistics Working Papers 1-31-216 Censoring Unbiased Regression Trees and Ensembles Jon Arni Steingrimsson Department of Biostatistics, Johns Hopkins Bloomberg School
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationData splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:
#+TITLE: Data splitting INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+AUTHOR: Thomas Alexander Gerds #+INSTITUTE: Department of Biostatistics, University of Copenhagen
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More information