Multinomial functional regression with application to lameness detection for horses

Similar documents
Lecture 6: Methods for high-dimensional problems

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING

Fast Regularization Paths via Coordinate Descent

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

ESL Chap3. Some extensions of lasso

Graphical Model Selection

Machine Learning for OR & FE

Classification. Chapter Introduction. 6.2 The Bayes classifier

Fast Regularization Paths via Coordinate Descent

Proteomics and Variable Selection

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

The lasso: some novel algorithms and applications

MSA220/MVE440 Statistical Learning for Big Data

Fast Regularization Paths via Coordinate Descent

MSA220/MVE440 Statistical Learning for Big Data

MS-C1620 Statistical inference

A Significance Test for the Lasso

Regression Shrinkage and Selection via the Lasso

Permutation-invariant regularization of large covariance matrices. Liza Levina

Other likelihoods. Patrick Breheny. April 25. Multinomial regression Robust regression Cox regression

Prediction & Feature Selection in GLM

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression

Variable Selection and Regularization

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Exploratory quantile regression with many covariates: An application to adverse birth outcomes

Building a Prognostic Biomarker

Pathwise coordinate optimization

Classification 1: Linear regression of indicators, linear discriminant analysis

Variable Selection and Weighting by Nearest Neighbor Ensembles

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.

The lasso, persistence, and cross-validation

Package Grace. R topics documented: April 9, Type Package

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Regularization and Variable Selection via the Elastic Net

Gaussian Graphical Models and Graphical Lasso

Variable Selection and Sensitivity Analysis via Dynamic Trees with an application to Computer Code Performance Tuning

Statistical Data Mining and Machine Learning Hilary Term 2016

COMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma

Sufficient Dimension Reduction using Support Vector Machine and it s variants

Recap from previous lecture

Week 5: Classification

Learning with Sparsity Constraints

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape

Applied Machine Learning Annalisa Marsico

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Data Mining 2018 Logistic Regression Text Classification

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

JEROME H. FRIEDMAN Department of Statistics and Stanford Linear Accelerator Center, Stanford University, Stanford, CA

Some new ideas for post selection inference and model assessment

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Regularization in Cox Frailty Models

Sparse regularization for functional logistic regression models

Does Modeling Lead to More Accurate Classification?

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Package covtest. R topics documented:

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă

A Study of Relative Efficiency and Robustness of Classification Methods

Accepted author version posted online: 28 Nov 2012.

Classification using stochastic ensembles

MSA200/TMS041 Multivariate Analysis

Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models

Support Vector Machine, Random Forests, Boosting Based in part on slides from textbook, slides of Susan Holmes. December 2, 2012

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Statistical Consulting Topics Classification and Regression Trees (CART)

A Robust Approach to Regularized Discriminant Analysis

A Bias Correction for the Minimum Error Rate in Cross-validation

Sparse statistical modelling

LASSO Review, Fused LASSO, Parallel LASSO Solvers

Introduction to Machine Learning

A simulation study of model fitting to high dimensional data using penalized logistic regression

Functional Data Analysis & Variable Selection

An Introduction to Graphical Lasso

MS&E 226: Small Data

Bi-level feature selection with applications to genetic association

Lecture 14: Shrinkage

arxiv: v3 [stat.ml] 14 Apr 2016

Part I Week 7 Based in part on slides from textbook, slides of Susan Holmes

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Saharon Rosset 1 and Ji Zhu 2

MS&E 226: Small Data

Sparse inverse covariance estimation with the lasso

Categorical Predictor Variables

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books

A Modern Look at Classical Multivariate Techniques

Chapter 3. Linear Models for Regression

Lecture 17: October 27

Importance Sampling: An Alternative View of Ensemble Learning. Jerome H. Friedman Bogdan Popescu Stanford University

Statistical Inference

Lecture 25: November 27

The lasso: some novel algorithms and applications

Independent Factor Discriminant Analysis

Uncertainty quantification and visualization for functional random variables

STA 450/4000 S: January

Extensions to LDA and multinomial regression

Lecture 10: Logistic Regression

Transcription:

Department of Mathematical Sciences Multinomial functional regression with application to lameness detection for horses Helle Sørensen (helle@math.ku.dk) Joint with Seyed Nourollah Mousavi user! 2015, Ålborg

Background and aim for case study Data collected by veterinarian Maj Halling Thomsen and her colleaugues at University of Copenhagen. Problem: Difficult to detect lameness and identify the injured limb, especially for low-degree lameness Visual inspection and clinical examination is highly subjective Large variability in assessments from different veterinarians, and also between repeated assessments from the same vet Aim: Develop an objective method for lameness detection Slide 2/16

Detection of lameness: The movie Slide 3/16

85 data signals (after preprocessing) Acceleration 1 0 1 2 LF 1 0 1 2 RF 1 0 1 2 NO Acceleration 1 0 1 2 LH 1 0 1 2 RH 1 0 1 2 LF LH NO RF RH Each curve: Average vertical acceleration for one gait cycle First half: Stance on right-fore/left-hind diagonal Slide 4/16

Problem and approach Classification for functional data: Given a data signal, which group does it belong to? Our approach: Functional multinomial regression using wavelets and LASSO Idea: wavelets offer precise approximations with sparse representations; use LASSO to induce sparseness Extension of the work by Zhao, Ogden, Reiss (2012) Slide 5/16

Multinomial functional regression (MFR) Data: Covariate functions x i (t), categorical outcomes y i M Regression model Linear predictor for group m: η m (x) = α m + β m (t)x(t)dt Conditional probabilities: eη m(x) p m (x) = P(Y = m X = x) = l M e η l (x) Unknowns: Intercepts α m and coefficient functions β m (t) Classification of new curve, x 0 : Compute ˆη m (x 0 ) and ˆp m (x 0 ) for all groups, and assign x 0 to group with largest probability Slide 6/16

Estimation and computations Estimation approach Expand functions x i in a wavelet basis Use wavelet coefficients as covariates in ordinary multinomial regression with LASSO penalization Translate estimated regression coef s to coef. functions ˆβ m Tuning parameters are selected with cross validation: resolution level j 0 in wavelet basis and penalty parameter λ Computations fda package for preprocessing wavethresh package for wavelet computations glmnet package for penalized multinomial regression Slide 7/16

Cross validation results Mean cross validated deviance Left fore CVM 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 j 0 =0 j 0 =1 j 0 =2 j 0 =3 j 0 =4 j 0 =5 j 0 =6 j 0 =7 0.5 0.0 0.5 j 0 =0 j 0 =1 j 0 =2 j 0 =3 j 0 =4 8 7 6 5 4 3 log(lambda) 0.0 0.2 0.4 Tim Slide 8/16

Estimated coefficient functions: ˆβ m ˆβ no Left fore normal Right fore normal 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 Left hind normal Right hind normal 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 Slide 9/16

... and 100 bootstrap samples Left fore normal Right fore normal 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 Left hind normal Right hind normal 0.5 0.0 0.5 1.0 0.5 0.0 0.5 1.0 Slide 10/16

Classification results Leave-one-out classification: Predicted group True group LF LH NO RF RH LF 13 0 2 0 1 LH 1 12 2 1 0 NO 1 1 19 1 1 RF 0 2 1 13 0 RH 1 0 2 0 11 68 curves classified to correct group (80%); five more curves to correct diagonal; only one curve to wrong diagonal PC-LDA and method by Ferraty+Vieu (2003): 65% correct Simulation study with similar data: 89% correct classifications Slide 11/16

Discussion Estimated ˆβ m s satisfy, to some extend, natural symmetry restrictions. Possible to build in those restrictions explicitly. MFR performs at least as well as the other methods we have tested also in application from speech recognition Many possible variations which we didn t pursue Data come from only eight horses! Calls for random effect of horse at least for inference about coefficient functions. Thank you for your attention Questions and comments are most welcome: helle@math.ku.dk Slide 12/16

References Mousavi, S. N., and Sørensen, H. (2015). Multinomial functional regression with wavelets and LASSO penalization. Submitted. Zhao, Y., Ogden, R. T., and Reiss, P. T. (2012). Wavelet-based LASSO in functional linear regression. Journal of Computational and Graphical Statistics, 21:600 617. Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1):1. Ferraty, F. and Vieu, P. (2003). Curves discrimination: a nonparametric functional approach. Computational Statistics & Data Analysis, 44:161 173. Slide 13/16

Simulation results Color Legend Test Data Twins Test Data 60 n=(20,20,20,20,20) n=(40,40,40,40,40) Misclassificaton rate(%) 40 20 0 Dense Dense+Small noise Dense+Large noise Sparse Sparse+Small noise Sparse+Large noise Dense Dense+Small noise Dense+Large noise Sparse Sparse+Small noise Sparse+Large noise Slide 14/16

Probabilities for classification Correct group Correct diagonal, wrong group Remaining True status LF LH NO RF RH 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Maximum probability Slide 15/16

Estimated coefficient functions with symmetry restrictions Estimated deviation from NO group Estimated deviation from NO group 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 LF LH RF RH 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 LF LH RF RH Slide 16/16