Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I

Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I What We Did The Machine Learning Zoo Moving Forward M Magdon-Ismail CSCI 4100/6100

recap: Three Learning Principles Scientist 2 resistivity ρ Scientist 3 resistivity ρ Occam s razor: simpler is better; falsifiable resistivity ρ Scientist 1 temperature T temperature T not falsifiable temperature T falsifiable Sampling bias: ensure that training and test distributions are the same, or else acknowledge/account for it You cannot sample from one bin and use your estimates for another bin Data snooping: you are charged for every choice influenced by D Choose the learning process (usually H) before looking at D? h H Data g? We know the price of choosing g from H D c AM L Creator: Malik Magdon-Ismail? Reflecting on Our Path: 2 /10 your choices g Our Plan

Our Plan 1 What is Learning? Output g f after looking at data (x n,y n ) 2 Can We do it? E in E out simple H, finite d vc, large N E in 0 good H, algorithms 3 How to do it? Linear models, nonlinear transforms Algorithms: PLA, pseudoinverse, gradient descent 4 How to do it well? Overfitting: stochastic & deterministic noise Cures: regularization, validation concepts theory practice 5 General principles? Occams razor, sampling bias, data snooping 6 Advanced techniques 7 Other Learning Paradigms c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 3 /10 LFD Jungle

Learning From Data: It s A Jungle Out There overfitting stochastic noise K-means augmented error ill-posed distribution free learning stochastic gradient descent exploration Gaussian processes Lloyds algorithm deterministic noise bootstrapping data snooping linear regression bagging unlabelled data expectation-maximization logistic regression CART Rademacher complexity transfer learning learning VC dimension reinforcement curve exploitation Q-learning nonlinear transformation sampling bias support vectors neural networks Markov Chain Monte Carlo (MCMC) Mercer s theorem linear models ordinal regression adaboost decision trees SVM graphical models bioinformatics training versus testing Gibbs sampling extrapolation no free lunch cross validation HMMs RBF bias-variance tradeoff data contamination perceptron learning DEEP LEARNINGPAC-learning error measures biometrics active learning multiclass MDL types of learning one versus all unsupervised weak learning is learning feasible? momentum RKHS conjugate gradients online-learning Occam s razor noisy targets mixture of experts kernel methods ranking multi-agent systems boosting AIC ensemble methods classification PCA kernel-pca permutation complexity LLE regularization primal-dual colaborative filtering semi-supervised learning clustering Levenberg-Marquardt weight decay Big Data Boltzmann machine c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 4 /10 Theory

Navigating the Jungle: Theory THEORY VC-analysis bias-variance complexity Rademacher SRM c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 5 /10 Techniques

Navigating the Jungle: Techniques THEORY TECHNIQUES VC-analysis bias-variance complexity Rademacher SRM Models Methods c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 6 /10 Models

Navigating the Jungle: Models THEORY TECHNIQUES VC-analysis bias-variance complexity Rademacher SRM Models linear neural networks SVM similarity Gaussian processes graphical models bilinear/svd Methods c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 7 /10 Methods

Navigating the Jungle: Methods THEORY TECHNIQUES VC-analysis bias-variance complexity Rademacher SRM Models linear neural networks SVM similarity Gaussian processes graphical models bilinear/svd Methods regularization validation aggregation preprocessing c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 8 /10 Paradigms

Navigating the Jungle: Paradigms THEORY TECHNIQUES PARADIGMS VC-analysis bias-variance complexity Rademacher SRM Models linear neural networks SVM similarity Gaussian processes graphical models bilinear/svd Methods regularization validation aggregation preprocessing supervised unsupervised reinforcement active online unlabeled transfer learning big data c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 9 /10 Moving Forward

Moving Forward 1 What is Learning? Output g f after looking at data (x n,y n ) 2 Can We do it? 3 How to do it? E in E out simple H, finite d vc, large N E in 0 good H, algorithms Linear models, nonlinear transforms Algorithms: PLA, pseudoinverse, gradient descent 4 How to do it well? Overfitting: stochastic & deterministic noise Cures: regularization, validation concepts theory practice 5 General principles? Occams razor, sampling bias, data snooping 6 Advanced techniques Similarity, neural networks, SVMs, preprocessing & aggregation 7 Other Learning Paradigms Unsupervised, reinforcement c AM L Creator: Malik Magdon-Ismail Reflecting on Our Path: 10 /10