Detecting Statistical Interactions from Neural Network Weights

Size: px
Start display at page:

Download "Detecting Statistical Interactions from Neural Network Weights"

Transcription

1 Detecting Statistical Interactions from Neural Network Weights Michael Tsang Joint work with Dehua Cheng, Yan Liu 1/17

2 Motivation: We seek assurance that a neural network learned the longitude x latitude interaction for predicting housing price $$$ $ $$ $$$$ Michael Author Tsang (USC) Page 2

3 Problem Can we detect statistical interactions in data by interpreting the trained weights of a multilayer perceptron (MLP)? The complex behavior of MLPs can be better understood. 3/17

4 Statistical Interaction Statistical Interaction 1 : Non-Additive Groupings of Variables in F x For example: F x = sin x ( + x * + x + + x + x, + x - {1,2,3} {3,4} 1 Sorokina et al /17

5 Statistical Interaction Statistical Interaction 1 : Non-Additive Groupings of Variables in F x For example: F x = sin x ( + x * + x + + x + x, + x - {1,2,3} {3,4} F x = log x ( x * = log x ( + log x * no interaction 1 Sorokina et al /17

6 Core Insight in Nonlinear Networks: {1,3} should exist Assume first layer hidden units are especially good at modeling interactions 5/17

7 Neural Interaction Detection (NID) Framework 1. Train MLP with Regularization 2. Rank Interactions by Interpreting Weights 3. Find Cutoff on the Ranking (if desired) 6/17

8 Rank Interactions by Interpreting Weights Interaction Strength Per Hidden Unit for hidden unit i Approximation of Hidden Unit Influence 7/17

9 Rank Interactions by Interpreting Weights Interaction Strength Per Hidden Unit for hidden unit i Approximation of Hidden Unit Influence 7/17

10 Rank Interactions by Interpreting Weights Interaction Strength Per Hidden Unit for hidden unit i Approximation of Hidden Unit Influence 7/17

11 Rank Interactions by Interpreting Weights Interaction Strength Per Hidden Unit for hidden unit i Approximation of Hidden Unit Influence 7/17

12 Ranking Pairwise Interactions x ( x * x + x, 8/17

13 Ranking Pairwise Interactions x ( x * x + x, 8/17

14 Ranking Pairwise Interactions x ( x * x + x, 8/17

15 Ranking Pairwise Interactions x ( x * x + x, 8/17

16 Ranking Pairwise Interactions x ( x * x + x, 8/17

17 Ranking Pairwise Interactions x ( x * x + x, 8/17

18 Ranking Pairwise Interactions x ( x * x + x, 8/17

19 Ranking Pairwise Interactions x ( x * x + x, 8/17

20 Ranking Pairwise Interactions x ( x * x + x, 8/17

21 Ranking Pairwise Interactions x ( x * x + x, 8/17

22 Ranking Pairwise Interactions x ( x * x + x, 8/17

23 Ranking Pairwise Interactions x ( x * x + x, 8/17

24 Ranking Pairwise Interactions x ( x * x + x, 8/17

25 Ranking Pairwise Interactions x ( x * x + x, 8/17

26 Ranking Higher-Order Interactions x ( x * x + x, w ( > w * > w + > w, 9/17

27 Ranking Higher-Order Interactions h ( Interactions Strengths {1,2} z ( min w (, w * x ( x * x + x, w ( > w * > w + > w, 9/17

28 Ranking Higher-Order Interactions h ( Interactions Strengths {1,2} z ( w * x ( x * x + x, w ( > w * > w + > w, 9/17

29 Ranking Higher-Order Interactions h ( Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + x ( x * x + x, w ( > w * > w + > w, 9/17

30 Ranking Higher-Order Interactions h ( Interactions Strengths x ( x * x + x, {1,2} z ( w * {1,2,3} z ( w + {1,2,3,4} z ( w, w ( > w * > w + > w, 9/17

31 Ranking Higher-Order Interactions h ( h * Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + {1,2,3,4} z ( w, x ( x * x + x, {1,3} z * w ( w + > w ( > w * > w, 9/17

32 Ranking Higher-Order Interactions h ( h * Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + + z * w * {1,2,3,4} z ( w, x ( x * x + x, {1,3} z * w ( w + > w ( > w * > w, 9/17

33 Ranking Higher-Order Interactions h ( h * Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + + z * w * {1,2,3,4} z ( w, + z * w, x ( x * x + x, {1,3} z * w ( w + > w ( > w * > w, 9/17

34 Ranking Higher-Order Interactions h ( h * h + Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + + z * w * {1,2,3,4} z ( w, + z * w, x ( x * x + x, {1,3} z * w ( 9/17

35 Ranking Higher-Order Interactions h ( h * h + h, Interactions Strengths {1,2} z ( w * {1,2,3} z ( w + + z * w * {1,2,3,4} z ( w, + z * w, x ( x * x + x, {1,3} z * w ( 9/17

36 Sample Interaction Ranking Interactions Strengths {1,2,3} {1,2,3,4} {1,2} {1,3} /17

37 Find a Cutoff on the Ranking Use a generalized additive model with interactions (MLP-Cutoff) error Cutoff K 10/17

38 Test Suite of Data-Generating Functions Complex functions are used in our evaluation 11/17

39 AUC of Pairwise Interaction Strengths Fisher 1925, 2 Bien et al. 2013, 3 Sorokina et al *F 6 plays an important role for this result 12/17

40 Higher-Order Interaction Detection for Synthetic Data 13/17

41 Higher-Order Interaction Detection versus Baseline 14/17

42 Higher-Order Interaction Detection versus Baseline Similar detection performance at varying noise levels 14/17

43 Higher-Order Interaction Detection versus Baseline Runtime is orders of magnitude times faster 14/17

44 Back to our housing problem $$$ $ $$ $$$$ Michael Author Tsang (USC) Page 44

45 Pairwise Heat-Maps for Real-World Data {1,2}: longitude and latitude! 1 Pace et al. 1997, 2 Fanaee-T et al. 2014, 3 Adam-Bourdarios et al. 2014, 4 Frey et al /17

46 Pairwise Heat-Maps for Real-World Data {4,7}: hour and working day 1 Pace et al. 1997, 2 Fanaee-T et al. 2014, 3 Adam-Bourdarios et al. 2014, 4 Frey et al /17

47 Higher-Order Interaction Detection for Real-World Data Reached the cutoff point obtained informative interactions 16/17

48 Summary Proposed Neural Interaction Detection (NID) that detects interactions from neural network weights NID takes orders of magnitude less time to obtain similar performance to the state-of-the-art baseline. 17/17

49 References Adam-Bourdarios, Claire, et al. "Learning to discover: the higgs boson machine learning challenge." URL (2014). Bien, Jacob, Jonathan Taylor, and Robert Tibshirani. "A lasso for hierarchical interactions." Annals of statistics 41.3 (2013): Fanaee-T, Hadi, and Joao Gama. "Event labeling combining ensemble detectors and background knowledge." Progress in Artificial Intelligence (2014): Fisher, Ronald Aylmer. "Statistical methods for research workers." Breakthroughs in Statistics. Springer, New York, NY, Frey, Peter W., and David J. Slate. "Letter recognition using Holland-style adaptive classifiers." Machine learning 6.2 (1991): Pace, R. Kelley, and Ronald Barry. "Sparse spatial autoregressions." Statistics & Probability Letters 33.3 (1997): Sorokina, Daria, et al. "Detecting statistical interactions with additive groves of trees." Proceedings of the 25th international conference on Machine learning. ACM, 2008.

Neural Interaction Detection

Neural Interaction Detection Neural Interaction Detection Michael Tsang, Dehua Cheng, Yan Liu Department of Computer Science University of Southern California Los Angeles, CA 008 {tsangm, dehuache, yanliu.cs}@usc.edu Abstract We develop

More information

DETECTING STATISTICAL INTERACTIONS FROM NEURAL NETWORK WEIGHTS

DETECTING STATISTICAL INTERACTIONS FROM NEURAL NETWORK WEIGHTS DETECTING STATISTICAL INTERACTIONS FROM NEURAL NETWORK WEIGHTS Michael Tsang, Dehua Cheng, Yan Liu Department of Computer Science University of Southern California {tsangm,dehuache,yanliu.cs}@usc.edu ABSTRACT

More information

Artificial neural networks

Artificial neural networks Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural

More information

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

Advanced statistical methods for data analysis Lecture 2

Advanced statistical methods for data analysis Lecture 2 Advanced statistical methods for data analysis Lecture 2 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

A. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy.

A. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy. Application of Neural Net Models to classify and to forecast the observed precipitation type at the ground using the Artificial Intelligence Competition data set. A. Pelliccioni (*), R. Cotroneo (*), F.

More information

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant

More information

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69 R E S E A R C H R E P O R T Online Policy Adaptation for Ensemble Classifiers Christos Dimitrakakis a IDIAP RR 03-69 Samy Bengio b I D I A P December 2003 D a l l e M o l l e I n s t i t u t e for Perceptual

More information

Pairwise Neural Network Classifiers with Probabilistic Outputs

Pairwise Neural Network Classifiers with Probabilistic Outputs NEURAL INFORMATION PROCESSING SYSTEMS vol. 7, 1994 Pairwise Neural Network Classifiers with Probabilistic Outputs David Price A2iA and ESPCI 3 Rue de l'arrivée, BP 59 75749 Paris Cedex 15, France a2ia@dialup.francenet.fr

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

Applied Statistics. Multivariate Analysis - part II. Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1

Applied Statistics. Multivariate Analysis - part II. Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1 Applied Statistics Multivariate Analysis - part II Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1 Fisher Discriminant You want to separate two types/classes (A and B) of

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks Sections 18.6 and 18.7 Analysis of Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline Univariate regression

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection How ew Information Criteria WAIC and WBIC Worked for MLP Model Selection Seiya Satoh and Ryohei akano ational Institute of Advanced Industrial Science and Tech, --7 Aomi, Koto-ku, Tokyo, 5-6, Japan Chubu

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35 Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How

More information

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information Mathias Berglund, Tapani Raiko, and KyungHyun Cho Department of Information and Computer Science Aalto University

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Analysis of Fast Input Selection: Application in Time Series Prediction

Analysis of Fast Input Selection: Application in Time Series Prediction Analysis of Fast Input Selection: Application in Time Series Prediction Jarkko Tikka, Amaury Lendasse, and Jaakko Hollmén Helsinki University of Technology, Laboratory of Computer and Information Science,

More information

Adaptive Crowdsourcing via EM with Prior

Adaptive Crowdsourcing via EM with Prior Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and

More information

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, 2012 Sasidharan Sreedharan www.sasidharan.webs.com 3/1/2012 1 Syllabus Artificial Intelligence Systems- Neural Networks, fuzzy logic,

More information

8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7]

8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7] 8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7] While cross-validation allows one to find the weight penalty parameters which would give the model good generalization capability, the separation of

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

Maxout Networks. Hien Quoc Dang

Maxout Networks. Hien Quoc Dang Maxout Networks Hien Quoc Dang Outline Introduction Maxout Networks Description A Universal Approximator & Proof Experiments with Maxout Why does Maxout work? Conclusion 10/12/13 Hien Quoc Dang Machine

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs artifical neural networks

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Sections 18.6 and 18.7 Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks Sections 18.6 and 18.7 Artificial Neural Networks CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs. artifical neural

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne Wikipedia Geoffrey A. Hinton https://www.macs.hw.ac.uk/~dwcorne/teaching/introdl.ppt Feedforward networks for function

More information

Negatively Correlated Echo State Networks

Negatively Correlated Echo State Networks Negatively Correlated Echo State Networks Ali Rodan and Peter Tiňo School of Computer Science, The University of Birmingham Birmingham B15 2TT, United Kingdom E-mail: {a.a.rodan, P.Tino}@cs.bham.ac.uk

More information

THE multilayer perceptron (MLP) is a nonlinear signal

THE multilayer perceptron (MLP) is a nonlinear signal Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 013 Partially Affine Invariant Training Using Dense Transform Matrices Melvin D Robinson and Michael T

More information

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition J. Uglov, V. Schetinin, C. Maple Computing and Information System Department, University of Bedfordshire, Luton,

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Classification of Higgs Boson Tau-Tau decays using GPU accelerated Neural Networks

Classification of Higgs Boson Tau-Tau decays using GPU accelerated Neural Networks Classification of Higgs Boson Tau-Tau decays using GPU accelerated Neural Networks Mohit Shridhar Stanford University mohits@stanford.edu, mohit@u.nus.edu Abstract In particle physics, Higgs Boson to tau-tau

More information

CS325 Artificial Intelligence Chs. 18 & 4 Supervised Machine Learning (cont)

CS325 Artificial Intelligence Chs. 18 & 4 Supervised Machine Learning (cont) CS325 Artificial Intelligence Cengiz Spring 2013 Model Complexity in Learning f(x) x Model Complexity in Learning f(x) x Let s start with the linear case... Linear Regression Linear Regression price =

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Deep Convolutional Neural Networks for Pairwise Causality

Deep Convolutional Neural Networks for Pairwise Causality Deep Convolutional Neural Networks for Pairwise Causality Karamjit Singh, Garima Gupta, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal TCS Research, Delhi Tata Consultancy Services Ltd. {karamjit.singh,

More information

Memory-Augmented Attention Model for Scene Text Recognition

Memory-Augmented Attention Model for Scene Text Recognition Memory-Augmented Attention Model for Scene Text Recognition Cong Wang 1,2, Fei Yin 1,2, Cheng-Lin Liu 1,2,3 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Knowledge Extraction from DBNs for Images

Knowledge Extraction from DBNs for Images Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental

More information

Selection of Classifiers based on Multiple Classifier Behaviour

Selection of Classifiers based on Multiple Classifier Behaviour Selection of Classifiers based on Multiple Classifier Behaviour Giorgio Giacinto, Fabio Roli, and Giorgio Fumera Dept. of Electrical and Electronic Eng. - University of Cagliari Piazza d Armi, 09123 Cagliari,

More information

Non-parametric Classification of Facial Features

Non-parametric Classification of Facial Features Non-parametric Classification of Facial Features Hyun Sung Chang Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Problem statement In this project, I attempted

More information

CSC242: Intro to AI. Lecture 21

CSC242: Intro to AI. Lecture 21 CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages

More information

Bayesian ensemble learning of generative models

Bayesian ensemble learning of generative models Chapter Bayesian ensemble learning of generative models Harri Valpola, Antti Honkela, Juha Karhunen, Tapani Raiko, Xavier Giannakopoulos, Alexander Ilin, Erkki Oja 65 66 Bayesian ensemble learning of generative

More information

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller 2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that

More information

W vs. QCD Jet Tagging at the Large Hadron Collider

W vs. QCD Jet Tagging at the Large Hadron Collider W vs. QCD Jet Tagging at the Large Hadron Collider Bryan Anenberg: anenberg@stanford.edu; CS229 December 13, 2013 Problem Statement High energy collisions of protons at the Large Hadron Collider (LHC)

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Data Mining (Mineria de Dades)

Data Mining (Mineria de Dades) Data Mining (Mineria de Dades) Lluís A. Belanche belanche@lsi.upc.edu Soft Computing Research Group Dept. de Llenguatges i Sistemes Informàtics (Software department) Universitat Politècnica de Catalunya

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

Multi-Layer Boosting for Pattern Recognition

Multi-Layer Boosting for Pattern Recognition Multi-Layer Boosting for Pattern Recognition François Fleuret IDIAP Research Institute, Centre du Parc, P.O. Box 592 1920 Martigny, Switzerland fleuret@idiap.ch Abstract We extend the standard boosting

More information

Feed-forward Network Functions

Feed-forward Network Functions Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

The z-vertex Trigger for Belle II

The z-vertex Trigger for Belle II The z-vertex Trigger for Belle II Sebastian Skambraks Technische Universität München IMPRS Young Scientist Workshop at Ringberg Castle July 10, 2015 1 / 13 Outline Introduction Motivation Signal Flow Multi

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

Intelligent Modular Neural Network for Dynamic System Parameter Estimation Intelligent Modular Neural Network for Dynamic System Parameter Estimation Andrzej Materka Technical University of Lodz, Institute of Electronics Stefanowskiego 18, 9-537 Lodz, Poland Abstract: A technique

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

Caesar s Taxi Prediction Services

Caesar s Taxi Prediction Services 1 Caesar s Taxi Prediction Services Predicting NYC Taxi Fares, Trip Distance, and Activity Paul Jolly, Boxiao Pan, Varun Nambiar Abstract In this paper, we propose three models each predicting either taxi

More information

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

ECE662: Pattern Recognition and Decision Making Processes: HW TWO ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are

More information

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis 009 0th International Conference on Document Analysis and Recognition A Novel Rejection easurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis Chun Lei He Louisa Lam Ching

More information

An Efficient Algorithm For Weak Hierarchical Lasso. Yashu Liu, Jie Wang, Jieping Ye Arizona State University

An Efficient Algorithm For Weak Hierarchical Lasso. Yashu Liu, Jie Wang, Jieping Ye Arizona State University An Efficient Algorithm For Weak Hierarchical Lasso Yashu Liu, Jie Wang, Jieping Ye Arizona State University Outline Regression with Interactions Problems and Challenges Weak Hierarchical Lasso The Proposed

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge

More information

Infinite Ensemble Learning with Support Vector Machinery

Infinite Ensemble Learning with Support Vector Machinery Infinite Ensemble Learning with Support Vector Machinery Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology ECML/PKDD, October 4, 2005 H.-T. Lin and L. Li (Learning Systems

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

Low Bias Bagged Support Vector Machines

Low Bias Bagged Support Vector Machines Low Bias Bagged Support Vector Machines Giorgio Valentini Dipartimento di Scienze dell Informazione Università degli Studi di Milano, Italy valentini@dsi.unimi.it Thomas G. Dietterich Department of Computer

More information

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology ..... SPARROW SPARse approximation Weighted regression Pardis Noorzad Department of Computer Engineering and IT Amirkabir University of Technology Université de Montréal March 12, 2012 SPARROW 1/47 .....

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

Data Dependence in Combining Classifiers

Data Dependence in Combining Classifiers in Combining Classifiers Mohamed Kamel, Nayer Wanas Pattern Analysis and Machine Intelligence Lab University of Waterloo CANADA ! Dependence! Dependence Architecture! Algorithm Outline Pattern Recognition

More information

Hierarchical Boosting and Filter Generation

Hierarchical Boosting and Filter Generation January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers

More information

An Introduction to Statistical and Probabilistic Linear Models

An Introduction to Statistical and Probabilistic Linear Models An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In

More information

A Deep Interpretation of Classifier Chains

A Deep Interpretation of Classifier Chains A Deep Interpretation of Classifier Chains Jesse Read and Jaakko Holmén http://users.ics.aalto.fi/{jesse,jhollmen}/ Aalto University School of Science, Department of Information and Computer Science and

More information

Machine learning approaches to the Higgs boson self coupling

Machine learning approaches to the Higgs boson self coupling Machine learning approaches to the Higgs boson self coupling 6 JUNE ICHEP 2018 @ COEX SEOUL JUBIN PARK (CHONNAM NATIONAL UNIVERSITY) COLLABORATED WITH JUNG CHANG, KINGMAN CHEUNG, JAE SIK LEE, CHIH-TING

More information

Sample questions for Fundamentals of Machine Learning 2018

Sample questions for Fundamentals of Machine Learning 2018 Sample questions for Fundamentals of Machine Learning 2018 Teacher: Mohammad Emtiyaz Khan A few important informations: In the final exam, no electronic devices are allowed except a calculator. Make sure

More information

Supervised Machine Learning: Learning SVMs and Deep Learning. Klaus-Robert Müller!!et al.!!

Supervised Machine Learning: Learning SVMs and Deep Learning. Klaus-Robert Müller!!et al.!! Supervised Machine Learning: Learning SVMs and Deep Learning Klaus-Robert Müller!!et al.!! Today s Tutorial Machine Learning introduction: ingredients for ML Kernel Methods and Deep networks with explaining

More information

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence Artificial Intelligence (AI) Artificial Intelligence AI is an attempt to reproduce intelligent reasoning using machines * * H. M. Cartwright, Applications of Artificial Intelligence in Chemistry, 1993,

More information

Introduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 13 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Multi-Class Classification Mehryar Mohri - Introduction to Machine Learning page 2 Motivation

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Errors, and What to Do. CS 188: Artificial Intelligence Fall What to Do About Errors. Later On. Some (Simplified) Biology

Errors, and What to Do. CS 188: Artificial Intelligence Fall What to Do About Errors. Later On. Some (Simplified) Biology CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered

More information

Online Estimation of Discrete Densities using Classifier Chains

Online Estimation of Discrete Densities using Classifier Chains Online Estimation of Discrete Densities using Classifier Chains Michael Geilke 1 and Eibe Frank 2 and Stefan Kramer 1 1 Johannes Gutenberg-Universtität Mainz, Germany {geilke,kramer}@informatik.uni-mainz.de

More information

Multitask Learning of Environmental Spatial Data

Multitask Learning of Environmental Spatial Data 9th International Congress on Environmental Modelling and Software Brigham Young University BYU ScholarsArchive 6th International Congress on Environmental Modelling and Software - Leipzig, Germany - July

More information

CS 188: Artificial Intelligence Fall 2011

CS 188: Artificial Intelligence Fall 2011 CS 188: Artificial Intelligence Fall 2011 Lecture 22: Perceptrons and More! 11/15/2011 Dan Klein UC Berkeley Errors, and What to Do Examples of errors Dear GlobalSCAPE Customer, GlobalSCAPE has partnered

More information

A Bayesian Approach to Concept Drift

A Bayesian Approach to Concept Drift A Bayesian Approach to Concept Drift Stephen H. Bach Marcus A. Maloof Department of Computer Science Georgetown University Washington, DC 20007, USA {bach, maloof}@cs.georgetown.edu Abstract To cope with

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Deep Learning Architecture for Univariate Time Series Forecasting

Deep Learning Architecture for Univariate Time Series Forecasting CS229,Technical Report, 2014 Deep Learning Architecture for Univariate Time Series Forecasting Dmitry Vengertsev 1 Abstract This paper studies the problem of applying machine learning with deep architecture

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

Multivariate Methods in Statistical Data Analysis

Multivariate Methods in Statistical Data Analysis Multivariate Methods in Statistical Data Analysis Web-Site: http://tmva.sourceforge.net/ See also: "TMVA - Toolkit for Multivariate Data Analysis, A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E.

More information