Data Dependence in Combining Classifiers

Size: px
Start display at page:

Download "Data Dependence in Combining Classifiers"

Transcription

1 in Combining Classifiers Mohamed Kamel, Nayer Wanas Pattern Analysis and Machine Intelligence Lab University of Waterloo CANADA

2 ! Dependence! Dependence Architecture! Algorithm Outline

3 Pattern Recognition Systems! Best possible classification rates.! Increase efficiency and accuracy. Multiple Classifier Systems! Evidence of improving performance! Problem decomposed naturally from using various sensors! Avoid making commitments to arbitrary initial conditions or parameters in Combining Classifiers

4 Categorization of MCS Architecture Input/Output Mapping Representation Types of classifiers in Combining Classifiers

5 Categorization of MCS (cntd Architecture cntd ) Parallel [Dasarathy,, 94] Parallel Input 1 Classifier 1 Input 2 Classifier 2 Input N Classifier N F U S I O N Output Serial [Dasarathy,, 94] Serial Input 1 Classifier 1 Classifier 2 Classifier N Output Input 2 Input N in Combining Classifiers

6 Categorization of MCS (cntd Architectures [Lam, 00] cntd ) Conditional Topology! Once a classifier unable to classify the output the following classifier is deployed Hierarchal Topology! Classifiers applied in succession! Classifiers with various levels of generalization Hybrid Topology! The choice of the classifier to use is based on the input pattern (selection) Multiple (Parallel) Topology in Combining Classifiers

7 Categorization of MCS (cntd Input/Output Mapping cntd ) Linear Mapping! Sum Rule! Weighted Average [Hashem 97] Non-linear Mapping! Maximum! Product! Hierarchal Mixture of Experts [Jordon and Jacobs 94]! Stacked Generalization [Wolpert 92] in Combining Classifiers

8 Categorization of MCS (cntd Representation cntd ) Similar representations! Classifiers need to be different Different representation! Use of different sensors! Different features extracted from the same data set [Ho, 98, Skurichina & Duin,, 02] in Combining Classifiers

9 Categorization of MCS (cntd Types of Classifiers cntd ) Specialized classifiers! Encourage specialization in areas of the feature space! All classifiers must contribute to achieve a final decision! Hierarchal Mixture of Experts [Jordon and Jacobs 94]! Co-operative operative Modular Neural Networks [Auda and Kamel 98] Ensemble of classifiers! Set of redundant classifiers Competitive versus cooperative [Sharkey, 1999] in Combining Classifiers

10 Categorization of MCS (cntd cntd )! Classifiers inherently dependent on the data.! Describe how the final aggregation uses the information present in the input pattern.! Describe the relationship between the final output Q(x) and the pattern under classification x in Combining Classifiers

11 Data Independent ly Dependent ly Dependent in Combining Classifiers

12 Data Independence Solely rely on output of classifiers to determine final classification output. Q(x) = arg max(f (C (x)), j) j Q(x) is the final class assigned for pattern x C j is a vector composed of the output of the various classifiers in the ensemble {c{ 1j,c 2j,...,c Nj } for a given class y j c ij is the confidence classifier i has in pattern x belonging to class y j Mapping F j can be linear or non-linear j j in Combining Classifiers

13 Data Independence (cntd cntd ) Simple voting techniques are data independent! Average! Maximum! Majority Susceptible to incorrect estimates of the confidence in Combining Classifiers

14 Train the combiner on global performance of the data Q(x) = arg max(f (W ( C( x)), C (x)), j) j j j W(C (x)) is the weighting matrix composed of elements w ij w ij is the weight assigned to class j in classifier i in Combining Classifiers

15 (cntd cntd ) ly data dependent approaches include! Weighted average [Hashem 97]! Fuzzy Measures [Gader et al 96]! Belief theory [Xu et al 92]! Behavior Knowledge Space (BKS) [Huang et al 95]! Decision Templates [Kuncheva et al 01]! Modular approaches [Auda and Kamel 98]! Stacked Generalization [Wolpert 92]! Boosting [Schapire 90] Lack consideration for local superiority of classifiers in Combining Classifiers

16 Classifier selection or combining performed based on the sub-space space which the input pattern belongs to. Final classification is dependent on the pattern being classified. Q(x) = arg max(f (W ( x), j j C (x)), j) j in Combining Classifiers

17 (cntd cntd ) ly Data Dependent approaches include! Dynamic Classifier Selection (DCS) DCS With local Accuracy (DCS_LA) [Woods et. al.,97] DCS based on Multiple Classifier Behavior (DCS_MCB) [Giancinto and Roli,, 01]! Hierarchal Mixture of Experts [Jordon and Jacobs 94]! Feature-based approach [Wanas et. al., 99] Weights demonstrate dependence on the input pattern. Intuitively should perform better than other methods in Combining Classifiers

18 Architectures Methodology to incorporate multiple classifiers in a dynamically adapting system Aggregation adapts to the behavior of the ensemble! Detectors generate weights for each classifier that reflect the degree of confidence in each classifier for a given input! A trained aggregation learns to combine the different decisions in Combining Classifiers

19 Architectures (cntd Architecture I cntd ) N. Wanas, M. Kamel, G. Auda, and F. Karray, Decision Aggregation in Modular Neural Network Classifiers, Pattern Recognition Letters, 20(11-13), , in Combining Classifiers

20 Architectures (cntd cntd ) Classifiers! Each individual classifier, C i, produces some output representing its interpretation of the input x! Utilizing sub-optimal classifiers.! The collection of classifier outputs for class y j is represented as C j (x) Detector! Detector D l is a classifier that uses input features to extract useful information for aggregation! Doesn t aim to solve the classification problem.! Detector output d lg (x) is a probability that the input pattern x is categorized to group g.! The output of all the detectors is represented by D(X) in Combining Classifiers

21 Architectures (cntd cntd ) Aggregation! Fusion layer for all the classifiers! Trained to adapt to the behavior of the various modules! data dependent Q(x) = arg max(f (D( x), j C (x)), j) Weights dependent on the input pattern being classified j j in Combining Classifiers

22 Architectures (cntd Architecture II cntd ) in Combining Classifiers

23 Architectures (cntd cntd ) Classifiers! Each individual classifier, C i, produces some output representing its interpretation of the input x! Utilizing sub-optimal classifiers.! The collection of classifier outputs for class y j is represented as C j (x) Detector! Appends input to output of classifier ensemble.! Produces a weighting factor, w ij,for each class in a classifier output.! The dependence of the weights on both the classifier output and the input pattern is represented by W(x,C j (x)) in Combining Classifiers

24 Architectures (cntd cntd ) Aggregation! Fusion layer for all the classifiers! Trained to adapt to the behavior of the various modules! Combines implicit and explicit data dependence Q(x) = arg max(f (W ( x, C ( x)), C (x)), j) j j Weights dependent on the input pattern and the performance of the classifiers. j j in Combining Classifiers

25 Five one-hidden layer BP classifiers used partially disjoint data sets No optimization is performed for the trained networks The parameters of all the networks are maintained for all the classifiers that are trained Three data sets! 20 Class Gaussian! Satimages! Clouds data in Combining Classifiers

26 (cntd cntd ) Oracle Majority Data Set Singlenet Maximum 20 Class ± ± ± ± 0.36 Clouds ± ± 0.16 Data Dependent Approaches ± ± 0.02 Satimages ± ± ± ± 0.16 Average ± ± ± 0.22 Borda ± ± ± 0.20 ly Data Dependent Approaches Weighted Avg ± ± ± 0.21 Bayesian ± ± ± 0.16 Fuzzy Integral ± ± ± 0.19 ly Data Dependent Feature-based 8.64 ± ± ± 0.19 in Combining Classifiers

27 each component independently! Optimize individual components, may not lead to overall improvement! Collinearity,, high correlation between classifiers! Components, under-trained or over-trained in Combining Classifiers

28 (cntd cntd ) Adaptive training Selective: Reducing correlation between components! Selective:! Focused: Re Re-training focuses on misclassified patterns. Efficient: Controls the duration of training! Efficient: in Combining Classifiers

29 Adaptive : Main loop Increase diversity among ensemble Incremental learning Evaluation of training to determine the re-training set in Combining Classifiers

30 Adaptive : Save classifier if it performs well on the evaluation set Determine when to terminate training for each module in Combining Classifiers

31 Adaptive : Evaluation Train aggregation modules Evaluate training sets for each classifier Compose new training data in Combining Classifiers

32 Adaptive : Data Selection New training data are composed by concatenating! Error i : Misclassified entries of training data for classifier i.! Correct i : Random choice of P ratio of correctly classified entries of the training data for classifier i. in Combining Classifiers

33 Five one-hidden layer BP classifiers used partially disjoint data sets No optimization is performed for the trained networks The parameters of all the networks are maintained for all the classifiers that are trained Three data sets! 20 Class Gaussian! Satimages! Clouds data in Combining Classifiers

34 (cntd cntd ) Best Classifier Oracle Best Classifier Oracle Best Classifier Oracle Data Set Singlenet 20 Class ± ± ± ± ± ± ± ± ± ± 0.19 Normal Clouds ± ± ± ± ± ± ± 0.17 Architecture Trained Adaptively ± ± ± 0.13 Satimages ± ± ± ± 0.19 Ensemble Trained Adaptively using WA as the evaluation function ± ± ± ± ± ± 0.14 in Combining Classifiers

35 Categorization of various combining approaches based on data dependence Independent : vulnerable to incorrect confidence estimates implicitly dependent: doesn t take into account local superiority of classifiers ly dependent: Literature focuses on selection not combining in Combining Classifiers

36 (cntd cntd ) Feature-based approach! Combines implicit and explicit data dependence! Uses an Evolving training algorithm to enhance diversity amongst classifiers! Reduces harmful correlation! Determines duration of training! Improved classification accuracy in Combining Classifiers

37 References [Kittler et. al., 98] J. Kittler, M. Hatef, R. Duin, and J. Matas, On Combining Classifiers, IEEE Trans. PAMI, 20:3, , [Dasarthy,, 94] B. Dasarthy, Decision Fusion, IEEE Computer Soc. Press, [Lam, 00] L. Lam, Classifier Combinations: Implementations and Theoretical Issues, MCS2000, LNCS 1857, 77-86, [Hashem,, 1997] S. Hashem, Algorithms for Optimal Linear Combination of Neural Networks Int. Conf. on Neural Networks, Vol 1, , [Jordon and Jacob, 94] M. Jordon, and R. Jacobs, Hierarchical Mixture of Experts and the EM Algorithm, Neural Computing, , [Wolpert,, 92] D. Wolpert, Stacked Generalization, Neural Networks, Vol 5, , 1992 [Auda and Kamel, 98] G. Auda and M. Kamel, Modular Neural Network Classifiers: A Comparative Study, J. Int. Rob. Sys., Vol. 21, , [Gader et. al., 96] P. Gader, M. Mohamed, and J. Keller, Fusion of Handwritten Word Classifiers, Patt. Reco. Let.,17(6), , [Xu et. al., 92] L. Xu, A. Kazyzak, C. Suen, Methods of Combining Multiple Classifiers and their Applications to Handwritten Recognition, IEEE Sys. Man and Cyb., 22(3), , 1992 [Kuncheva et. al., 01] L. Kuncheva, J. Bezdek, and R. Duin, Decision Templates for Multiple Classifier Fusion: An Experimental Comparison, Patt. Reco., vol. 34, , [Huang et. al., 95] Y. Huang, K. Liu, and C. Suen, The Combination of Multiple Classifiers by a Neural Network Approach, J. Patt. Reco. and Art. Int., Vol. 9, , [Schapire, 90] R. Schapire, The Strength of Weak Learnability, Mach. Lear., Vol. 5, ,1990. [Giancinto and Roli,, 01] G. Giancinto and F. Roli, Dynamic Classifier Selection based on Multiple Classifier Behavior, Patt. Reco., Vol. 34, , [Wanas et., al., 99] N. Wanas, M. Kamel, G. Auda, and F. Karray, Decision Aggregation in Modular Neural Network Classifiers, Patt. Reco. Lett., 20(11-13), , in Combining Classifiers

38

Selection of Classifiers based on Multiple Classifier Behaviour

Selection of Classifiers based on Multiple Classifier Behaviour Selection of Classifiers based on Multiple Classifier Behaviour Giorgio Giacinto, Fabio Roli, and Giorgio Fumera Dept. of Electrical and Electronic Eng. - University of Cagliari Piazza d Armi, 09123 Cagliari,

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Improving the Expert Networks of a Modular Multi-Net System for Pattern Recognition

Improving the Expert Networks of a Modular Multi-Net System for Pattern Recognition Improving the Expert Networks of a Modular Multi-Net System for Pattern Recognition Mercedes Fernández-Redondo 1, Joaquín Torres-Sospedra 1 and Carlos Hernández-Espinosa 1 Departamento de Ingenieria y

More information

Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy and for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy Marina Skurichina, Liudmila I. Kuncheva 2 and Robert P.W. Duin Pattern Recognition Group, Department of Applied Physics,

More information

Modular Neural Network Task Decomposition Via Entropic Clustering

Modular Neural Network Task Decomposition Via Entropic Clustering Modular Neural Network Task Decomposition Via Entropic Clustering Jorge M. Santos Instituto Superior de Engenharia do Porto Instituto de Engenharia Biomédica Porto, Portugal jms@isep.ipp.pt Joaquim Marques

More information

Learning with multiple models. Boosting.

Learning with multiple models. Boosting. CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models

More information

Neural Networks and Ensemble Methods for Classification

Neural Networks and Ensemble Methods for Classification Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated

More information

Combination Methods for Ensembles of Multilayer Feedforward 1

Combination Methods for Ensembles of Multilayer Feedforward 1 Combination Methods for Ensembles of Multilayer Feedforward 1 JOAQUÍN TORRES-SOSPEDRA MERCEDES FERNÁNDEZ-REDONDO CARLOS HERNÁNDEZ-ESPINOSA Dept. de Ingeniería y Ciencia de los Computadores Universidad

More information

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69 R E S E A R C H R E P O R T Online Policy Adaptation for Ensemble Classifiers Christos Dimitrakakis a IDIAP RR 03-69 Samy Bengio b I D I A P December 2003 D a l l e M o l l e I n s t i t u t e for Perceptual

More information

A PERTURBATION-BASED APPROACH FOR MULTI-CLASSIFIER SYSTEM DESIGN

A PERTURBATION-BASED APPROACH FOR MULTI-CLASSIFIER SYSTEM DESIGN A PERTURBATION-BASED APPROACH FOR MULTI-CLASSIFIER SYSTEM DESIGN V.DI LECCE 1, G.DIMAURO 2, A.GUERRIERO 1, S.IMPEDOVO 2, G.PIRLO 2, A.SALZO 2 (1) Dipartimento di Ing. Elettronica - Politecnico di Bari-

More information

The use of entropy to measure structural diversity

The use of entropy to measure structural diversity The use of entropy to measure structural diversity L. Masisi 1, V. Nelwamondo 2 and T. Marwala 1 1 School of Electrical & Information Engineering Witwatersrand University, P/Bag 3, Wits, 2050, South Africa

More information

Dynamic Weighted Fusion of Adaptive Classifier Ensembles Based on Changing Data Streams

Dynamic Weighted Fusion of Adaptive Classifier Ensembles Based on Changing Data Streams Dynamic Weighted Fusion of Adaptive Classifier Ensembles Based on Changing Data Streams Christophe Pagano, Eric Granger, Robert Sabourin, Gian Luca Marcialis 2, and Fabio Roli 2 Laboratoire d imagerie,

More information

Bagging and Other Ensemble Methods

Bagging and Other Ensemble Methods Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained

More information

Overriding the Experts: A Stacking Method For Combining Marginal Classifiers

Overriding the Experts: A Stacking Method For Combining Marginal Classifiers From: FLAIRS-00 Proceedings. Copyright 2000, AAAI (www.aaai.org). All rights reserved. Overriding the Experts: A Stacking ethod For Combining arginal Classifiers ark D. Happel and Peter ock Department

More information

Investigating the Performance of a Linear Regression Combiner on Multi-class Data Sets

Investigating the Performance of a Linear Regression Combiner on Multi-class Data Sets Investigating the Performance of a Linear Regression Combiner on Multi-class Data Sets Chun-Xia Zhang 1,2 Robert P.W. Duin 2 1 School of Science and State Key Laboratory for Manufacturing Systems Engineering,

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne

More information

Dynamic Linear Combination of Two-Class Classifiers

Dynamic Linear Combination of Two-Class Classifiers Dynamic Linear Combination of Two-Class Classifiers Carlo Lobrano 1, Roberto Tronci 1,2, Giorgio Giacinto 1, and Fabio Roli 1 1 DIEE Dept. of Electrical and Electronic Engineering, University of Cagliari,

More information

Active Sonar Target Classification Using Classifier Ensembles

Active Sonar Target Classification Using Classifier Ensembles International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 11, Number 12 (2018), pp. 2125-2133 International Research Publication House http://www.irphouse.com Active Sonar Target

More information

A Hybrid Random Subspace Classifier Fusion Approach for Protein Mass Spectra Classification

A Hybrid Random Subspace Classifier Fusion Approach for Protein Mass Spectra Classification A Hybrid Random Subspace Classifier Fusion Approach for Protein Mass Spectra Classification Amin Assareh 1, Mohammad Hassan Moradi 1, and L. Gwenn Volkert 2 1 Department of Biomedical Engineering, Amirkabir

More information

Getting Lost in the Wealth of Classifier Ensembles?

Getting Lost in the Wealth of Classifier Ensembles? Getting Lost in the Wealth of Classifier Ensembles? Ludmila Kuncheva School of Computer Science Bangor University mas00a@bangor.ac.uk Supported by Project RPG-2015-188 Sponsored by the Leverhulme trust

More information

Analyzing dynamic ensemble selection techniques using dissimilarity analysis

Analyzing dynamic ensemble selection techniques using dissimilarity analysis Analyzing dynamic ensemble selection techniques using dissimilarity analysis George D. C. Cavalcanti 1 1 Centro de Informática - Universidade Federal de Pernambuco (UFPE), Brazil www.cin.ufpe.br/~gdcc

More information

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual Outline: Ensemble Learning We will describe and investigate algorithms to Ensemble Learning Lecture 10, DD2431 Machine Learning A. Maki, J. Sullivan October 2014 train weak classifiers/regressors and how

More information

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Selection Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Ensembles Assume we have an ensemble of classifiers with a well-chosen feature set. We want to optimize the competence of this

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

A TWO-STAGE COMMITTEE MACHINE OF NEURAL NETWORKS

A TWO-STAGE COMMITTEE MACHINE OF NEURAL NETWORKS Journal of the Chinese Institute of Engineers, Vol. 32, No. 2, pp. 169-178 (2009) 169 A TWO-STAGE COMMITTEE MACHINE OF NEURAL NETWORKS Jen-Feng Wang, Chinson Yeh, Chen-Wen Yen*, and Mark L. Nagurka ABSTRACT

More information

Electrical and Computer Engineering Department University of Waterloo Canada

Electrical and Computer Engineering Department University of Waterloo Canada Predicting a Biological Response of Molecules from Their Chemical Properties Using Diverse and Optimized Ensembles of Stochastic Gradient Boosting Machine By Tarek Abdunabi and Otman Basir Electrical and

More information

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels

Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels ESANN'29 proceedings, European Symposium on Artificial Neural Networks - Advances in Computational Intelligence and Learning. Bruges Belgium), 22-24 April 29, d-side publi., ISBN 2-9337-9-9. Heterogeneous

More information

Linear Combiners for Fusion of Pattern Classifiers

Linear Combiners for Fusion of Pattern Classifiers International Summer School on eural ets duardo R. Caianiello" SMBL MTHODS FOR LARIG MACHIS Vietri sul Mare(Salerno) ITALY 22-28 September 2002 Linear Combiners for Fusion of Pattern Classifiers Lecturer

More information

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis 009 0th International Conference on Document Analysis and Recognition A Novel Rejection easurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis Chun Lei He Louisa Lam Ching

More information

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods The wisdom of the crowds Ensemble learning Sir Francis Galton discovered in the early 1900s that a collection of educated guesses can add up to very accurate predictions! Chapter 11 The paper in which

More information

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite

More information

Multi-Layer Boosting for Pattern Recognition

Multi-Layer Boosting for Pattern Recognition Multi-Layer Boosting for Pattern Recognition François Fleuret IDIAP Research Institute, Centre du Parc, P.O. Box 592 1920 Martigny, Switzerland fleuret@idiap.ch Abstract We extend the standard boosting

More information

Use of Dempster-Shafer theory to combine classifiers which use different class boundaries

Use of Dempster-Shafer theory to combine classifiers which use different class boundaries Use of Dempster-Shafer theory to combine classifiers which use different class boundaries Mohammad Reza Ahmadzadeh and Maria Petrou Centre for Vision, Speech and Signal Processing, School of Electronics,

More information

Chapter 14 Combining Models

Chapter 14 Combining Models Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

What makes good ensemble? CS789: Machine Learning and Neural Network. Introduction. More on diversity

What makes good ensemble? CS789: Machine Learning and Neural Network. Introduction. More on diversity What makes good ensemble? CS789: Machine Learning and Neural Network Ensemble methods Jakramate Bootkrajang Department of Computer Science Chiang Mai University 1. A member of the ensemble is accurate.

More information

Regularized Linear Models in Stacked Generalization

Regularized Linear Models in Stacked Generalization Regularized Linear Models in Stacked Generalization Sam Reid and Greg Grudic University of Colorado at Boulder, Boulder CO 80309-0430, USA Abstract Stacked generalization is a flexible method for multiple

More information

ROBUST KERNEL METHODS IN CONTEXT-DEPENDENT FUSION

ROBUST KERNEL METHODS IN CONTEXT-DEPENDENT FUSION ROBUST KERNEL METHODS IN CONTEXT-DEPENDENT FUSION By GYEONGYONG HEO A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

More information

Ensembles. Léon Bottou COS 424 4/8/2010

Ensembles. Léon Bottou COS 424 4/8/2010 Ensembles Léon Bottou COS 424 4/8/2010 Readings T. G. Dietterich (2000) Ensemble Methods in Machine Learning. R. E. Schapire (2003): The Boosting Approach to Machine Learning. Sections 1,2,3,4,6. Léon

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

A Novel Activity Detection Method

A Novel Activity Detection Method A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of

More information

Boosting & Deep Learning

Boosting & Deep Learning Boosting & Deep Learning Ensemble Learning n So far learning methods that learn a single hypothesis, chosen form a hypothesis space that is used to make predictions n Ensemble learning à select a collection

More information

FEATURE SELECTION COMBINED WITH RANDOM SUBSPACE ENSEMBLE FOR GENE EXPRESSION BASED DIAGNOSIS OF MALIGNANCIES

FEATURE SELECTION COMBINED WITH RANDOM SUBSPACE ENSEMBLE FOR GENE EXPRESSION BASED DIAGNOSIS OF MALIGNANCIES FEATURE SELECTION COMBINED WITH RANDOM SUBSPACE ENSEMBLE FOR GENE EXPRESSION BASED DIAGNOSIS OF MALIGNANCIES Alberto Bertoni, 1 Raffaella Folgieri, 1 Giorgio Valentini, 1 1 DSI, Dipartimento di Scienze

More information

ADVANCED METHODS FOR PATTERN RECOGNITION

ADVANCED METHODS FOR PATTERN RECOGNITION Università degli Studi di Cagliari Dottorato di Ricerca in Ingegneria Elettronica e Informatica XIV Ciclo ADVANCED METHODS FOR PATTERN RECOGNITION WITH REJECT OPTION Relatore Prof. Ing. Fabio ROLI Tesi

More information

Recognition of Properties by Probabilistic Neural Networks

Recognition of Properties by Probabilistic Neural Networks Recognition of Properties by Probabilistic Neural Networks JiříGrim 1 and Jan Hora 2 1 Institute of Information Theory and Automation P.O. BOX 18, 18208 PRAGUE 8, Czech Republic 2 Faculty of Nuclear Science

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

2D1431 Machine Learning. Bagging & Boosting

2D1431 Machine Learning. Bagging & Boosting 2D1431 Machine Learning Bagging & Boosting Outline Bagging and Boosting Evaluating Hypotheses Feature Subset Selection Model Selection Question of the Day Three salesmen arrive at a hotel one night and

More information

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output

More information

Ensembles of Classifiers.

Ensembles of Classifiers. Ensembles of Classifiers www.biostat.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts ensemble bootstrap sample bagging boosting random forests error correcting

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

University of Genova - DITEN. Smart Patrolling. video and SIgnal Processing for Telecommunications ISIP40

University of Genova - DITEN. Smart Patrolling. video and SIgnal Processing for Telecommunications ISIP40 University of Genova - DITEN Smart Patrolling 1 Smart Patrolling Detection of the intruder Tracking of the intruder A cognitive node will active an operator, describing on his mobile terminal the characteristic

More information

Lecture 8. Instructor: Haipeng Luo

Lecture 8. Instructor: Haipeng Luo Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine

More information

Boosting: Foundations and Algorithms. Rob Schapire

Boosting: Foundations and Algorithms. Rob Schapire Boosting: Foundations and Algorithms Rob Schapire Example: Spam Filtering problem: filter out spam (junk email) gather large collection of examples of spam and non-spam: From: yoav@ucsd.edu Rob, can you

More information

Linear Classifiers as Pattern Detectors

Linear Classifiers as Pattern Detectors Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

The Perceptron. Volker Tresp Summer 2014

The Perceptron. Volker Tresp Summer 2014 The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

More information

Knowledge Extraction from DBNs for Images

Knowledge Extraction from DBNs for Images Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental

More information

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

Intelligent Modular Neural Network for Dynamic System Parameter Estimation Intelligent Modular Neural Network for Dynamic System Parameter Estimation Andrzej Materka Technical University of Lodz, Institute of Electronics Stefanowskiego 18, 9-537 Lodz, Poland Abstract: A technique

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Maximum Entropy Generative Models for Similarity-based Learning

Maximum Entropy Generative Models for Similarity-based Learning Maximum Entropy Generative Models for Similarity-based Learning Maya R. Gupta Dept. of EE University of Washington Seattle, WA 98195 gupta@ee.washington.edu Luca Cazzanti Applied Physics Lab University

More information

Effect of Rule Weights in Fuzzy Rule-Based Classification Systems

Effect of Rule Weights in Fuzzy Rule-Based Classification Systems 506 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 Effect of Rule Weights in Fuzzy Rule-Based Classification Systems Hisao Ishibuchi, Member, IEEE, and Tomoharu Nakashima, Member, IEEE

More information

ENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS

ENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS International Journal of Uncertainty Fuzziness and Knowledge-Based Systems World Scientific ublishing Company ENTOIES OF FUZZY INDISCENIBILITY ELATION AND ITS OEATIONS QINGUA U and DAEN YU arbin Institute

More information

TDT4173 Machine Learning

TDT4173 Machine Learning TDT4173 Machine Learning Lecture 9 Learning Classifiers: Bagging & Boosting Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Detecting Statistical Interactions from Neural Network Weights

Detecting Statistical Interactions from Neural Network Weights Detecting Statistical Interactions from Neural Network Weights Michael Tsang Joint work with Dehua Cheng, Yan Liu 1/17 Motivation: We seek assurance that a neural network learned the longitude x latitude

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Numerical Learning Algorithms

Numerical Learning Algorithms Numerical Learning Algorithms Example SVM for Separable Examples.......................... Example SVM for Nonseparable Examples....................... 4 Example Gaussian Kernel SVM...............................

More information

Experts. Lei Xu. Dept. of Computer Science, The Chinese University of Hong Kong. Dept. of Computer Science. Toronto, M5S 1A4, Canada.

Experts. Lei Xu. Dept. of Computer Science, The Chinese University of Hong Kong. Dept. of Computer Science. Toronto, M5S 1A4, Canada. An Alternative Model for Mixtures of Experts Lei u Dept of Computer Science, The Chinese University of Hong Kong Shatin, Hong Kong, Email lxu@cscuhkhk Michael I Jordan Dept of Brain and Cognitive Sciences

More information

A Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification

A Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification A Neuro-Fuzzy Scheme for Integrated Input Fuzzy Set Selection and Optimal Fuzzy Rule Generation for Classification Santanu Sen 1 and Tandra Pal 2 1 Tejas Networks India Ltd., Bangalore - 560078, India

More information

Short Note: Naive Bayes Classifiers and Permanence of Ratios

Short Note: Naive Bayes Classifiers and Permanence of Ratios Short Note: Naive Bayes Classifiers and Permanence of Ratios Julián M. Ortiz (jmo1@ualberta.ca) Department of Civil & Environmental Engineering University of Alberta Abstract The assumption of permanence

More information

Real-time image-based parking occupancy detection using deep learning. Debaditya Acharya, Weilin Yan & Kourosh Khoshelham The University of Melbourne

Real-time image-based parking occupancy detection using deep learning. Debaditya Acharya, Weilin Yan & Kourosh Khoshelham The University of Melbourne Real-time image-based parking occupancy detection using deep learning Debaditya Acharya, Weilin Yan & Kourosh Khoshelham The University of Melbourne Slide 1/20 Prologue People spend on Does average that

More information

Logistic Regression and Boosting for Labeled Bags of Instances

Logistic Regression and Boosting for Labeled Bags of Instances Logistic Regression and Boosting for Labeled Bags of Instances Xin Xu and Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand {xx5, eibe}@cs.waikato.ac.nz Abstract. In

More information

SEC: Stochastic ensemble consensus approach to unsupervised SAR sea-ice segmentation

SEC: Stochastic ensemble consensus approach to unsupervised SAR sea-ice segmentation 2009 Canadian Conference on Computer and Robot Vision SEC: Stochastic ensemble consensus approach to unsupervised SAR sea-ice segmentation Alexander Wong, David A. Clausi, and Paul Fieguth Vision and Image

More information

Machine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)

Machine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML) Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

A Brief Introduction to Adaboost

A Brief Introduction to Adaboost A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman. 1 Outline Background Adaboost Algorithm Theory/Interpretations 2 What s So Good

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Thomas G. Dietterich tgd@eecs.oregonstate.edu 1 Outline What is Machine Learning? Introduction to Supervised Learning: Linear Methods Overfitting, Regularization, and the

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

Sparse Kernel Machines - SVM

Sparse Kernel Machines - SVM Sparse Kernel Machines - SVM Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Support

More information

Notes on Discriminant Functions and Optimal Classification

Notes on Discriminant Functions and Optimal Classification Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting PAC Learning Readings: Murphy 16.4;; Hastie 16 Stefan Lee Virginia Tech Fighting the bias-variance tradeoff Simple

More information

Naive Bayes classification

Naive Bayes classification Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental

More information

CS7267 MACHINE LEARNING

CS7267 MACHINE LEARNING CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo Gutierrez-Osuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning

More information

Diversity and Regularization in Neural Network Ensembles

Diversity and Regularization in Neural Network Ensembles Diversity and Regularization in Neural Network Ensembles Huanhuan Chen School of Computer Science University of Birmingham A thesis submitted for the degree of Doctor of Philosophy October, 2008 To my

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS Proceedings of the First Southern Symposium on Computing The University of Southern Mississippi, December 4-5, 1998 COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS SEAN

More information

Short-Term Solar Flare Prediction Using Predictor

Short-Term Solar Flare Prediction Using Predictor Solar Phys (2010) 263: 175 184 DOI 10.1007/s11207-010-9542-3 Short-Term Solar Flare Prediction Using Predictor Teams Xin Huang Daren Yu Qinghua Hu Huaning Wang Yanmei Cui Received: 18 November 2009 / Accepted:

More information

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar

More information

VBM683 Machine Learning

VBM683 Machine Learning VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Combining Classifiers and Learning Mixture-of-Experts

Combining Classifiers and Learning Mixture-of-Experts 38 Combining Classiiers and Learning Mixture-o-Experts Lei Xu Chinese University o Hong Kong Hong Kong & Peing University China Shun-ichi Amari Brain Science Institute Japan INTRODUCTION Expert combination

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition J. Uglov, V. Schetinin, C. Maple Computing and Information System Department, University of Bedfordshire, Luton,

More information

Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data

Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data Jonathan Ortigosa-Hernández, Iñaki Inza, and Jose A. Lozano Contents 1 Appendix A.

More information

IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS - PART B 1

IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS - PART B 1 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS - PART B 1 1 2 IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS - PART B 2 An experimental bias variance analysis of SVM ensembles based on resampling

More information

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri CSE 151 Machine Learning Instructor: Kamalika Chaudhuri Ensemble Learning How to combine multiple classifiers into a single one Works well if the classifiers are complementary This class: two types of

More information

The Perceptron. Volker Tresp Summer 2016

The Perceptron. Volker Tresp Summer 2016 The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters

More information