A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

Similar documents
2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Lecture 12: Classification

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Kernel Methods and SVMs Extension

SDMML HT MSc Problem Sheet 4

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

Solving Nonlinear Differential Equations by a Neural Network Method

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

Some modelling aspects for the Matlab implementation of MMA

Natural Language Processing and Information Retrieval

Homework Assignment 3 Due in class, Thursday October 15

Pattern Classification

Regularized Discriminant Analysis for Face Recognition

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

Linear Approximation with Regularization and Moving Least Squares

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Lecture 10 Support Vector Machines II

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

On the Multicriteria Integer Network Flow Problem

Generalized Linear Methods

Semi-supervised Classification with Active Query Selection

Module 9. Lecture 6. Duality in Assignment Problems

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

Bayesian predictive Configural Frequency Analysis

Classification as a Regression Problem

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Credit Card Pricing and Impact of Adverse Selection

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Support Vector Machines

Hidden Markov Models

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Assortment Optimization under MNL

ERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Evaluation for sets of classes

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

Time-Varying Systems and Computations Lecture 6

The Study of Teaching-learning-based Optimization Algorithm

Statistical pattern recognition

Support Vector Machines

Boostrapaggregating (Bagging)

Which Separator? Spring 1

10-701/ Machine Learning, Fall 2005 Homework 3

Lecture 3: Probability Distributions

3.1 ML and Empirical Distribution

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

CHAPTER 3: BAYESIAN DECISION THEORY

Discretization of Continuous Attributes in Rough Set Theory and Its Application*

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

On mutual information estimation for mixed-pair random variables

Statistics II Final Exam 26/6/18

PHYS 705: Classical Mechanics. Calculus of Variations II

Multilayer Perceptron (MLP)

CSC 411 / CSC D11 / CSC C11

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

A quantum-statistical-mechanical extension of Gaussian mixture model

The big picture. Outline

A new Approach for Solving Linear Ordinary Differential Equations

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

The Order Relation and Trace Inequalities for. Hermitian Operators

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

THE SUMMATION NOTATION Ʃ

ECE559VV Project Report

Hidden Markov Models

Lecture Notes on Linear Regression

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

A PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY.

Excess Error, Approximation Error, and Estimation Error

Randić Energy and Randić Estrada Index of a Graph

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

arxiv:cs.cv/ Jun 2000

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Explaining the Stein Paradox

Limited Dependent Variables

A New Evolutionary Computation Based Approach for Learning Bayesian Network

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

MAXIMUM A POSTERIORI TRANSDUCTION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija

Lecture 3 Stat102, Spring 2007

Finding Dense Subgraphs in G(n, 1/2)

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

Problem Set 9 Solutions

The Minimum Universal Cost Flow in an Infeasible Flow Network

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Lecture Nov

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Polynomial Regression Models

5. POLARIMETRIC SAR DATA CLASSIFICATION

The Expectation-Maximization Algorithm

NP-Completeness : Proofs

Transcription:

A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland puchala@zssk.pwr.wroc.pl Abstract. The paper presents algorthms of the multtask recognton for the drect approach. Frst one, wth full probablstc nformaton and second one, algorthms wth learnng sequence. Algorthm wth full probablstc nformaton was workng on bass of Bayes decson theory. Full probablstc nformaton n a pattern recognton task, denotes a knowledge of the classes probabltes and the class-condtonal probablty densty functons. Optmal algorthm for the selected loss functon wll be presented. Some tests for algorthm wth learnng were done. Introducton The classcal pattern recognton problem s concerned wth the assgnment of a gven pattern to one and only one class from a gven set of classes. Multtask classfcaton problem refers to a stuaton n whch an obect undergoes several classfcaton tasks. Each task denotes recognton from a dfferent pont of vew and wth respect to dfferent set of classes. For example, such a stuaton s typcal for compound medcal decson problems where the frst classfcaton denotes the answer to the queston about the knd of dsease, the next task states recognton of the stadum of dsease, the thrd one determnes the knd of therapy, etc. Let us consder the non-hodgkn lymphoma as a common dlemma n hematology practce. For ths medcal problem we can utlse the multtask classfcaton ths s caused by the structure of the decson process, whch leads to the followng scheme. In the frst task of recognton, we arrve at a decson about the lymphoma type. After the type of lymphoma has been determned, t s essental for dagnoss and therapy to recognze ts stage. The values of decson denote the frst, the second, the thrd and the fourth stage of lymphoma development, respectvely. Apart from that, each stage of lymphoma may assume two forms. Whch of such forms occurs s determned by decson 3. If 3, then lymphoma assumes the form A there are no addtonal symptoms. For 3, lymphoma takes on form B there are other symptoms, as well. Decsons 4 determnes therapy, that s one of the known schemes of treatment e.g. CHOP, BCVP, COMBA, MEVA, COP-BLAM-I. A therapy scheme of treatment cannot be used n ts orgnal form n every case. Because of the sde P.M.A. Sloot et al. Eds.: ICCS 003, LCS 659, pp. 3 0, 003. Sprnger-Verlag Berln Hedelberg 003

4 E. Puchala effects of cytostatc treatment t s necessary to modfy such a scheme. Decson about modfcaton s 5. In the present paper I have focused my attenton on the concept of multtask pattern recognton. In partcular, so-called drect approach for problem soluton wll be taken nto consderaton. Drect Approach to the Multtask Pattern Recognton Algorthm Let us consder -task pattern recognton problem. We shall assume that the vector of features xk X k and the class number k M k for the k-th recognton task of the pattern beng recognzed are observed values of random varables x k and k, respectvely [5]. When a pror probabltes of the whole random vector,,, denote as Ppp,,.. and class-condtonal probablty densty functons of xx,, x,,x denote as fx,x,..x /,,.., are known then we can derve the optmal Bayes recognton algorthm mnmzng the rsk functon [3], [4]: R E L,.e. expected value of the loss ncurred f a pattern from,,, s assgned to the classes,. the classes In the case of multtask classfcaton we can defne the acton of recognzer, whch leads to so-called drect approach. []. In that nstance, classfcaton s a sngle acton. The obect s classfed to the classes, on the bass of full features vector x x, x,, x smultaneously. That we can see below Fg.. x x x. Ψx... Fg.. Block scheme of the drect multtask pattern recognton algorthm.

A Bayes Algorthm for the Multtask Pattern Recognton Problem 5 Let Ψx denotes drect pattern recognton algorthm: Ψ x Ψ x, x,, x, x k X k, k M k Mnmzaton of the rsk functon R: [ Ψ x ] E{ L,,, } R,, 3 where L denotes the loss functon, leads to the optmal algorthm Ψ *. R * Ψ mn R Ψ 4 Ψ Average rsk 3 expresses formula: R { Ψ L[,,,, X M M M * p,, / x} f x dx,, ]* 5 where: p,, / x} p,, f x / f x,, 6 denotes a posteror probablty for the set of classes,,, As we can easly show the formula: r, M M, x E[ L,,,, / x] 7 M L[,,,,, ] p,,, / x presents average condtonal rsk. Hence, the Bayes algorthm for multtask pattern recognton for drect approach may be derved. As we can see, t s result of

6 E. Puchala optmzaton problem 4 soluton. Thus, we have obtaned optmal algorthm lke below: r, Ψ x,, x,, f mn r,, x 8 Ψ x, f M M f x /, mn,, M p,, M,, L[, M M L[, f x /,,,,..,,,,,, ] p,,,,, 9 Let us consder characterstc form of loss functon L. Value of ths functon depends on number of msclassfcaton decsons: L[,,,,, ] n 0 Where n denotes number of pars algorthm s decson k and real class for wtch. In ths case, average condtonal rsk has the followng form: k k r,,,, x [ p / x + p / x + + p / x] Because number of tasks s constant for each practcal problem and we are lookng for mnmum of average condtonal rsk, then optmal multtask pattern recognton algorthm for so called drect approach wll be allowed to wrte lke below: k Ψ p x, k / x max, k f p k / x

A Bayes Algorthm for the Multtask Pattern Recognton Problem 7 The average rsk functon, for the loss functon L 0, s the sum of the ncorrect classfcaton probabltes n ndvdual tasks: R[ Ψ] P n c M P n M e n n M [ P n] q n c /,,, p,,, 3 where q /,,, s the probablty of correct classfcaton for obect n,,, from classes n n-th task: q /,,, n M n M n n+ M n+ M D x f x /,, dx 4 D - decson area for algorthm x x Ψ. 3 Multtask Recognton wth Learnng In the real world there s often a lack of exact knowledge of a pror probabltes and class-condtonal probablty densty functons. For nstance, there are stuatons n whch only a learnng sequence: S L x,, x,,, x m, m 5 where: x k x k,,x k X, k k,, k M 6 as a set of correctly classfed samples, s known. In ths case we can use the algorthms known for conventonal pattern recognton, but now algorthm must be formulated n the verson correspondng to above concept. As an example let us consder α - nearest neghbour α - multtask recognton algorthm for drect approach.

8 E. Puchala Let us denote: p m I m 5 estmator of the a pror classes probablty, where: I - number of obects from class n learnng sequence, m number of obects n learnng sequence, and f m x / x α 6 I V x estmator of the densty functon for α - algorthm, where: I - number of obects from class n learnng sequence, α - number of obect s x neghbours, α x the area wth x - number of obects from class whch are neghbours of x and belong to V volume. On the bass of and 5, 6 fnal form of the multtask pattern recognton algorthm wth learnng s done: k Ψ m x, x,, x,, α x / V x max α x / V k k, k k f k x 7 In the Fg. we can see values of probablty of correct classfcaton for α - nearest neghbour α - multtask recognton algorthm for drect approach dependng on the length of learnng sequence. Probablty of correct classfcaton rses, when numbers of elements n learnng sequence rses too. When m 350 or more, probablty has values between 0,8 and 0,9. These results where obtaned for computer smulated generated set of correctly classfed samples.

A Bayes Algorthm for the Multtask Pattern Recognton Problem 9 probablty of correct classyfcaton 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0, 0, 0 50 00 50 00 50 300 350 400 450 500 α3 α5 α7 length of learnng sequence Fg.. Probablty of correct classfcaton as functon of learnng sequence s length, for varous number of neghbors α α - algorthm The superorty the multtask α - algorthm n drect verson over the classcal pattern recognton one demonstrates the effectveness of ths concept n such multtask classfcaton problems for whch the decomposton s necessary from the functonal or computatonal pont of vew e.g. n medcal dagnoss. Drect approach to multtask recognton algorthms gves better results then decomposed approach because such algorthms take nto consderaton correlaton between ndvdual classfcaton problems. Acknowledgement. The work presented n ths paper s a part of the proect The Artfcal Intellgence Methods for Decsons Support Systems. Analyss and Practcal Applcatons realzed n the Hgher State School of Professonal Educaton n Legnca. References. Kurzynsk, M., Puchala, E., :Algorthms of the multperspectve recognton. Proc. of the th Int. Conf. on Pattern Recognton, Hague 99. Puchala, E., Kurzynsk, M.,: A branch-and-bound algorthm for optmzaton of multperspectve classfer. Proceedngs of the th IAPR, Jerusalem, Israel, 994 35-39

0 E. Puchala 3. Parzen, E.,: On estmaton of a probablty densty functon and mode. Ann. Math. Statst., 96 Vol.33, 065-076 4. Duda, R., Hart, P.,: Pattern classfcaton and scene analyss. John Wley & Sons, ew York 973 5. Fukunaga, K., : Introducton to Statstcal Pattern Recognton, Academc Press, ew York 97.