Additive Preference Model with Piecewise Linear Components Resulting from Dominance-based Rough Set Approximations

Size: px
Start display at page:

Download "Additive Preference Model with Piecewise Linear Components Resulting from Dominance-based Rough Set Approximations"

Transcription

1 Additive Preference Model with Piecewise Linear Components Resulting from Dominance-based Rough Set Approximations Krzysztof Dembczyński, Wojciech Kotłowski, and Roman Słowiński,2 Institute of Computing Science, Poznań University of Technology, Poznań, Poland {kdembczynski, wkotlowski, 2 Institute for Systems Research, Polish Academy of Sciences, -447 Warsaw, Poland Abstract. Dominance-based Rough Set Approach (DRSA) has been proposed for multi-criteria classification problems in order to handle inconsistencies in the input information with respect to the dominance principle. The end result of DRSA is a decision rule model of Decision Maker preferences. In this paper, we consider an additive function model resulting from dominance-based rough approximations. The presented approach is similar to UTA and UTADIS methods. However, we define a goal function of the optimization problem in a similar way as it is done in Support Vector Machines (SVM). The problem may also be defined as the one of searching for linear value functions in a transformed feature space obtained by exhaustive binarization of criteria. Introduction The rough set approach has often proved to be an interesting tool for solving a classification problem that consists in an assignment of objects from set A, described by condition attributes, to pre-defined decision classes Cl t, where t T and T is a finite set of numerically coded labels. In order to solve the problem (i.e., to classify all objects from A), a decision rule model is induced from a set of reference (training) objects U A. The rough set analysis starts with computing lower and upper rough approximations of decision classes. The lower approximation of a decision class contains objects (from U) certainly belonging to the decision class without any inconsistency. The upper approximation of a decision class contains objects possibly belonging to the decision class that may cause inconsistencies. In the simplest case, the inconsistency is defined as a situation where two objects described by the same values of condition attributes (it is said that these objects are indiscernible) are assigned to different classes. In the next step, decision rules are induced from lower and upper rough approximations. These rules represent, respectively, certain and possible patterns explaining relationships between conditions and decisions. The model in the form of decision rules permits to classify all objects from A. Greco. Matarazzo and Słowiński [5, 6, 3] have introduced a rough set approach (called Dominance-based Rough Set Approach DRSA) for solving

2 the problem of multi-criteria classification. In this problem, it is additionally assumed that the domains of attributes (scales) are preference-ordered. The decision classes are also preference-ordered according to an increasing order of class labels, i.e. for all r, s T, such that r > s, the objects from Cl r are strictly preferred to the objects from Cl s. The condition attributes are often referred to as condition criteria. DRSA extends the classical approach by substituting the indiscernibility relation by a dominance relation, which permits taking into account the preference order. The inconsistency is defined in view of a dominance principle that requires that any object x, having not worse evaluations than any object y on the considered set of criteria, cannot be assigned to a worse class than y. Moreover, unlike in the classical rough set approach, there is no need in DRSA to make discretization of numerical attributes. The preference model is a necessary component of a decision support system for multi-criteria classification. Construction of preference model requires some preference information from the Decision Maker (DM). Classically, these are substitution rates among criteria, importance weights, comparisons of lotteries, etc.. Acquisition of this preference information from the DM is not easy. In this situation, the preference model induced from decision examples provided by the DM has clear advantages over the classical approaches. DRSA, but also UTA [8] and UTADIS [9, 5], follows the paradigm of inductive learning (in Multi- Criteria Decision Analysis referred to as a preference-disaggregation approach). It is very often underlined by Słowiński, Greco and Matarazzo (see, for example [3]) that a decision rule model has another advantage over other models, i.e. it is intelligible and speaks the language of the DM. However, in the case of many numerical criteria and decision classes, the set of decision rules may be huge and may loose its intelligibility. In such situations, an additive model composed of marginal value (utility) functions, like in UTA and UTADIS, may be helpful. The marginal value functions are usually presented graphically to the DM in order to support her/his intuition. In the following, we present an extension of DRSA, where after computing rough approximations, additive value functions are constructed instead of a set of decision rules. The additive value function is composed of piecewise linear marginal value functions. Its construction is proceeded by solving a problem of mathematical programming similar to that formulated in UTA and UTADIS. The main difference is that we define a goal function of the optimization problem in a similar way as it is done in Support Vector Machines (SVM) [4]. However, the obtained additive value functions, for lower and upper rough approximations of decision classes, may not cover accurately all objects belonging to corresponding rough approximations. It is caused by a limited capacity of an additive model based on piecewise linear functions to represent preferences as proved in [7, 2]. The problem may be also defined as the one of searching linear value functions in a transformed feature space obtained by exhaustive binarization of criteria. The paper is organized as follows. In Section 2, DRSA involving piecewise linear marginal value functions is presented. Section 3 contains first experimental results of the methodology. The last section concludes the paper.

3 Table. Decision table: q and q 2 indicate criteria, d class label. The last two columns present range of generalized decision function; objects x 2 and x 3 are inconsistent. U q q 2 d(x) l(x) u(x) x.25.3 x x x Piecewise Linear Marginal Value Functions and Dominance-based Rough Set Approach Assume, we have a set of objects A described by n criteria. We assign to each object x a vector x = (q (x),..., q n (x)), where i-th coordinate q i (x) is a value (evaluation) of object x on criterion q i, i =,..., n. For simplicity, it is assumed that domains of criteria are numerically coded with an increasing order of preference. The objective of multi-criteria classification problem is to build a preference model, according to which a class label d(x) from a finite set T is assigned to every object from A. Here, for simplicity, it is assumed that T = {, }. It corresponds to that the objects from Cl are strictly preferred to the objects from Cl. We assume that the DM provides a preference information concerning a set of reference objects U A, assigning to each object x U a label d(x) T. Reference objects described by criteria and class labels are often presented in the decision table. An example of the decision table is presented in Table. The criteria aggregation model (preference model) is assumed to be additive value function: n Φ(x) = w i φ i (q i (x)) () i= where φ i (q i (x)), i =,..., n, are non-decreasing marginal value functions, normalized between and, w i is a weight of φ i (q i (x)). A similar aggregation model with was used in [8] within UTA method (for ranking problems) and UTADIS [9, 5] (for multi-criteria classification problems), where marginal value functions were assumed to be piecewise linear. The use of this aggregation model for classification requires existence of threshold φ, such that d(x) = if Φ(x) φ and d(x) = otherwise (so d(x) = sgn(φ(x) φ )). The error, which is the sum of differences Φ(x) φ of misclassified objects is minimized. Assume however, that objects can be inconsistent. By inconsistency we mean violation of the dominance principle, requiring that any object x, having not worse evaluations than any object y on the considered set of criteria, cannot be assigned to a worse class than y. If such inconsistencies occur, the UTA method is not able to find any additive value function compatible with this information, whatever the complexity of the marginal functions (number of breakpoints) is, since none monotonic function can model this information. Within DRSA, such inconsistencies can be handled by using concepts of lower and upper approxi-

4 mations of classes. It was shown [3] that it corresponds to generalized decision function δ for an object x U: where δ(x) = l(x), u(x) (2) l(x) = min{d(x) : ydx, y U} u(x) = max{d(x) : xdy, y U} (3) where D is a dominance relation defined as xdy i {,...,n} q i (x) q i (y). In other words, given the preference information, for object x there is determined a range of decision classes to which x may belong. This range results from taking into account inconsistencies caused by x. Remark that without inconsistencies, for all x U, l(x) = u(x). Moreover, if we assign to each x U a class index l(x) (instead of d(x)), the decision table becomes consistent (similarly if we assign a class index u(x) for all x U). Thus one can deal with inconsistent set U, by considering two consistent sets with two different labelings. The values of generelized decision function are also presented in Table. In terms of further classification, the response of such model is a range of classes, to which an object may belong. For the two consistent decision tables it is possible to derive compatible value functions Φ l (x), Φ u (x) respectively, and corresponding marginal value functions φ l i (q i(x)) and φ u i (q i(x)) We assume that both φ l i (q i(x)) and φ u i (q i(x)) have piecewise linear form: k φ i (q i (x)) = c r i + r= c k i h k i hk i (q i (x) h k i ), for h k i q i (x) h k i (4) where h k i is the location of the k-th brakepoint on the i-th criterion (k =,..., κ i, where κ i is a number of brakepoints on i-th criterion), and c k i is an increment of marginal value function between brakepoints h k i and h k i, i.e. φ i(h k i ) φ i(h k+ i ). Equation (4) states that function φ i evaluated at q i (x) equals to the sum of increments on all intervals on the left of q i (x) and linearly approximated increment in the interval where q i (x) is located. The example is shown on Figure. In the simplest form, the corresponding optimization problem can be formulated for lower bound of generalized decision (Φ l i (x)) as follows: m min: σj l (5) j= subject to constraints: Φ l (x j ) φ l + σ l j x j : l(x j ) = (6) Φ l (x j ) φ l σ l j x j : l(x j ) = (7) σ l j x j (8) φ l i(z i ) = i {,..., n} (9) φ l i(z i ) = i {,..., n} () c k i i {,..., n}, k {,..., κ i } ()

5 .9 φi(qi(x)) q i(x) Fig.. Piecewise linear marginal value function φ i defined by Equation 4. The increments are: c i =., c 2 i =.5, c 3 i =.3, c 4 i =.. where m is the number of reference objects, σ j are possible errors, zi and z i are the highest and the lowest value on i-th criterion. Constraints (6) and (7) ensure correct separation, (9) and () control scaling and () preserves monotonicity. Analogous problem can be formulated for upper bound of generalized decision (function Φ u (x)). It is worth noting, that the method does not assure all errors become zero as the complexity of φ l i (q i(x)) and φ u i (q i(x)) grows, however, it avoids errors caused by inconsistencies. If all σi l and σu i become, the obtained model is concordant with DRSA in the sense that all objects belonging to lower or upper approximations will be reassigned by the obtained functions to these approximations. It is worth introducing some measure of complexity of marginal functions and minimize it, to avoid building complex models. Notice, that as the function φ i (q i (x)) is multiplied in () by weight w i, we can introduce new coefficients wi k = c k i w i in order to keep the problem linear. Now control of the complexity is done by minimizing a regularization term: n κ i w 2 = (wi k ) 2 (2) i= k= instead of controlling the scale of functions in (9) and (). Minimizing of term may lead to rescaling the utility function, so that all values of φ i (q i (x)) will decrease down almost to zero. To avoid that, constraints are modified introducing the unit margin around threshold, in which no object may appear without error. Thus we rewrite equations (6) and (7) as: The objective of the optimization is now: (Φ l (x j ) φ l )l(x j ) σ l i (3) min: w l 2 + C m σj l (4) j=

6 where C is the complexity constant. Analogous reasoning may be proceeded for Φ u (x). Such problem resembles maximal margin classifier and Support Vector Machines [4]. We will try to bring it even more similar. Let us first modify Φ(x) to be Φ(x) = n i= w iφ i (q i (x)) φ. Now, we decompose each function φ i (q i (x)) into κ i functions φ i,k (q i (x)) in the following way: κ i φ i (q i (x)) = c i kφ i,k (q i (x)). (5) k= An example of such decomposition is shown on Figure 2. One can treat the family of functions {φ, (q (x)),..., φ n,κn (q n (x))} as a transformation of the space of criteria. Namely, there is a map T : A R s where s = n i= κ i, such that T (x) = (φ, (q (x)),..., φ n,κn (q n (x))). By substituting wi k = w i c k i and denoting w = (w,..., wn κn ), the function () becomes: Φ(x) = w, T (x) φ (6) where, is a canonical dot product in R s. φi,(qi(x)) φi,2(qi(x)) q i(x) q i(x) φi,3(qi(x)) φi,4(qi(x)) q i(x) q i(x) Fig. 2. Functions obtained by decomposing φ i(q i(x)). Notice that φ i(q i(x) =.φ i,(q i(x)) +.5φ i,2(q i(x)) +.3φ i,3(q i(x)) +.φ i,4(q i(x)).

7 Finally, after reformulating the problem we obtain: min: w l + C m σj l (7) j= subject to constraints: ( w l, T l (x) φ l )l(x) σ l i x i U (8) σ l i i {,..., m} (9) w k i i {,..., n}, k {,..., κ i } (2) Notice that the regularization term is just squared Euclidean norm of the weight vector in new feature space. Motivated by the above result, we introduce a kernel function k : A A R, defined as: n k(x, y) = k i (x, y) (2) where i= κ i k i (x, y) = φ i,j (q i (x))φ i,j (q i (y)). (22) j= Notice, that k(x, y) = T (x), T (y). Assume that we have a brakepoint in each evaluation point of all objects x U, the set of brakepoints for i-th criterion is {q i (x ),..., q i (x m )}. Then the computing of the marginal kernel function k i (x, y) boils down to: k i (x, y) = min{rank i (x), rank i (y)} (23) where rank i (x) is a position (in ranking) of value q i (x) on i-th criterion. Thus, the problem may be formulated in a dual form. The most essential advantage of such approach is reduction in number of variables, irrespective to the number of brakepoints of marginal functions. As the complexity of marginal functions increases, the optimization problem remains the same and only the computation of the kernel function becomes harder. However there is a problem, how to ensure monotonicity of the resulting utility function. In the dual formulation the information about each criterion is lost, thus not all the weights may be non-negative. Let us remark that the transformed criteria space obtained by image of mapping T (A) (i.e. (φ, (q (x)),..., φ n,κn (q n (x))), x A) may be also seen as a result of binarization of criteria. This type of binarization should be called an exhaustive one by analogy to other approaches well-known in rough set theory or in logical analysis of data (see for example, []). The exhaustive binarization is proceeded by choosing cut points in each evaluation point of all objects x U. More precisely, the binarization of the i-th criterion is accomplished in a straightforward way by associating with each value

8 Table 2. Decision table from Table with binarized criteria U q.25 q.5 q.75 q q 2.3 q 2.6 q 2.65 q 2.7 d x x 2 x 3 x 4 v on this criterion, for which there exists an object x, such that q i (x) = v, a boolean attribute q iv such that: { if qi (x) v q iv (x) = if q i (x) < v. (24) Table 2 shows the exhaustive binarization of criteria from Table. Moreover, let us remark that the binarized decision table contains almost the same information as dominance matrix introduced in [2]. The dominance matrix DM is defined as follows: DM = {dm(x, y) : x, y U}, where dm(x, y) = {q i Q : q i (x) q i (y)}. (25) where Q = {q i, i =,..., n}. Dominance Matrix DM is usually implemented as 3-dimensional binary cube C defined as c jki =, if q i dm(x j, x k ), and c jki = otherwise, where j, k =,..., m and i =,..., n. Such a structure is very useful in a procedure of generating exhaustive set of decision rules [2], because all computations may be quickly proceeded as bitwise operations. It is a counterpart of a discernibility matrix [] well-known in classical rough set approach. It is easy to see that the following occurs: c jki = q iqi (x k ) (x j) =, x j, x k U. 3 Experimental results We performed a computational experiment on Wisconsin breast cancer (BCW) data obtained from the UCI Machine Learning Repository []. This problem was chosen since it is known to have monotonic relationship between values on condition attributes and decision labels. Thus, all attributes can be interpreted as criteria, enabling DRSA. BCW consist of 699 instances described by 9 integervalued attributes, from which 6 instances have missing values. Each instance is assigned to one of two classes (malignant and benign). Several approaches have been compared with the methodology presented in the previous section that will be referred to as Piecewise Linear DRSA (PL- DRSA). These are k-nearest Neighbours, linear Support Vector Machines, Logistic Regression, J48 Decision Trees and Naive Bayes. WEKA [4] software was used for the experiment. For all algorithms a criteria selection was performed, by

9 Table 3. Experimental results for Wisconsin breast cancer data. Algorithm Number of criteria loo estimate k-nn (k = ) % linear SVM % J % Logistic Regr % Naive Bayes 6 97.% PL-DRSA % using backward elimination. The number of criteria left and the leaving-one-out (loo) accuracy estimate are shown in Table 3. Apparently, all the accuracy estimates are similar. PL-DRSA was conducted with setting breakpoints on each evaluation point of all objects on each criterion. Two models were created, one for lower bound of the decision range, second for the upper bound. The classification rule was the following: if Φ l (x) Φ u (x) then assign x to class Cl otherwise assign x to Cl. The achieved accuracy was one of the best, competitive to other methods. However, marginal value functions constructed in PL-DRSA may be presented graphically and easily interpreted. Moreover, PL-DRSA shows the inconsistent data both in learning and classification stage. 4 Conclusions Within DRSA framework, the decision rule model were always considered for multicriteria decision analysis. We presented an alternative method, related to additive aggregation model, similar to the one used in the UTA method. The described approach has several advantages. First, it is flexible and allows various shapes of separating function to be obtained. Marginal value functions may also be presented graphically and interpreted by DM. Finally, PL-DRSA can control the complexity of the additive function by fixing the number of breakpoints and minimizing the slopes in each breakpoint. DRSA plays important role in handling inconsistencies, which affect the data. Ignoring them may cause errors and, therefore generate wrong decision model. The method can be also interpreted in terms of criteria transformation (in a specific case, also in terms of binarization) and Support Vector Machines. Acknowledgements. The authors wish to acknowledge financial support from the Ministry of Education and Science (grant no. 3TF 227). References. Boros, E., Hammer, P. L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An Implementation of Logical Analysis of Data. IEEE Trans. on Knowledge and Data Engineering 2 (2),

10 2. Dembczyński, K., Pindur, R., Susmaga R.: Generation of Exhaustive Set of Rules within Dominance-based Rough Set Approach. Electr. Notes Theor. Comput. Sci, 82 4 (23) 3. Dembczyński, K., Greco, S., Słowiński, R.: Second-order Rough Approximations in Multi-criteria Classification with Imprecise Evaluations and Assignments. LNAI 364 (25) Frank, E., Witten, I. H.: Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann, San Francisco, (25) 5. Greco S., Matarazzo, B. and Słowiński, R.: Rough approximation of a preference relation by dominance relations, European Journal of Operational Research 7 (999) Greco S., Matarazzo, B. and Słowiński, R.: Rough sets theory for multicriteria decision analysis. European Journal of Operational Research, 29 (2) Greco S., Matarazzo, B. and Słowiński, R.: Axiomatic characterization of a general utility function and its particular cases in terms of conjoint measurement and roughset decision rules. European Journal of Operational Research, 58, (24) Jacquet-Lagréze, E., Siskos, Y.: Assessing a set of additive utility functions for multicriteria decision making: The UTA method. European Journal of Operational Research, (982) Jacquet-Lagréze, E.: An application of the UTA discriminant model for the evaluation of R&D projects, In Pardalos, P.M., Siskos, Y., Zopounidis, C., (eds.): Advances in multicriteria analysis. Kluwer Academic Publishers, Dordrecht (995) Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases, Irvine, CA: University of California, Department of Information and Computer Science (998). Skowron A., Rauszer C.: The discernibility matrices and functions in information systems. In Slowinski R. (ed.): Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory. Kluwer Academic Publishers, Dordrecht (992) Słowiński, R., Greco, S., Matarazzo, B.: Axiomatization of utility, outranking and decision-rule preference models for multiple-criteria classification problems under partial inconsistency with the dominance principle. Control & Cybernetics 3 (22) Słowiński, R., Greco S., Matarazzo, B., Rough Set Based Decision Support, Chapter 6, in Burke, E., Kendall, G. (eds.): Introductory Tutorials on Optimization, Search and Decision Support Methodologies. Springer-Verlag, Boston (25) 4. Vapnik, V.: The Nature of Statistical Learning Theory, New York, Springer-Verlag (995) 5. Zopounidis, C., Doumpos, M.: PREFDIS: a multicriteria decision support system for sorting decision problems, Computers & Operations Research 27 (2)

Similarity-based Classification with Dominance-based Decision Rules

Similarity-based Classification with Dominance-based Decision Rules Similarity-based Classification with Dominance-based Decision Rules Marcin Szeląg, Salvatore Greco 2,3, Roman Słowiński,4 Institute of Computing Science, Poznań University of Technology, 60-965 Poznań,

More information

Statistical Model for Rough Set Approach to Multicriteria Classification

Statistical Model for Rough Set Approach to Multicriteria Classification Statistical Model for Rough Set Approach to Multicriteria Classification Krzysztof Dembczyński 1, Salvatore Greco 2, Wojciech Kotłowski 1 and Roman Słowiński 1,3 1 Institute of Computing Science, Poznań

More information

ENSEMBLES OF DECISION RULES

ENSEMBLES OF DECISION RULES ENSEMBLES OF DECISION RULES Jerzy BŁASZCZYŃSKI, Krzysztof DEMBCZYŃSKI, Wojciech KOTŁOWSKI, Roman SŁOWIŃSKI, Marcin SZELĄG Abstract. In most approaches to ensemble methods, base classifiers are decision

More information

The possible and the necessary for multiple criteria group decision

The possible and the necessary for multiple criteria group decision The possible and the necessary for multiple criteria group decision Salvatore Greco 1, Vincent Mousseau 2 and Roman S lowiński 3 1 Faculty of Economics, University of Catania, Italy, salgreco@unict.it,

More information

Stochastic Dominance-based Rough Set Model for Ordinal Classification

Stochastic Dominance-based Rough Set Model for Ordinal Classification Stochastic Dominance-based Rough Set Model for Ordinal Classification Wojciech Kotłowski wkotlowski@cs.put.poznan.pl Institute of Computing Science, Poznań University of Technology, Poznań, 60-965, Poland

More information

Relationship between Loss Functions and Confirmation Measures

Relationship between Loss Functions and Confirmation Measures Relationship between Loss Functions and Confirmation Measures Krzysztof Dembczyński 1 and Salvatore Greco 2 and Wojciech Kotłowski 1 and Roman Słowiński 1,3 1 Institute of Computing Science, Poznań University

More information

An algorithm for induction of decision rules consistent with the dominance principle

An algorithm for induction of decision rules consistent with the dominance principle An algorithm for induction of decision rules consistent with the dominance principle Salvatore Greco 1, Benedetto Matarazzo 1, Roman Slowinski 2, Jerzy Stefanowski 2 1 Faculty of Economics, University

More information

Non-additive robust ordinal regression with Choquet integral, bipolar and level dependent Choquet integrals

Non-additive robust ordinal regression with Choquet integral, bipolar and level dependent Choquet integrals Non-additive robust ordinal regression with Choquet integral, bipolar and level dependent Choquet integrals Silvia Angilella Salvatore Greco Benedetto Matarazzo Department of Economics and Quantitative

More information

Feature Selection with Fuzzy Decision Reducts

Feature Selection with Fuzzy Decision Reducts Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium

More information

Ordinal Classification with Decision Rules

Ordinal Classification with Decision Rules Ordinal Classification with Decision Rules Krzysztof Dembczyński 1, Wojciech Kotłowski 1, and Roman Słowiński 1,2 1 Institute of Computing Science, Poznań University of Technology, 60-965 Poznań, Poland

More information

Reference-based Preferences Aggregation Procedures in Multicriteria Decision Making

Reference-based Preferences Aggregation Procedures in Multicriteria Decision Making Reference-based Preferences Aggregation Procedures in Multicriteria Decision Making Antoine Rolland Laboratoire ERIC - Université Lumière Lyon 2 5 avenue Pierre Mendes-France F-69676 BRON Cedex - FRANCE

More information

Chapter 2 An Overview of Multiple Criteria Decision Aid

Chapter 2 An Overview of Multiple Criteria Decision Aid Chapter 2 An Overview of Multiple Criteria Decision Aid Abstract This chapter provides an overview of the multicriteria decision aid paradigm. The discussion covers the main features and concepts in the

More information

Interactive Multi-Objective Optimization (MOO) using a Set of Additive Value Functions

Interactive Multi-Objective Optimization (MOO) using a Set of Additive Value Functions Interactive Multi-Objective Optimization (MOO) using a Set of Additive Value Functions José Rui Figueira 1, Salvatore Greco 2, Vincent Mousseau 3, and Roman Słowiński 4 1 CEG-IST, Center for Management

More information

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support

More information

Support Vector Machine via Nonlinear Rescaling Method

Support Vector Machine via Nonlinear Rescaling Method Manuscript Click here to download Manuscript: svm-nrm_3.tex Support Vector Machine via Nonlinear Rescaling Method Roman Polyak Department of SEOR and Department of Mathematical Sciences George Mason University

More information

Brief Introduction to Machine Learning

Brief Introduction to Machine Learning Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Hypothesis Space variable size deterministic continuous parameters Learning Algorithm linear and quadratic programming eager batch SVMs combine three important ideas Apply optimization

More information

Support Vector Machine Classification via Parameterless Robust Linear Programming

Support Vector Machine Classification via Parameterless Robust Linear Programming Support Vector Machine Classification via Parameterless Robust Linear Programming O. L. Mangasarian Abstract We show that the problem of minimizing the sum of arbitrary-norm real distances to misclassified

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane

More information

Interpreting Low and High Order Rules: A Granular Computing Approach

Interpreting Low and High Order Rules: A Granular Computing Approach Interpreting Low and High Order Rules: A Granular Computing Approach Yiyu Yao, Bing Zhou and Yaohua Chen Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail:

More information

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:

More information

Support Vector Machines Explained

Support Vector Machines Explained December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

A Simple Algorithm for Multilabel Ranking

A Simple Algorithm for Multilabel Ranking A Simple Algorithm for Multilabel Ranking Krzysztof Dembczyński 1 Wojciech Kot lowski 1 Eyke Hüllermeier 2 1 Intelligent Decision Support Systems Laboratory (IDSS), Poznań University of Technology, Poland

More information

Study on Classification Methods Based on Three Different Learning Criteria. Jae Kyu Suhr

Study on Classification Methods Based on Three Different Learning Criteria. Jae Kyu Suhr Study on Classification Methods Based on Three Different Learning Criteria Jae Kyu Suhr Contents Introduction Three learning criteria LSE, TER, AUC Methods based on three learning criteria LSE:, ELM TER:

More information

An axiomatic approach to ELECTRE TRI 1

An axiomatic approach to ELECTRE TRI 1 An axiomatic approach to ELECTRE TRI 1 Denis Bouyssou 2 CNRS LAMSADE Thierry Marchant 3 Ghent University 22 April 2004 Prepared for the Brest Proceedings Revised following the comments by Thierry 1 This

More information

Ensembles of classifiers based on approximate reducts

Ensembles of classifiers based on approximate reducts Fundamenta Informaticae 34 (2014) 1 10 1 IOS Press Ensembles of classifiers based on approximate reducts Jakub Wróblewski Polish-Japanese Institute of Information Technology and Institute of Mathematics,

More information

ROBUST ORDINAL REGRESSION

ROBUST ORDINAL REGRESSION Chapter 1 ROBUST ORDINAL REGRESSION Salvatore Greco Faculty of Economics, University of Catania, Corso Italia 55, 95 129 Catania, Italy salgreco@unict.it Roman S lowiński Institute of Computing Science,

More information

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination

More information

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

Minimal Attribute Space Bias for Attribute Reduction

Minimal Attribute Space Bias for Attribute Reduction Minimal Attribute Space Bias for Attribute Reduction Fan Min, Xianghui Du, Hang Qiu, and Qihe Liu School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu

More information

Aijun An and Nick Cercone. Department of Computer Science, University of Waterloo. methods in a context of learning classication rules.

Aijun An and Nick Cercone. Department of Computer Science, University of Waterloo. methods in a context of learning classication rules. Discretization of Continuous Attributes for Learning Classication Rules Aijun An and Nick Cercone Department of Computer Science, University of Waterloo Waterloo, Ontario N2L 3G1 Canada Abstract. We present

More information

Linear Classification and SVM. Dr. Xin Zhang

Linear Classification and SVM. Dr. Xin Zhang Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a

More information

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37 COMP 652: Machine Learning Lecture 12 COMP 652 Lecture 12 1 / 37 Today Perceptrons Definition Perceptron learning rule Convergence (Linear) support vector machines Margin & max margin classifier Formulation

More information

A new Approach to Drawing Conclusions from Data A Rough Set Perspective

A new Approach to Drawing Conclusions from Data A Rough Set Perspective Motto: Let the data speak for themselves R.A. Fisher A new Approach to Drawing Conclusions from Data A Rough et Perspective Zdzisław Pawlak Institute for Theoretical and Applied Informatics Polish Academy

More information

Support Vector Machine & Its Applications

Support Vector Machine & Its Applications Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification

More information

Learning Kernel Parameters by using Class Separability Measure

Learning Kernel Parameters by using Class Separability Measure Learning Kernel Parameters by using Class Separability Measure Lei Wang, Kap Luk Chan School of Electrical and Electronic Engineering Nanyang Technological University Singapore, 3979 E-mail: P 3733@ntu.edu.sg,eklchan@ntu.edu.sg

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Research on Complete Algorithms for Minimal Attribute Reduction

Research on Complete Algorithms for Minimal Attribute Reduction Research on Complete Algorithms for Minimal Attribute Reduction Jie Zhou, Duoqian Miao, Qinrong Feng, and Lijun Sun Department of Computer Science and Technology, Tongji University Shanghai, P.R. China,

More information

Support Vector Machines on General Confidence Functions

Support Vector Machines on General Confidence Functions Support Vector Machines on General Confidence Functions Yuhong Guo University of Alberta yuhong@cs.ualberta.ca Dale Schuurmans University of Alberta dale@cs.ualberta.ca Abstract We present a generalized

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM 1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University

More information

A Posteriori Corrections to Classification Methods.

A Posteriori Corrections to Classification Methods. A Posteriori Corrections to Classification Methods. Włodzisław Duch and Łukasz Itert Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland; http://www.phys.uni.torun.pl/kmk

More information

On Improving the k-means Algorithm to Classify Unclassified Patterns

On Improving the k-means Algorithm to Classify Unclassified Patterns On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector

More information

SUPPORT VECTOR MACHINE

SUPPORT VECTOR MACHINE SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

An Empirical Study of Building Compact Ensembles

An Empirical Study of Building Compact Ensembles An Empirical Study of Building Compact Ensembles Huan Liu, Amit Mandvikar, and Jigar Mody Computer Science & Engineering Arizona State University Tempe, AZ 85281 {huan.liu,amitm,jigar.mody}@asu.edu Abstract.

More information

CIS 520: Machine Learning Oct 09, Kernel Methods

CIS 520: Machine Learning Oct 09, Kernel Methods CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed

More information

Binary Classification, Multi-label Classification and Ranking: A Decision-theoretic Approach

Binary Classification, Multi-label Classification and Ranking: A Decision-theoretic Approach Binary Classification, Multi-label Classification and Ranking: A Decision-theoretic Approach Krzysztof Dembczyński and Wojciech Kot lowski Intelligent Decision Support Systems Laboratory (IDSS) Poznań

More information

Support Vector Machine (continued)

Support Vector Machine (continued) Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of

More information

Dominance-based Rough Set Approach Data Analysis Framework. User s guide

Dominance-based Rough Set Approach Data Analysis Framework. User s guide Dominance-based Rough Set Approach Data Analysis Framework User s guide jmaf - Dominance-based Rough Set Data Analysis Framework http://www.cs.put.poznan.pl/jblaszczynski/site/jrs.html Jerzy Błaszczyński,

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

MinOver Revisited for Incremental Support-Vector-Classification

MinOver Revisited for Incremental Support-Vector-Classification MinOver Revisited for Incremental Support-Vector-Classification Thomas Martinetz Institute for Neuro- and Bioinformatics University of Lübeck D-23538 Lübeck, Germany martinetz@informatik.uni-luebeck.de

More information

CRITERIA REDUCTION OF SET-VALUED ORDERED DECISION SYSTEM BASED ON APPROXIMATION QUALITY

CRITERIA REDUCTION OF SET-VALUED ORDERED DECISION SYSTEM BASED ON APPROXIMATION QUALITY International Journal of Innovative omputing, Information and ontrol II International c 2013 ISSN 1349-4198 Volume 9, Number 6, June 2013 pp. 2393 2404 RITERI REDUTION OF SET-VLUED ORDERED DEISION SYSTEM

More information

Lecture 7: Kernels for Classification and Regression

Lecture 7: Kernels for Classification and Regression Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Natural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley

Natural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley Natural Language Processing Classification Classification I Dan Klein UC Berkeley Classification Automatically make a decision about inputs Example: document category Example: image of digit digit Example:

More information

Robust Kernel-Based Regression

Robust Kernel-Based Regression Robust Kernel-Based Regression Budi Santosa Department of Industrial Engineering Sepuluh Nopember Institute of Technology Kampus ITS Surabaya Surabaya 60111,Indonesia Theodore B. Trafalis School of Industrial

More information

Pattern Recognition 2018 Support Vector Machines

Pattern Recognition 2018 Support Vector Machines Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recognition 1 / 48 Support Vector Machines Ad Feelders ( Universiteit Utrecht

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Support Vector Machines with Example Dependent Costs

Support Vector Machines with Example Dependent Costs Support Vector Machines with Example Dependent Costs Ulf Brefeld, Peter Geibel, and Fritz Wysotzki TU Berlin, Fak. IV, ISTI, AI Group, Sekr. FR5-8 Franklinstr. 8/9, D-0587 Berlin, Germany Email {geibel

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Tobias Pohlen Selected Topics in Human Language Technology and Pattern Recognition February 10, 2014 Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6

More information

18.9 SUPPORT VECTOR MACHINES

18.9 SUPPORT VECTOR MACHINES 744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the

More information

Support Vector Machines

Support Vector Machines Two SVM tutorials linked in class website (please, read both): High-level presentation with applications (Hearst 1998) Detailed tutorial (Burges 1998) Support Vector Machines Machine Learning 10701/15781

More information

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI Support Vector Machines II CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Outline Linear SVM hard margin Linear SVM soft margin Non-linear SVM Application Linear Support Vector Machine An optimization

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable

More information

Support Vector Machines. Machine Learning Fall 2017

Support Vector Machines. Machine Learning Fall 2017 Support Vector Machines Machine Learning Fall 2017 1 Where are we? Learning algorithms Decision Trees Perceptron AdaBoost 2 Where are we? Learning algorithms Decision Trees Perceptron AdaBoost Produce

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines www.cs.wisc.edu/~dpage 1 Goals for the lecture you should understand the following concepts the margin slack variables the linear support vector machine nonlinear SVMs the kernel

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

ML (cont.): SUPPORT VECTOR MACHINES

ML (cont.): SUPPORT VECTOR MACHINES ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version

More information

A Logical Formulation of the Granular Data Model

A Logical Formulation of the Granular Data Model 2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University

More information

Review: Support vector machines. Machine learning techniques and image analysis

Review: Support vector machines. Machine learning techniques and image analysis Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall

More information

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

ON COMBINING PRINCIPAL COMPONENTS WITH FISHER S LINEAR DISCRIMINANTS FOR SUPERVISED LEARNING

ON COMBINING PRINCIPAL COMPONENTS WITH FISHER S LINEAR DISCRIMINANTS FOR SUPERVISED LEARNING ON COMBINING PRINCIPAL COMPONENTS WITH FISHER S LINEAR DISCRIMINANTS FOR SUPERVISED LEARNING Mykola PECHENIZKIY*, Alexey TSYMBAL**, Seppo PUURONEN* Abstract. The curse of dimensionality is pertinent to

More information

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

A Reformulation of Support Vector Machines for General Confidence Functions

A Reformulation of Support Vector Machines for General Confidence Functions A Reformulation of Support Vector Machines for General Confidence Functions Yuhong Guo 1 and Dale Schuurmans 2 1 Department of Computer & Information Sciences Temple University www.cis.temple.edu/ yuhong

More information

Group Decision-Making with Incomplete Fuzzy Linguistic Preference Relations

Group Decision-Making with Incomplete Fuzzy Linguistic Preference Relations Group Decision-Making with Incomplete Fuzzy Linguistic Preference Relations S. Alonso Department of Software Engineering University of Granada, 18071, Granada, Spain; salonso@decsai.ugr.es, F.J. Cabrerizo

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

Kernels and the Kernel Trick. Machine Learning Fall 2017

Kernels and the Kernel Trick. Machine Learning Fall 2017 Kernels and the Kernel Trick Machine Learning Fall 2017 1 Support vector machines Training by maximizing margin The SVM objective Solving the SVM optimization problem Support vectors, duals and kernels

More information

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3 - MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3

More information

Support vector machines

Support vector machines Support vector machines Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 SVM, kernel methods and multiclass 1/23 Outline 1 Constrained optimization, Lagrangian duality and KKT 2 Support

More information

Machine Learning Practice Page 2 of 2 10/28/13

Machine Learning Practice Page 2 of 2 10/28/13 Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric

More information

Beyond Sequential Covering Boosted Decision Rules

Beyond Sequential Covering Boosted Decision Rules Beyond Sequential Covering Boosted Decision Rules Krzysztof Dembczyński 1, Wojciech Kotłowski 1,andRomanSłowiński 2 1 Poznań University of Technology, 60-965 Poznań, Poland kdembczynski@cs.put.poznan.pl,

More information

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems

More information