Linear Programming-based Data Mining Techniques And Credit Card Business Intelligence
|
|
- Darcy Warren
- 5 years ago
- Views:
Transcription
1 Linear Programming-based Data Mining Techniques And Credit Card Business Intelligence Yong Shi the Charles W. and Margre H. Durham Distinguished Professor of Information Technology University of Nebraska, USA
2 Introduction -Data Mining -Process -Methodology -Mathematical Tools Contents Linear System Approaches -Linear Programming Methods -Multiple Criteria Linear Programming Methods Credit Card Intelligence -Real-life Credit Card Portfolio Management -Many Others
3 Data Mining: Introduction a powerful information technology (IT) tool in today s competitive business world. an area of the intersection of human intervention, machine learning, mathematical modeling and databases.
4 Introduction Process: Selecting Transforming Mining Interpreting
5 Introduction Methodology: Association Clustering Classification Predictions Sequential Patterns Similar Time Sequences
6 Introduction Mathematical Tools: statistics decision trees neural networks fuzzy logic linear programming
7 Introduction: Classification Use a training data set for predetermined classes Develop a separation model with rules on the training set Apply the model to classify unknown objects Discover knowledge training set model unknown objects knowledge
8 Linear System Approaches: Linear Programming Linear programming has been used for Classification in Data Mining. Given two attributes {a 1, a 2 } and two groups {G 1, G 2 }, with the observation A i = (A i1, A i2 ), we want to find a scalar b and nonzero vector X = (x 1,x 2 ) such that A i X b, A i G 1 and A i X b, A i G 2 have the fewest number of violated constraints.
9 Let Linear System Approaches α i = the overlapping of two-group (classes) boundary for case A i (external measurement); α = the max overlapping of two-group (classes) boundary for all cases A i (α i < α ); β i = the distance of case A i from its adjusted boundary (internal measurement); β = the min distance of all cases A i to the adjusted boundary (β i > β ); h i = the penalties for α i ( cost of misclassification); k i = the penalties for β i ( cost of misclassification);
10 Linear System Approaches G 1: 1 2 : k G 2: k+1 k+2 : n a 1 a 2 A 11 A 12 A 21 A 22 A k1 A k2 A k+1,1 A k+2,2 A k+2,1 A k+2,2 A n1 A n2 Linear Transformed Scores A i X = b A 1 X* A 2 X* : A k X* A k+1 X* A k+2 X* : A n X*
11 Linear System Approaches Example: Consider a credit-rating problem with two variables and two cases: a 1 = salary and a 2 = age. Let the boundary b = 10 Cases (Obs) a 1 a 2 Boundary b = 10 found best coef. x 1 x 2 LP Score A A
12 Linear System Approaches Find the best (x 1 *, x 2 *) to compare A i X* = a 1 x 1 *+ a 2 x 2 * with b = 10 We see A 1 X* = 7.2 is Bad (< 10) and A 2 X* = 10.4 is Good ( > 10) Bad β i β i Good A i X* = 10 Perfect Separation (α = 0
13 A i X* = b-α A i X* = b+α Linear System Approaches Linear System Approaches Example: Overlapping A i X* = b α α β i β i α i α i Bad Good
14 Linear System Approaches Simple Models (Freed and Glover 1981): Minimize Σ i h i α i Subject to A i X b + α i, A i Bad, A i X b - α i, A i Good, where A i are given, X and b are unrestricted, and α i 0.
15 Linear System Approaches Simple Models (Freed and Glover 1981): OR Maximize Σ i β i Subject to A i X b - β i, A i Bad, A i X b + β i, A i Good, where A i are given, X and b are unrestricted, and β i 0.
16 Linear System Approaches Hybrid Model (Glover 1990): Minimize h α + Σ i h i α i - k β -Σ i k i β i Subject to A i X = b + α + α i - β - β i, A i Bad, A i X = b - α - α i + β + β i, A i Good, where A i are given, X and b are unrestricted, α, α i, β i and β 0.
17 Linear System Approaches Mixed Integer Model (Koehler and Erenguc 1990): Minimize Σ j I 1j + Σ i I 2j Subject to A i X b -M I 1j, A i X b + M I 2j, where A i are given, X 0 and b are unrestricted, I 1j = 1, if A i X < b, A i Bad; Otherwise, 0; I 2j = 1, if A i X > b, A i Good; Otherwise, 0.
18 Linear System Approaches Three-group Model (Freed and Glover 1981): Minimize h 1 α 1 + h 2 α 2 Subject to b L1 A i X b U1, A i G 1, b L2 A i X b U2, A i G 2, b L3 A i X b U3, A i G 3, b U1 + ε b L2 + α 1 b U2 + ε b L3 + α 2 where A i and ε are given, X, b Lj (lower bound) and b Uj (upper bound) are unrestricted, and α j (group overlapping) 0.
19 Linear System Approaches a 2 b L1 b U1 b L2 G 1 b U2 b L3 G 2 b U3 G 3 X Three-group LP Model a 1
20 Linear System Approaches Multi-group Model (Freed and Glover 1981): Minimize Σ j h j α j Subject to b Lj A i X b Uj, A i G j, j =1,, m b Uj + ε b Lj+1 + α j, j =1,, m-1 where A i and ε are given, X, b Lj (lower bound) and b Uj (upper bound) are unrestricted, and α j (group overlapping) 0.
21 Linear System Approaches Multi-group Model (Freed and Glover 1981): OR Minimize Σ j h j α j Subject to b Lj - α j A i X b Uj + α j, A i G j, j =1,, m b Uj b Lj+1, j =1,, m-1 where A i and ε are given, X, b Lj (lower bound) and b Uj (upper bound) are unrestricted, and α j (group overlapping) 0.
22 Linear System Approaches Problems and Challenges of LP Approaches: Different normalization will give the different solution of X and b (boundary). Giving a proper value of b may lead a nice separations. Integer variables may improve the misclassification rate of LP model. Penalty cost of misclassification will change the classifier results. The simple model is verified as an useful model alternative to logistic discriminant (classification) model. Both are better than linear discriminant function and quadratic discriminant function. No comparison between Decision Tree Induction and LP approaches
23 Linear System Approaches: Multi-Criteria Linear Programming Multi-Criteria Linear programming considers to simultaneously minimize the total overlapping degree and maximize the total distance from the boundary of two groups: Minimize Σ i α i and Maximize Σ i β i Subject to A i X = b + α i - β i, A i B, A i X = b - α i + β i, A i G, where A i are given, X and b are unrestricted, and α i and β i 0.
24 Linear System Approaches: Multi-Criteria Linear Programming To utilize the capability of computational power of some commercial software on LP and non-lp problems, we can find the compromise solution for the separation problems (Yu 1973, Yu 1985, and Shi and Yu 1989): Let α* = the ideal overlapping of -Σ i α i ; β* = the ideal distance of Σ i β i. Then, we define the regret function as: -d + α = Σ i α + α*, if -Σ i i α i > α*; otherwise, it is 0. d - α = α* + Σ i α i,if -Σ i α i < α*;otherwise, it is 0. d + β = Σ i β - β*, if Σ i i β i > β*; otherwise, it is 0. d - α = β* - Σ i β i,if Σ i β i < β*; otherwise, it is 0.
25 β i Min (d α + + d α - ) p + (d β + +d α - ) p (α*, β*) 0 -Σ i α i
26 Linear System Approaches: Multi-Criteria Linear Programming Thus, the Multi-Criteria separation problem becomes (Shi and Peng 2001): Min (d α + + d α - ) p + (d β + +d β - ) p Subject to α* + Σ i α i = d α - -d α +, β* - Σ i β i = d β - -d β +, A i X = b + α i - β i, A i G, A i X = b - α i + β i, A i B, where A i, α*, and β* are given, X and b are unrestricted, and α i, β i, d α -, d α +, d β -, d β + 0.
27 R(d; ) R(d; 2) R(d; 1) for p 1 R(d; p) = (d + k + d - k ) p and p = R(d; ) = min x max {d + k + d - k k = 1,, q }
28 Linear System Approaches: Multi-Criteria Linear Programming Multi-Criteria Separation Model for three groups (Shi, Peng, Xu and Tang 2001): Given groups (G 1, G 2, G 3 ), let b 1 = the boundary between G 1 and G 2 ; and b 2 = the boundary between G 2 and G 3 ; α i1 = the overlapping of G 1 and G 2 for all cases A i ; β i1 = the distance of case A i from its adjusted boundary between G 1 and G 2 ; α i2 = the overlapping of G 2 and G 3 for all cases A i ; β i2 = the distance of case A i from its adjusted boundary between G 2 and G 3.
29 Linear System Approaches: Multi-Criteria Linear Programming Min Σ i (α i1 + α i2 ) and Max Σ i (β i1 + β i2 ) Subject to A i X = b 1 - α i1 + β i1, A i G 1; A i X = 0.5(b 1 + α i1 - β i 1 + b 2 - α i2 + β i2 ), A i G 2; A i X = b 2 + α i2 - β i2, A i G 3; b 1 + α i1 < b 2 - α i 2 ; where A i are given, X, b 1 and b 2 are unrestricted, and α i1, β i 1, α i2 and β i2 0.
30 Linear System Approaches: Multi-Criteria Linear Programming A i X = b 1 A i X = b 2 β 1 i β 2 α 2 i i β 1 α 1 i i α 2 i β 2 G i 1 G α 1 3 i G 2 b 1 - α 1 b 1 + α 1 b b 2 + α α 2 Three-group MC Model
31 Linear System Approaches: Multi-Criteria Linear Programming The compromise model for three groups can be: Min (d α1 + + d α1 - ) p + (d α2 + + d α2- ) p + (d β1 + +d β1- ) p + (d β2 + + d β2- ) p Subject to α* 1 + Σ i α i1 = d α1- -d α1 + ; β* 1 - Σ i β i1 = d β1- -d β1 + ; α* 2 + Σ i α i2 = d α2- -d α2 + ; β* 2 - Σ i β i2 = d β2- -d β2 + ; A i X = b 1 - α i1 + β i1, A i G 1; A i X = 0.5(b 1 + α i1 - β i 1 + b 2 - α i2 + β i2 ), A i G 2; A i X = b 2 + α i2 - β i2, A i G 3; b 1 + α i1 < b 2 - α i 2 ; where A i are given, X, b 1 and b 2 are unrestricted, and d αj +, d αj-, d βj +, d βj +, α* 1, α* 2, β* 1, β* 2, α i1, β i 1, α i2 and β i2 0.
32 Linear System Approaches: Multi-Criteria Linear Programming Similarly, Multi-Criteria Classification Model for four groups (Kou, Peng, Shi, Wise and Xu 2002): Given groups (G 1, G 2, G 3, G 4 ), we have: Min Σ i (α i1 + α i 2 + α i3 ) and Max Σ i (β i1 + β i 2 + β i3 ) Subject to A i X = b 1 - α i1 + β i1, A i G 1; A i X =.5(b 1 + α i1 - β i 1 + b 2 - α i2 + β i2 ), A i G 2; A i X =.5(b 2 + α i2 - β i 2 + b 3 - α i3 + β i3 ), A i G 3; A i X = b 3 + α i3 - β i3, A i G 4; b 1 + α i1 < b 2 - α i 2 ; b 2 + α i2 < b 3 - α i 3 ; where A i are given, X, b 1, b 2 and b 3 are unrestricted, and α i1, β i 1, α i2, β i 2, α i3 and β i3 0.
33 Linear System Approaches: Multi-Criteria Linear Programming Generally, given groups (G 1, G 2,, G s ), Multi- Criteria Classification Model for s groups is: Minimize Σ i Σ j α ij and Maximize Σ i Σ j β i j Subject to A i X = b 1 - α i1 + β i1, A i G 1; A i X = 0.5(b k-1 + α i k-1 - β i k-1 + b k - α ik + β ik ), A i G k; A i X = b s-1 + α s-1 i - β s-1 i, A i G s; b k-1 + α k-1 i < b k - α k i ; k = 2,,s-1 where A i are given, X, b j are unrestricted, and α j i and β ij 0, j = 1,,s-1.
34 Linear System Approaches: Step 5 Use Predict to mine the s groups from Verifying data set. Multi-Criteria Linear Programming Algorithm. Step 1 Use ReadCHD to convert both Training and Verifying data into data mat. Step 2 Use GroupDef to divide the observations within Training data sets into s groups: G1, G2,., and Gs. Step 3 Use sgmodel to perform the separation task on the training data. Here, PROC LP is called to calculate the MCLP model for the best solution of the s-group classifier given the values of control parameters. Step 4 Use Score to produce the graphical representations of training results. Step 3-4 will not terminate until the best training result is found.
35 Credit Card Portfolio Management Introduction Data mining for credit card portfolio management decisions is to classify the different cardholder behaviors in terms of their payment to the credit card companies, such as banks and mortgage loan firms. In realty, the common categories of the credit card variables are balance, purchase, payment and cash advance. Some credit card company may consider residence state category and job security as special variables. In the case of FDC (First Data Corporation), there are 38 original variables from the common variables over the past seven months. Then, a set of derived variables is internally generated from the 38 variables to perform the precise data mining.
36 Individual Bankruptcy Filing ( )
37 Real-life life Applications: Credit Card Portfolio Management The objective of this research is to search an alternative modeling approach (preferred by a linear approach) that could outperform the current approaches (Shi, Wise, Luo and Lin 2001): (1)Behavior Score by FICO, (2)Credit Bureau Score by FICO, (3)FDC Proprietary Bankruptcy Score, (4)SE Decision Tree.
38 Research Methodology Using the 65 (Char 1-65) variables in FDR and a small development sample, we want to determine the coefficients for an appropriate subset of 65 derived variables, X = (x 1,..., x r ), and a boundary value b to separate two groups: G (Goods) and B (Bads); that is, A i X b, A i G and A i X b, A i B, where A i are the vector values of the variables.
39 Two-group MC Model for SAS Algorithm Minimize d α - + d α + + d β - + d β + Subject to α* + Σ i α i = d - α -d + α, β* - Σ i β i = d - β -d + β, A i X = b + α i - β i, A i G, A i X = b - α + β i i, A i B, where A i, α*, and β* are given, X and b are unrestricted, and α i, β i, d - α, d + α, d - β, d + β 0.
40 Comparison Method A comparison of different methods can be demonstrated by the Kolmogorov-Smirnov (KS) value that measures the largest separation of the cumulative distributions of Goods and Bads (Conover 1999): KS = max Cum. distribution of Good - Cum. distribution of Bad
41 Comparison Results KS values on a sample of 1000 cases: (i) KS(Behavior Score) = 55.26; (ii) KS(Credit Bureau Score) = 45.55; (iii) KS(FDC Bankruptcy Score) = 59.16; (iv) KS(SE Score) = 60.22; (v) KS(MCLP) = 59.49
42 Comparison Results on a sample of 2000 cases: KS(MCLP) = 60.1 outperforms other methods From the sample size 3000 to 6000, the KS(MCLP) deviation is only 0.38 ( ).
43 MCLP Model Learning Experience Versions # of variables Sample Size KS value Note (M8) Optimal (M8) Cross-Validation (M9) 30 (log. reg.) Optimal (M9) 30 (log. reg.) Cross-Validation (M10) 29 (log. reg.) Optimal (M10) 29 (log. reg.) Cross-Validation (M11) 27 (expert) Optimal (M11) 27 (expert) Cross-Validation
44 KS(MCLP) = on 1000 cases: % % 80.00% 60.00% 40.00% 20.00% 0.00% CUMGOOD CUMBAD
45 % % 80.00% 60.00% 40.00% 20.00% 0.00% KS(MCLP) = on 2000 cases: CUMGOOD CUMBAD
46 KS(MCLP) = on 6000 cases: % % 80.00% 60.00% 40.00% 20.00% 0.00% CUMGOOD CUMBAD
47 Research Findings the MCLP model is fully controlled by the formulation; the 3000 is a stable size of sample for the robustness in the separation process; the MCLP model can be easily adopted to multi-group separation problems.
48 Three-group MC Model 3-group cumulative distribution(training) distribution cumpct1 cumpct2 cumpct interval score
49 Three-group MC Model 3-group cumulative distribution (Verifying) distribution 0.6 cumpct1 cumpct2 cumpct interval score
50 Four-group MC Model 4-group cumulative distribution(training) distr ibution cumpct1 cumpct2 cumpct3 cumpct score interval
51 Four-group MC Model 4-group cumulative distribution(verifying) distribution cumpct1 cumpct2 cumpct3 cumpct score interval
52 Five-group MC Model 5-group MCLP separation(verify) distribution 0.6 cumpct1 cumpct2 cumpct3 cumpct4 cumpct score interval
53 Other Applications Linear Programming-based Data Mining Technology can be used to: (1) Bank and Firms Bankruptcy Analyses (2) Fraud Management (3) Financial Risk Management (4) Medical Clinic Analyses (5) Marketing Promotion ---- Many others
How to evaluate credit scorecards - and why using the Gini coefficient has cost you money
How to evaluate credit scorecards - and why using the Gini coefficient has cost you money David J. Hand Imperial College London Quantitative Financial Risk Management Centre August 2009 QFRMC - Imperial
More informationMeasuring Scorecard Performance
Measuring corecard Performance Zheng Yang, Yue Wang, Yu Bai, and Xin Zhang Colledge of Economics, ichuan University Chengdu, ichuan, 610064, China Yangzheng9@16.com Abstract. In this paper, we look at
More informationVol. 5, No. 5 May 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.
Application of ANP in Evaluating Accounting Softwares based on Accounting Information Systems Characteristics Morteza Ramazani, 2 Reza Askari, 3 Ebrahim Fazli Management and Accounting Department, Zanjan
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationChapter 8 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.
1 Learning Objectives Chapter 8 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. 2 Process Capability Natural tolerance limits are defined as follows: Chapter 8 Statistical Quality Control,
More informationParts 3-6 are EXAMPLES for cse634
1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.
More informationExpert Systems with Applications
Expert Systems with Applications 36 (29) 5718 5727 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Cluster-based under-sampling
More informationEVALUATING MISCLASSIFICATION PROBABILITY USING EMPIRICAL RISK 1. Victor Nedel ko
94 International Journal "Information Theories & Applications" Vol13 [Raudys, 001] Raudys S, Statistical and neural classifiers, Springer, 001 [Mirenkova, 00] S V Mirenkova (edel ko) A method for prediction
More informationLearning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht
Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with
More informationPredictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 3 (2017) pp. 461-469 Research India Publications http://www.ripublication.com Predictive Analytics on Accident Data Using
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationIssues using Logistic Regression for Highly Imbalanced data
Issues using Logistic Regression for Highly Imbalanced data Yazhe Li, Niall Adams, Tony Bellotti Imperial College London yli16@imperialacuk Credit Scoring and Credit Control conference, Aug 2017 Yazhe
More informationInduction of Decision Trees
Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.
More informationLecture 2. Judging the Performance of Classifiers. Nitin R. Patel
Lecture 2 Judging the Performance of Classifiers Nitin R. Patel 1 In this note we will examine the question of how to udge the usefulness of a classifier and how to compare different classifiers. Not only
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationANN and Statistical Theory Based Forecasting and Analysis of Power System Variables
ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables Sruthi V. Nair 1, Poonam Kothari 2, Kushal Lodha 3 1,2,3 Lecturer, G. H. Raisoni Institute of Engineering & Technology,
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationFast Logistic Regression for Text Categorization with Variable-Length N-grams
Fast Logistic Regression for Text Categorization with Variable-Length N-grams Georgiana Ifrim *, Gökhan Bakır +, Gerhard Weikum * * Max-Planck Institute for Informatics Saarbrücken, Germany + Google Switzerland
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationAdaptive Crowdsourcing via EM with Prior
Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and
More informationDistributed Clustering and Local Regression for Knowledge Discovery in Multiple Spatial Databases
Distributed Clustering and Local Regression for Knowledge Discovery in Multiple Spatial Databases Aleksandar Lazarevic, Dragoljub Pokrajac, Zoran Obradovic School of Electrical Engineering and Computer
More informationForecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method
Forecasting Casino Gaming Traffic with a Data Mining Alternative to Croston s Method Barry King Abstract Other researchers have used Croston s method to forecast traffic at casino game tables. Our data
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationAnomaly Detection using Support Vector Machine
Anomaly Detection using Support Vector Machine Dharminder Kumar 1, Suman 2, Nutan 3 1 GJUS&T Hisar 2 Research Scholar, GJUS&T Hisar HCE Sonepat 3 HCE Sonepat Abstract: Support vector machine are among
More informationReview of Lecture 1. Across records. Within records. Classification, Clustering, Outlier detection. Associations
Review of Lecture 1 This course is about finding novel actionable patterns in data. We can divide data mining algorithms (and the patterns they find) into five groups Across records Classification, Clustering,
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationDimensionality Reduction and Principal Components
Dimensionality Reduction and Principal Components Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,..., M} and observations of X
More informationA NOTE ON THE HOMOMORPHISM THEOREM FOR HEMIRINGS
Internal. J. Math. & Math. Sci. Vol. (1978)439-445 439 A NOTE ON THE HOMOMORPHISM THEOREM FOR HEMIRINGS D. M. OLSON Department of Mathematics Cameron University Lawton, Oklahoma 73501 U.S.A. (Recieved
More informationDecision Trees / NLP Introduction
Decision Trees / NLP Introduction Dr. Kevin Koidl School of Computer Science and Statistic Trinity College Dublin ADAPT Research Centre The ADAPT Centre is funded under the SFI Research Centres Programme
More informationUniversal Learning Technology: Support Vector Machines
Special Issue on Information Utilizing Technologies for Value Creation Universal Learning Technology: Support Vector Machines By Vladimir VAPNIK* This paper describes the Support Vector Machine (SVM) technology,
More informationThe Quadratic Entropy Approach to Implement the Id3 Decision Tree Algorithm
Journal of Computer Science and Information Technology December 2018, Vol. 6, No. 2, pp. 23-29 ISSN: 2334-2366 (Print), 2334-2374 (Online) Copyright The Author(s). All Rights Reserved. Published by American
More informationIncorporating detractors into SVM classification
Incorporating detractors into SVM classification AGH University of Science and Technology 1 2 3 4 5 (SVM) SVM - are a set of supervised learning methods used for classification and regression SVM maximal
More informationIntroduction to Machine Learning
Introduction to Machine Learning CS4731 Dr. Mihail Fall 2017 Slide content based on books by Bishop and Barber. https://www.microsoft.com/en-us/research/people/cmbishop/ http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=brml.homepage
More informationAnti correlation: A Diversity Promoting Mechanisms in Ensemble Learning
Anti correlation: A Diversity Promoting Mechanisms in Ensemble Learning R. I. (Bob) McKay 1 and Hussein A. Abbass 2 School of Computer Science, University of New South Wales, Australian Defence Force Academy
More informationGene Expression Data Classification With Kernel Principal Component Analysis
Journal of Biomedicine and Biotechnology 25:2 25 55 59 DOI:.55/JBB.25.55 RESEARCH ARTICLE Gene Expression Data Classification With Kernel Principal Component Analysis Zhenqiu Liu, Dechang Chen, 2 and Halima
More informationCSC 411 Lecture 17: Support Vector Machine
CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationCS281B/Stat241B. Statistical Learning Theory. Lecture 1.
CS281B/Stat241B. Statistical Learning Theory. Lecture 1. Peter Bartlett 1. Organizational issues. 2. Overview. 3. Probabilistic formulation of prediction problems. 4. Game theoretic formulation of prediction
More informationGeneralization to Multi-Class and Continuous Responses. STA Data Mining I
Generalization to Multi-Class and Continuous Responses STA 5703 - Data Mining I 1. Categorical Responses (a) Splitting Criterion Outline Goodness-of-split Criterion Chi-square Tests and Twoing Rule (b)
More informationIntroduction to Course
.. Introduction to Course Oran Kittithreerapronchai 1 1 Department of Industrial Engineering, Chulalongkorn University Bangkok 10330 THAILAND last updated: September 17, 2016 COMP METH v2.00: intro 1/
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationOn Computational Limitations of Neural Network Architectures
On Computational Limitations of Neural Network Architectures Achim Hoffmann + 1 In short A powerful method for analyzing the computational abilities of neural networks based on algorithmic information
More informationIntroduction of Recruit
Apr. 11, 2018 Introduction of Recruit We provide various kinds of online services from job search to hotel reservations across the world. Housing Beauty Travel Life & Local O2O Education Automobile Bridal
More informationModern Logistics & Supply Chain Management
Modern Logistics & Supply Chain Management As gold which he cannot spend will make no man rich, so knowledge which he cannot apply will make no man wise. Samuel Johnson: The Idler No. 84 Production Mix
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationChapter 10 Logistic Regression
Chapter 10 Logistic Regression Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Logistic Regression Extends idea of linear regression to situation where outcome
More informationGene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm
Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm Zhenqiu Liu, Dechang Chen 2 Department of Computer Science Wayne State University, Market Street, Frederick, MD 273,
More informationData Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition
Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationReducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers Erin Allwein, Robert Schapire and Yoram Singer Journal of Machine Learning Research, 1:113-141, 000 CSE 54: Seminar on Learning
More informationENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS
International Journal of Uncertainty Fuzziness and Knowledge-Based Systems World Scientific ublishing Company ENTOIES OF FUZZY INDISCENIBILITY ELATION AND ITS OEATIONS QINGUA U and DAEN YU arbin Institute
More informationReferences. Lecture 7: Support Vector Machines. Optimum Margin Perceptron. Perceptron Learning Rule
References Lecture 7: Support Vector Machines Isabelle Guyon guyoni@inf.ethz.ch An training algorithm for optimal margin classifiers Boser-Guyon-Vapnik, COLT, 992 http://www.clopinet.com/isabelle/p apers/colt92.ps.z
More informationHomework #1 RELEASE DATE: 09/26/2013 DUE DATE: 10/14/2013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIALS ARE WELCOMED ON THE FORUM.
Homework #1 REEASE DATE: 09/6/013 DUE DATE: 10/14/013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIAS ARE WECOMED ON THE FORUM. Unless granted by the instructor in advance, you must turn in a printed/written
More informationThe Problem. Sustainability is an abstract concept that cannot be directly measured.
Measurement, Interpretation, and Assessment Applied Ecosystem Services, Inc. (Copyright c 2005 Applied Ecosystem Services, Inc.) The Problem is an abstract concept that cannot be directly measured. There
More informationA review of some semiparametric regression models with application to scoring
A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationForecasting without Fear
Forecasting without Fear How to keep the business informed and keep your cool NY SPIN December 15, 2015 Drivers, Challenges why you have to forecast, and why it s not easy what you forecast Refinement
More informationIterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationGraph Theoretic Latent Class Discovery
Graph Theoretic Latent Class Discovery Jeff Solka jsolka@gmu.edu NSWCDD/GMU GMU BINF Colloquium 2/24/04 p.1/28 Agenda What is latent class discovery? What are some approaches to the latent class discovery
More informationPredictive Modeling: Classification. KSE 521 Topic 6 Mun Yi
Predictive Modeling: Classification Topic 6 Mun Yi Agenda Models and Induction Entropy and Information Gain Tree-Based Classifier Probability Estimation 2 Introduction Key concept of BI: Predictive modeling
More informationCSC Neural Networks. Perceptron Learning Rule
CSC 302 1.5 Neural Networks Perceptron Learning Rule 1 Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron
More informationNeural Networks and Ensemble Methods for Classification
Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated
More informationClassification 1: Linear regression of indicators, linear discriminant analysis
Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationCS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.
CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform
More informationAn overview of Boosting. Yoav Freund UCSD
An overview of Boosting Yoav Freund UCSD Plan of talk Generative vs. non-generative modeling Boosting Alternating decision trees Boosting and over-fitting Applications 2 Toy Example Computer receives telephone
More informationSupport vector machines Lecture 4
Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More informationML (cont.): SUPPORT VECTOR MACHINES
ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version
More informationFirm Failure Timeline Prediction: Math Programming Approaches
2016 49th Hawaii International Conference on System Sciences Firm Failure Timeline Prediction: Math Programming Approaches Young U. Ryu School of Management The University of Texas at Dallas ryoung@utdallas.edu
More informationData classification (II)
Lecture 4: Data classification (II) Data Mining - Lecture 4 (2016) 1 Outline Decision trees Choice of the splitting attribute ID3 C4.5 Classification rules Covering algorithms Naïve Bayes Classification
More informationA Posteriori Corrections to Classification Methods.
A Posteriori Corrections to Classification Methods. Włodzisław Duch and Łukasz Itert Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland; http://www.phys.uni.torun.pl/kmk
More informationOperations Research Lecture 1: Linear Programming Introduction
Operations Research Lecture 1: Linear Programming Introduction Notes taken by Kaiquan Xu@Business School, Nanjing University 25 Feb 2016 1 Some Real Problems Some problems we may meet in practice or academy:
More informationA Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems
A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems Hiroyuki Mori Dept. of Electrical & Electronics Engineering Meiji University Tama-ku, Kawasaki
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationNearest Neighbors Methods for Support Vector Machines
Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationData Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction
Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction
More informationImproved Classification and Discrimination by Successive Hyperplane and Multi-Hyperplane Separation
Improved Classification and Discrimination by Successive Hyperplane and Multi-Hyperplane Separation Fred Glover and Marco Better OptTek Systems, Inc. 2241 17 th Street Boulder, CO 80302 Abstract We propose
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationLecture 4: Feed Forward Neural Networks
Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training
More informationCLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition
CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data
More informationInternational Journal "Information Theories & Applications" Vol.14 /
International Journal "Information Theories & Applications" Vol.4 / 2007 87 or 2) Nˆ t N. That criterion and parameters F, M, N assign method of constructing sample decision function. In order to estimate
More informationAnnouncements Kevin Jamieson
Announcements My office hours TODAY 3:30 pm - 4:30 pm CSE 666 Poster Session - Pick one First poster session TODAY 4:30 pm - 7:30 pm CSE Atrium Second poster session December 12 4:30 pm - 7:30 pm CSE Atrium
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationOptimization Methods in Finance
Optimization Methods in Finance 1 PART 1 WELCOME 2 Welcome! My name is Friedrich Eisenbrand Assistants of the course: Thomas Rothvoß, Nicolai Hähnle How to contact me: Come to see me during office ours:
More informationAdversarial Machine Learning: Big Data Meets Cyber Security
Adversarial Machine Learning: Big Data Meets Cyber Security Bowei Xi Department of Statistics Purdue University With Murat Kantarcioglu Introduction Many adversarial learning problems in practice. Image
More informationText Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University
Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data
More informationRobust Pareto Design of GMDH-type Neural Networks for Systems with Probabilistic Uncertainties
. Hybrid GMDH-type algorithms and neural networks Robust Pareto Design of GMDH-type eural etworks for Systems with Probabilistic Uncertainties. ariman-zadeh, F. Kalantary, A. Jamali, F. Ebrahimi Department
More informationLinear classifiers Lecture 3
Linear classifiers Lecture 3 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin ML Methodology Data: labeled instances, e.g. emails marked spam/ham
More informationDECISION MAKING SUPPORT AND EXPERT SYSTEMS
325 ITHEA DECISION MAKING SUPPORT AND EXPERT SYSTEMS UTILITY FUNCTION DESIGN ON THE BASE OF THE PAIRED COMPARISON MATRIX Stanislav Mikoni Abstract: In the multi-attribute utility theory the utility functions
More informationAPPLYING FOR ADMISSION TO COURSES OR DEGREES HONOURS IN MATHEMATICAL STATISTICS
APPLYING FOR ADMISSION TO COURSES OR DEGREES Application forms may be obtained from http://www.wits.ac.za/prospective/postgraduate or from the Student Enrolment centre, ground floor, Senate House. The
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationDecision Trees (Cont.)
Decision Trees (Cont.) R&N Chapter 18.2,18.3 Side example with discrete (categorical) attributes: Predicting age (3 values: less than 30, 30-45, more than 45 yrs old) from census data. Attributes (split
More informationIntro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation
Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor
More informationRadial Basis Functions Networks to hybrid neuro-genetic RBFΝs in Financial Evaluation of Corporations
Radial Basis Functions Networks to hybrid neuro-genetic RBFΝs in Financial Evaluation of Corporations Loukeris Nikolaos University of Essex Email: nikosloukeris@gmail.com Abstract:- Financial management
More information