PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY

Size: px
Start display at page:

Download "PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY"

Transcription

1 International Journal of Mechanical Engineering and Technology (IJMET) Volume 10, Issue 3, March 2019, pp , Article ID: IJMET_10_03_062 Available online at ISSN Print: and ISSN Online: IAEME Publication Scopus Indexed PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY Zidni Nurrobi Agam Computer Science Department, Bina Nusantara University, Jakarta, Indonesia Sani M. Isa Computer Science Department, Bina Nusantara University, Jakarta, Indonesia Abstract Classification is a method that process related categories used to group data according to it are similarities. High dimensional data used in the classification process sometimes makes a classification process not optimize because there are huge amounts of otherwise meaningless data. in this paper, we try to classify profit agent from PT.XYZ and find the best feature that has a major impact to profit agent. Feature selection is one of the methods that can optimize the dataset for the classification process. in this paper we applied a feature selection based on graph method, graph method identifies the most important nodes that are interrelated with neighbors nodes. Eigenvector centrality is a method that estimates the importance of features to its neighbors, using Eigenvector centrality will ranking central nodes as candidate features that used for classification method and find the best feature for classifying Data Agent. Support Vector Machines (SVM) is a method that will be used whether the approach using Feature Selection with Eigenvalue Centrality will further optimize the accuracy of the classification. Keywords: Classification, Support Vector Machines, Feature Selection, Eigenvalue Centrality, Graph-based. Cite this Article: Zidni Nurrobi Agam and Sani M. Isa, Profit Agent Classification Using Feature Selection Eigenvector Centrality, International Journal of Mechanical Engineering and Technology (IJMET)10(3), 2019, pp editor@iaeme.com

2 Zidni Nurrobi Agam and Sani M. Isa 1. INTRODUCTION In this era data is a very important commodity used in almost all existing technologies, data makes researchers examine more data in order to find hidden patterns that can be used as information. but with the increasing number of data, there are also many data that irrelevant and redundant dataset, making the quality of the data less good. Feature Selection is a method that selects a subset of variables from the input which can efficiently describe the input data while reducing effects from noise or irrelevant variables and still provide good prediction results [1]. Usually, feature selection operation both ranking and subset selection [2][3] to get most relational or most important value from a dataset. n described as total feature the goal of feature selection is to select the optimal feature I, so the optimal feature selection is I < n. With Feature Selection processing data will improve the overall prediction because optimal dataset that improves by feature selection. we applied feature selection for optimizing classification based on graph feature selection, this feature selection ranked feature based on Eigenvector Centrality. in graph theory, ECFS measures a node that has major impact on other nodes in the network. all nodes on the network are assigned relative scores based on the concept that nodes that have high value contribute more to the score of the node in question than equal connections to low-scoring nodes [4]. A high eigenvector score means that a node is connected to many nodes who themselves have high scores. so, relationship between feature (nodes) are measure by weight the connection between nodes. The problem from feature subset selection refers the task of identifying and selecting a useful subset of attributes to be used to represent patterns from a larger set of often mutually redundant, possibly irrelevant, attributes with different associated measurement costs and/or risks [5]. we try to find the most influential feature to predict the profit agent with ECFS. There are many studies that research about Eigenvector Centrality such as Nicholas J. Bryan and Ge Wang [6], Nicholas J. Bryan and team research about how music with so many features can create pattern network between song and used to help describe patterns of musical influence in sample-based music suitable for musicological analysis. [7] To analyze rank influence feature between genre music with Eigenvector Centrality. and on 2016 Giorgio Roffo & Simione Melzi research about Feature ranking vie Eigenvector Centrality, in Giorgio Roffo & Simione Melzi research important feature by identifying the most important attribute into an arbitrary set of cues then mapping the problem to find where feature are the nodes by assessing the importance of nodes through some indicator of centrality. for building the graph and the weighted distance between nodes Giorgio Roffo & Simione Melzi use Fisher Criteria. The Goal of this paper is to applied Chi-Square and ECFS feature selection and compare both features with different dataset. Both Feature Selection test with HCC and Profit agent dataset, this test validates with K Fold Cross Validation feature selection to test the model s ability then evaluated with confusion matrix to measure misclassification. Based on ECFS results we try to determine which attribute from profit agent that have a major impact on another attribute on profit agent dataset. 2. RESEARCH METHOD A discussion about Feature selection and ranking based on Graph network [8] method will be discussed in this paragraph. To build the graph first we have to define how to design and build the graph editor@iaeme.com

3 Profit Agent Classification Using Feature Selection Eigenvector Centrality 2.1. Graph Design Define the graph G = (V,E), V is a vertices corresponding one by one to each variable x, x is a set of features X = {x (1),x (2),...x (n) }. E define as a (Weight) edges between nodes (features). To represent the graph that have relationship between edge. Define node (feature) into adjency matrix represent on binary matrix: α ij = l(x (i),x (j) ) (1) When the graph not complex, the adjacency matrix are 0 all 1 or (there are no weights on the edges or multiple edges) and the diagonal entries are all [9], but if the adjacency matrix has weighted on edges the adjacency matrix will not 0 and 1 but fill with weighted as figure 1. ( ) ( ) Figure 1 Matrix A with Weight and Matrix A no Weight A design graph is a part to weight the graph according to how good the relationship between two feature in the dataset. we apply Fisher linear discriminant [10] to find the mean and standard deviation, this methods find a linear combination of features which characterizes or separates two or more classes of objects or events. ( ) (2) Where: m = represent a mean. s = represent a variance. (standard deviation) Subscripts = the subscripts denote the two classes. After we measure the relationship between to class and we get the weight from fisher linear Discriminant, then we implement Eigenvector Centrality to rank and filter data from spearman correlation weight that generated from the relationship between 2 nodes (feature). For G: = (V, E) with V vertices let A = (a v,t ) adjacency matrix, the relative centrality score of vertex v can be defined as : ( ) (3) where M(v) is a set of the neighbors of and is a constant. With a small rearrangement this can be rewritten in vector notation as the eigenvector equation: (4) However, as we count longer and longer paths, this measure of accessibility converges to an index known as eigenvector centrality measure (EC). Example for node and adjacency matrix [9] described in figure 2 and Table editor@iaeme.com

4 Zidni Nurrobi Agam and Sani M. Isa Figure 2 Node Data Agent Company XYZ Table 1 Example adjacency Matrix Example Age Gender City Balance Total Trans Age Gender City Balance Total Trans SVM Classification Method SVM Classification is classification analysis which fall into the category a supervised learning algorithm. Given a set of training examples, each marked as belonging to one or the other of two categories, Support vectro machine builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier. Its basic idea is to map data into a high dimensional space and find a separating hyperplane with the maximal margin. Given a training dataset of n points of the form: ( ) ( ) (5) where the are 1 or 1 [11], each point indicating the class to which the point belongs. Each is p-dimensional real vector. We want to find the "maximum-margin hyperplane" that divides the group of points for which = 1 from the group of points for which = - 1 which is defined so that the distance between the hyperplane and the nearest point from either group is maximized editor@iaeme.com

5 Profit Agent Classification Using Feature Selection Eigenvector Centrality Figure 3 SVM try to find best Hayperlane to split class -1 and Confusion Matrix To measure how to optimize SVM Classification used Eigenvector Centrality, we used a confusion matrix, confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. The table of confusion matrix will describe on figure 4 there is a TP (True Positives), TN (True Negatives), FP (False Positives) and FN (False Negatives). Figure 4 Confusion Matrix True Positive: We predicted yes (they have failure), and they do have the failure. False Positives: We predicted yes, but they don't actually have the failure. True Negatives: We predicted no, and they don't have the failure. False Negatives: We predicted no, but they actually do have the failure. 3. RESULT & ANALYSIS In this step we discuss result from compare between Eigenvector Centrality FS and Chi- Square FS, then we used Support Vector Machines Classification to calculate and compare accuracy between that feature selection Dataset Dataset PT. XYZ is chosen analysis feature selection Eigenvector Centrality with the scenario. first, we describe the dataset used for feature selection as how many features used for prediction, data agent is a categorical data collected from transaction data and have 6 attributes that used for analysis: editor@iaeme.com

6 Zidni Nurrobi Agam and Sani M. Isa 1. Type Application Description: a device used by the agent to the transaction. Categorical: 1. EDC 2. Android 2. Age Description: Age agent PT. XYZ Categorical: 1. <= 23 Years 2. > 23 Years and <= 29 Years 3. > 29 Years 3. City Description: City based on Agent stay. Categorical: Convert every city with a number. 4. Balance Agent Description: Wallet Agent on PT.Xyz Categorical: 1. <= Rp > Rp and <= Rp > Rp Transaction Description: Transaction Agent every day. Categorical: every transaction agent /day 6. Joined Description: Duration Agent join with PT.Xyz. Categorical: Count per day Agent from join until now. 7. Gender Description: Agent Gender. Categorical: 1= L 0 = P 8. PulsaPrabayar, PLNPrabayar, TVBerbayar, PDAM, PLNPasca, Telpon, Speedy, BPJS, Cashin, Asuransi, Gopay, & TiketKereta. Description: Any transaction from Pt.Xyz application. Categorical: Count perday detail transaction agent. Number of data profit: 558 Number of profit: FS Comparison Approach Is this step, we compare analysis about Eigenvector Centrality Feature Selection, we applied chi-square feature selection, chi-square is a numerical test that measures deviation considering the feature event is independent of the class value [12]. In this section, we applied to compare between eigenvector centrality and chi-square feature selection used SVM Classification to dataset Agent and HCC (hepatocellular carcinoma) survival public dataset have 49 attributes and 2 class (Survive, Not Survive). HCC is the most common type of primary liver cancer. Hepatocellular carcinoma occurs most often in people with chronic liver diseases, such as cirrhosis caused by hepatitis B or hepatitis C infection. The entire dataset is taken for our analysis containing 165 records and have numerical category characteristics. Result from the comparison between ECFS with data agent, data HCC and Chi-Square with data agent, data HCC will be analyzed whether there are significant differences between the two methods. The correlation between the attributes can influence the classification result[13] and Eliminating crucial features accidentally can reduce the classification result. Details from dataset describe on table 2 includes further details of considered datasets such as editor@iaeme.com

7 Profit Agent Classification Using Feature Selection Eigenvector Centrality the number of samples and variables, number of classes and accuracy result from classification [14]. Table 2 Dataset used in the comparison of feature selection. for each dataset, the following detailed are reported. No Dataset Samples Variables Classes 1 Agent Profit HCC Test Models & Performance Analysis This step try to test accuracy both of feature selection we used SVM classification and measure accuracy with confusion matrix. For Eigenvector Centrality dataset will split into data and. describe the feature that used for classification model and describe the label form dataset, profit label will give -1 and non profit willl give 1. The data will split into training and test then data then split into two class, profit class will assume -1 and nonprofit 1. Mean data will used for find the mutual information, linear descriminant used for measure mutual information between two clasess and find best mutual for process building the graph on Eigenvector Centrality. the Eigenvector Centrality will rank result by how strongly each node is conneceted to the other nodes. selected 10 best strongly central attribute (node) to predict the model with SVM, SVM model used for this classification is FITCSVM. This models will validate with 10 k fold validation and used random seed (1) to control random number generation for every feature selection. Table 3 shows the result obtained by comparing the accuracy from different datasets used for classification. HCC dataset starts from attribute 9 to attribute 45, after we try 1 iteration to 8 iteration the accuracy of chi-square decreases to 60 % and Agent Profit dataset start from dataset 10 to 19 attribute, after we try 1 iteration to 9 iteration the accuracy of chi-square decreases to 60 % so we don t display the accuracy from 1-9 for HCC and 1-10 for Profit Agent. Best attribute found based on manual reduction to get the best accuracy from iteration. Table 3 Performance Analysis dataset HCC with Chi-Square and ECFS Itteration Chi Chi Chi ECFS Itteration ECFS Itteration Square Square Square ECFS 45 70, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,625 69, , , , , , , , , , , , , , , , , , , , , , , , , , editor@iaeme.com

8 Zidni Nurrobi Agam and Sani M. Isa Figure 5 Performance Analysis dataset HCC with Chi-Square and ECFS In the chi-square, the highest iteration was generated in 32,33 iteration and produce the highest accuracy of 71,6176%. while the highest ECFS iteration was generated in 19 iterations and produced the highest accuracy %. Chi-Square succeeded in producing faster accuracy at 33 iterations while ECFS achieved maximum accuracy in the 19th iteration. both of these FS were tested on the number of datasets that had 45 attributes, then the test will be carried out on dataset agent profit that has the number of attributes 19 and showing on table 4 and figures 6. Table 4 Performance Analysis dataset Agent Profit with Chi-Square and ECFS Itteration ECFS Chi square 19 84, , , , , , , , , , , , , , , , , , , , , , editor@iaeme.com

9 Profit Agent Classification Using Feature Selection Eigenvector Centrality Figure 6 Performance Analysis dataset Agent Profit with Chi-Square and ECFS The Accuracy produced by ECFS on the Agent profit dataset is 90.48% better than chisquare which produces a maximum accuracy 89,18% and iteration for maximum accuracy is obtained by ECFS in 11 iterations while chi-square is in iteration 9. the overall performance for both feature selection indicates that performance from ECFS is more robust than chisquare because on the results obtained from test HCC dataset and Agent Profit Dataset, but Chi-Square better when attribute is more than 20 attributes when chi-square can reach maximum accuracy on 32 and 33 iteration. but when chi-square and ECFS reach less than 20 attributes from HCC Dataset, ECFS succeeded reach maximum accuracy on iteration 19. When the iteration from attributes reach smaller iteration such as 9 attributes, Chi-Square shows that accuracy decreases significantly as in dataset HCC and agent profit. ECFS is more robust when the attribute reduced even the accuracy decreases ECFS not significantly decreases. Value of ECFS has the largest increase in performance when the attribute reduces [15] on dataset agent profit as many attributes have strong relationships with others. Every attribute on ECFS rank according to how well they descriminant between two class and reduces the attribute that doesn t have a major impact on other attribute make the result from reduces attribute better Evaluating & Analysis Attribute From this comparison both feature selection has each best accuracy and evaluated with Confusion Matrix that shown in table 5, Confusion matrix only show 1 best accuracy from ECFS and Chi-Square. For ECFS best accuracy obtained from 10fold Dataset, HCC is 71,61% and Dataset Profit Agent is 90,48%. For Chi-Square best accuracy obtained from 10fold Dataset, HCC is 71,61% and Dataset Profit Agent is 89,18% editor@iaeme.com

10 Zidni Nurrobi Agam and Sani M. Isa Table 5 Confussion Matrix Best Accuracy ECFS (Dataset Agent Profit) Profit NonProfit Profit 40 4 NonProfit 5 51 Chi-Square (Dataset Agent Profit) Profit NonProfit Profit NonProfit 1 55 ECFS (Dataset HCC) Profit NonProfit Profit 2 4 NonProfit 1 9 Chi-Square (Dataset HCC) Profit NonProfit Profit 2 4 NonProfit 1 9 There are 12 attributes from dataset Agent that have a major impact on other attributes after tested by ECFS and may have an impact for analysis profit agent as shown in figure 7. Figure 7 Attributes Analysis Dataset Profit Agent with ECFS From figure 7 we analysis attributes that have a major impact for other attribute is City and many profit agents are determined by where is agent doing the transaction. That is possible analysis from ECFS because the city is one attribute that has a major impact in real conditions because people in big cities and small cities have different habits such as knowledge about technology and culture editor@iaeme.com

11 Profit Agent Classification Using Feature Selection Eigenvector Centrality 4. CONCLUSION & FUTURE WORK In this paper we try to build a model from Eigenvector Centrality, this feature selection used eigenvector to weight a between the node and find the best node then rank feature based on the most important feature. The result from this paper tries to compare between ECFS and Chi-Square and produce analysis that ECFS more robust than Chi-Square if the attribute has been optimized with reduced attribute and Chi-Square, can reach the maximum accuracy when having more attribute than ECFS. If Chi-Square test with the reduced attribute such as fewer than 9 attributes, Chi-Square significantly decreases than ECFS. With this ECFS we can know each attribute that have any major impact on Agent Profit. Future work from this paper is trying to boost the performance of feature selection itself when the dataset has many attributes and try the other dataset with more attribute to test ECFS and make ECFS can develop on other tools or framework so the resource can be developed and easy access to many researchers. REFERENCES [1] I. Guyon and A. Elisseeff, An Introduction to Variable and Feature Selection, J. Mach. Learn. Res., Volume 3, Number 3, pp , [2] R. Wan, L. Vegas, M. Carlo, and P. Ii, Computational Methods of Feature Selection. [3] P. S. Bradley, Feature Selection via Concave Minimization and Support Vector Machines, Number 6. [4] L. Solá, M. Romance, R. Criado, J. Flores, A. García del Amo, and S. Boccaletti, Eigenvector centrality of nodes in multiplex networks, Chaos, Volume 23, Number 3, pp. 1 11, [5] J. Yang and V. Honavar, Feature Subset Selection Using a Genetic Algorithm, IEEE Intell. Syst., vol. 13, pp , [6] N. Bryan and G. Wang, Musical Influence Network Analysis and Rank of Sample-Based Music, Proc. 12th Int. Soc. Music Inf. Retr. Conf., no. ISMIR, pp , [7] G. Roffo and S. Melzi, Ranking to learn: Feature ranking and selection via eigenvector centrality, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol LNCS, pp , [8] F. R. Pitts, A graph theoretic approach t o historical geography, pp [9] F. Harary, S. Review, and N. Jul, No Title, vol. 4, no. 3, pp , [10] B. Scholkopft and K. Mullert, Fisher Discriminant Analysis With, pp , [11] Y. Bazi, S. Member, F. Melgani, and S. Member, Toward an Optimal SVM Classification System for Hyperspectral Remote Sensing Images, no. December 2013, [12] S. Thaseen and C. A. Kumar, Intrusion Detection Model Using fusion of Chi-square feature selection and multi class, J. KING SAUD Univ. - Comput. Inf. Sci., no. 2016, [13] I. Sumaiya Thaseen and C. Aswani Kumar, Intrusion detection model using fusion of chi-square feature selection and multi class SVM, J. King Saud Univ. - Comput. Inf. Sci., vol. 29, no. 4, pp , [14] D. Ballabio, F. Grisoni, and R. Todeschini, Multivariate comparison of classification performance measures, Chemom. Intell. Lab. Syst., vol. 174, no. March, pp , [15] E. M. Hand and R. Chellappa, Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification, pp , editor@iaeme.com

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

Machine Learning Practice Page 2 of 2 10/28/13

Machine Learning Practice Page 2 of 2 10/28/13 Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /

More information

Anomaly Detection. Jing Gao. SUNY Buffalo

Anomaly Detection. Jing Gao. SUNY Buffalo Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

Introduction to Machine Learning Midterm Exam Solutions

Introduction to Machine Learning Midterm Exam Solutions 10-701 Introduction to Machine Learning Midterm Exam Solutions Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes,

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Predicting Graph Labels using Perceptron. Shuang Song

Predicting Graph Labels using Perceptron. Shuang Song Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction

More information

W vs. QCD Jet Tagging at the Large Hadron Collider

W vs. QCD Jet Tagging at the Large Hadron Collider W vs. QCD Jet Tagging at the Large Hadron Collider Bryan Anenberg: anenberg@stanford.edu; CS229 December 13, 2013 Problem Statement High energy collisions of protons at the Large Hadron Collider (LHC)

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

Feature selection and extraction Spectral domain quality estimation Alternatives

Feature selection and extraction Spectral domain quality estimation Alternatives Feature selection and extraction Error estimation Maa-57.3210 Data Classification and Modelling in Remote Sensing Markus Törmä markus.torma@tkk.fi Measurements Preprocessing: Remove random and systematic

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}

More information

Classification using stochastic ensembles

Classification using stochastic ensembles July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Stephen Scott.

Stephen Scott. 1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

c 4, < y 2, 1 0, otherwise,

c 4, < y 2, 1 0, otherwise, Fundamentals of Big Data Analytics Univ.-Prof. Dr. rer. nat. Rudolf Mathar Problem. Probability theory: The outcome of an experiment is described by three events A, B and C. The probabilities Pr(A) =,

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction

More information

10-701/ Machine Learning - Midterm Exam, Fall 2010

10-701/ Machine Learning - Midterm Exam, Fall 2010 10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam

More information

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data

More information

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014 CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof.

More information

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance Dhaka Univ. J. Sci. 61(1): 81-85, 2013 (January) An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance A. H. Sajib, A. Z. M. Shafiullah 1 and A. H. Sumon Department of Statistics,

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

Brief Introduction to Machine Learning

Brief Introduction to Machine Learning Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector

More information

Expectation Maximization Algorithm

Expectation Maximization Algorithm Expectation Maximization Algorithm Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein, Luke Zettlemoyer and Dan Weld The Evils of Hard Assignments? Clusters

More information

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology Text classification II CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Some slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Jelena Kovacevic Carnegie

More information

Model Accuracy Measures

Model Accuracy Measures Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses

More information

CC283 Intelligent Problem Solving 28/10/2013

CC283 Intelligent Problem Solving 28/10/2013 Machine Learning What is the research agenda? How to measure success? How to learn? Machine Learning Overview Unsupervised Learning Supervised Learning Training Testing Unseen data Data Observed x 1 x

More information

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p)

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p) 1 Decision Trees (13 pts) Data points are: Negative: (-1, 0) (2, 1) (2, -2) Positive: (0, 0) (1, 0) Construct a decision tree using the algorithm described in the notes for the data above. 1. Show the

More information

Constrained Optimization and Support Vector Machines

Constrained Optimization and Support Vector Machines Constrained Optimization and Support Vector Machines Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/

More information

Online Passive-Aggressive Algorithms. Tirgul 11

Online Passive-Aggressive Algorithms. Tirgul 11 Online Passive-Aggressive Algorithms Tirgul 11 Multi-Label Classification 2 Multilabel Problem: Example Mapping Apps to smart folders: Assign an installed app to one or more folders Candy Crush Saga 3

More information

Moving Average Rules to Find. Confusion Matrix. CC283 Intelligent Problem Solving 05/11/2010. Edward Tsang (all rights reserved) 1

Moving Average Rules to Find. Confusion Matrix. CC283 Intelligent Problem Solving 05/11/2010. Edward Tsang (all rights reserved) 1 Machine Learning Overview Supervised Learning Training esting Te Unseen data Data Observed x 1 x 2... x n 1.6 7.1... 2.7 1.4 6.8... 3.1 2.1 5.4... 2.8... Machine Learning Patterns y = f(x) Target y Buy

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

Predictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers

Predictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 3 (2017) pp. 461-469 Research India Publications http://www.ripublication.com Predictive Analytics on Accident Data Using

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Predictive Modeling: Classification. KSE 521 Topic 6 Mun Yi

Predictive Modeling: Classification. KSE 521 Topic 6 Mun Yi Predictive Modeling: Classification Topic 6 Mun Yi Agenda Models and Induction Entropy and Information Gain Tree-Based Classifier Probability Estimation 2 Introduction Key concept of BI: Predictive modeling

More information

CSCE 478/878 Lecture 6: Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Computational Genomics

Computational Genomics Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event

More information

Non-parametric Classification of Facial Features

Non-parametric Classification of Facial Features Non-parametric Classification of Facial Features Hyun Sung Chang Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Problem statement In this project, I attempted

More information

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber. CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform

More information

Unsupervised Data Discretization of Mixed Data Types

Unsupervised Data Discretization of Mixed Data Types Unsupervised Data Discretization of Mixed Data Types Jee Vang Outline Introduction Background Objective Experimental Design Results Future Work 1 Introduction Many algorithms in data mining, machine learning,

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

day month year documentname/initials 1

day month year documentname/initials 1 ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Decision Support. Dr. Johan Hagelbäck.

Decision Support. Dr. Johan Hagelbäck. Decision Support Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Decision Support One of the earliest AI problems was decision support The first solution to this problem was expert systems

More information

Kernel Logistic Regression and the Import Vector Machine

Kernel Logistic Regression and the Import Vector Machine Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable

More information

ECE521 Lecture7. Logistic Regression

ECE521 Lecture7. Logistic Regression ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

The Decision List Machine

The Decision List Machine The Decision List Machine Marina Sokolova SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 sokolova@site.uottawa.ca Nathalie Japkowicz SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 nat@site.uottawa.ca

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multi-class

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are

More information

Linear Classification and SVM. Dr. Xin Zhang

Linear Classification and SVM. Dr. Xin Zhang Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)

Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Eunsik Park 1 and Y-c Ivan Chang 2 1 Chonnam National University, Gwangju, Korea 2 Academia Sinica, Taipei,

More information

The Perceptron algorithm

The Perceptron algorithm The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following

More information

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS Oloyede I. Department of Statistics, University of Ilorin, Ilorin, Nigeria Corresponding Author: Oloyede I.,

More information

6.036 midterm review. Wednesday, March 18, 15

6.036 midterm review. Wednesday, March 18, 15 6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that

More information

Linear Classifiers as Pattern Detectors

Linear Classifiers as Pattern Detectors Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2013/2014 Lesson 18 23 April 2014 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear

More information

Final Examination CS540-2: Introduction to Artificial Intelligence

Final Examination CS540-2: Introduction to Artificial Intelligence Final Examination CS540-2: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

A Tutorial on Support Vector Machine

A Tutorial on Support Vector Machine A Tutorial on School of Computing National University of Singapore Contents Theory on Using with Other s Contents Transforming Theory on Using with Other s What is a classifier? A function that maps instances

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a two-page cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Final Exam, Fall 2002

Final Exam, Fall 2002 15-781 Final Exam, Fall 22 1. Write your name and your andrew email address below. Name: Andrew ID: 2. There should be 17 pages in this exam (excluding this cover sheet). 3. If you need more room to work

More information

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph.

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph. Int. J. Complex Systems in Science vol. 1(2) (2011), pp. 100 106 Line graphs for directed and undirected networks: An structural and analytical comparison Regino Criado 1, Julio Flores 1, Alejandro García

More information