Informa(on theory in ML
|
|
- Julianna Nash
- 6 years ago
- Views:
Transcription
1 Informa(on theory in ML
2 Feature Selec+on For efficiency of the classifier and to suppress noise choose subset of all possible features. Selected features should be frequent to avoid overfitting the classifier to the training data, but not too frequent in order to be characteristic. Features should be good discriminators between classes (i.e. frequent/characteristic in one class but infrequent in other classes). Approach: - compute measure of discrimination for each feature - select the top k most discriminative features in greedy manner tf*idf is usually not a good discrimination measure, and may give undue weight to terms with high idf value (leading to the danger of overfitting)
3 Entropy:idea Die Toten Hosen: Sieben fuhren nach Düsseldorf und einer fuhr nach Köln 7 P (D) = 8 P (K) = 8 ( log2 + 0 log2 0) = 0 X 7 7 H = pi log2 pi = log2 + log i Hwater = log2 + log2 = lim pi!0 pi log2 pi = 0 Hkoelsch = Halt =
4 Example for Feature Selec+on f f2 f3 f4 f5 f6 f7 f8 d: d2: d3: d4: d5: d6: d7: d8: d9: d0: d: d2: Class Tree: Entertainment Calculus Math training docs: d, d2, d3, d4 Entertainment d5, d6, d7, d8 Calculus d9, d0, d, d2 Algebra Algebra
5 Discrimina+ve Classifiers: Decision Trees given: a multiset of m-dimensional training data records dom(a)... dom(am) with numerical, ordinal, or categorial attributes Ai (e.g. term occurrence frequencies N 0... N 0 ) and with class labels wanted: a tree with attribute value conditions of the form Ai value for numerical or ordinal attributes or Ai value set or Ai value set = for categorial attributes or linear combinations of this type k i A i for several numerical attributes as inner nodes and labeled classes as leaf nodes value
6 Examples for Decision Trees () tf(homomorphism) 2 tf(vector) 3 tf(limit) 2 Lineare Algebra T T F T F F Algebra Calculus Other has read Tolkien T has read Eco T F intellectual uneducated F boring salary T credit worthy university degree & salary T F credit worthy F not credit worthy
7 Top- Down Construc+on of Decision Tree Input: decision tree node k that represents one partition D of dom(a)... dom(am) Output: decision tree with root k ) BuildTree (root, dom(a)... dom(am)) 2) PruneTree: reduce tree to appropriate size with: procedure BuildTree (k, D): if k contains only training data of the same class then terminate; determine split dimension Ai; determine split value x for most suitable partitioning of D into D = D {d d.ai x} and D2= D {d d.ai > x}; create children k and k2 of k; BuildTree (k, D); BuildTree (k2, D2);
8 Split Criterion: Informa+on Gain Goal is to split current node such that the resulting partitions are as pure as possible w.r.t. class labels of the corresponding training data. Thus we aim to minimize the impurity of the partitions. An approach to define impurity is via the entropy-based (statistical) information gain (referring to the distribution of class labels within a partition) G (k, k, k2) = H(k) ( p*h(k) + p2*h(k2) ) where: n k : # training data records in k n k,j : # training data records in k that belong to class j p = n k / n k and p2 = n k2 / n k nk, j nk, j H( k ) = log2 n n j k k
9 Example for Decision Tree for Text Classifica+on C: Algebra C2: Calculus C3: Stochastics f f2 f3 f4 f5 f6 f7 f8 d: d2: d3: d4: d5: d6: f2>0 Algebra f7> Stochastics Calculus G = H(k) ( 2/6*H(k) + 4/6*H(k2) ) H(k) = /3 log 3 + /3 log 3 + /3 log 3 H(k) = log H(k2) = 0 + /2 log 2 + /2 log 2 G = log 3 0 2/3*,6 0,66 = 0,94
10 Simple (Class- unspecific) Criteria for Feature Selec+on Document Frequency Thresholding: Consider for class Cj only terms ti that occur in at least δ training documents of Cj. Term Strength: For decision between classes C,..., Ck select (binary) features Xi with the highest value of s( X ) : P[ X occurs in doc d X occurs in similar doc d' i = i To this end the set of similar doc pairs (d, d ) is obtained by thresholding on pairwise similarity or by clustering/grouping the training docs. i + further possible criteria along these lines
11 Feature Selec+on Based on Informa+on Gain Information gain: For discriminating classes c,..., ck select the (binary) features Xi with the largest gain in entropy G( Xi ) = k P[ c j j= log2 P[ c j P[ P[ X i X i k P[ c j j= k P[ c j j= X i X i log2 log2 P[ c j P[ c j Xi Xi Generalization for non-binary features is straightforward..
12 Feature Selec+on Based on Mutual Informa+on Mutual information (Kullback-Leibler distance, relative entropy): for class cj select those (binary) features Xi with the largest value of MI( X i,c j ) = X { X i,x i } C { c j,c j } P[ X C log P[ X C P[ X P[ C and for discriminating classes c,..., ck: MI( X i ) = k j= P[ c j MI( X i, c j ) Generalization for non-binary features is straightforward
13 Condi(onal mutual informa(on Idea: construct the set of features itera+vely (e.g. by adding features one by one) For a new candidate, compute score s(n, k) which depends of ist individual discrimina+ve power, given class labels and previously chosen features. By taking the feature Xn with the maximum score s(n, k) we ensure that the new feature is both informa+ve and different than the preceding ones - at least in terms of predic+ng the class label Y. Instrument for construc+ng s(n, k): condi+onal mutual informa+on.
14 Comparing distribu(ons We are given: two probability distribu+ons P and Q. Ques+on: how (dis)similar are they to each other? Common example: distribu+ons of two documents over (some) latent topics, Expressed by means of mul+nomials.. We can compute MI (P,Q) or MI (Q,P) but they are both asymmetric Workaround: take as a dissimilarity measure between the value of DIS (P,Q) = DIS (Q,P) = 0.5 * MI (P,Q) * MI (Q,P) or JSD (P,Q) = JSD (Q,P) = 0.5 * MI (P, M) * MI (Q,M) M = 0.5 * (P+Q) JSD also known as Jensen- Shannon divergence Not yet a real distance (JS is not a proper metric), but indicator Becomes a metric with further regulariza+on (e.g. square root of JSD)..
Decision Trees Lecture 12
Decision Trees Lecture 12 David Sontag New York University Slides adapted from Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Machine Learning in the ER Physician documentation Triage Information
More information2018 CS420, Machine Learning, Lecture 5. Tree Models. Weinan Zhang Shanghai Jiao Tong University
2018 CS420, Machine Learning, Lecture 5 Tree Models Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/cs420/index.html ML Task: Function Approximation Problem setting
More informationDecision Tree Learning
Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,
More informationMachine Learning 3. week
Machine Learning 3. week Entropy Decision Trees ID3 C4.5 Classification and Regression Trees (CART) 1 What is Decision Tree As a short description, decision tree is a data classification procedure which
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationCS 6140: Machine Learning Spring What We Learned Last Week 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Sign
More informationARBEITSPAPIERE ZUR MATHEMATISCHEN WIRTSCHAFTSFORSCHUNG
ARBEITSPAPIERE ZUR MATHEMATISCHEN WIRTSCHAFTSFORSCHUNG Some Remarks about the Usage of Asymmetric Correlation Measurements for the Induction of Decision Trees Andreas Hilbert Heft 180/2002 Institut für
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More informationInformal Definition: Telling things apart
9. Decision Trees Informal Definition: Telling things apart 2 Nominal data No numeric feature vector Just a list or properties: Banana: longish, yellow Apple: round, medium sized, different colors like
More informationCse537 Ar*fficial Intelligence Short Review 1 for Midterm 2. Professor Anita Wasilewska Computer Science Department Stony Brook University
Cse537 Ar*fficial Intelligence Short Review 1 for Midterm 2 Professor Anita Wasilewska Computer Science Department Stony Brook University Data Mining Process Ques*ons: Describe and discuss all stages of
More informationLecture 7 Decision Tree Classifier
Machine Learning Dr.Ammar Mohammed Lecture 7 Decision Tree Classifier Decision Tree A decision tree is a simple classifier in the form of a hierarchical tree structure, which performs supervised classification
More informationData Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition
Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each
More informationDecision Tree Learning Lecture 2
Machine Learning Coms-4771 Decision Tree Learning Lecture 2 January 28, 2008 Two Types of Supervised Learning Problems (recap) Feature (input) space X, label (output) space Y. Unknown distribution D over
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}
More informationPredictive Modeling: Classification. KSE 521 Topic 6 Mun Yi
Predictive Modeling: Classification Topic 6 Mun Yi Agenda Models and Induction Entropy and Information Gain Tree-Based Classifier Probability Estimation 2 Introduction Key concept of BI: Predictive modeling
More informationChapter ML:III. III. Decision Trees. Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning
Chapter ML:III III. Decision Trees Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning ML:III-34 Decision Trees STEIN/LETTMANN 2005-2017 Splitting Let t be a leaf node
More informationMachine Learning & Data Mining
Group M L D Machine Learning M & Data Mining Chapter 7 Decision Trees Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University Top 10 Algorithm in DM #1: C4.5 #2: K-Means #3: SVM
More informationMultilayer Neural Networks
Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationDecision Tree Learning
Topics Decision Tree Learning Sattiraju Prabhakar CS898O: DTL Wichita State University What are decision trees? How do we use them? New Learning Task ID3 Algorithm Weka Demo C4.5 Algorithm Weka Demo Implementation
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationInduction of Decision Trees
Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.
More informationDecision Trees. Lewis Fishgold. (Material in these slides adapted from Ray Mooney's slides on Decision Trees)
Decision Trees Lewis Fishgold (Material in these slides adapted from Ray Mooney's slides on Decision Trees) Classification using Decision Trees Nodes test features, there is one branch for each value of
More informationC4.5 - pruning decision trees
C4.5 - pruning decision trees Quiz 1 Quiz 1 Q: Is a tree with only pure leafs always the best classifier you can have? A: No. Quiz 1 Q: Is a tree with only pure leafs always the best classifier you can
More informationFeature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with
Feature selection c Victor Kitov v.v.kitov@yandex.ru Summer school on Machine Learning in High Energy Physics in partnership with August 2015 1/38 Feature selection Feature selection is a process of selecting
More informationInformation Theory & Decision Trees
Information Theory & Decision Trees Jihoon ang Sogang University Email: yangjh@sogang.ac.kr Decision tree classifiers Decision tree representation for modeling dependencies among input variables using
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationSymbolic methods in TC: Decision Trees
Symbolic methods in TC: Decision Trees ML for NLP Lecturer: Kevin Koidl Assist. Lecturer Alfredo Maldonado https://www.cs.tcd.ie/kevin.koidl/cs0/ kevin.koidl@scss.tcd.ie, maldonaa@tcd.ie 01-017 A symbolic
More informationDecision Tree And Random Forest
Decision Tree And Random Forest Dr. Ammar Mohammed Associate Professor of Computer Science ISSR, Cairo University PhD of CS ( Uni. Koblenz-Landau, Germany) Spring 2019 Contact: mailto: Ammar@cu.edu.eg
More informationMultilayer Neural Networks
Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationGenerative MaxEnt Learning for Multiclass Classification
Generative Maximum Entropy Learning for Multiclass Classification A. Dukkipati, G. Pandey, D. Ghoshdastidar, P. Koley, D. M. V. S. Sriram Dept. of Computer Science and Automation Indian Institute of Science,
More informationbrainlinksystem.com $25+ / hr AI Decision Tree Learning Part I Outline Learning 11/9/2010 Carnegie Mellon
I Decision Tree Learning Part I brainlinksystem.com $25+ / hr Illah Nourbakhsh s version Chapter 8, Russell and Norvig Thanks to all past instructors Carnegie Mellon Outline Learning and philosophy Induction
More informationData Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,
More informationClassification: Decision Trees
Classification: Decision Trees These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online. Feel
More informationCHAPTER-17. Decision Tree Induction
CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes
More information3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H.
Appendix A Information Theory A.1 Entropy Shannon (Shanon, 1948) developed the concept of entropy to measure the uncertainty of a discrete random variable. Suppose X is a discrete random variable that
More informationClassification Using Decision Trees
Classification Using Decision Trees 1. Introduction Data mining term is mainly used for the specific set of six activities namely Classification, Estimation, Prediction, Affinity grouping or Association
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationCS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationDecision Tree Learning
Topics Decision Tree Learning Sattiraju Prabhakar CS898O: DTL Wichita State University What are decision trees? How do we use them? New Learning Task ID3 Algorithm Weka Demo C4.5 Algorithm Weka Demo Implementation
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 5: Multi-class and preference learning Juho Rousu 11. October, 2017 Juho Rousu 11. October, 2017 1 / 37 Agenda from now on: This week s theme: going
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationArtificial Intelligence Decision Trees
Artificial Intelligence Decision Trees Andrea Torsello Decision Trees Complex decisions can often be expressed in terms of a series of questions: What to do this Weekend? If my parents are visiting We
More informationDecision Trees. Tirgul 5
Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2
More informationClassification: Decision Trees
Classification: Decision Trees Outline Top-Down Decision Tree Construction Choosing the Splitting Attribute Information Gain and Gain Ratio 2 DECISION TREE An internal node is a test on an attribute. A
More informationDay 3: Classification, logistic regression
Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised
More informationStatistical Theory 1
Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is
More informationLecture 3: Decision Trees
Lecture 3: Decision Trees Cognitive Systems - Machine Learning Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning last change November 26, 2014 Ute Schmid (CogSys,
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationRule Generation using Decision Trees
Rule Generation using Decision Trees Dr. Rajni Jain 1. Introduction A DT is a classification scheme which generates a tree and a set of rules, representing the model of different classes, from a given
More informationMachine Learning 2nd Edi7on
Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 9: Decision Trees ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e
More informationIs cross-validation valid for small-sample microarray classification?
Is cross-validation valid for small-sample microarray classification? Braga-Neto et al. Bioinformatics 2004 Topics in Bioinformatics Presented by Lei Xu October 26, 2004 1 Review 1) Statistical framework
More informationJeffrey D. Ullman Stanford University
Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some
More informationDecision Trees.
. Machine Learning Decision Trees Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de
More information26 Chapter 4 Classification
26 Chapter 4 Classification The preceding tree cannot be simplified. 2. Consider the training examples shown in Table 4.1 for a binary classification problem. Table 4.1. Data set for Exercise 2. Customer
More informationTutorial 6. By:Aashmeet Kalra
Tutorial 6 By:Aashmeet Kalra AGENDA Candidate Elimination Algorithm Example Demo of Candidate Elimination Algorithm Decision Trees Example Demo of Decision Trees Concept and Concept Learning A Concept
More informationLecture 3: Decision Trees
Lecture 3: Decision Trees Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning Lecture 3: Decision Trees p. Decision
More information.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. for each element of the dataset we are given its class label.
.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Definitions Data. Consider a set A = {A 1,...,A n } of attributes, and an additional
More informationDecision Trees. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. February 5 th, Carlos Guestrin 1
Decision Trees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 5 th, 2007 2005-2007 Carlos Guestrin 1 Linear separability A dataset is linearly separable iff 9 a separating
More informationDecision Trees / NLP Introduction
Decision Trees / NLP Introduction Dr. Kevin Koidl School of Computer Science and Statistic Trinity College Dublin ADAPT Research Centre The ADAPT Centre is funded under the SFI Research Centres Programme
More informationDecision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18
Decision Tree Analysis for Classification Problems Entscheidungsunterstützungssysteme SS 18 Supervised segmentation An intuitive way of thinking about extracting patterns from data in a supervised manner
More informationDecision trees COMS 4771
Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).
More informationDecision Support. Dr. Johan Hagelbäck.
Decision Support Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Decision Support One of the earliest AI problems was decision support The first solution to this problem was expert systems
More informationIntelligent Data Analysis Lecture Notes on Document Mining
Intelligent Data Analysis Lecture Notes on Document Mining Peter Tiňo Representing Textual Documents as Vectors Our next topic will take us to seemingly very different data spaces - those of textual documents.
More informationDecision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014
Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists
More informationEmpirical Risk Minimization, Model Selection, and Model Assessment
Empirical Risk Minimization, Model Selection, and Model Assessment CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 5.7-5.7.2.4, 6.5-6.5.3.1 Dietterich,
More informationGraphical Models. Lecture 3: Local Condi6onal Probability Distribu6ons. Andrew McCallum
Graphical Models Lecture 3: Local Condi6onal Probability Distribu6ons Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Condi6onal Probability Distribu6ons
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationDan Roth 461C, 3401 Walnut
CIS 519/419 Applied Machine Learning www.seas.upenn.edu/~cis519 Dan Roth danroth@seas.upenn.edu http://www.cis.upenn.edu/~danroth/ 461C, 3401 Walnut Slides were created by Dan Roth (for CIS519/419 at Penn
More informationQuestion of the Day. Machine Learning 2D1431. Decision Tree for PlayTennis. Outline. Lecture 4: Decision Tree Learning
Question of the Day Machine Learning 2D1431 How can you make the following equation true by drawing only one straight line? 5 + 5 + 5 = 550 Lecture 4: Decision Tree Learning Outline Decision Tree for PlayTennis
More informationModern Information Retrieval
Text Classification, Modern Information Retrieval, Addison Wesley, 2009 p. 1/141 Modern Information Retrieval Chapter 5 Text Classification Introduction A Characterization of Text Classification Algorithms
More informationDECISION TREE LEARNING. [read Chapter 3] [recommended exercises 3.1, 3.4]
1 DECISION TREE LEARNING [read Chapter 3] [recommended exercises 3.1, 3.4] Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting Decision Tree 2 Representation: Tree-structured
More informationDecision Tree. Decision Tree Learning. c4.5. Example
Decision ree Decision ree Learning s of systems that learn decision trees: c4., CLS, IDR, ASSISA, ID, CAR, ID. Suitable problems: Instances are described by attribute-value couples he target function has
More informationday month year documentname/initials 1
ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationDecision Trees.
. Machine Learning Decision Trees Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de
More informationMachine Learning and Data Mining. Decision Trees. Prof. Alexander Ihler
+ Machine Learning and Data Mining Decision Trees Prof. Alexander Ihler Decision trees Func-onal form f(x;µ): nested if-then-else statements Discrete features: fully expressive (any func-on) Structure:
More informationIntroduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees
Introduction to ML Two examples of Learners: Naïve Bayesian Classifiers Decision Trees Why Bayesian learning? Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical
More informationInformation Theory, Statistics, and Decision Trees
Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010
More informationClassification and Prediction
Classification Classification and Prediction Classification: predict categorical class labels Build a model for a set of classes/concepts Classify loan applications (approve/decline) Prediction: model
More informationRestricted Boltzmann Machines
Restricted Boltzmann Machines Boltzmann Machine(BM) A Boltzmann machine extends a stochastic Hopfield network to include hidden units. It has binary (0 or 1) visible vector unit x and hidden (latent) vector
More informationGeneralization to Multi-Class and Continuous Responses. STA Data Mining I
Generalization to Multi-Class and Continuous Responses STA 5703 - Data Mining I 1. Categorical Responses (a) Splitting Criterion Outline Goodness-of-split Criterion Chi-square Tests and Twoing Rule (b)
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationPROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy
More informationDissimilarity, Quasimetric, Metric
Dissimilarity, Quasimetric, Metric Ehtibar N. Dzhafarov Purdue University and Swedish Collegium for Advanced Study Abstract Dissimilarity is a function that assigns to every pair of stimuli a nonnegative
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationUVA CS 4501: Machine Learning
UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course
More information1 Handling of Continuous Attributes in C4.5. Algorithm
.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous
More informationLecture VII: Classification I. Dr. Ouiem Bchir
Lecture VII: Classification I Dr. Ouiem Bchir 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationGenerative and Discriminative Approaches to Graphical Models CMSC Topics in AI
Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 2 Yasemin Altun January 26, 2007 Review of Inference on Graphical Models Elimination algorithm finds single
More informationMachine Learning on temporal data
Machine Learning on temporal data Classification rees for ime Series Ahlame Douzal (Ahlame.Douzal@imag.fr) AMA, LIG, Université Joseph Fourier Master 2R - MOSIG (2011) Plan ime Series classification approaches
More information