Incremental Construction of Complex Aggregates: Counting over a Secondary Table
|
|
- Beatrice Brooks
- 5 years ago
- Views:
Transcription
1 Incremental Construction of Complex Aggregates: Counting over a Secondary Table Clément Charnay 1, Nicolas Lachiche 1, and Agnès Braud 1 ICube, Université de Strasbourg, CNRS 300 Bd Sébastien Brant - CS 10413, F Illkirch Cedex {charnay,nicolaslachiche,agnesbraud}@unistrafr Abstract In this paper, we discuss the integration of complex aggregates in the construction of logical decision trees We review the use of complex aggregates in TILDE, which is based on an exhaustive search in the complex aggregate space As opposed to such a combinatorial search, we introduce a hill-climbing approach to build complex aggregates incrementally 1 Introduction and Context Relational data mining deals with data represented by several tables We focus on the typical setting where one table, the primary table, contains the target column, ie the attribute whose value is to be predicted, and has a one-to-many relationship with a secondary table A possible way of handling such relationships is to use complex aggregates, ie to aggregate the objects of the secondary table which meet a given condition, using an aggregate function on the objects themselves (count function) or on a numerical attribute of the objects (eg max, average functions) For instance, we may want to classify molecules Molecules have atoms They can be represented as a table of molecules and a table of atoms, with a foreign key in the table of atoms indicating the molecule it belongs to Then, the class of a molecule may depend on the comparison between the average of the charge of the carbon atoms of the molecule and some threshold value This example also shows what a complex aggregate relies on: an aggregate function (here the average), a condition to select the objects to aggregate (here we select only the carbon atoms), the attribute to aggregate on (here the charge of the atoms), an operator and a threshold to make a comparison with the result of the aggregation Previous work showed that the expressivity of complex aggregates can be useful to solve problems such as urban blocks classification [1,2] introduced complex aggregates in propositionalisation But their use increases the size of the feature space too much, and they cannot be fully handled This is the reason why we focus on introducing them in the learning step To our knowledge, only one relational learner, TILDE [3], has implemented complex aggregates, but its exhaustive approach is not adapted to too complex problems The main motivation for our work is to elaborate new heuristics to handle complex aggregates in
2 2 Clément Charnay, Nicolas Lachiche, Agnès Braud relational models This article presents a logical decision tree learner which uses complex aggregates to deal with secondary tables, using a hill-climbing heuristic to build them incrementally We introduce this heuristic in the context of logical decision trees, but it could be applied to other approaches In this article, we focus on counting over a secondary table, ie the aggregate function considered will be the count function The rest of this paper is organized as follows In Sect 2, we review the use of complex aggregates in TILDE In Sect 3, we describe our heuristic to explore the complex aggregate space Finally, in Sect 4, we detail future work 2 TILDE and Complex Aggregates TILDE [3] is a first-order decision tree learner It uses a top-down approach to choose, node by node, from root to leaves, the best refinement to split the training examples according to their class values, using gain ratio as a metric to guide the search TILDE relies on a language bias: the user specifies the literals which can be added in the conjunction at a node In this relational context, to deal with secondary tables, the initial version of TILDE introduces new variables from these secondary tables using an existential quantifier Then, TILDE has been extended [4] to allow the use of complex aggregates, and a heuristic has been developed to explore the search space [5] This heuristic is based on the idea of a refinement cube, where refinement space for the aggregate condition, aggregate function and threshold are the dimensions of the cube This cube is explored in a general-to-specific way, using monotone paths along the different dimensions: when a complex aggregate (a point in the refinement cube) is too specific (ie it fails for every training example), the search does not restart from this point However, the implementation does not allow more than two conjuncts in the aggregate query due to memory problems [6, p 32], which limits the search space Numerical attributes are handled by a comparison to a threshold in problems such as geographical ones It is possible to discretize numerical attributes beforehand and to define predicates to make those comparisons between the values of the attributes and the thresholds given by the discretization Nevertheless, enumerating all the combinations of thresholds over all the numerical attributes takes space in memory, and hence the approach is not tractable To summarize, this combinatorial approach which handles complex aggregates in TILDE has limitations We intend to overcome these limitations by not trying to explore the search space exhaustively, but by finding a heuristic to explore the search space and to build complex aggregates incrementally 3 Incremental Construction of Complex Aggregates with Hill-Climbing Our goal is to build a logical decision tree, like TILDE, which uses complex aggregates to deal with secondary tables To explore the refinement cube of com-
3 Incremental Construction of Complex Aggregates 3 plex aggregates, we choose to use a hill-climbing method This section details the method We use the general notation f unction(condition){operator}threshold to refer to a complex aggregate 31 Refinement of Complex Aggregates When testing an aggregate, we make a refinement to find the best aggregate, and a hill-climbing is performed from a starting aggregate In this paper, we limit ourselves to the count function to aggregate secondary tables The starting conditions are detailed below We then try to refine it with hill-climbing, allowing a non-strict climbing: if there is no strictly improving refinement, we allow picking a refinement with the same score as in the previous step, if it has not been visited before To achieve that, we store all previous refinements chosen in the hill-climbing path From the current aggregate, there are several ways to modify it: Firstly, given the examples, we compute the possible results of the aggregate function, which will serve as possible thresholds To these values, we add one threshold depending on the operator: strictly lower than the other values if the operator is (so that the complex aggregate is true for none of the examples) and strictly higher if the operator is Such thresholds are chosen to be respectively MAX DOUBLE and its opposite The current threshold is then set to the closest possible threshold if it is lower than the minimum or higher than the maximum of the possible thresholds For instance, if the current refinement to try is the count of atoms in the molecule is less than or equal to MAX DOUBLE and, in the training set, there are between 13 and 42 atoms in a molecule, the threshold will be immediately set to 42 Then, we can refine the aggregate by increasing or decreasing the threshold (among the possible thresholds) Other possibilities are to remove a literal from the aggregate condition, or to add one The Starting Conditions Given this, if we refer to the maximum threshold as max, 2 starting conditions will be count(true) max and count(true) MAX DOUBLE From the point of view of the examples considered, they are opposite: if one succeeds for an example, the other will fail The former is the most general (ie it succeds for every example), the latter the most specific We then observe that refinements for one will have the same effect on information gain for the other (the final branch of the examples will be inverted between both), so on the training set, they are equivalent, and will be refined equivalently In the end, we get two conditions count(condition) somet hreshold and count(condition) nextt hreshold which are opposite To conclude on this point, considering both starting conditions is not necessary, since they will be refined following similar paths and will yield the same information gain after the hill-climbing process Of course, the same reasoning applies for starting conditions count(true) MAX DOUBLE and count(true) min, this is the
4 4 Clément Charnay, Nicolas Lachiche, Agnès Braud reason why we consider only two starting conditions, arbitrarily with the operator Moving in the Complex Aggregate Space We now discuss our method for refinement of the aggregate If we add a literal to the aggregate condition, it will yield a specialization and less objects will be selected The threshold range discussed above will not be the same, and the current threshold, associated to the previous, more general condition, will not be relevant since it may be too high Hence, the refinement will yield a poor gain ratio and will not be chosen For instance, in the training set, there are between 13 and 42 atoms in a molecule, but only between 5 and 20 carbon atoms If the current aggregate states that the count of atoms is less than or equal to 30, and we try to refine it to the count of carbon atoms is less than or equal to 30, this last aggregate will be true for every example, yielding zero gain, and hence will not be chosen Of course, the problem will be similar if we consider the other way, ie if we drop a literal from the aggregate condition, yielding a generalization To avoid modifying the aggregate condition without modifying the threshold, we do as follows When modifying the aggregate condition, we consider the number of possible thresholds n 1 given by the current aggregate condition, and the number of thresholds n 2 given by the next (after modification) aggregate condition We sort those two sets those thresholds in increasing order, such that the current threshold is in position c 1 with indices going from 0 to n 1 1, the next threshold chosen, in position c 2 between 0 and n 2 1 will be picked such c that 2 n is closest to c n Mathematically: c = round( c1 (n2 1) n 1 1 ) Since we add a threshold to a list which already contains at least one element, there are always at least two possible thresholds and hence the case n 1 = 1 is not an issue 32 Dealing with Empty Sets We finally discuss a problem that will occur with aggregate functions other than count: the computation of a value for empty sets Indeed, an aggregate condition might select no object, and aggregation over a numerical attribute is not possible in this case For instance, how can the mean of the charge of the oxygen atoms in a molecule be computed when the molecule does not have any oxygen atom? Only the count function can deal in a natural way with empty sets, while another solution has to be chosen for numerical aggregate functions Some possibilities to deal with this issue have been discussed in [7]: Fixing an arbitrary value as the result Using a value depending on the aggregate condition, as close as possible to the values for examples for which the aggregate condition does not result in an empty set, or as far as possible Failing the aggregate when the aggregate function cannot be applied Discarding the aggregate from being chosen as a refinement if the aggregate function cannot be applied for at least one example
5 Incremental Construction of Complex Aggregates 5 In our opinion, the two first options are not easy to apply for every function Indeed, for functions min or max, one can choose values as low or as high as possible so that the aggregate always succeeds or fails if the function cannot be applied directly But for the average function, using a fixed value such as 0 is not relevant if the attribute can take both positive and negative values, neither is choosing positive or negative infinity Moreover, the fact that the set to aggregate is empty can be meaningful, and by assigning a result to the aggregate function we lose this significance The third option also gives to the empty set a meaning we do not necessarily want it to have: the failure of the aggregate should mean the inequality between the result of the aggregation and the threshold is wrong However, this is not the meaning of the empty set Actually, it is implied by the failure of the existential quantifier for the aggregate condition, ie count(condition) 1 fails Then we see two ways to address the issue of empty sets: firstly to consider them as a third branch in our decision trees, since they do not correspond to a success or a failure of the inequality, they are a third possibility However, this option breaks the binary structure of the tree, which is not necessarily a problem Nevertheless, we present another option to preserve the binary tree structure: the principle is to create a node with the count(condition) 1 aggregate before adding a node with an aggregate with a function which may not be applicable In the left branch, we can then add the aggregate f unction(condition) threshold without the empty set problem, since we know from the parent node that condition will select at least one object This adapts the three-branch idea to preserve the binary tree structure, using two nodes instead of one An example is shown in Fig 1, where the aggregate the average charge of the carbon atoms in the molecule is greater than or equal to 0542 is protected by the existential quantifier which tests the presence of at least one carbon atom in the molecule, ie the aggregate the count of carbon atoms in the molecule is greater than or equal to 1 If the latter succeeds, then the former can be evaluated because it is meaningful to compute the average value of a non-empty set If the existential quantifier fails, then the average is not computed because it would be meaningless 4 Conclusion and Future Work The method described in Sect 3 is implemented and will be evaluated The next step in this work is to consider other aggregate functions, on numerical attributes of the secondary objects, and to allow the change of the aggregate function in the refining process of the aggregates, by taking advantage of their ordering as in [5] Then, another step will be to allow recursivity in the aggregates, ie create complex aggregates which have complex aggregates in their aggregate condition, to use the whole database However, this will inevitably raise new issues, since this will add other levels of refinements A more complex problem will be to handle many-to-many relationships, since the complex aggregates can
6 6 Clément Charnay, Nicolas Lachiche, Agnès Braud count(b, (atom(a,b), atom type(b,carbon)), Res1), Res1 1 true false avg(c, (atom(a,b), atom type(b,carbon), atom charge(a,c)), Res2), Res true false Fig 1 Example of three-branch structure to deal with empty sets be formed both ways with such relationships, which can possibly lead to loops in the recursivity discussed above References 1 El Jelali, S, Braud, A, Lachiche, N: Propositionalisation of continuous attributes beyond simple aggregation In Riguzzi, F, Zelezný, F, eds: ILP Volume 7842 of Lecture Notes in Computer Science, Springer (2012) Puissant, A, Lachiche, N, Skupinski, G, Braud, A, Perret, J, Mas, A: Classification et évolution des tissus urbains à partir de données vectorielles Revue Internationale de Géomatique 21(4) (2011) Blockeel, H, Raedt, LD: Top-down induction of first-order logical decision trees Artif Intell 101(1-2) (1998) Assche, AV, Vens, C, Blockeel, H, Dzeroski, S: First order random forests: Learning relational classifiers with complex aggregates Machine Learning 64(1-3) (2006) Vens, C, Ramon, J, Blockeel, H: Refining aggregate conditions in relational learning In Fürnkranz, J, Scheffer, T, Spiliopoulou, M, eds: PKDD Volume 4213 of Lecture Notes in Computer Science, Springer (2006) Blockeel, H, Dehaspe, L, Ramon, J, Struyf, J, Assche, AV, Vens, C, Fierens, D: The ACE Data Mining System (March 2009) 7 Vens, C: Complex aggregates in relational learning PhD thesis, Informatics Section, Department of Computer Science, Faculty of Engineering Science (March 2007) Blockeel, Hendrik (supervisor)
Complex aggregates over clusters of elements
Complex aggregates over clusters of elements Celine Vens 1,2,3, Sofie Van Gassen 4, Tom Dhaene 4, and Yvan Saeys 1,2 1 Department of Respiratory Medicine, Ghent University, Ghent, Belgium 2 VIB Inflammation
More informationDecision Tree Learning
Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,
More informationJeffrey D. Ullman Stanford University
Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some
More informationData Mining and Machine Learning (Machine Learning: Symbolische Ansätze)
Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze) Learning Individual Rules and Subgroup Discovery Introduction Batch Learning Terminology Coverage Spaces Descriptive vs. Predictive
More informationOn and Off-Policy Relational Reinforcement Learning
On and Off-Policy Relational Reinforcement Learning Christophe Rodrigues, Pierre Gérard, and Céline Rouveirol LIPN, UMR CNRS 73, Institut Galilée - Université Paris-Nord first.last@lipn.univ-paris13.fr
More informationRelational Information Gain
Relational Information Gain Marco Lippi 1, Manfred Jaeger 2, Paolo Frasconi 1, and Andrea Passerini 3 1 Dipartimento di Sistemi e Informatica, Università degli Studi di Firenze, Italy 2 Department for
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationInduction of Decision Trees
Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.
More informationDecision Tree Learning and Inductive Inference
Decision Tree Learning and Inductive Inference 1 Widely used method for inductive inference Inductive Inference Hypothesis: Any hypothesis found to approximate the target function well over a sufficiently
More informationQuestion Answering on Statistical Linked Data
Question Answering on Statistical Linked Data AKSW Colloquium paper presentation Konrad Höffner Universität Leipzig, AKSW/MOLE, PhD Student 2015-2-16 1 / 18 1 2 3 2 / 18 Motivation Statistical Linked Data
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationLearning Decision Trees
Learning Decision Trees CS194-10 Fall 2011 Lecture 8 CS194-10 Fall 2011 Lecture 8 1 Outline Decision tree models Tree construction Tree pruning Continuous input features CS194-10 Fall 2011 Lecture 8 2
More informationAutomated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving
Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving Matt Fredrikson mfredrik@cs.cmu.edu October 17, 2016 Matt Fredrikson SAT Solving 1 / 36 Review: Propositional
More informationDecision Trees. Lewis Fishgold. (Material in these slides adapted from Ray Mooney's slides on Decision Trees)
Decision Trees Lewis Fishgold (Material in these slides adapted from Ray Mooney's slides on Decision Trees) Classification using Decision Trees Nodes test features, there is one branch for each value of
More informationLecture 3: Decision Trees
Lecture 3: Decision Trees Cognitive Systems - Machine Learning Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning last change November 26, 2014 Ute Schmid (CogSys,
More informationDescription Logics. Deduction in Propositional Logic. franconi. Enrico Franconi
(1/20) Description Logics Deduction in Propositional Logic Enrico Franconi franconi@cs.man.ac.uk http://www.cs.man.ac.uk/ franconi Department of Computer Science, University of Manchester (2/20) Decision
More informationTutorial 6. By:Aashmeet Kalra
Tutorial 6 By:Aashmeet Kalra AGENDA Candidate Elimination Algorithm Example Demo of Candidate Elimination Algorithm Decision Trees Example Demo of Decision Trees Concept and Concept Learning A Concept
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationLecture 3: Decision Trees
Lecture 3: Decision Trees Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning Lecture 3: Decision Trees p. Decision
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?
More informationEECS 349:Machine Learning Bryan Pardo
EECS 349:Machine Learning Bryan Pardo Topic 2: Decision Trees (Includes content provided by: Russel & Norvig, D. Downie, P. Domingos) 1 General Learning Task There is a set of possible examples Each example
More informationLecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018
CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of
More informationML techniques. symbolic techniques different types of representation value attribute representation representation of the first order
MACHINE LEARNING Definition 1: Learning is constructing or modifying representations of what is being experienced [Michalski 1986], p. 10 Definition 2: Learning denotes changes in the system That are adaptive
More informationWEIGHTS OF TESTS Vesela Angelova
International Journal "Information Models and Analyses" Vol.1 / 2012 193 WEIGHTS OF TESTS Vesela Angelova Abstract: Terminal test is subset of features in training table that is enough to distinguish objects
More informationSearch and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012
Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Search and Lookahead Enforcing consistency is one way of solving constraint networks:
More informationOverview of Topics. Finite Model Theory. Finite Model Theory. Connections to Database Theory. Qing Wang
Overview of Topics Finite Model Theory Part 1: Introduction 1 What is finite model theory? 2 Connections to some areas in CS Qing Wang qing.wang@anu.edu.au Database theory Complexity theory 3 Basic definitions
More informationKE/Tableaux. What is it for?
CS3UR: utomated Reasoning 2002 The term Tableaux refers to a family of deduction methods for different logics. We start by introducing one of them: non-free-variable KE for classical FOL What is it for?
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}
More informationLearning Classification Trees. Sargur Srihari
Learning Classification Trees Sargur srihari@cedar.buffalo.edu 1 Topics in CART CART as an adaptive basis function model Classification and Regression Tree Basics Growing a Tree 2 A Classification Tree
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationCS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search
More informationData Mining and Machine Learning
Data Mining and Machine Learning Concept Learning and Version Spaces Introduction Concept Learning Generality Relations Refinement Operators Structured Hypothesis Spaces Simple algorithms Find-S Find-G
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy
More informationThe exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
More informationThe PITA System for Logical-Probabilistic Inference
The System for Logical-Probabilistic Inference Fabrizio Riguzzi 1 and Terrance Swift 2 1 EDIF University of Ferrara, Via Saragat 1, I-44122, Ferrara, Italy fabrizio.riguzzi@unife.it 2 CETRIA Universidade
More informationPattern Popularity in 132-Avoiding Permutations
Pattern Popularity in 132-Avoiding Permutations The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Rudolph,
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 23. Decision Trees Barnabás Póczos Contents Decision Trees: Definition + Motivation Algorithm for Learning Decision Trees Entropy, Mutual Information, Information
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationIMBALANCED DATA. Phishing. Admin 9/30/13. Assignment 3: - how did it go? - do the experiments help? Assignment 4. Course feedback
9/3/3 Admin Assignment 3: - how did it go? - do the experiments help? Assignment 4 IMBALANCED DATA Course feedback David Kauchak CS 45 Fall 3 Phishing 9/3/3 Setup Imbalanced data. for hour, google collects
More informationHierarchical Concept Learning
COMS 6998-4 Fall 2017 Octorber 30, 2017 Hierarchical Concept Learning Presenter: Xuefeng Hu Scribe: Qinyao He 1 Introduction It has been shown that learning arbitrary polynomial-size circuits is computationally
More informationParts 3-6 are EXAMPLES for cse634
1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationDecision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014
Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists
More informationMachine Learning 2010
Machine Learning 2010 Decision Trees Email: mrichter@ucalgary.ca -- 1 - Part 1 General -- 2 - Representation with Decision Trees (1) Examples are attribute-value vectors Representation of concepts by labeled
More informationStrong AI vs. Weak AI Automated Reasoning
Strong AI vs. Weak AI Automated Reasoning George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Artificial intelligence can be classified into two categories:
More informationData Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition
Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each
More informationMachine Learning 2nd Edi7on
Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 9: Decision Trees ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e
More informationLecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds
Lecture 25 of 42 PAC Learning, VC Dimension, and Mistake Bounds Thursday, 15 March 2007 William H. Hsu, KSU http://www.kddresearch.org/courses/spring2007/cis732 Readings: Sections 7.4.17.4.3, 7.5.17.5.3,
More informationData Mining und Maschinelles Lernen
Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting
More informationPropositional and Predicate Logic - V
Propositional and Predicate Logic - V Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - V WS 2016/2017 1 / 21 Formal proof systems Hilbert s calculus
More informationBoosting with decision stumps and binary features
Boosting with decision stumps and binary features Jason Rennie jrennie@ai.mit.edu April 10, 2003 1 Introduction A special case of boosting is when features are binary and the base learner is a decision
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationCS 6375: Machine Learning Computational Learning Theory
CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty
More information9. PSPACE 9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete
Geography game Geography. Alice names capital city c of country she is in. Bob names a capital city c' that starts with the letter on which c ends. Alice and Bob repeat this game until one player is unable
More informationCHAPTER-17. Decision Tree Induction
CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes
More information9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete
9. PSPACE PSPACE complexity class quantified satisfiability planning problem PSPACE-complete Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley Copyright 2013 Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos
More informationSimple Neural Nets For Pattern Classification
CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification
More informationCSC 411 Lecture 3: Decision Trees
CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning
More informationLecture Notes in Machine Learning Chapter 4: Version space learning
Lecture Notes in Machine Learning Chapter 4: Version space learning Zdravko Markov February 17, 2004 Let us consider an example. We shall use an attribute-value language for both the examples and the hypotheses
More informationCorrelated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)
Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!
More informationFinite Model Theory: First-Order Logic on the Class of Finite Models
1 Finite Model Theory: First-Order Logic on the Class of Finite Models Anuj Dawar University of Cambridge Modnet Tutorial, La Roche, 21 April 2008 2 Finite Model Theory In the 1980s, the term finite model
More informationEnsembles of Classifiers.
Ensembles of Classifiers www.biostat.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts ensemble bootstrap sample bagging boosting random forests error correcting
More informationLogic: Bottom-up & Top-down proof procedures
Logic: Bottom-up & Top-down proof procedures Alan Mackworth UBC CS 322 Logic 3 March 4, 2013 P & M Textbook 5.2 Lecture Overview Recap: Soundness, Completeness, Bottom-up proof procedure Bottom-up Proof
More informationDecision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology
Decision trees Special Course in Computer and Information Science II Adam Gyenge Helsinki University of Technology 6.2.2008 Introduction Outline: Definition of decision trees ID3 Pruning methods Bibliography:
More informationLecture 7 Decision Tree Classifier
Machine Learning Dr.Ammar Mohammed Lecture 7 Decision Tree Classifier Decision Tree A decision tree is a simple classifier in the form of a hierarchical tree structure, which performs supervised classification
More informationIntroduction to Logic in Computer Science: Autumn 2006
Introduction to Logic in Computer Science: Autumn 2006 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today Today s class will be an introduction
More informationLecture 1: The arithmetic hierarchy
MODEL THEORY OF ARITHMETIC Lecture 1: The arithmetic hierarchy Tin Lok Wong 8 October, 2014 [These theorems] go a long way to explaining why recursion theory is relevant to the study of models of arithmetic.
More informationDesign of Distributed Systems Melinda Tóth, Zoltán Horváth
Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Publication date 2014 Copyright 2014 Melinda Tóth, Zoltán Horváth Supported by TÁMOP-412A/1-11/1-2011-0052
More informationClassification Based on Logical Concept Analysis
Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.
More informationMarkov Decision Processes
Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More informationNonlinear Classification
Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions
More informationMachine Learning: Exercise Sheet 2
Machine Learning: Exercise Sheet 2 Manuel Blum AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg mblum@informatik.uni-freiburg.de Manuel Blum Machine Learning
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced
More informationDecision Tree And Random Forest
Decision Tree And Random Forest Dr. Ammar Mohammed Associate Professor of Computer Science ISSR, Cairo University PhD of CS ( Uni. Koblenz-Landau, Germany) Spring 2019 Contact: mailto: Ammar@cu.edu.eg
More informationLinear Classifiers: Expressiveness
Linear Classifiers: Expressiveness Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture outline Linear classifiers: Introduction What functions do linear classifiers express?
More informationIC3 and Beyond: Incremental, Inductive Verification
IC3 and Beyond: Incremental, Inductive Verification Aaron R. Bradley ECEE, CU Boulder & Summit Middle School IC3 and Beyond: Incremental, Inductive Verification 1/62 Induction Foundation of verification
More informationChapter 9. PSPACE: A Class of Problems Beyond NP. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.
Chapter 9 PSPACE: A Class of Problems Beyond NP Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Geography Game Geography. Alice names capital city c of country she
More informationUVA CS 4501: Machine Learning
UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course
More informationFrank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 331 le-tex
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c15 2013/9/9 page 331 le-tex 331 15 Ensemble Learning The expression ensemble learning refers to a broad class
More informationDecision Trees: Overfitting
Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9
More information1 Handling of Continuous Attributes in C4.5. Algorithm
.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous
More informationExtracting Provably Correct Rules from Artificial Neural Networks
Extracting Provably Correct Rules from Artificial Neural Networks Sebastian B. Thrun University of Bonn Dept. of Computer Science III Römerstr. 64, D-53 Bonn, Germany E-mail: thrun@cs.uni-bonn.de thrun@cmu.edu
More informationOpleiding Informatica
Opleiding Informatica Tape-quantifying Turing machines in the arithmetical hierarchy Simon Heijungs Supervisors: H.J. Hoogeboom & R. van Vliet BACHELOR THESIS Leiden Institute of Advanced Computer Science
More informationCondensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule
Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Frank Hoonakker 1,3, Nicolas Lachiche 2, Alexandre Varnek 3, and Alain Wagner 3,4 1 Chemoinformatics laboratory,
More informationDecision Tree Learning
Topics Decision Tree Learning Sattiraju Prabhakar CS898O: DTL Wichita State University What are decision trees? How do we use them? New Learning Task ID3 Algorithm Weka Demo C4.5 Algorithm Weka Demo Implementation
More informationVersion Spaces.
. Machine Learning Version Spaces Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de
More informationLearning Invariants using Decision Trees and Implication Counterexamples
Learning Invariants using Decision Trees and Implication Counterexamples Pranav Garg Daniel Neider P. Madhusudan Dan Roth University of Illinois at Urbana-Champaign, USA {garg11, neider2, madhu, danr}@illinois.edu
More informationAnalysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing
Analysis of a Large Structure/Biological Activity Data Set Using Recursive Partitioning and Simulated Annealing Student: Ke Zhang MBMA Committee: Dr. Charles E. Smith (Chair) Dr. Jacqueline M. Hughes-Oliver
More informationComputational Complexity of Bayesian Networks
Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that
More informationCS 220: Discrete Structures and their Applications. Mathematical Induction in zybooks
CS 220: Discrete Structures and their Applications Mathematical Induction 6.4 6.6 in zybooks Why induction? Prove algorithm correctness (CS320 is full of it) The inductive proof will sometimes point out
More informationChapter 18. Decision Trees and Ensemble Learning. Recall: Learning Decision Trees
CSE 473 Chapter 18 Decision Trees and Ensemble Learning Recall: Learning Decision Trees Example: When should I wait for a table at a restaurant? Attributes (features) relevant to Wait? decision: 1. Alternate:
More informationCS540 ANSWER SHEET
CS540 ANSWER SHEET Name Email 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 1 2 Final Examination CS540-1: Introduction to Artificial Intelligence Fall 2016 20 questions, 5 points
More informationEntailment with Conditional Equality Constraints (Extended Version)
Entailment with Conditional Equality Constraints (Extended Version) Zhendong Su Alexander Aiken Report No. UCB/CSD-00-1113 October 2000 Computer Science Division (EECS) University of California Berkeley,
More informationUnderstanding IC3. Aaron R. Bradley. ECEE, CU Boulder & Summit Middle School. Understanding IC3 1/55
Understanding IC3 Aaron R. Bradley ECEE, CU Boulder & Summit Middle School Understanding IC3 1/55 Further Reading This presentation is based on Bradley, A. R. Understanding IC3. In SAT, June 2012. http://theory.stanford.edu/~arbrad
More informationComputational Learning Theory
09s1: COMP9417 Machine Learning and Data Mining Computational Learning Theory May 20, 2009 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997
More information10.3 Matroids and approximation
10.3 Matroids and approximation 137 10.3 Matroids and approximation Given a family F of subsets of some finite set X, called the ground-set, and a weight function assigning each element x X a non-negative
More informationVC Dimension Review. The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces.
VC Dimension Review The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces. Previously, in discussing PAC learning, we were trying to answer questions about
More information