Incremental Construction of Complex Aggregates: Counting over a Secondary Table

Size: px
Start display at page:

Download "Incremental Construction of Complex Aggregates: Counting over a Secondary Table"

Transcription

1 Incremental Construction of Complex Aggregates: Counting over a Secondary Table Clément Charnay 1, Nicolas Lachiche 1, and Agnès Braud 1 ICube, Université de Strasbourg, CNRS 300 Bd Sébastien Brant - CS 10413, F Illkirch Cedex {charnay,nicolaslachiche,agnesbraud}@unistrafr Abstract In this paper, we discuss the integration of complex aggregates in the construction of logical decision trees We review the use of complex aggregates in TILDE, which is based on an exhaustive search in the complex aggregate space As opposed to such a combinatorial search, we introduce a hill-climbing approach to build complex aggregates incrementally 1 Introduction and Context Relational data mining deals with data represented by several tables We focus on the typical setting where one table, the primary table, contains the target column, ie the attribute whose value is to be predicted, and has a one-to-many relationship with a secondary table A possible way of handling such relationships is to use complex aggregates, ie to aggregate the objects of the secondary table which meet a given condition, using an aggregate function on the objects themselves (count function) or on a numerical attribute of the objects (eg max, average functions) For instance, we may want to classify molecules Molecules have atoms They can be represented as a table of molecules and a table of atoms, with a foreign key in the table of atoms indicating the molecule it belongs to Then, the class of a molecule may depend on the comparison between the average of the charge of the carbon atoms of the molecule and some threshold value This example also shows what a complex aggregate relies on: an aggregate function (here the average), a condition to select the objects to aggregate (here we select only the carbon atoms), the attribute to aggregate on (here the charge of the atoms), an operator and a threshold to make a comparison with the result of the aggregation Previous work showed that the expressivity of complex aggregates can be useful to solve problems such as urban blocks classification [1,2] introduced complex aggregates in propositionalisation But their use increases the size of the feature space too much, and they cannot be fully handled This is the reason why we focus on introducing them in the learning step To our knowledge, only one relational learner, TILDE [3], has implemented complex aggregates, but its exhaustive approach is not adapted to too complex problems The main motivation for our work is to elaborate new heuristics to handle complex aggregates in

2 2 Clément Charnay, Nicolas Lachiche, Agnès Braud relational models This article presents a logical decision tree learner which uses complex aggregates to deal with secondary tables, using a hill-climbing heuristic to build them incrementally We introduce this heuristic in the context of logical decision trees, but it could be applied to other approaches In this article, we focus on counting over a secondary table, ie the aggregate function considered will be the count function The rest of this paper is organized as follows In Sect 2, we review the use of complex aggregates in TILDE In Sect 3, we describe our heuristic to explore the complex aggregate space Finally, in Sect 4, we detail future work 2 TILDE and Complex Aggregates TILDE [3] is a first-order decision tree learner It uses a top-down approach to choose, node by node, from root to leaves, the best refinement to split the training examples according to their class values, using gain ratio as a metric to guide the search TILDE relies on a language bias: the user specifies the literals which can be added in the conjunction at a node In this relational context, to deal with secondary tables, the initial version of TILDE introduces new variables from these secondary tables using an existential quantifier Then, TILDE has been extended [4] to allow the use of complex aggregates, and a heuristic has been developed to explore the search space [5] This heuristic is based on the idea of a refinement cube, where refinement space for the aggregate condition, aggregate function and threshold are the dimensions of the cube This cube is explored in a general-to-specific way, using monotone paths along the different dimensions: when a complex aggregate (a point in the refinement cube) is too specific (ie it fails for every training example), the search does not restart from this point However, the implementation does not allow more than two conjuncts in the aggregate query due to memory problems [6, p 32], which limits the search space Numerical attributes are handled by a comparison to a threshold in problems such as geographical ones It is possible to discretize numerical attributes beforehand and to define predicates to make those comparisons between the values of the attributes and the thresholds given by the discretization Nevertheless, enumerating all the combinations of thresholds over all the numerical attributes takes space in memory, and hence the approach is not tractable To summarize, this combinatorial approach which handles complex aggregates in TILDE has limitations We intend to overcome these limitations by not trying to explore the search space exhaustively, but by finding a heuristic to explore the search space and to build complex aggregates incrementally 3 Incremental Construction of Complex Aggregates with Hill-Climbing Our goal is to build a logical decision tree, like TILDE, which uses complex aggregates to deal with secondary tables To explore the refinement cube of com-

3 Incremental Construction of Complex Aggregates 3 plex aggregates, we choose to use a hill-climbing method This section details the method We use the general notation f unction(condition){operator}threshold to refer to a complex aggregate 31 Refinement of Complex Aggregates When testing an aggregate, we make a refinement to find the best aggregate, and a hill-climbing is performed from a starting aggregate In this paper, we limit ourselves to the count function to aggregate secondary tables The starting conditions are detailed below We then try to refine it with hill-climbing, allowing a non-strict climbing: if there is no strictly improving refinement, we allow picking a refinement with the same score as in the previous step, if it has not been visited before To achieve that, we store all previous refinements chosen in the hill-climbing path From the current aggregate, there are several ways to modify it: Firstly, given the examples, we compute the possible results of the aggregate function, which will serve as possible thresholds To these values, we add one threshold depending on the operator: strictly lower than the other values if the operator is (so that the complex aggregate is true for none of the examples) and strictly higher if the operator is Such thresholds are chosen to be respectively MAX DOUBLE and its opposite The current threshold is then set to the closest possible threshold if it is lower than the minimum or higher than the maximum of the possible thresholds For instance, if the current refinement to try is the count of atoms in the molecule is less than or equal to MAX DOUBLE and, in the training set, there are between 13 and 42 atoms in a molecule, the threshold will be immediately set to 42 Then, we can refine the aggregate by increasing or decreasing the threshold (among the possible thresholds) Other possibilities are to remove a literal from the aggregate condition, or to add one The Starting Conditions Given this, if we refer to the maximum threshold as max, 2 starting conditions will be count(true) max and count(true) MAX DOUBLE From the point of view of the examples considered, they are opposite: if one succeeds for an example, the other will fail The former is the most general (ie it succeds for every example), the latter the most specific We then observe that refinements for one will have the same effect on information gain for the other (the final branch of the examples will be inverted between both), so on the training set, they are equivalent, and will be refined equivalently In the end, we get two conditions count(condition) somet hreshold and count(condition) nextt hreshold which are opposite To conclude on this point, considering both starting conditions is not necessary, since they will be refined following similar paths and will yield the same information gain after the hill-climbing process Of course, the same reasoning applies for starting conditions count(true) MAX DOUBLE and count(true) min, this is the

4 4 Clément Charnay, Nicolas Lachiche, Agnès Braud reason why we consider only two starting conditions, arbitrarily with the operator Moving in the Complex Aggregate Space We now discuss our method for refinement of the aggregate If we add a literal to the aggregate condition, it will yield a specialization and less objects will be selected The threshold range discussed above will not be the same, and the current threshold, associated to the previous, more general condition, will not be relevant since it may be too high Hence, the refinement will yield a poor gain ratio and will not be chosen For instance, in the training set, there are between 13 and 42 atoms in a molecule, but only between 5 and 20 carbon atoms If the current aggregate states that the count of atoms is less than or equal to 30, and we try to refine it to the count of carbon atoms is less than or equal to 30, this last aggregate will be true for every example, yielding zero gain, and hence will not be chosen Of course, the problem will be similar if we consider the other way, ie if we drop a literal from the aggregate condition, yielding a generalization To avoid modifying the aggregate condition without modifying the threshold, we do as follows When modifying the aggregate condition, we consider the number of possible thresholds n 1 given by the current aggregate condition, and the number of thresholds n 2 given by the next (after modification) aggregate condition We sort those two sets those thresholds in increasing order, such that the current threshold is in position c 1 with indices going from 0 to n 1 1, the next threshold chosen, in position c 2 between 0 and n 2 1 will be picked such c that 2 n is closest to c n Mathematically: c = round( c1 (n2 1) n 1 1 ) Since we add a threshold to a list which already contains at least one element, there are always at least two possible thresholds and hence the case n 1 = 1 is not an issue 32 Dealing with Empty Sets We finally discuss a problem that will occur with aggregate functions other than count: the computation of a value for empty sets Indeed, an aggregate condition might select no object, and aggregation over a numerical attribute is not possible in this case For instance, how can the mean of the charge of the oxygen atoms in a molecule be computed when the molecule does not have any oxygen atom? Only the count function can deal in a natural way with empty sets, while another solution has to be chosen for numerical aggregate functions Some possibilities to deal with this issue have been discussed in [7]: Fixing an arbitrary value as the result Using a value depending on the aggregate condition, as close as possible to the values for examples for which the aggregate condition does not result in an empty set, or as far as possible Failing the aggregate when the aggregate function cannot be applied Discarding the aggregate from being chosen as a refinement if the aggregate function cannot be applied for at least one example

5 Incremental Construction of Complex Aggregates 5 In our opinion, the two first options are not easy to apply for every function Indeed, for functions min or max, one can choose values as low or as high as possible so that the aggregate always succeeds or fails if the function cannot be applied directly But for the average function, using a fixed value such as 0 is not relevant if the attribute can take both positive and negative values, neither is choosing positive or negative infinity Moreover, the fact that the set to aggregate is empty can be meaningful, and by assigning a result to the aggregate function we lose this significance The third option also gives to the empty set a meaning we do not necessarily want it to have: the failure of the aggregate should mean the inequality between the result of the aggregation and the threshold is wrong However, this is not the meaning of the empty set Actually, it is implied by the failure of the existential quantifier for the aggregate condition, ie count(condition) 1 fails Then we see two ways to address the issue of empty sets: firstly to consider them as a third branch in our decision trees, since they do not correspond to a success or a failure of the inequality, they are a third possibility However, this option breaks the binary structure of the tree, which is not necessarily a problem Nevertheless, we present another option to preserve the binary tree structure: the principle is to create a node with the count(condition) 1 aggregate before adding a node with an aggregate with a function which may not be applicable In the left branch, we can then add the aggregate f unction(condition) threshold without the empty set problem, since we know from the parent node that condition will select at least one object This adapts the three-branch idea to preserve the binary tree structure, using two nodes instead of one An example is shown in Fig 1, where the aggregate the average charge of the carbon atoms in the molecule is greater than or equal to 0542 is protected by the existential quantifier which tests the presence of at least one carbon atom in the molecule, ie the aggregate the count of carbon atoms in the molecule is greater than or equal to 1 If the latter succeeds, then the former can be evaluated because it is meaningful to compute the average value of a non-empty set If the existential quantifier fails, then the average is not computed because it would be meaningless 4 Conclusion and Future Work The method described in Sect 3 is implemented and will be evaluated The next step in this work is to consider other aggregate functions, on numerical attributes of the secondary objects, and to allow the change of the aggregate function in the refining process of the aggregates, by taking advantage of their ordering as in [5] Then, another step will be to allow recursivity in the aggregates, ie create complex aggregates which have complex aggregates in their aggregate condition, to use the whole database However, this will inevitably raise new issues, since this will add other levels of refinements A more complex problem will be to handle many-to-many relationships, since the complex aggregates can

6 6 Clément Charnay, Nicolas Lachiche, Agnès Braud count(b, (atom(a,b), atom type(b,carbon)), Res1), Res1 1 true false avg(c, (atom(a,b), atom type(b,carbon), atom charge(a,c)), Res2), Res true false Fig 1 Example of three-branch structure to deal with empty sets be formed both ways with such relationships, which can possibly lead to loops in the recursivity discussed above References 1 El Jelali, S, Braud, A, Lachiche, N: Propositionalisation of continuous attributes beyond simple aggregation In Riguzzi, F, Zelezný, F, eds: ILP Volume 7842 of Lecture Notes in Computer Science, Springer (2012) Puissant, A, Lachiche, N, Skupinski, G, Braud, A, Perret, J, Mas, A: Classification et évolution des tissus urbains à partir de données vectorielles Revue Internationale de Géomatique 21(4) (2011) Blockeel, H, Raedt, LD: Top-down induction of first-order logical decision trees Artif Intell 101(1-2) (1998) Assche, AV, Vens, C, Blockeel, H, Dzeroski, S: First order random forests: Learning relational classifiers with complex aggregates Machine Learning 64(1-3) (2006) Vens, C, Ramon, J, Blockeel, H: Refining aggregate conditions in relational learning In Fürnkranz, J, Scheffer, T, Spiliopoulou, M, eds: PKDD Volume 4213 of Lecture Notes in Computer Science, Springer (2006) Blockeel, H, Dehaspe, L, Ramon, J, Struyf, J, Assche, AV, Vens, C, Fierens, D: The ACE Data Mining System (March 2009) 7 Vens, C: Complex aggregates in relational learning PhD thesis, Informatics Section, Department of Computer Science, Faculty of Engineering Science (March 2007) Blockeel, Hendrik (supervisor)

Complex aggregates over clusters of elements

Complex aggregates over clusters of elements Complex aggregates over clusters of elements Celine Vens 1,2,3, Sofie Van Gassen 4, Tom Dhaene 4, and Yvan Saeys 1,2 1 Department of Respiratory Medicine, Ghent University, Ghent, Belgium 2 VIB Inflammation

More information

Decision Tree Learning

Decision Tree Learning Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze)

Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze) Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze) Learning Individual Rules and Subgroup Discovery Introduction Batch Learning Terminology Coverage Spaces Descriptive vs. Predictive

More information

On and Off-Policy Relational Reinforcement Learning

On and Off-Policy Relational Reinforcement Learning On and Off-Policy Relational Reinforcement Learning Christophe Rodrigues, Pierre Gérard, and Céline Rouveirol LIPN, UMR CNRS 73, Institut Galilée - Université Paris-Nord first.last@lipn.univ-paris13.fr

More information

Relational Information Gain

Relational Information Gain Relational Information Gain Marco Lippi 1, Manfred Jaeger 2, Paolo Frasconi 1, and Andrea Passerini 3 1 Dipartimento di Sistemi e Informatica, Università degli Studi di Firenze, Italy 2 Department for

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Induction of Decision Trees

Induction of Decision Trees Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.

More information

Decision Tree Learning and Inductive Inference

Decision Tree Learning and Inductive Inference Decision Tree Learning and Inductive Inference 1 Widely used method for inductive inference Inductive Inference Hypothesis: Any hypothesis found to approximate the target function well over a sufficiently

More information

Question Answering on Statistical Linked Data

Question Answering on Statistical Linked Data Question Answering on Statistical Linked Data AKSW Colloquium paper presentation Konrad Höffner Universität Leipzig, AKSW/MOLE, PhD Student 2015-2-16 1 / 18 1 2 3 2 / 18 Motivation Statistical Linked Data

More information

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees CS194-10 Fall 2011 Lecture 8 CS194-10 Fall 2011 Lecture 8 1 Outline Decision tree models Tree construction Tree pruning Continuous input features CS194-10 Fall 2011 Lecture 8 2

More information

Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving

Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving Matt Fredrikson mfredrik@cs.cmu.edu October 17, 2016 Matt Fredrikson SAT Solving 1 / 36 Review: Propositional

More information

Decision Trees. Lewis Fishgold. (Material in these slides adapted from Ray Mooney's slides on Decision Trees)

Decision Trees. Lewis Fishgold. (Material in these slides adapted from Ray Mooney's slides on Decision Trees) Decision Trees Lewis Fishgold (Material in these slides adapted from Ray Mooney's slides on Decision Trees) Classification using Decision Trees Nodes test features, there is one branch for each value of

More information

Lecture 3: Decision Trees

Lecture 3: Decision Trees Lecture 3: Decision Trees Cognitive Systems - Machine Learning Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning last change November 26, 2014 Ute Schmid (CogSys,

More information

Description Logics. Deduction in Propositional Logic. franconi. Enrico Franconi

Description Logics. Deduction in Propositional Logic.   franconi. Enrico Franconi (1/20) Description Logics Deduction in Propositional Logic Enrico Franconi franconi@cs.man.ac.uk http://www.cs.man.ac.uk/ franconi Department of Computer Science, University of Manchester (2/20) Decision

More information

Tutorial 6. By:Aashmeet Kalra

Tutorial 6. By:Aashmeet Kalra Tutorial 6 By:Aashmeet Kalra AGENDA Candidate Elimination Algorithm Example Demo of Candidate Elimination Algorithm Decision Trees Example Demo of Decision Trees Concept and Concept Learning A Concept

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Lecture 3: Decision Trees

Lecture 3: Decision Trees Lecture 3: Decision Trees Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning Lecture 3: Decision Trees p. Decision

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?

More information

EECS 349:Machine Learning Bryan Pardo

EECS 349:Machine Learning Bryan Pardo EECS 349:Machine Learning Bryan Pardo Topic 2: Decision Trees (Includes content provided by: Russel & Norvig, D. Downie, P. Domingos) 1 General Learning Task There is a set of possible examples Each example

More information

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of

More information

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order MACHINE LEARNING Definition 1: Learning is constructing or modifying representations of what is being experienced [Michalski 1986], p. 10 Definition 2: Learning denotes changes in the system That are adaptive

More information

WEIGHTS OF TESTS Vesela Angelova

WEIGHTS OF TESTS Vesela Angelova International Journal "Information Models and Analyses" Vol.1 / 2012 193 WEIGHTS OF TESTS Vesela Angelova Abstract: Terminal test is subset of features in training table that is enough to distinguish objects

More information

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012 Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Search and Lookahead Enforcing consistency is one way of solving constraint networks:

More information

Overview of Topics. Finite Model Theory. Finite Model Theory. Connections to Database Theory. Qing Wang

Overview of Topics. Finite Model Theory. Finite Model Theory. Connections to Database Theory. Qing Wang Overview of Topics Finite Model Theory Part 1: Introduction 1 What is finite model theory? 2 Connections to some areas in CS Qing Wang qing.wang@anu.edu.au Database theory Complexity theory 3 Basic definitions

More information

KE/Tableaux. What is it for?

KE/Tableaux. What is it for? CS3UR: utomated Reasoning 2002 The term Tableaux refers to a family of deduction methods for different logics. We start by introducing one of them: non-free-variable KE for classical FOL What is it for?

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}

More information

Learning Classification Trees. Sargur Srihari

Learning Classification Trees. Sargur Srihari Learning Classification Trees Sargur srihari@cedar.buffalo.edu 1 Topics in CART CART as an adaptive basis function model Classification and Regression Tree Basics Growing a Tree 2 A Classification Tree

More information

Midterm: CS 6375 Spring 2015 Solutions

Midterm: CS 6375 Spring 2015 Solutions Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an

More information

CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón

CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search

More information

Data Mining and Machine Learning

Data Mining and Machine Learning Data Mining and Machine Learning Concept Learning and Version Spaces Introduction Concept Learning Generality Relations Refinement Operators Structured Hypothesis Spaces Simple algorithms Find-S Find-G

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy

More information

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

More information

The PITA System for Logical-Probabilistic Inference

The PITA System for Logical-Probabilistic Inference The System for Logical-Probabilistic Inference Fabrizio Riguzzi 1 and Terrance Swift 2 1 EDIF University of Ferrara, Via Saragat 1, I-44122, Ferrara, Italy fabrizio.riguzzi@unife.it 2 CETRIA Universidade

More information

Pattern Popularity in 132-Avoiding Permutations

Pattern Popularity in 132-Avoiding Permutations Pattern Popularity in 132-Avoiding Permutations The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Rudolph,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 23. Decision Trees Barnabás Póczos Contents Decision Trees: Definition + Motivation Algorithm for Learning Decision Trees Entropy, Mutual Information, Information

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

IMBALANCED DATA. Phishing. Admin 9/30/13. Assignment 3: - how did it go? - do the experiments help? Assignment 4. Course feedback

IMBALANCED DATA. Phishing. Admin 9/30/13. Assignment 3: - how did it go? - do the experiments help? Assignment 4. Course feedback 9/3/3 Admin Assignment 3: - how did it go? - do the experiments help? Assignment 4 IMBALANCED DATA Course feedback David Kauchak CS 45 Fall 3 Phishing 9/3/3 Setup Imbalanced data. for hour, google collects

More information

Hierarchical Concept Learning

Hierarchical Concept Learning COMS 6998-4 Fall 2017 Octorber 30, 2017 Hierarchical Concept Learning Presenter: Xuefeng Hu Scribe: Qinyao He 1 Introduction It has been shown that learning arbitrary polynomial-size circuits is computationally

More information

Parts 3-6 are EXAMPLES for cse634

Parts 3-6 are EXAMPLES for cse634 1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.

More information

the tree till a class assignment is reached

the tree till a class assignment is reached Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

Machine Learning 2010

Machine Learning 2010 Machine Learning 2010 Decision Trees Email: mrichter@ucalgary.ca -- 1 - Part 1 General -- 2 - Representation with Decision Trees (1) Examples are attribute-value vectors Representation of concepts by labeled

More information

Strong AI vs. Weak AI Automated Reasoning

Strong AI vs. Weak AI Automated Reasoning Strong AI vs. Weak AI Automated Reasoning George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Artificial intelligence can be classified into two categories:

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

Machine Learning 2nd Edi7on

Machine Learning 2nd Edi7on Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 9: Decision Trees ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e

More information

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds

Lecture 25 of 42. PAC Learning, VC Dimension, and Mistake Bounds Lecture 25 of 42 PAC Learning, VC Dimension, and Mistake Bounds Thursday, 15 March 2007 William H. Hsu, KSU http://www.kddresearch.org/courses/spring2007/cis732 Readings: Sections 7.4.17.4.3, 7.5.17.5.3,

More information

Data Mining und Maschinelles Lernen

Data Mining und Maschinelles Lernen Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting

More information

Propositional and Predicate Logic - V

Propositional and Predicate Logic - V Propositional and Predicate Logic - V Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - V WS 2016/2017 1 / 21 Formal proof systems Hilbert s calculus

More information

Boosting with decision stumps and binary features

Boosting with decision stumps and binary features Boosting with decision stumps and binary features Jason Rennie jrennie@ai.mit.edu April 10, 2003 1 Introduction A special case of boosting is when features are binary and the base learner is a decision

More information

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018 Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model

More information

CS 6375: Machine Learning Computational Learning Theory

CS 6375: Machine Learning Computational Learning Theory CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty

More information

9. PSPACE 9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete

9. PSPACE 9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete Geography game Geography. Alice names capital city c of country she is in. Bob names a capital city c' that starts with the letter on which c ends. Alice and Bob repeat this game until one player is unable

More information

CHAPTER-17. Decision Tree Induction

CHAPTER-17. Decision Tree Induction CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes

More information

9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete

9. PSPACE. PSPACE complexity class quantified satisfiability planning problem PSPACE-complete 9. PSPACE PSPACE complexity class quantified satisfiability planning problem PSPACE-complete Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley Copyright 2013 Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos

More information

Simple Neural Nets For Pattern Classification

Simple Neural Nets For Pattern Classification CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification

More information

CSC 411 Lecture 3: Decision Trees

CSC 411 Lecture 3: Decision Trees CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning

More information

Lecture Notes in Machine Learning Chapter 4: Version space learning

Lecture Notes in Machine Learning Chapter 4: Version space learning Lecture Notes in Machine Learning Chapter 4: Version space learning Zdravko Markov February 17, 2004 Let us consider an example. We shall use an attribute-value language for both the examples and the hypotheses

More information

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1) Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!

More information

Finite Model Theory: First-Order Logic on the Class of Finite Models

Finite Model Theory: First-Order Logic on the Class of Finite Models 1 Finite Model Theory: First-Order Logic on the Class of Finite Models Anuj Dawar University of Cambridge Modnet Tutorial, La Roche, 21 April 2008 2 Finite Model Theory In the 1980s, the term finite model

More information

Ensembles of Classifiers.

Ensembles of Classifiers. Ensembles of Classifiers www.biostat.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts ensemble bootstrap sample bagging boosting random forests error correcting

More information

Logic: Bottom-up & Top-down proof procedures

Logic: Bottom-up & Top-down proof procedures Logic: Bottom-up & Top-down proof procedures Alan Mackworth UBC CS 322 Logic 3 March 4, 2013 P & M Textbook 5.2 Lecture Overview Recap: Soundness, Completeness, Bottom-up proof procedure Bottom-up Proof

More information

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology Decision trees Special Course in Computer and Information Science II Adam Gyenge Helsinki University of Technology 6.2.2008 Introduction Outline: Definition of decision trees ID3 Pruning methods Bibliography:

More information

Lecture 7 Decision Tree Classifier

Lecture 7 Decision Tree Classifier Machine Learning Dr.Ammar Mohammed Lecture 7 Decision Tree Classifier Decision Tree A decision tree is a simple classifier in the form of a hierarchical tree structure, which performs supervised classification

More information

Introduction to Logic in Computer Science: Autumn 2006

Introduction to Logic in Computer Science: Autumn 2006 Introduction to Logic in Computer Science: Autumn 2006 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today Today s class will be an introduction

More information

Lecture 1: The arithmetic hierarchy

Lecture 1: The arithmetic hierarchy MODEL THEORY OF ARITHMETIC Lecture 1: The arithmetic hierarchy Tin Lok Wong 8 October, 2014 [These theorems] go a long way to explaining why recursion theory is relevant to the study of models of arithmetic.

More information

Design of Distributed Systems Melinda Tóth, Zoltán Horváth

Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Publication date 2014 Copyright 2014 Melinda Tóth, Zoltán Horváth Supported by TÁMOP-412A/1-11/1-2011-0052

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Machine Learning: Exercise Sheet 2

Machine Learning: Exercise Sheet 2 Machine Learning: Exercise Sheet 2 Manuel Blum AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg mblum@informatik.uni-freiburg.de Manuel Blum Machine Learning

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced

More information

Decision Tree And Random Forest

Decision Tree And Random Forest Decision Tree And Random Forest Dr. Ammar Mohammed Associate Professor of Computer Science ISSR, Cairo University PhD of CS ( Uni. Koblenz-Landau, Germany) Spring 2019 Contact: mailto: Ammar@cu.edu.eg

More information

Linear Classifiers: Expressiveness

Linear Classifiers: Expressiveness Linear Classifiers: Expressiveness Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture outline Linear classifiers: Introduction What functions do linear classifiers express?

More information

IC3 and Beyond: Incremental, Inductive Verification

IC3 and Beyond: Incremental, Inductive Verification IC3 and Beyond: Incremental, Inductive Verification Aaron R. Bradley ECEE, CU Boulder & Summit Middle School IC3 and Beyond: Incremental, Inductive Verification 1/62 Induction Foundation of verification

More information

Chapter 9. PSPACE: A Class of Problems Beyond NP. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 9. PSPACE: A Class of Problems Beyond NP. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 9 PSPACE: A Class of Problems Beyond NP Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Geography Game Geography. Alice names capital city c of country she

More information

UVA CS 4501: Machine Learning

UVA CS 4501: Machine Learning UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course

More information

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 331 le-tex

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 331 le-tex Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c15 2013/9/9 page 331 le-tex 331 15 Ensemble Learning The expression ensemble learning refers to a broad class

More information

Decision Trees: Overfitting

Decision Trees: Overfitting Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9

More information

1 Handling of Continuous Attributes in C4.5. Algorithm

1 Handling of Continuous Attributes in C4.5. Algorithm .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous

More information

Extracting Provably Correct Rules from Artificial Neural Networks

Extracting Provably Correct Rules from Artificial Neural Networks Extracting Provably Correct Rules from Artificial Neural Networks Sebastian B. Thrun University of Bonn Dept. of Computer Science III Römerstr. 64, D-53 Bonn, Germany E-mail: thrun@cs.uni-bonn.de thrun@cmu.edu

More information

Opleiding Informatica

Opleiding Informatica Opleiding Informatica Tape-quantifying Turing machines in the arithmetical hierarchy Simon Heijungs Supervisors: H.J. Hoogeboom & R. van Vliet BACHELOR THESIS Leiden Institute of Advanced Computer Science

More information

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Frank Hoonakker 1,3, Nicolas Lachiche 2, Alexandre Varnek 3, and Alain Wagner 3,4 1 Chemoinformatics laboratory,

More information

Decision Tree Learning

Decision Tree Learning Topics Decision Tree Learning Sattiraju Prabhakar CS898O: DTL Wichita State University What are decision trees? How do we use them? New Learning Task ID3 Algorithm Weka Demo C4.5 Algorithm Weka Demo Implementation

More information

Version Spaces.

Version Spaces. . Machine Learning Version Spaces Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

More information

Learning Invariants using Decision Trees and Implication Counterexamples

Learning Invariants using Decision Trees and Implication Counterexamples Learning Invariants using Decision Trees and Implication Counterexamples Pranav Garg Daniel Neider P. Madhusudan Dan Roth University of Illinois at Urbana-Champaign, USA {garg11, neider2, madhu, danr}@illinois.edu

More information

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing Analysis of a Large Structure/Biological Activity Data Set Using Recursive Partitioning and Simulated Annealing Student: Ke Zhang MBMA Committee: Dr. Charles E. Smith (Chair) Dr. Jacqueline M. Hughes-Oliver

More information

Computational Complexity of Bayesian Networks

Computational Complexity of Bayesian Networks Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that

More information

CS 220: Discrete Structures and their Applications. Mathematical Induction in zybooks

CS 220: Discrete Structures and their Applications. Mathematical Induction in zybooks CS 220: Discrete Structures and their Applications Mathematical Induction 6.4 6.6 in zybooks Why induction? Prove algorithm correctness (CS320 is full of it) The inductive proof will sometimes point out

More information

Chapter 18. Decision Trees and Ensemble Learning. Recall: Learning Decision Trees

Chapter 18. Decision Trees and Ensemble Learning. Recall: Learning Decision Trees CSE 473 Chapter 18 Decision Trees and Ensemble Learning Recall: Learning Decision Trees Example: When should I wait for a table at a restaurant? Attributes (features) relevant to Wait? decision: 1. Alternate:

More information

CS540 ANSWER SHEET

CS540 ANSWER SHEET CS540 ANSWER SHEET Name Email 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 1 2 Final Examination CS540-1: Introduction to Artificial Intelligence Fall 2016 20 questions, 5 points

More information

Entailment with Conditional Equality Constraints (Extended Version)

Entailment with Conditional Equality Constraints (Extended Version) Entailment with Conditional Equality Constraints (Extended Version) Zhendong Su Alexander Aiken Report No. UCB/CSD-00-1113 October 2000 Computer Science Division (EECS) University of California Berkeley,

More information

Understanding IC3. Aaron R. Bradley. ECEE, CU Boulder & Summit Middle School. Understanding IC3 1/55

Understanding IC3. Aaron R. Bradley. ECEE, CU Boulder & Summit Middle School. Understanding IC3 1/55 Understanding IC3 Aaron R. Bradley ECEE, CU Boulder & Summit Middle School Understanding IC3 1/55 Further Reading This presentation is based on Bradley, A. R. Understanding IC3. In SAT, June 2012. http://theory.stanford.edu/~arbrad

More information

Computational Learning Theory

Computational Learning Theory 09s1: COMP9417 Machine Learning and Data Mining Computational Learning Theory May 20, 2009 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997

More information

10.3 Matroids and approximation

10.3 Matroids and approximation 10.3 Matroids and approximation 137 10.3 Matroids and approximation Given a family F of subsets of some finite set X, called the ground-set, and a weight function assigning each element x X a non-negative

More information

VC Dimension Review. The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces.

VC Dimension Review. The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces. VC Dimension Review The purpose of this document is to review VC dimension and PAC learning for infinite hypothesis spaces. Previously, in discussing PAC learning, we were trying to answer questions about

More information