Lazy Rule Learning Nikolaus Korfhage

Similar documents
The Solution to Assignment 6

Decision Support. Dr. Johan Hagelbäck.

Mining Classification Knowledge

Mining Classification Knowledge

Algorithms for Classification: The Basic Methods

Decision Trees. Tirgul 5

Data Mining and Machine Learning

Decision Tree Learning

Classification: Rule Induction Information Retrieval and Data Mining. Prof. Matteo Matteucci

Learning Decision Trees

Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano


CLASSIFICATION NAIVE BAYES. NIKOLA MILIKIĆ UROŠ KRČADINAC

Tools of AI. Marcin Sydow. Summary. Machine Learning

Administrative notes. Computational Thinking ct.cs.ubc.ca

CSE-4412(M) Midterm. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated.

Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees!

Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze)

Bayesian Classification. Bayesian Classification: Why?

Numerical Learning Algorithms

Decision Trees. Gavin Brown

Decision Tree Learning and Inductive Inference

Decision Trees. Each internal node : an attribute Branch: Outcome of the test Leaf node or terminal node: class label.

Slides for Data Mining by I. H. Witten and E. Frank

Classification. Classification. What is classification. Simple methods for classification. Classification by decision tree induction

Inductive Learning. Chapter 18. Material adopted from Yun Peng, Chuck Dyer, Gregory Piatetsky-Shapiro & Gary Parker

Lecture 3: Decision Trees

Machine Learning Chapter 4. Algorithms

Introduction. Decision Tree Learning. Outline. Decision Tree 9/7/2017. Decision Tree Definition

Learning Classification Trees. Sargur Srihari

Data classification (II)

Decision Trees. Danushka Bollegala

Chapter 4.5 Association Rules. CSCI 347, Data Mining

Jialiang Bao, Joseph Boyd, James Forkey, Shengwen Han, Trevor Hodde, Yumou Wang 10/01/2013

Classification: Decision Trees

Intuition Bayesian Classification

Lecture 3: Decision Trees

Learning Decision Trees

Modern Information Retrieval

Machine Learning 2nd Edi7on

Classification and Prediction

Imagine we ve got a set of data containing several types, or classes. E.g. information about customers, and class=whether or not they buy anything.

( D) I(2,3) I(4,0) I(3,2) weighted avg. of entropies

Inductive Learning. Chapter 18. Why Learn?

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

EECS 349:Machine Learning Bryan Pardo

Decision Tree Learning Mitchell, Chapter 3. CptS 570 Machine Learning School of EECS Washington State University

Induction on Decision Trees

Machine Learning Alternatives to Manual Knowledge Acquisition

Holdout and Cross-Validation Methods Overfitting Avoidance

Machine Learning Recitation 8 Oct 21, Oznur Tastan

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Data Mining. Chapter 1. What s it all about?

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology

Modern Information Retrieval

Symbolic methods in TC: Decision Trees

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order

Data Mining and Knowledge Discovery. Petra Kralj Novak. 2011/11/29

Einführung in Web- und Data-Science

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU

UVA CS 4501: Machine Learning

Classification and Regression Trees

Decision Trees Part 1. Rao Vemuri University of California, Davis

Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan

Bayesian Learning. Bayesian Learning Criteria

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Data Mining Part 4. Prediction

How do we compare the relative performance among competing models?

Reminders. HW1 out, due 10/19/2017 (Thursday) Group formations for course project due today (1 pt) Join Piazza (

Leveraging Randomness in Structure to Enable Efficient Distributed Data Analytics

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18

Decision Tree Learning

Data Mining and Knowledge Discovery: Practice Notes

Question of the Day. Machine Learning 2D1431. Decision Tree for PlayTennis. Outline. Lecture 4: Decision Tree Learning

9 Classification. 9.1 Linear Classifiers

Decision Tree And Random Forest

Introduction to Machine Learning CMU-10701

Large-Scale Nearest Neighbor Classification with Statistical Guarantees

CS 6375 Machine Learning

Decision Trees / NLP Introduction

The Bayesian Learning

Weather Prediction Using Historical Data

Classification Lecture 2: Methods

Machine Learning & Data Mining

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

The Naïve Bayes Classifier. Machine Learning Fall 2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Administrative notes February 27, 2018

Machine Learning 2010

Machine Learning: Symbolische Ansätze. Decision-Tree Learning. Introduction C4.5 ID3. Regression and Model Trees

Unsupervised Learning. k-means Algorithm

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Intelligent Data Analysis. Decision Trees

Answers Machine Learning Exercises 2

DECISION TREE LEARNING. [read Chapter 3] [recommended exercises 3.1, 3.4]

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

the tree till a class assignment is reached

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

Machine Learning 3. week

Diagnostics. Gad Kimmel

Transcription:

Lazy Rule Learning Nikolaus Korfhage 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 1

Introduction Lazy Rule Learning Algorithm Possible Improvements Improved Lazy Rule Learning Algorithm Implementation Evaluation and Results 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 2

Rule Learning Learns classifier once on the training data to classify test instances Classifier Rule set Separate-and-conquer add rules that cover many positive examples 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 3

Lazy Learning Training data utilized by each query instance individually Classify instances simultaneously More time for classification 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 4

k-nn w i = 1 d(testinstance,x i ) 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 5

Lazy Rule Learning Combine lazy learning and rule learning produces many context-less rules Learn one rule Rule consists of conditions from test instance Rule should classify test instance correctly Example: Test instance: <rainy, 68, 80, FALSE> Rule: play = yes :- windy = FALSE. # rules = # instances to classify 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 6

LAZYRULE LAZYRULE (Instance, Examples) InitialRule = BestRule = InitialRule for Class Classes Conditions POSSIBLECONDITIONS(Instance) NewRule = REFINERULE (Instance, Conditions, InitialRule, Class) if NewRule > BestRule BestRule = NewRule return BestRule 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 7

POSSIBLECONDITIONS POSSIBLECONDITIONS (Instance) Conditions for Attribute Attributes Value = ATTRIBUTEVALUE (Attribute, Instance) if Value Conditions = Conditions {(Attribute = Value)} return Conditions 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 8

REFINERULE REFINERULE (Instance, Conditions, Rule, Class) if Conditions BestRule = Rule BestCondtion = BESTCONDITION (Rule, Conditions) Refinement = Rule BestCondtion Evaluation = EVALUATERULE (Refinement) NewRule = <Evaluation, Refinement> if NewRule > BestRule BestRule = NewRule REFINERULE (Instance, Conditions \ BestCondition, NewRule, Class) return BestRule 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 9

Numeric Attributes Test instance: <sunny, 85,85, FALSE> Condition outlook = sunny covers some training examples but condition temperature = 85 covers no training example Solution infer two conditions, e.g. temperature 80 temperature < 90 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 10

Numeric Attributes 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 11

Example of Learned Rules play = yes :- humidity < 88, humidity >= 70, windy = FALSE. play = yes :- temperature >= 72, temperature < 84. play = yes :- outlook = overcast. play = yes :- humidity >= 70, humidity < 82.5, windy = FALSE. play = no :- outlook = sunny, temperature >= 70.5, temperature < 80.5. 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 12

Heuristics LAZYRULE evaluated with heuristics available in SECO Laplace significantly better on most datasets Results: Laplace 76.17 Linear Regression 72.44 F-Measure 70.62 Linear Cost 69.17 m-estimate 68.81 Foil Gain 65.99... 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 13

Complexity # rules to check for one test instance: O(c a 2 ) # rules all instances: O(c a 2 d) # instances to check on first call REFINERULE: c a t Decrease a or t c : # classes a : # attributes d : # instances to classify t : # training instances 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 14

Possible Improvements Increase accuracy Beam search Reduce execution time Consider less data random subset of training data Preselect attributes Increase accuracy and decrease execution time Learn rules on k-nearest neighbors 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 15

LAZYRULENN Learn rule on k-nearest neighbors Less training data to learn rule on faster Consider only useful instances to learn rule on higher accuracy 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 16

Computation Time 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 17

Accuracy 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 18

Learned Rules Shorter rules for small k More empty rules LAZYRULE LAZYRULENN LAZYRULENN >, k = 5, k = 5 Accuracy (%) 75.41 82.86 82.86 Average Rule Length 2.88 0.89 19.56 Empty Rules (%) 0.01 54.41 1.78 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 19

Implementation Based on SECO-framework Weka: Rules Heuristics Evaluation Interface knn 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 20

Weka Interface 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 21

Evaluation 37 datasets Evaluating possible improvements: Weka: ten-fold CV Corrected paired Student s t-test Leave-one-out cross-validation Comparing algorithms: Weka: ten-fold CV Friedmann test with post-hoc Nemenyi test 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 22

LAZYRULENN and other algorithms Compared to: Decision tree algorithm J48 (C4.5) Separate-and-conquer rule learning algorithm JRip (RIPPER) k-nearest neighbor Weighted k-nearest neighbor k = 1, 2, 3, 5, 10, 15, 25 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 23

Results Average accuracy k = 1 k = 2 k = 3 k = 5 k = 10 k = 15 k = 25 LAZYRULENN 83.31 83.02 84.00 83.90 82.94 82.47 82.09 knn 83.35 82.75 83.73 83.47 81.73 80.23 78.18 knn, weighted 83.35 83.29 84.16 84.27 83.69 83.29 82.14 JRip 83.09 J48 83.37 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 24

Results 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 25

Summary Combines lazy learning and rule learning Improved lazy rule learning algorithm uses knn Not significantly worse than considered learning algorithms Learns many context-free rules (one for each instance) May be useful for other projects (e.g. Learn-a-LOD) 19. Januar 2012 TU-Darmstadt Nikolaus Korfhage 26