CSCE 478/878 Lecture 2: Concept Learning and the General-to-Specific Ordering

Similar documents
Fundamentals of Concept Learning

Concept Learning through General-to-Specific Ordering

[read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] General-to-specific ordering over hypotheses

Concept Learning.

Version Spaces.

Outline. [read Chapter 2] Learning from examples. General-to-specic ordering over hypotheses. Version spaces and candidate elimination.

Outline. Training Examples for EnjoySport. 2 lecture slides for textbook Machine Learning, c Tom M. Mitchell, McGraw Hill, 1997

Question of the Day? Machine Learning 2D1431. Training Examples for Concept Enjoy Sport. Outline. Lecture 3: Concept Learning

Machine Learning 2D1431. Lecture 3: Concept Learning

Concept Learning Mitchell, Chapter 2. CptS 570 Machine Learning School of EECS Washington State University

Overview. Machine Learning, Chapter 2: Concept Learning

Concept Learning. Space of Versions of Concepts Learned

Concept Learning. Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University.

Concept Learning. Berlin Chen References: 1. Tom M. Mitchell, Machine Learning, Chapter 2 2. Tom M. Mitchell s teaching materials.

EECS 349: Machine Learning

Introduction to machine learning. Concept learning. Design of a learning system. Designing a learning system

Lecture 2: Foundations of Concept Learning

Continuity and Differentiability Worksheet

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if

Topics. Concept Learning. Concept Learning Task. Concept Descriptions

Computational Learning Theory

Answers Machine Learning Exercises 2

232 Calculus and Structures

Computational Learning Theory

1. Which one of the following expressions is not equal to all the others? 1 C. 1 D. 25x. 2. Simplify this expression as much as possible.

Computational Learning Theory

Derivatives of Exponentials

Introduction to Derivatives

Lecture 10: Carnot theorem

BITS F464: MACHINE LEARNING

Computational Learning Theory

Intuition Bayesian Classification

The derivative function

Logarithmic functions

Efficient algorithms for for clone items detection

Regularized Regression

Notes on wavefunctions II: momentum wavefunctions

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,

2.11 That s So Derivative

Machine Learning. Computational Learning Theory. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

INTRODUCTION TO CALCULUS LIMITS

3.1 Extreme Values of a Function

Numerical Differentiation

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

University Mathematics 2

Function Composition and Chain Rules

MATH1151 Calculus Test S1 v2a

2.8 The Derivative as a Function

Exam 1 Solutions. x(x 2) (x + 1)(x 2) = x

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

Introduction to Machine Learning. Recitation 8. w 2, b 2. w 1, b 1. z 0 z 1. The function we want to minimize is the loss over all examples: f =

1 + t5 dt with respect to x. du = 2. dg du = f(u). du dx. dg dx = dg. du du. dg du. dx = 4x3. - page 1 -

Machine Learning. Computational Learning Theory. Eric Xing , Fall Lecture 9, October 5, 2016

Integral Calculus, dealing with areas and volumes, and approximate areas under and between curves.

The Derivative The rate of change

Math 312 Lecture Notes Modeling

Math 31A Discussion Notes Week 4 October 20 and October 22, 2015

Financial Econometrics Prof. Massimo Guidolin

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2019

Math Module Preliminary Test Solutions

EECS 349: Machine Learning Bryan Pardo

Exponentials and Logarithms Review Part 2: Exponentials

Solutions to the Multivariable Calculus and Linear Algebra problems on the Comprehensive Examination of January 31, 2014

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

MVT and Rolle s Theorem

158 Calculus and Structures

1. State whether the function is an exponential growth or exponential decay, and describe its end behaviour using limits.

Lab 6 Derivatives and Mutant Bacteria

3.4 Worksheet: Proof of the Chain Rule NAME

AVL trees. AVL trees

f(x + h) f(x) f (x) = lim

Math 161 (33) - Final exam

7.1 Using Antiderivatives to find Area

Chapter 2 Limits and Continuity

HOMEWORK HELP 2 FOR MATH 151

Learning Decision Trees

3.4 Algebraic Limits. Ex 1) lim. Ex 2)

Continuity and Differentiability of the Trigonometric Functions

Decision Tree Learning and Inductive Inference

(a) At what number x = a does f have a removable discontinuity? What value f(a) should be assigned to f at x = a in order to make f continuous at a?

2.3 Product and Quotient Rules

Differential Calculus (The basics) Prepared by Mr. C. Hull

Exam 1 Review Solutions

5.1 We will begin this section with the definition of a rational expression. We

Combining functions: algebraic methods

y = 3 2 x 3. The slope of this line is 3 and its y-intercept is (0, 3). For every two units to the right, the line rises three units vertically.

10 Derivatives ( )

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016

THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225

Review for Exam IV MATH 1113 sections 51 & 52 Fall 2018

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Calculus I Homework: The Derivative as a Function Page 1

Derivatives of trigonometric functions

Mathematics 105 Calculus I. Exam 1. February 13, Solution Guide

Derivative at a point

Phase space in classical physics

A Reconsideration of Matter Waves

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points

Pre-Calculus Review Preemptive Strike

Transcription:

Outline Learning from eamples CSCE 78/878 Lecture : Concept Learning and te General-to-Specific Ordering Stepen D. Scott (Adapted from Tom Mitcell s slides) General-to-specific ordering over ypoteses Version spaces and candidate elimination algoritm Picking new eamples (making queries) Te need for inductive bias A Concept Learning Task: EnjoySport Sky Temp Humid Wind Water Forecst EnjoySpt Sunny Warm Normal Strong Warm Same Yes Sunny Warm Hig Strong Warm Same Yes Rainy Cold Hig Strong Warm Cange No Sunny Warm Hig Strong Cool Cange Yes Goal: Output a ypotesis to predict labels of future eamples. Note: simple approac assuming no noise, illustrates key concepts How to Represent te Hypotesis? Many possible representations Here, will be conjunction of constraints on attributes Eac constraint can be E.g. a specfic value (e.g. Water = Warm) don t care (i.e. Water =? ) no value allowed (i.e. Water= ) Sky AirTemp Humid Wind Water Forecst Sunny?? Strong? Same (i.e. If Sky == Sunny and Wind == Strong and Forecast == Same ten predict Yes else predict No. ) Given: Prototypical Concept Learning Task Instance Space X, e.g. Possible days, eac described by te attributes Sky, AirTemp, Humidity, Wind, Water, Forecast [all possible values listed in Table., p. ] Hypotesis Class H, e.g. conjunctions of literals, suc as?,cold,hig,?,?,? Training Eamples D: Positive and negative eamples of te target function c,c( ),... m,c( m), were i X and c : X {0, }, e.g. c = EnjoySport Determine: A ypotesis H suc tat () = c() for all X Prototypical Concept Learning Task Typically X is eponentially or infinitely large, so in general we can never be sure tat () =c() for all X (can do tis in special restricted, teoretical cases) Instead, settle for a good approimation, e.g. () =c() D Te inductive learning ypotesis: Any ypotesis found to approimate te target function well over a sufficiently large set of training eamples D will also approimate te target function well over oter unobserved eamples. Will study tis more quantitatively later 5 6

Te More-General-Tan Relation Instances X Hypoteses H Find-S Algoritm (Find Maimally Specific Hypotesis) Specific General. Initialize to,,,,,, te most specific ypotesis in H Hypotesis Space Searc by Find-S Instances X Hypoteses H. For eac positive training instance - 0 Specific = <Sunny, Warm, Hig, Strong, Cool, Same> = <Sunny, Warm, Hig, Ligt, Warm, Same> = <Sunny,?,?, Strong,?,?> = <Sunny,?,?,?,?,?> = <Sunny,?,?,?, Cool,?> For eac attribute constraint a i in If te constraint a i in is satisfied by, ten do noting + + +, General j g k iff ( k () =) ( j () =) X g, g, g, g So g induces a partial order on yps from H Else replace a i in by te net more general constraint tat is satisfied by. Output ypotesis = <Sunny Warm Normal Strong Warm Same>, + = <Sunny Warm Hig Strong Warm Same>, + = <Rainy Cold Hig Strong Warm Cange>, - = <Sunny Warm Hig Strong Cool Cange>, + = <!,!,!,!,!,!> 0 = <Sunny Warm Normal Strong Warm Same> = <Sunny Warm? Strong Warm Same> = <Sunny Warm? Strong Warm Same> = <Sunny Warm? Strong?? > Can define > g similarly Wy can we ignore negative eamples? 7 8 9 Complaints about Find-S Assuming tere eists some function in H consistent wit D, Find-S will find one Version Spaces Te List-Ten-Eliminate Algoritm But Find-S cannot detect if tere are oter consistent ypoteses, or ow many tere are. In oter words, if c H, as Find-S found it? Is a maimally specific ypotesis really te best one? Depending on H, tere migt be several maimally specific yps, and Find-S doesn t backtrack Not robust against errors or noise, ignores negative eamples A ypotesis is consistent wit a set of training eamples D of target concept c if and only if () = c() for eac training eample, c() in D Consistent(, D) (, c() D) () =c() Te version space, VS H,D, wit respect to ypotesis space H and training eamples D, is te subset of ypoteses from H consistent wit all training eamples in D VS H,D { H : Consistent(, D)}. VersionSpace a list containing every ypotesis in H. For eac training eample,, c() Remove from VersionSpaceany ypotesis for wic () c(). Output te list of ypoteses in VersionSpace Problem: Requires Ω ( H ) time to enumerate all yps. Can address many of tese concerns by tracking te entire set of consistent yps. 0

Candidate Elimination Algoritm G set of maimally general ypoteses in H Representing Version Spaces S set of maimally specific ypoteses in H Te General boundary, G, of version space VS H,D is te set of its maimally general members Te Specific boundary, S, of version space VS H,D is te set of its maimally specific members Every member of te version space lies between tese boundaries VS H,D = { H :( s S)( g G)(g g g s)} S: Eample Version Space { <Sunny, Warm,?, Strong,?,?> } G: { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?> } For eac training eample d D, do If d is a positive eample Remove from G any yp. inconsistent wit d For eac ypotesis s S tat is not consistent wit d Remove s from S Add to S all minimal generalizations of s suc tat. is consistent wit d, and. some member of G is more general tan Remove from S any ypotesis tat is more general tan anoter ypotesis in S 5 Candidate Elimination Algoritm S 0 : Eample Trace {<Ø, Ø, Ø, Ø, Ø, Ø>} Eample Trace S 0 : { <!,!,!,!,!,! > } If d is a negative eample Remove from S any yp. inconsistent wit d S : { <Sunny, Warm, Normal, Strong, Warm, Same> } For eac ypotesis g G tat is not consistent wit d Remove g from G S : Add to G all minimal specializations of g suc tat. is consistent wit d, and G 0, G, G : { <?,?,?,?,?,?>}. some member of S is more specific tan Remove from G any ypotesis tat is less general tan anoter ypotesis in G G 0 : {<?,?,?,?,?,?>} Training eamples:. <Sunny, Warm, Normal, Strong, Warm, Same>, Enjoy Sport = Yes. <Sunny, Warm, Hig, Strong, Warm, Same>, Enjoy Sport = Yes 6 7 8

Eample Trace S, S : Eample Trace S : Final Version Space G : { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?> <?,?,?,?,?, Same> } S : { <Sunny, Warm,?, Strong,?,?>} S : { <Sunny, Warm,?, Strong,?,?>} G : { <?,?,?,?,?,?> } G : { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?>} Training Eample: G : { <Sunny,?,?,?,?,?> <?, Warm,?,?,?,?> <?,?,?,?,?, Same> } G : { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?>}. <Rainy, Cold, Hig, Strong, Warm, Cange>, EnjoySport=No Training Eample:.<Sunny, Warm, Hig, Strong, Cool, Cange>, EnjoySport = Yes Wy is G only? E.g. wy?,?,normal,?,?,? G 9 0 Aside: Asking Queries Generalizing Beyond Training Data An UNBiased Learner S : { <Sunny, Warm,?, Strong,?,?>} S: { <Sunny, Warm,?, Strong,?,?> } Wat if we assumed noting about te structure of c? G : { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?>} G: { <Sunny,?,?,?,?,?>, <?, Warm,?,?,?,?> } Ten learning becomes rote memorization, e.g. if c is any boolean function over variables wit D = { (000), +, (0), +, (00),, (0), }, ten version space is defined by S = {(000) (0)} and G = { ((0) (00))} Wat if te learner can ask queries, i.e. present an eample and ave a teacer (oracle) give classification? [Like running eperiments] Wy is Sunny,Warm,Normal, Ligt, Warm, Same a good query to make? In general, wat is a good strategy? Sunny Warm Normal Strong Cool Cange () [Unanimous yes over version space] Rainy Cool Normal Ligt W arm Same () [Unanimous no over version space] Sunny Warm Normal Ligt Warm Same () [/ no, / yes] Wy believe we can accurately classify () and ()? Wy not ()? Originally VS = X = power set of X; now it is te set of trut tables satisfying te following: 000 + 00 00 0 + 00 0 0 Since tere are oles, VS = =6=number of ways to fill oles, and for any yet unclassified eample, eactly alf of yps in VSclassify as + and alf as Tus, cannot generalize witout bias!

Inductive Bias Consider concept learning algoritm L Inductive Systems and Equivalent Deductive Systems Tree Learners wit Different Biases instances X, target concept c training eamples D c = {, c() } let L( i,d c) denote classification assigned to instance i by L after training on data D c Definition: Training eamples New instance Training eamples New instance Inductive system Candidate Elimination Algoritm Using Hypotesis Space H Equivalent deductive system Teorem Prover Classification of new instance, or "don t know" Classification of new instance, or "don t know". Rote learner: Store eamples, Classify iff it matces previously observed eample. Version space candidate elimination algoritm. Find-S Te inductive bias of L is any minimal set of assertions B suc tat for any target concept c and corresponding training eamples D c Assertion " H contains te target concept" Inductive bias made eplicit Generally, stronger bias ability to generalize on more eamples from X, but correctness of learner depends on correctness of bias! ( i X)[(B D c i ) L( i,d c)] were y z means y logically entails z 5 6 7