Chapter 4.5 Association Rules CSCI 347, Data Mining
Mining Association Rules Can be highly computationally complex One method: Determine item sets Build rules from those item sets
Vocabulary from before Coverage (support) of a rule number of instances it predicts correctly, in formula, represented by p t total number of instances to which the rule applies Accuracy (confidence) of a rule the number of instances it predicts correctly divided by the total number of instances to which the rule applies, p/t
Input to Mining Association Rules Two inputs Coverage (example - 2 instances) Accuracy (example - 100% accuracy) 4
Vocabulary from before Coverage (support) of a rule number of instances it predicts correctly, in formula, represented by p t total number of instances to which the rule applies Accuracy (confidence) of a rule the number of instances it predicts correctly divided by the total number of instances to which the rule applies, p/t Note: For 100% accuracy, p=t. Thus the support is the number of records to which the item set applies.
Item Sets Item: one attribute-value pair Example: outlook=rainy Item set : set of items Example: outlook=rainy temperature = cool play = yes 6
Weather Data 7 No Rainy Hot Overcast Overcast Sunny Rainy Cool Sunny No Sunny Cool Overcast No Cool Rainy Cool Rainy Rainy Hot Overcast No Hot Sunny No Hot Sunny Play Windy Humidity Temp Outlook
Item Sets for Weather Data In total, 12 one-item sets, 47 two-item sets, 39 threeitem sets, 6 four-item sets and 0 five-item sets (with minimum support of two) One-item sets Two-item sets Three-item sets Four-item sets Outlook = Sunny (5) Outlook = Sunny Temperature = Hot (2) Outlook = Sunny Temperature = Hot Humidity = (2) Outlook = Sunny Temperature = Hot Humidity = Play = No (2) Temperature = Cool (4) Outlook = Sunny Humidity = (3) Outlook = Sunny Humidity = Windy = (2) Outlook = Rainy Temperature = Windy = Play = (2) 8
Generating Rules from an Item Set Once all item sets with minimum support have been generated, we can turn them into rules Humidity =, Windy =, Play = (4) Example: If Humidity = and Windy = then Play = If Humidity = and Play = then Windy = If Windy = and Play = then Humidity = If Humidity = then Windy = and Play = If Windy = then Humidity = and Play = If Play = then Humidity = and Windy = If then Humidity = and Windy = and Play = 4/4 4/6 4/6 4/7 4/8 4/9 4/14 Seven (2 N -1) potential rules: 9
Rules for Weather Data Rules with support > 1 and confidence = 100% Association rule Sup. Conf. 1 Humidity= Windy= Play= 4 100% 2 Temperature=Cool Humidity= 4 100% 3 Outlook=Overcast Play= 4 100% 4 Temperature=Cold Play= Humidity= 3 100%............ 58 Outlook=Sunny Temperature=Hot Humidity= 2 100% In total: 3 rules with support four 5 with support three 50 with support two 10
Example Rules from the Same Set Item set: Temperature = Cool, Humidity =, Windy =, Play = (2) Resulting rules (all with 100% confidence): Temperature = Cool, Windy = Humidity =, Play = Temperature = Cool, Windy =, Humidity = Play = Temperature = Cool, Windy =, Play = Humidity = Temperature = Cool, Windy = (2) Temperature = Cool, Humidity =, Windy = (2) Temperature = Cool, Windy =, Play = (2) due to the following frequent item sets: 11