Etymology of Entropy. Definitions. Shannon Entropy. Information Entropy: Illustrating Example. Entropy = randomness. Amount of uncertainty

Similar documents
Etymology of Entropy. Definitions. Shannon Entropy 3/3/2008. Information Entropy: Illustrating Example. Entropy = randomness. Amount of uncertainty

Introduction. Decision Tree Learning. Outline. Decision Tree 9/7/2017. Decision Tree Definition

Decision Trees. Tirgul 5

Learning Decision Trees

Learning Decision Trees

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology

Decision Trees / NLP Introduction

Machine Learning 2nd Edi7on

Induction on Decision Trees

Classification: Decision Trees

Decision Trees. Gavin Brown

The Quadratic Entropy Approach to Implement the Id3 Decision Tree Algorithm

Artificial Intelligence. Topic

Lecture 3: Decision Trees

Decision Tree Learning and Inductive Inference

Decision Tree Learning - ID3

Decision Trees. Common applications: Health diagnosis systems Bank credit analysis

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Intelligent Data Analysis. Decision Trees

EECS 349:Machine Learning Bryan Pardo

the tree till a class assignment is reached

Decision Trees Part 1. Rao Vemuri University of California, Davis

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

Decision Tree Learning Mitchell, Chapter 3. CptS 570 Machine Learning School of EECS Washington State University

Learning Classification Trees. Sargur Srihari

2018 CS420, Machine Learning, Lecture 5. Tree Models. Weinan Zhang Shanghai Jiao Tong University

Information Theory and ID3 Algo.

Machine Learning 3. week

Machine Learning Alternatives to Manual Knowledge Acquisition

Decision T ree Tree Algorithm Week 4 1

M chi h n i e n L e L arni n n i g Decision Trees Mac a h c i h n i e n e L e L a e r a ni n ng

CS 6375 Machine Learning

Decision Trees.

Administration. Chapter 3: Decision Tree Learning (part 2) Measuring Entropy. Entropy Function

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Rule Generation using Decision Trees

Decision Trees. Danushka Bollegala

10-701/ Machine Learning: Assignment 1

Dan Roth 461C, 3401 Walnut

Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees!

Lecture 3: Decision Trees

Decision Trees.

Classification Using Decision Trees

CC283 Intelligent Problem Solving 28/10/2013

Introduction to Machine Learning CMU-10701

Decision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18

Moving Average Rules to Find. Confusion Matrix. CC283 Intelligent Problem Solving 05/11/2010. Edward Tsang (all rights reserved) 1

Decision-Tree Learning. Chapter 3: Decision Tree Learning. Classification Learning. Decision Tree for PlayTennis

Decision Tree Learning

Decision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore


Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan

Bayesian Classification. Bayesian Classification: Why?

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Decision Tree Learning

Classification and Prediction

Chapter 3: Decision Tree Learning

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order

CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

The Solution to Assignment 6

Classification and regression trees

Decision Tree Learning

Introduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees

Machine Learning Recitation 8 Oct 21, Oznur Tastan

Notes on Machine Learning for and

Decision Tree. Decision Tree Learning. c4.5. Example

NAM weather forecasting model. RUC weather forecasting model 4/19/2011. Outline. Short and Long Term Wind Farm Power Prediction


CS145: INTRODUCTION TO DATA MINING

Bias Correction in Classification Tree Construction ICML 2001

Administrative notes. Computational Thinking ct.cs.ubc.ca

Answer keys. EAS 1600 Lab 1 (Clicker) Math and Science Tune-up. Note: Students can receive partial credit for the graphs/dimensional analysis.

Chapter 6: Classification

Chapter 3: Decision Tree Learning (part 2)

Outline. Training Examples for EnjoySport. 2 lecture slides for textbook Machine Learning, c Tom M. Mitchell, McGraw Hill, 1997

Induction of Decision Trees

( D) I(2,3) I(4,0) I(3,2) weighted avg. of entropies

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

Markov Processes. Fundamentals (1) State graph. Fundamentals (2) IEA Automation 2018

COMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem

DIFFERENTIAL EQUATIONS

Einführung in Web- und Data-Science

Classification and Regression Trees

ARTIFICIAL INTELLIGENCE. Supervised learning: classification

Question of the Day. Machine Learning 2D1431. Decision Tree for PlayTennis. Outline. Lecture 4: Decision Tree Learning

Symbolic methods in TC: Decision Trees

Image Denoising Based on Non-Local Low-Rank Dictionary Learning

Imagine we ve got a set of data containing several types, or classes. E.g. information about customers, and class=whether or not they buy anything.

Generalization Error on Pruning Decision Trees

Artificial Intelligence EDAF70

Machine Learning & Data Mining

The Naïve Bayes Classifier. Machine Learning Fall 2017

Leveraging Randomness in Structure to Enable Efficient Distributed Data Analytics

Decision Tree Learning

Machine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang

The Extended Balanced Truncation Algorithm

Decision Trees. Each internal node : an attribute Branch: Outcome of the test Leaf node or terminal node: class label.

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Typical Supervised Learning Problem Setting

Transcription:

Inforation Entropy: Illutrating Exaple Etyology of Entropy Andrew Kuak Intelligent Syte Laboratory 2139 Seaan Center The Unierty of Iowa Iowa City, Iowa 52242-1527 andrew-kuak@uiowa.edu http://www.icaen.uiowa.edu/~ankuak Tel: 319-335 5934 Fax: 319-335 5669 Entropy = randone Aount of uncertainty The Unierty of Iowa Intelligent Syte Laboratory Shannon Entropy S = final probability pace copoed of two dijoint eent E 1 and E 2 with probability p 1 = p and p 2 = 1 p, repectiely. The Shannon entropy i defined a H(S) = H(p 1, p 2 ) = plogp (1 p)log(1 p) Inforation content Entropy Inforation gain Definition I(,2,...,) log 1 2 j j ) Gain(A) I(, 2,...,) 1 E(A) 1

I(D 1, D 2 ) = -4/8* (4/8) - 4/8* (4/8) = 1 For D 11 = 4, D 21 = 0 I(D 11, D 21 ) = -4/4* (4/4) = 0 For D 12 = 0, D 22 = 4 I(D 12, D 22 ) = -4/4* (4/4) = 0 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I(D 12, D 22 ) = 0 Gain (F1) = I (D 1, D 2 ) - E (F1) = 1 Gain(A) I(, 2,...,) Exaple 2 clae; 2 node 1 E(A) D 1 =4 D 2 =0 j... j I(,2,...,) j) D 1 = of exaple in cla 1 D 2 = of exaple in cla 2. F1 D 2 1 3 1 4 1 5 2 6 2 7 2 8 2 D 2 =4 Gain(A) I(, 2,...,) 3 clae; 2 node Exaple I(D 1, D 2, D 3 ) = -2/8* (2/8) - 3/8* (3/8) - 3/8* (3/8) = 1.56 For D 11 = 2, D 21 = 2, D 31 = 0 I(D 11, D 21 )= -2/4* (2/4) -2/4* (2/4) = 1 For D 12 = 0, D 22 = 1, D 32 = 3 I(D 22, D 32 ) = -1/4* (1/4) -3/4* (3/4) = 0.81 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I (D 22, D 32 ) = 0.905 Gain (F1) = I(D 1, D 2 ) - E (F1) = 0.655 j I(,2,...,) E(A) I ( j) D 1 =2 D 2 =2 D 3 =0. F1 D 2 1 3 2 4 2 5 2 6 3 7 3 8 3 D 2 =1 D 3 =3 Gain(A) I(, 2,...,) I(D 1, D 2, D 3 ) = -1/8* (1/8) - 3/8* (3/8) - 4/8* (4/8) = 1.41 For D 11 = 1, D 21 = 3, D 31 = 0 I (D 11, D 21 ) = -1/4* (1/4) -3/4* (3/4) = 0.81 For D 12 = 0, D 22 = 0, D 32 = 4 I (D 32 ) = -4/4* (4/4) = 0 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I(D 32 ) = 0.41 Gain (F1)= I(D 1, D 2 ) - E (F1) = 1 j I(,2,...,) j) 3 clae; 2 node Exaple Exaple. F1 D D 1 =1 D 2 =3 D 3 =0. F1 D 2 2 3 2 4 2 5 3 6 3 7 3 8 3 D 2 =0 D 3 =4 Gain(A) I(, 2,...,) 3 clae; 3 node I(D 1, D 2, D 3 ) = -2/8* (2/8) -3/8* (3/8) -3/8* (3/8) = 1.56 For D 11 = 2, D 21 = 0, D 31 = 0 I(D 11 ) = -2/2* (2/2) = 0 For D 12 = 0, D 22 = 3, D 32 = 0 I(D 32 ) = -3/3* (3/3) = 0 For Green D 13 = 0, D 23 = 0, D 33 = 3 I(D 33 ) = -3/3* (3/3) = 0 E(F1) = 2/8 I(D 11 ) + 3/8 I(D 32 ) + 3/8 I(D 32 ) = 0 Gain (F1) = I(D 1, D 2 ) - E (F1) = 1.56 j... j I(,2,...,) j) D 1=2 2 1 3 Green 3 4 Green 3 5 Green 3 6 2 7 2 8 2 Green D 3=3 D 2=3 Gain(A) I(, 2,...,) 2

Exaple I(D 1, D 2, D 3, D 4 ) = -2/8* (2/8) -2/8* (2/8) -2/8* (2/8) -2/8* (2/8) = 2 For D 11 = 1, D 21 = 0, D 31 = 0, D 41 =0 I(D 11 ) = -1/1* (1/1) = 0 For D 12 = 0, D 22 = 2, D 32 = 0, D 42 = 0 I(D 22 ) = -2/2* (2/2) = 0 For Green D 13 = 1, D 23 = 0, D 33 = 2, D 43 = 2 I(D 13, D 33, D 43 ) = -1/5* (1/5) - 2/5* (2/5) - 2/5* (2/5) = 1.52 E(F1)= 1/8 I(D 11 ) + 2/8 I(D 22 ) + 5/8 I(D 13, D 33, D 43 ) = 0.95 Gain (F1)= I(D 1, D 2 ) - E (F1) = 1.05 j I(,2,...,) j) D 1=1 D 4=0. F1 D 2 Green 1 3 Green 3 4 Green 3 5 Green 4 6 Green 4 7 2 8 2 Green D 1=1 D 3=2 D 4=2 D 2=2 D 4=0 Gain(A) I(, 2,...,) E(F1) = 0 E(F1) = 0.905 Suary Cae 1 Cae 2 Cae 3 Cae 4 Cae 5. F1 D. F1 D. F1 D. F1 D. F1 D 2 1 2 1 2 2 2 1 2 Green 1 3 1 3 2 3 2 3 Green 3 3 Green 3 4 1 4 2 4 2 4 Green 3 4 Green 3 5 2 5 2 5 3 5 Green 3 5 Green 4 6 2 6 3 6 3 6 2 6 Green 4 7 2 7 3 7 3 7 2 7 2 8 2 8 3 8 3 8 2 8 2 E(F1) = 0.41 E(F1) = 0 E(F1) = 0.95 Gain (F1) = 1 Gain (F1) = 0.655 Gain (F1) = 1 Gain (F1) = 1.56 Gain (F1) = 1.05 Continuou alue a plit http://www.icaen.uiowa.edu/%7ecop/public/kantardzic.pdf te Play Tenni:Training Data Set Tranforation between (x) and log 10 (x): Log 2 (x) = log 10 (x)/log 10 (2) = 3.322 log 10 (x) Log 10 (x) = (x)/ (10) = 0.301 (x) Teperature Huidity unny hot high weak no unny hot high trong no oercat hot high weak ye rain ild high weak ye rain cool noral weak ye Play tenni Decion rain cool noral trong no oercat cool noral trong ye Forula in Excel: For (x), ue function: =log(x,2) For log 10 (x), ue function: =log(x,10) or =log10(x) Feature (Attribute) unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Feature alue 3

Feature Selection Teperature Huidity unny hot high weak no unny hot high trong no oercat hot high weak ye rain ild high weak ye Play tenni Contructing Decion Tree feature wind Gain(S, wind) = 0.048 feature outlook Gain(S, outlook) = 0.246 feature huidity Gain(S, huidity) = 0.151 feature teperature Gain(S, teperature) = 0.029 rain cool noral weak ye rain cool noral trong no oercat cool noral trong ye unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Ye and Oercat Ye unny Tep hot Huidity high weak Play tenni no unny hot high trong no oercat hot high weak ye rain ild high weak ye rain cool noral weak ye rain cool noral trong no oercat cool noral trong ye unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Coplete Decion Tree Fro Decion Tree to Rule Huidity Oercat Huidity High ral Oercat ye Strong Weak Ye Ye Ye High ral Ye Strong Weak Ye If = Oercat OR = AND Huidity = ral OR = AND = Weak THEN Play tenni 4

Decion Tree: Key Characteritic Aoiding Oerfitting the Data Coplete pace of finite dicrete alued function Maintaining a ngle hypothe backtracking in earch All training exaple ued at each tep Accuracy Training data et Teting data et Size of tree Reference J. R. Quinlan, Induction of decion tree, Machine Learning, 1, 1986, 81 106. 5