Etymology of Entropy. Definitions. Shannon Entropy. Information Entropy: Illustrating Example. Entropy = randomness. Amount of uncertainty

Inforation Entropy: Illutrating Exaple Etyology of Entropy Andrew Kuak Intelligent Syte Laboratory 2139 Seaan Center The Unierty of Iowa Iowa City, Iowa 52242-1527 andrew-kuak@uiowa.edu http://www.icaen.uiowa.edu/~ankuak Tel: 319-335 5934 Fax: 319-335 5669 Entropy = randone Aount of uncertainty The Unierty of Iowa Intelligent Syte Laboratory Shannon Entropy S = final probability pace copoed of two dijoint eent E 1 and E 2 with probability p 1 = p and p 2 = 1 p, repectiely. The Shannon entropy i defined a H(S) = H(p 1, p 2 ) = plogp (1 p)log(1 p) Inforation content Entropy Inforation gain Definition I(,2,...,) log 1 2 j j ) Gain(A) I(, 2,...,) 1 E(A) 1

I(D 1, D 2 ) = -4/8* (4/8) - 4/8* (4/8) = 1 For D 11 = 4, D 21 = 0 I(D 11, D 21 ) = -4/4* (4/4) = 0 For D 12 = 0, D 22 = 4 I(D 12, D 22 ) = -4/4* (4/4) = 0 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I(D 12, D 22 ) = 0 Gain (F1) = I (D 1, D 2 ) - E (F1) = 1 Gain(A) I(, 2,...,) Exaple 2 clae; 2 node 1 E(A) D 1 =4 D 2 =0 j... j I(,2,...,) j) D 1 = of exaple in cla 1 D 2 = of exaple in cla 2. F1 D 2 1 3 1 4 1 5 2 6 2 7 2 8 2 D 2 =4 Gain(A) I(, 2,...,) 3 clae; 2 node Exaple I(D 1, D 2, D 3 ) = -2/8* (2/8) - 3/8* (3/8) - 3/8* (3/8) = 1.56 For D 11 = 2, D 21 = 2, D 31 = 0 I(D 11, D 21 )= -2/4* (2/4) -2/4* (2/4) = 1 For D 12 = 0, D 22 = 1, D 32 = 3 I(D 22, D 32 ) = -1/4* (1/4) -3/4* (3/4) = 0.81 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I (D 22, D 32 ) = 0.905 Gain (F1) = I(D 1, D 2 ) - E (F1) = 0.655 j I(,2,...,) E(A) I ( j) D 1 =2 D 2 =2 D 3 =0. F1 D 2 1 3 2 4 2 5 2 6 3 7 3 8 3 D 2 =1 D 3 =3 Gain(A) I(, 2,...,) I(D 1, D 2, D 3 ) = -1/8* (1/8) - 3/8* (3/8) - 4/8* (4/8) = 1.41 For D 11 = 1, D 21 = 3, D 31 = 0 I (D 11, D 21 ) = -1/4* (1/4) -3/4* (3/4) = 0.81 For D 12 = 0, D 22 = 0, D 32 = 4 I (D 32 ) = -4/4* (4/4) = 0 E(F1) = 4/8 I(D 11, D 21 ) + 4/8 I(D 32 ) = 0.41 Gain (F1)= I(D 1, D 2 ) - E (F1) = 1 j I(,2,...,) j) 3 clae; 2 node Exaple Exaple. F1 D D 1 =1 D 2 =3 D 3 =0. F1 D 2 2 3 2 4 2 5 3 6 3 7 3 8 3 D 2 =0 D 3 =4 Gain(A) I(, 2,...,) 3 clae; 3 node I(D 1, D 2, D 3 ) = -2/8* (2/8) -3/8* (3/8) -3/8* (3/8) = 1.56 For D 11 = 2, D 21 = 0, D 31 = 0 I(D 11 ) = -2/2* (2/2) = 0 For D 12 = 0, D 22 = 3, D 32 = 0 I(D 32 ) = -3/3* (3/3) = 0 For Green D 13 = 0, D 23 = 0, D 33 = 3 I(D 33 ) = -3/3* (3/3) = 0 E(F1) = 2/8 I(D 11 ) + 3/8 I(D 32 ) + 3/8 I(D 32 ) = 0 Gain (F1) = I(D 1, D 2 ) - E (F1) = 1.56 j... j I(,2,...,) j) D 1=2 2 1 3 Green 3 4 Green 3 5 Green 3 6 2 7 2 8 2 Green D 3=3 D 2=3 Gain(A) I(, 2,...,) 2

Exaple I(D 1, D 2, D 3, D 4 ) = -2/8* (2/8) -2/8* (2/8) -2/8* (2/8) -2/8* (2/8) = 2 For D 11 = 1, D 21 = 0, D 31 = 0, D 41 =0 I(D 11 ) = -1/1* (1/1) = 0 For D 12 = 0, D 22 = 2, D 32 = 0, D 42 = 0 I(D 22 ) = -2/2* (2/2) = 0 For Green D 13 = 1, D 23 = 0, D 33 = 2, D 43 = 2 I(D 13, D 33, D 43 ) = -1/5* (1/5) - 2/5* (2/5) - 2/5* (2/5) = 1.52 E(F1)= 1/8 I(D 11 ) + 2/8 I(D 22 ) + 5/8 I(D 13, D 33, D 43 ) = 0.95 Gain (F1)= I(D 1, D 2 ) - E (F1) = 1.05 j I(,2,...,) j) D 1=1 D 4=0. F1 D 2 Green 1 3 Green 3 4 Green 3 5 Green 4 6 Green 4 7 2 8 2 Green D 1=1 D 3=2 D 4=2 D 2=2 D 4=0 Gain(A) I(, 2,...,) E(F1) = 0 E(F1) = 0.905 Suary Cae 1 Cae 2 Cae 3 Cae 4 Cae 5. F1 D. F1 D. F1 D. F1 D. F1 D 2 1 2 1 2 2 2 1 2 Green 1 3 1 3 2 3 2 3 Green 3 3 Green 3 4 1 4 2 4 2 4 Green 3 4 Green 3 5 2 5 2 5 3 5 Green 3 5 Green 4 6 2 6 3 6 3 6 2 6 Green 4 7 2 7 3 7 3 7 2 7 2 8 2 8 3 8 3 8 2 8 2 E(F1) = 0.41 E(F1) = 0 E(F1) = 0.95 Gain (F1) = 1 Gain (F1) = 0.655 Gain (F1) = 1 Gain (F1) = 1.56 Gain (F1) = 1.05 Continuou alue a plit http://www.icaen.uiowa.edu/%7ecop/public/kantardzic.pdf te Play Tenni:Training Data Set Tranforation between (x) and log 10 (x): Log 2 (x) = log 10 (x)/log 10 (2) = 3.322 log 10 (x) Log 10 (x) = (x)/ (10) = 0.301 (x) Teperature Huidity unny hot high weak no unny hot high trong no oercat hot high weak ye rain ild high weak ye rain cool noral weak ye Play tenni Decion rain cool noral trong no oercat cool noral trong ye Forula in Excel: For (x), ue function: =log(x,2) For log 10 (x), ue function: =log(x,10) or =log10(x) Feature (Attribute) unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Feature alue 3

Feature Selection Teperature Huidity unny hot high weak no unny hot high trong no oercat hot high weak ye rain ild high weak ye Play tenni Contructing Decion Tree feature wind Gain(S, wind) = 0.048 feature outlook Gain(S, outlook) = 0.246 feature huidity Gain(S, huidity) = 0.151 feature teperature Gain(S, teperature) = 0.029 rain cool noral weak ye rain cool noral trong no oercat cool noral trong ye unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Ye and Oercat Ye unny Tep hot Huidity high weak Play tenni no unny hot high trong no oercat hot high weak ye rain ild high weak ye rain cool noral weak ye rain cool noral trong no oercat cool noral trong ye unny ild high weak no unny cool noral weak ye rain ild noral weak ye unny ild noral trong ye oercat ild high trong ye oercat hot noral weak ye rain ild high trong no Coplete Decion Tree Fro Decion Tree to Rule Huidity Oercat Huidity High ral Oercat ye Strong Weak Ye Ye Ye High ral Ye Strong Weak Ye If = Oercat OR = AND Huidity = ral OR = AND = Weak THEN Play tenni 4

Decion Tree: Key Characteritic Aoiding Oerfitting the Data Coplete pace of finite dicrete alued function Maintaining a ngle hypothe backtracking in earch All training exaple ued at each tep Accuracy Training data et Teting data et Size of tree Reference J. R. Quinlan, Induction of decion tree, Machine Learning, 1, 1986, 81 106. 5