Statistical Analysis of Environmental Data - Academic Year Prof. Fernando Sansò CLUSTER ANALYSIS

Size: px
Start display at page:

Download "Statistical Analysis of Environmental Data - Academic Year Prof. Fernando Sansò CLUSTER ANALYSIS"

Transcription

1 Statstal Analyss o Envronmental Data - Aadem Year Pro. Fernando Sansò EXERCISES - PAR CLUSER ANALYSIS Supervsed Unsupervsed Determnst Stohast Determnst Stohast Dsrmnant Analyss Bayesan Herarhal Optmzaton Approah methods methods AGNES DIANA PAM FANNY able o Contents Herarhal Methods... AGNES...3 AGNES eample...3 he slhouette nde rteron...7 DIANA...8 DIANA eample...9 Optmzaton Method (PAM and FANNY)...5 PAM...5 PAM eample...7 FANNY...9 FANNY eample...0 Supervsed lassaton... Bayesan approah... Dsrmnant analyss...3 Supervsed lassaton eample...3 APPENDIX...9 Update: 6/0/08 Authors: A. Molten, L. Pertusn, M. Reguzzon

2 Herarhal Methods: the dea s to group data n lusters wthout usng tranng samples, but denng a smlarty/dssmlarty table (see Fg. ) based on a dstane onept. hese methods are alled herarhal beause they requre many steps and the hoses taen at eah step are never hanged n the subsequent steps. A dstane s a untonal d o two vetors and y that ulls the ollowng propertes: d(, y) 0 and d(, y) 0 y d(, y) d(y, ) d(, y) d(, z) + d(z, y) Dstane between ponts P Euldean dstane d(,) P P Cty-Blo dstane d(,) P Fg.a. Eamples o dstane between ponts P and P. Dstane between lusters Mean dstane Mnmum dstane D( A, A ) A A P A, P A D( A, A ) mn, P A, P A, [ d(, ) ] d(, ) Fg.b. Eamples o dstane between lusters A and A. A unton on vetors whose values are salars.

3 AGNES (AGglomeratve NEStng) It s an agglomeratve method, startng rom as many lusters as the number o data and endng wth a unque luster. Startng pont: dene one luster or eah element Create the dssmlarty table Fnd the mnmum dstane and merge the losest lusters Are all elements n a sngle luster? NO YES Choose the number o lusters END AGNES eample Consder the ollowng 8 observatons (n one dmenson) provded n asendng order. Group data usng AGNES and then hoose a reasonable number o lusters. At the begnnng we dene 8 lusters omposed by one element only. X X X 3 X 4 X 5 X 6 X 7 X hen we reate the dssmlarty table on the bass o Euldean dstane (see Fg.a), namely d X, or eample d, ( 0.79) ( 0.308) 0.9., X Note that the dssmlarty table s a symmetr matr wth zero values on the dagonal. X X X 3 X 4 X 5 X 6 X 7 X 8 X X X X X X X X 8 0 Fg.. Dssmlarty table ( st step). 3

4 he mnmum dstane n the table above (Fg. ) s d 7, ; thus, the rst luster, ormed by the X 7 and X 8, s reated. Now we update the dssmlarty table by omputng dstanes wth respet to the luster {7,8} usng a d,7 + d, onept o mean dstane (see Fg.b), or eample d, { 7,8}.007. X X X 3 X 4 X 5 X 6 {X 7,X 8 } X X X X X X {X 7,X 8 } 0 Fg.3. Dssmlarty table ( nd step). he mnmum dstane n the table above (Fg. 3) s d 6, { 7,8 } 0.0 ; thus, the element X 6 s nluded n the luster {7,8} reatng the luster {6,7,8}. Agan we update the dssmlarty table by omputng dstanes wth respet to the luster {6,7,8}. X X X 3 X 4 X 5 {X 6,X 7,X 8 } X X X X X {X 6,X 7,X 8 } 0 Fg.4. Dssmlarty table (3 rd step). he mnmum dstane n the table above (Fg. 4) s d, 0.9; thus, a new luster, ormed by the X and X, s reated. {X,X } X 3 X 4 X 5 {X 6,X 7,X 8 } {X,X } X X X {X 6,X 7,X 8 } 0 Fg.5. Dssmlarty table (4 th step). he mnmum dstane n the table above (Fg. 5) s d 3,4 0.; thus, a new luster, ormed by the X 3 and X 4, s reated. 4

5 {X,X } {X 3,X 4 } X 5 {X 6,X 7,X 8 } {X,X } {X 3,X 4 } X {X 6,X 7,X 8 } 0 Fg.6. Dssmlarty table (5 th step). he mnmum dstane n the table above (Fg. 6) s d 5, { 6,7,8 } ; thus the element X 5 s nluded nto the luster {6,7,8} reatng the luster {5,6,7,8}. {X,X } {X 3,X 4 } { X 5,X 6,X 7,X 8 } {X,X } {X 3,X 4 } 0.5 { X 5,X 6,X 7,X 8 } 0 Fg.7. Dssmlarty table (6 th step). he mnmum dstane n the table above (Fg. 7) s d{, }{, 3,4} ; thus the luster {,,3,4} s reated. {X,X, X 3,X 4 } { X 5,X 6,X 7,X 8 } {X,X, X 3,X 4 } { X 5,X 6,X 7,X 8 } 0 Fg.8. Dssmlarty table (7 th step). Fnally, a unque luster s obtaned: luster {,,3,4,5,6,7,8}. In order to hoose a reasonable number o lusters (a unque luster wth all the elements s obvously not meanngul) a stop rteron or the agglomeratve proedure has to be dened, suh as: o the mamum dstane rteron, denng a threshold on the mamum dstane between lusters (see lag dagram n Fg.9a); o the mamum gradent rteron, denng a threshold on the mamum varaton o the agglomeratve dstane (see graph n Fg.9b); o the slhouette nde rteron. 5

6 elements X X X 3 X 4 X 5 X 6 X 7 X agglomeratve dstane Fg.9a. Flag dagram.,60,40,0,00 0,80 0,60 0,40 0,0 0,00 ACCEPABLE GRADIEN UNACCEPABLE GRADIEN Fg.9b. Mamum gradent rteron. 6

7 he slhouette nde rteron he mean dstane between the element and the luster A m s generally dened as: d(, A ) d(, ) P m Am A We have to dene the parameters a, b and the slhouette nde s or eah element : a b s m d(,a l ) P A l dstane wthn ts own luster (ntra-luster dstane). mn d(, A m l ) b a s ma(b,a ) P A, P, m l dstane wth respet to the losest lusters A m (nter-luster dstane). s then a >> b, thus the lusterng o the element s very bad. s then a << b, thus the lusterng o the element s optmal. Moreover, t s possble to evaluate the qualty o a luster by omputng the average o the slhouette ndees o all the elements ormng the luster: n s l s P A l n, n A l. As an eample, let us onsder the 6 th step o the AGNES eerse wth lusters {X,X }, {X 3,X 4 }, {X 5,X 6, X 7,X 8 } and evaluate the qualty o the luster {X,X } usng the slhouette nde. ( X ) X X 0.9 a { X, X }) d(x, { X, X, X, X }).8583 d(x, ( X ) b s( X ) ( X ) X X 0.9 a { X, X }) d(x, { X, X, X, X }).793 d(x, ( X ) b s( X ) s ({ X,X }) ( X ) + s( X ) s good luster lassaton! 7

8 DIANA (DIvsve ANAlyss) It s a dvsve method, startng rom a nque luster omposed by all the data endng wth many lusters omposed by one element only. Startng pont: nlude all the elements n a unque luster Create the dssmlarty table Fnd the element wth the mamum dstane and reate a new luster Choose the luster wth the mamum dameter Create dssmlarty table o the remanng elements o the orgnal luster Fnd the element wth the mamum dstane Inlude the element nto the new luster Is the element loser to the new luster? YES NO NO Does eah luster ontan one element only? YES Choose the number o lusters END 8

9 DIANA eample Consder the same 8 observatons o the AGNES eample, but now group data usng DIANA. he startng pont s a unque luster omposed by all the elements. he dssmlarty table s the same o Fg.. X X X 3 X 4 X 5 X 6 X 7 X 8 X X X X X X X X Fg.0. Dssmlarty table. he ntalzaton an be summarzed as ollows: A o B o Φ {,,3,4,5,6,7,8} X X X 3 X 4 X 5 X 6 X 7 X 8 Fg.. Step 0. he rst step onssts n ndng the most dssmlar element wthn A 0, thereore we alulate the mean value o eah row o the dssmlarty table (Fg.0). For eample d(x,a 0 ) ( ) / 7.58 A 0 X X X 3 X 4 X 5 X 6 X 7 X 8 d(x,a 0 ) Fg.. Dssmlarty table wthn A 0. he mamum dstane n the table above (Fg. ) s d(x,a 0 ).58; thus the A B {,3,4,5,6,7,8} {} X luster{} X X 3 X 4 X 5 X 6 X 7 X 8 Fg.3. Step. s reated: hen we have to nd the most dssmlar element wthn the luster A, thereore we alulate the mean value o eah row o the dssmlarty table (Fg.0), dsregardng the X -olumn and the X -row. For eample d(x,a ) ( ) /

10 A X X 3 X 4 X 5 X 6 X 7 X 8 d(x,a ) Fg.4. Dssmlarty table wthn A. he mamum dstane above (Fg.4) s d(x,a ).38. Now the ollowng test s perormed: d(x,a ) > d(x,b ) the element X s nluded n B. d(x,a ) < d(x,b ) the element X remans n A and the luster dvson s stopped. In ths ase: d(x,a ).38 > d(x,b ) 0.9 (see Fg.0), thus the element X s nluded n B: A { 3,4,5,6,7,8} B {,} X X X 3 Fg.5. Step. X 4 X 5 X 6 X 7 X 8 NOE that the elements an only sht rom luster A to luster B and not ve versa! he proedure s then repeated: A X 3 X 4 X 5 X 6 X 7 X 8 d(x,a ) Fg.6. Dssmlarty table wthn A. In ths ase: d(x 3,A ).4 > d(x 3,B ) ( )/ 0.453, thus the element X 3 s nluded n B. A 3 { 4,5,6,7,8} B 3 {,,3} X X X 3 Fg.7. Step 3. X 4 X 5 X 6 X 7 X 8 A 3 X 4 X 5 X 6 X 7 X 8 d(x,a 3 ) Fg.8. Dssmlarty table wthn A 3. In ths ase: d(x 4,A 3 ).9 > d(x 4,B 3 ) ( )/3 0.53, thus the element X 4 s nluded n B. A 4 { 5,6,7,8} B 4 {,,3,4} X X X 3 X 4 X 5 X 6 X 7 X 8 Fg.9. Step 4. A 4 X 5 X 6 X 7 X 8 d(x,a 4 ) Fg.0. Dssmlarty table wthn A 4. In ths ase: d(x 5,A 4 ) < d(x 5,B 4 ) ( )/4.88, thus the element X 5 has not to be nluded n B, and the proess stops. 0

11 In order to hoose whh luster to dvde, t s neessary to ompute the dameter o A 4 and B 4, loong at the table n Fg.0. he luster to be dvded s the one wth the largest dameter. dam(a 4 ) ma d(,) d(x 5,X 8 ) X, X A 4 dam(b 4 ) ma d(,) d(x,x 4 ) 0.79 X, X B 4 Sne dam(b 4 ) > dam(a 4 ), thus we are gong to dvde the luster B 4 and the luster A 4 s now rozen. B 4 X X X 3 X 4 d(x,b 4 ) Fg.. Dssmlarty table wthn B 4. he mamum dstane n the table above (Fg.) s d(x 4,B 4 ) 0.53; thus the A 5 { 5,6,7,8} B 5 C 5 {} 4 {,,3} X X X 3 X 4 Fg.. Step 5. X 5 luster{} 4 X 6 X 7 X 8 s reated: B 5 X X X 3 d(x,b 5 ) Fg.3. Dssmlarty table wthn B 5. In ths ase: d(x 3,B 5 ) > d(x 3,C 5 ) 0. (see Fg.0), thus the element X 3 s nluded n C. A 6 { 5,6,7,8} B 6 {,} C 6 { 3,4} X X X 3 X 4 Fg.4. Step 6. X 5 X 6 X 7 X 8 B 6 X X D( X,B 6 ) Fg.5. Dssmlarty table wthn B 6. In ths ase d(x,b 6 ) d(x,b 6 ) 0.9, so the anddate element to be moved an be X or X. We arbtrarly hoose X. Sne d(x,c 6 ) ( )/ > d(x,b 6 ), the element X an not be nluded n C, and the proess stops. dam(a 6 ) ma d(,) d(5,8) X, X A 6 dam(b 6 ) ma d(,) d(,) 0.9 X, X B 6 dam(c 6 ) ma d(,) d(3,4) 0. X, X C 6 dam(a 6 ) > dam(c 6 ) > dam(b 6 ), thus we are gong to dvde the luster A 6 reatng the luster{5}, beause the mamum dstane n the dssmlarty table o Fg.0 s d(x 5,A) 0.435:

12 A 7 { 6,7,8} B 7 {,} C 7 { 3,4} D 7 {} 5 X X X 3 X 4 X 5 X 6 X 7 X 8 Fg.6. Step 7. A 7 X 6 X 7 X 8 d(x,a 7 ) Fg.7. Dssmlarty table wthn A 7. he mamum dstane s d(x 6,A 7 ) 0.0. Sne d(x 6,D 7 ) > d(x 6,A 7 ), the element X 6 an not be nluded n D, and the proess stops. We have to ompute agan the dameters: dam(a 7 ) ma d(,) d(6,8) 0.38 X, X A 7 dam(b 7 ) ma d(,) d(,) 0.9 X, X B 7 dam(c 7 ) ma d(,) d(3,4) 0. X, X C 7 dam(c 7 ) > dam(a 7 ) > dam(b 7 ), thus we are gong to dvde the luster C 7 reatng the luster{3} and luster{4}: A 8 { 6,7,8} B 8 {,} C 8 {} 3 D 8 {} 5 E 8 {} 4 X X X 3 X 4 X 5 X 6 Fg.8. Step 8. X 7 X 8 In order to hoose whh luster to dvde (luster A or B), t s neessary to alulate ther dameter, loong at the table n Fg.0. he luster that needs to be dvded s the one wth the largest dameter. dam(a 8 ) ma d(,) d(6,8) 0.38 X, X A 8 dam(b 8 ) ma d(,) d(,) 0.9 X, X B 8 dam(a 8 ) > dam(b 8 ), thus we are gong to dvde the luster A 8 reatng the luster{6}, beause the mamum dstane n the dssmlarty table o Fg.7 d d(x 6,A) 0.0. A 9 { 7,8} B 9 {,} C 9 {} 3 D 9 {} 5 E 9 {} 4 F 9 {} 6 X X X 3 X 4 X 5 X 6 X 7 X 8 Fg.9. Step 9.

13 A 9 X 7 X 8 d(x,a 9 ) Fg.30. Dssmlarty table wthn A 9. he mamum dstane s d(x 7/8,A 9 ) Sne d(x 7/8,F 9 ) 0.0 > d(x 7/8,A 9 ), the element X 7 an not be nluded n F, and the proess stops. We have to ompute agan the dameters to hoose whh luster has to be dvded: dam(a 9 ) ma d(,) d(7,8) X, X A 9 dam(b 9 ) ma d(,) d(,) 0.9 X, X B 9 dam(b 9 ) > dam(a 9 ), thus we are gong to dvde the luster B 9 reatng the luster{} and the luster{}: A 0 { 7,8} B 0 C 0 {} {} 3 D 0 {} 5 E 0 {} 4 F 0 {} 6 G 0 {} X 5 X 6 X X X 3 X 4 X 7 X 8 Fg.30. Step 0. he nal step s to dvde the luster A 0 n two lusters o one element luster{7} and luster{8}. So, the nal result s: A {} 7.68 B {} C {} D {} E {} 4.4 F {} G {} 0.4 X X X 3 X 4 X 5 X 6 X 7 X 8 H {} 8.77 Fg.3. Step. 3

14 We an use the Matlab ode alled DIANA.m n order to vsualze the results n a better way: Matlab ode SINAX: >> data [ ] ; >> [label] DIANA (data); 4

15 Optmzaton Method (PAM and FANNY) he dea s to optmze the ollowng target unton: Φ α, p (w, m ) w m wth ed values or α and p, where m s the representatve or the luster and w s the membershp, that s the ownershp nde o the element wth respet to the luster. he ollowng ondtons have to be satsed: 0 w w α In other words, the am o the optmzaton method s to mnmze the sum o the dstanes between eah element and the luster representatve, weghed by the ownershp nde. he number o lusters has to be dened n advane. p PAM: Parttonng Around Medods (α and p ) We need to: Φ ( w, m ) w, m a) m and mnmze w the mnmum s reahed or w *, w 0, *,. In other words ths s a hard lassaton. b) w and mnmze m the mnmum s reahed or m X *. In other words the representatve o the luster s an element o the luster (alled medod or entrod). he algorthm or the lusterng s dvded nto two phases: ) Buld ) Swap ) In ths stage the lusters are bult. he rst medod s the element wth the mnmum dstane wth respet to the other elements: ( *,A ) mn * : d l m * Eah element s a anddate medod and the ontrbuton C to the target unton o the gener element the element s hosen as medod s evaluated. C [,0] where Gan: ma d(, *) d(,) otal gan: G C he mamum value ndates the new medod: * * * : G * ma m * * *, * 5

16 he seond step s repeated, alulatng the gans wth respet to the prevous medods: *, * * and, *, * * D mn[ d(, *),d (, **)] dstane o rom the nearest medod C G ma[d C d (,),0] * * ** : G ** ma m 3 ** * ) In ths stage the medods are ehanged. he prevous onguraton s hanged by swappng the medods and evaluatng the mpat on the target unton. he swap that produes the best gan s hosen. C h ontrbuton o the general value ater the swap between the old medod and the new medod h (hosen out o the non-medod elements); [ d(,) ] D mn medod (.e. the dstane o rom the nearest medod); E mn[ d(, ) ] medod * (.e. the dstane o rom the seond nearest medod). here are varous ases: ) (, *) d(,),d(, h) d < there s a medod * that s nearer to than and h ( ) C h 0 ) d, D the medod s the nearest to : d, h < E a) ( ) C h d (, h) d(,) d(, h) D 0 d, h > E b) ( ) C E D h > 0 3) D d(, *) < d (,) s not the medod nearest to, but h s nearer to than the In any ase, we ompute urrent medod *: (, h) D 0 C h d <. h C h and we test the ollowng ondtons: mn h < 0, then SWAP h,h else SOP 6

17 PAM eample Consder the same 8 observatons o the AGNES eample, but now group data nto lusters usng PAM. X X X 3 X 4 X 5 X 6 X 7 X Usng the Matlab ode PAM_OU : Matlab ode SINAX: >> data [ ] ; >> [label, med] PAM_OU (data, ); DISSIMILARIY ABLE: X X X 3 X 4 X 5 X 6 X 7 X 8 X X X X X X X X he medod s the element wth the mnmum dstane ( [ d(, )] mn A l, P A l ) wth respet to the other elements; here we an arbtrarly hoose X 4 or X 5, wth a mnmum dstane o 6,0570: MEDOID X 5 [ * 5]. he other modod s hosen by omputng the gan table. GAIN ABLE: ma[ d(, *) d(,),0] C, G C. ma[ d( X,X ) d( X,X ),0] C 5 or eample \ X X X 3 X 4 X 5 X 6 X 7 X 8 X X X X X X X X he mamum value s G 3.6: MEDOID X 3 [ ** 3] 7

18 COS ABLE: (teraton ) h * ** X X X X X X X X he best mprovement or the target unton s reahed by swappng the medods 5 6. COS ABLE: (teraton ) h X X X X X X X X All the possble swaps do not produe any mprovement to the target unton; thereore the algorthm s stopped. BUILD X X X 3 X 4 X 5 X 6 X 7 X 8 BUILD SWAP SWAP he nal medods are m 6 and m 3. he nal membershps are: X X X 3 X 4 X 5 X 6 X 7 X 8 w w w (, m ) d mn 0 otherwse 8

19 FANNY: Fuzzy Analyss Clusterng (α and p ) he target unton s now a non lnear unton. he ondtons on the membershps reman the same: Φ ( w, m ) w, m 0 w w Usng the Lagrange multpler or the searh o the mnmum we obtan: Φ Φ m Φ w w m + λ ( w ) ( m ) 0 w 0 0 w λ m m w w we now that: w λ m, thus: w m m Usng an teratve proedure we obtan a relatve mnmum that depends on the startng pont (the result s not the absolute mnmum!). he teratve proedure s stopped when the derenes between the prevous and the urrent estmates are neglgble. 9

20 FANNY eample Consder the same observatons o the AGNES eample, but now group data nto lusters usng FANNY. In order to ntalze the teratve proess, onsder as startng pont the PAM lassaton, namely the lusters {X,X,X 3,X 4 } and {X 5,X 6,X 7,X 8 } X w w Compute the luster representatve values: w m w m m X + X + X 4 + X 4 + X X + X + X Compute the membershps or eah element X : w m m For eample, n the ase o : w w m m m m + m m he results are summarzed n the ollowng table: m m Step: 0.036;.5505 w

21 Repeatng the teratve proess we get: Step: m 0.033; m.5565 w Step: m m.5566 SOP! he estmates are stablzed. he ollowng mages are the result o the FANNY applaton (wth the same data o our eample and stoppng the teratons when the derenes n the estmates are smaller than 0.0) usng the Matlab ode: FUZZY_OU.m (see also FUZZY.m ). Note that the startng values o the membershps are random values sampled rom a unorm dstrbuton n the nterval [0,]. Matlab ode SINAX: >> data [ ] ;. >> [ label, w, med] FUZZY (data,, 0.0); step Hard lassaton Membershp luster Membershp luster 3 4 5

22 Supervsed lassaton: the dea s to group data n lusters usng tranng samples, namely data or whh the orrespondng luster s assumed to be nown. Among the varous supervsed lassaton methods, we onsder the Bayesan approah (based on an a-pror nowledge o the stohast dstrbuton o the data) and the dsrmnant analyss (whh does not requre any a-pror hypothess on the data dstrbuton). Bayesan approah he data are onsdered as a sample drawn rom a random varable ( X,L) where: X s a ontnuous n-dmensonal varable L s a dsrete varable: l,,,m where m number o lusters he ont dstrbuton s gven by:, l l p l X,L ( ) ( ) ( ) ( ) p What s really avalable s the margnal dstrbuton o (the labels are unnown): For eample n the ase m 3, we have: X m ( ) (, ) ( ) l X,L m ( ) ( ) p + ( ) p + ( )( p p ) hs margnal dstrbuton s alled mture and ts meanng s represented n the ollowng dagrams: () () 3 () 3 p p p p 3 () mture he lassaton problem onssts o two phases:. the evaluaton o the ondtonal dstrbuton ( ) possble to evaluate ( ) (e.g. onsderng a Gaussan dstrbuton t s by estmatne mean and varane values on the bass o tranng samples);. the proper lassaton (.e. the evaluaton o the label values) by applyng the Bayes theorem : ( ) ( ) p p l p ( ) P ( A B) P N ( B A ) P( A ) P( B A ) P( A )

23 In the ase m : p ( l ) ( ) p ( ) p + ( )( p ) P ( l ) ( )( p ) ( ) p + ( )( p ) hs s a sot (or uzzy) lassaton. In order to strtly dede whh label to assgn to the value (hard lassaton) t s also possble to ompute the so alled lelhood rato R: Dsrmnant analyss ( ) p ( )( p ) R > l else l Aordng to the Bayesan approah, the value s assgned to the luster l G ( ) p( l ) p( l ) > 0 he dsrmnant unton G( ) s non lnear and depends on the hosen dstrbuton ( ) and ( ) he dea o the dsrmnant analyss s to appromate G( ) wth a lnear unton L ( ) L( ) a b n suh a way that E [ L( ) G( ) ] mn. { } a, b In the ase m, t results that â p p C ( μ ) XX μ a ( p p ) X. bˆ μ where μ and C are the mean vetor and the ovarane matr o the mture dstrbuton. Supervsed lassaton eample Consder the ollowng 30 values: he rst 0 values are etrated rom N[μ 5, σ ], whle the last 0 values are etrated rom N[μ 8, σ ], but ths normaton an NO be used n order to solve the eerse! he rst 5 values: {5.47, 4.09, 5.03, 4.38, 5.53} have to be onsdered as tranng samples or the rst luster and the values rom the th to the 5 th : {0.03, 4.93, 7.74, 8.77, 9.03} as tranng samples or the seond luster. Assumng that the data o eah luster are etrated rom a normal dstrbuton, estmate the parameters o the mture and lassy the data: o o usng the lelhood rato (Bayesan approah) usng the dsrmnant analyss Mean (μ ) and standard devaton (σ) or the rst luster: μˆ n 5 ( ) 4.9 3

24 σˆ 4.9 n n n 4 5 n 5 ( μˆ ) μˆ ( ) σˆ Mean (μ ) and standard devaton (σ) or the seond luster: μˆ 5 ( ) 8. 5 ( ) σˆ 4 5 σˆ he ont dstrbuton o data and label s gven by: L, l X L l p whle the mture s: m X μ,σ p μ,σ p ( ) ( ) ( μ,σ ) X, p ( ) ( ) ( ) + ( μ,σ )( p ) where X D random varable rom whh the data are etrated L dsrete random varable o the labels. hereore, on the bass o tranng samples, we assume that X l ~ N [ μˆ 4.9, σˆ 0.645] X N μˆ 8., σˆ.950 l ~ [ ] 4 Now we need to estmate the parameter p (and p - p ); we an do t by a. the moment method b. loong at the data hstogram Usng the rst moment (mean value), we have ( ) d p ( ) d + ( p ) ( ) d pμ + ( p ) μ ( μ μ ) p μ μ + Computng the sample mean o all data: N μˆ N 30 ( ) 7.05 We an nd the estmate o p by solvng the ollowng equaton: μˆ μˆ μˆ ( μˆ μˆ ) p + μˆ pˆ 0.38 μˆ μˆ Usng the seond moment, we have ( ) ( ) + ( ) σ d μ p d p ( ) d μ 4

25 p ( σ + μ ) + ( p )( σ + μ ) μ ( p ) σ + p μ + ( p ) μ p μ ( p ) μ p ( p ) μ p σ + μ ( p ) σ + p μ ( p ) + ( p ) μ ( + p ) p ( p ) μ p σ + μ ( p ) σ + p ( p )( μ + μ μ ) p σ + μ ( p ) σ + p ( p )( μ p ) σ + μ Computng the sample varane o all data σˆ μ N N N We an nd the estmate o p by solvng the ollowng equaton: σˆ p σˆ + ( p ) σˆ + p ( p )( μˆ μˆ ) ( μˆ μˆ ) p + [( μˆ μˆ ) + ( σˆ σˆ )]p σˆ σˆ + ( μˆ μˆ ) p [( μˆ μˆ ) + ( σ σ )]p + ( σˆ σˆ ) 0 0.4p 6.85p ± p (,) and p p Note that 0 p! and Data hstogram Separate the data nto lasses and buld the hstogram (N absolute requeny o the lass ). hs hstogram s an appromaton o the mture dstrbuton. 4 N Classes Class : values between and 3 Class : values between 3 and 5 Class 3: values between 5 and 7 Class 4: values between 7 and 9 Class 5: values between 9 and Class 6: values between and 3 5

26 N he relatve requeny o the lass s an appromaton o the probablty o the data to all n the N [ ] orrespondng nterval a, a : N P [ a,a ] p P N { } { [ a,a ]} + ( p ) P [ a,a ] a a a ( μ,σ ) d + ( p ) ( μ,σ ) p d a { } μ In order to solve these ntegrals, let us onsder the standard normal dstrbuton z and ll the σ ollowng table: n sup z n z sup z n z sup Readng values rom a table o the standard normal dstrbuton, we get: P(z< z n) P(z< z sup) P(z< z n) P(z< z sup) P(z n<z< z sup) P(z n<z< z sup) he prevous equaton based on the relatve requenes an be used to dene the terms o the Least Squares problem: y 0 A + b + ν y 0 /30 4/30 8/30 /30 3/30 / A b p pˆ ( A A) A ( y b) pˆ pˆ

27 Atually, there s a problem: separatng the data nto 6 lasses wth a derent number o elements, the matr Q s not equal to I (beause the relatve requenes are observatons o the probabltes wth derent auraes). he soluton s to buld lasses wth the same number o elements: 6 N Class : values between.5 and 5 Class : values between 5 and 5.54 Class 3: values between 5.54 and 7. Class 4: values between 7. and 7.8 Class 5: values between 7.8 and 9 Class 6: values between 9 and Classes Repeatng the same reasonng o beore, we get: n sup z n z sup z n z sup P(z< z n) P(z< z sup) P(z< z n) P(z< z sup) P(z n<z< z sup) P(z n<z< z sup) he terms o the new Least Squares problem (where the use o Q I s orret) are y 0 A + b + ν y 0 /6 /6 /6 /6 /6 / A b p pˆ ( A A) A ( y b) pˆ pˆ

28 We an now lassy the data, both usng the lelhood rato and the dsrmnant analyss. Lelhood Rato ( μ ) σ p e πσ R( ) σ ( p ) e πσ > C ( luster ) ( μ ) By orng that the lelhood rato s equal to, one an get the threshold to dsrmnate between the rst and seond luster: ( μ ) ( p ) ( μ ) ln p lnσ ln + lnσ + 0 σ σ μ μ μ μ p ln σ σ + + σ σ + σ σ p σ σ σ σ 0 Substtutng the estmated values o µ, µ,, and p (rom the rst moment method), we get: ± ±.857, and p () luster p () 3.7 or 5.84 luster Dsrmnant analyss σ p σ + μ p μ + p a σ + p ( p ) σ ( p ) μ b μa ( p )( μ μ ) ( p )( μ μ ) ( p p ) b.4598 L ( ) a b 0 threshold a < 5.7 luster > 5.7 luster 8

29 APPENDIX: Dsrmnant Analyss Formula Proo Gven two lusters wth labels l and l, we an dene the dsrmnant unton G( ) o the data as: ( ) p( l ) p( l ) G so that the optmal lassaton aordng to the Bayesan approah s G( ) > 0 l G( ) < 0 l We want to appromate G( ) wth a lnear unton L() a b Usng the ollowng prnple: a b {[ ] } a. b {[ L() - G( ) ] } Ε a b G( ) Ε mn In order to mnmze the epresson above, we ompute the partal dervatves wth respet to a and b, and then ore them equal to zero: Ε [ a b G( ) ] Ε{ [ a b G( ) ]} 0 a Ε Ε Ε b [ a b G( ) ] Ε a b G( ) { } a Ε{} b Ε G( ) { } a b Ε{ G( ) } 0 { } 0 {[ ]} Reallng the mture dstrbuton s gven by () p + p ( ) ( ) ( ) wth p ( ) p we an wrte Ε ()d pμ + p so that Ε{ } a μb Ε{ G( ) } 0 μμ a μb με{ G( ) } 0 {} ( ) μ μ 0 Subtratng the seond equaton rom the rst equaton, we get: Ε μμ a Ε μ G [ { } ] [( ) ( )] where Ε μμ C { } beause C Ε {( μ)( μ) } Ε{ } + μμ με{ } Ε{ } μμ 9

30 Moreover: Ε μ G {( ) ( ) } Ε{ ( μ) p( l ) } Ε{ ( μ) p( l ) } ( ) p ( ) ( l ) ( ) p ( ) ( l ) μ () d μ () () p ( μ) ( ) d p ( μ) ( ) d p ( ( ) d μ ( ) d) p ( ( ) d μ ( ) d) p ( μ μ) p ( μ μ) p p ( μ ) μ hereore the equaton or the estmate o a s gven by C a p p μ μ â ( ) where the ovarane matr C p p C an be wrtten as () d Ε{ } μμ ( )( p( ) + p ( ) ) d μμ ( ) ( ) d + p ( ) ( ) d μμ pε{ } + p Ε { } ( C + μ μ ) + p ( C + μ μ ) μμ C + p p ( μ μ )( μ μ ) wth C p C + p C From the seond equaton o the system we have bˆ μ a Ε G where Ε [ ( )] [ G( ) ] G( ) so that b μ ( ) p ( ) ()d ()d () a ( p p ) bˆ μ â ( p p ) p () ()d p μμ p 30

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering Clusterng: Smlarty-Based Clusterng CS4780/5780 Mahne Learnng Fall 2013 Thorsten Joahms Cornell Unversty Supervsed vs. Unsupervsed Learnng Herarhal Clusterng Herarhal Agglomeratve Clusterng (HAC) Non-Herarhal

More information

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University Clusterng CS4780/5780 Mahne Learnng Fall 2012 Thorsten Joahms Cornell Unversty Readng: Mannng/Raghavan/Shuetze, Chapters 16 (not 16.3) and 17 (http://nlp.stanford.edu/ir-book/) Outlne Supervsed vs. Unsupervsed

More information

Instance-Based Learning and Clustering

Instance-Based Learning and Clustering Instane-Based Learnng and Clusterng R&N 04, a bt of 03 Dfferent knds of Indutve Learnng Supervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on labelled examples { (x,y, (x,y,, (x n,y

More information

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming OPTIMIATION Introducton ngle Varable Unconstraned Optmsaton Multvarable Unconstraned Optmsaton Lnear Programmng Chapter Optmsaton /. Introducton In an engneerng analss, sometmes etremtes, ether mnmum or

More information

Lecture 2 Solution of Nonlinear Equations ( Root Finding Problems )

Lecture 2 Solution of Nonlinear Equations ( Root Finding Problems ) Lecture Soluton o Nonlnear Equatons Root Fndng Problems Dentons Classcaton o Methods Analytcal Solutons Graphcal Methods Numercal Methods Bracketng Methods Open Methods Convergence Notatons Root Fndng

More information

: Numerical Analysis Topic 2: Solution of Nonlinear Equations Lectures 5-11:

: Numerical Analysis Topic 2: Solution of Nonlinear Equations Lectures 5-11: 764: Numercal Analyss Topc : Soluton o Nonlnear Equatons Lectures 5-: UIN Malang Read Chapters 5 and 6 o the tetbook 764_Topc Lecture 5 Soluton o Nonlnear Equatons Root Fndng Problems Dentons Classcaton

More information

Lecture-7. Homework (Due 2/13/03)

Lecture-7. Homework (Due 2/13/03) Leture-7 Ste Length Seleton Homewor Due /3/3 3. 3. 3.5 3.6 3.7 3.9 3. Show equaton 3.44 he last ste n the roo o heorem 3.6. see sldes Show that >.5, the lne searh would exlude the mnmzer o a quadrat, and

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Mixture o f of Gaussian Gaussian clustering Nov

Mixture o f of Gaussian Gaussian clustering Nov Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:

More information

Summary with Examples for Root finding Methods -Bisection -Newton Raphson -Secant

Summary with Examples for Root finding Methods -Bisection -Newton Raphson -Secant Summary wth Eamples or Root ndng Methods -Bsecton -Newton Raphson -Secant Nonlnear Equaton Solvers Bracketng Graphcal Open Methods Bsecton False Poston (Regula-Fals) Newton Raphson Secant All Iteratve

More information

36.1 Why is it important to be able to find roots to systems of equations? Up to this point, we have discussed how to find the solution to

36.1 Why is it important to be able to find roots to systems of equations? Up to this point, we have discussed how to find the solution to ChE Lecture Notes - D. Keer, 5/9/98 Lecture 6,7,8 - Rootndng n systems o equatons (A) Theory (B) Problems (C) MATLAB Applcatons Tet: Supplementary notes rom Instructor 6. Why s t mportant to be able to

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

CISE301: Numerical Methods Topic 2: Solution of Nonlinear Equations

CISE301: Numerical Methods Topic 2: Solution of Nonlinear Equations CISE3: Numercal Methods Topc : Soluton o Nonlnear Equatons Dr. Amar Khoukh Term Read Chapters 5 and 6 o the tetbook CISE3_Topc c Khoukh_ Lecture 5 Soluton o Nonlnear Equatons Root ndng Problems Dentons

More information

Finite Difference Method

Finite Difference Method 7/0/07 Instructor r. Ramond Rump (9) 747 698 rcrump@utep.edu EE 337 Computatonal Electromagnetcs (CEM) Lecture #0 Fnte erence Method Lecture 0 These notes ma contan coprghted materal obtaned under ar use

More information

represents the amplitude of the signal after modulation and (t) is the phase of the carrier wave.

represents the amplitude of the signal after modulation and (t) is the phase of the carrier wave. 1 IQ Sgnals general overvew 2 IQ reevers IQ Sgnals general overvew Rado waves are used to arry a message over a dstane determned by the ln budget The rado wave (alled a arrer wave) s modulated (moded)

More information

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems Chapter. Ordnar Dfferental Equaton Boundar Value (BV) Problems In ths chapter we wll learn how to solve ODE boundar value problem. BV ODE s usuall gven wth x beng the ndependent space varable. p( x) q(

More information

CS 331 DESIGN AND ANALYSIS OF ALGORITHMS DYNAMIC PROGRAMMING. Dr. Daisy Tang

CS 331 DESIGN AND ANALYSIS OF ALGORITHMS DYNAMIC PROGRAMMING. Dr. Daisy Tang CS DESIGN ND NLYSIS OF LGORITHMS DYNMIC PROGRMMING Dr. Dasy Tang Dynamc Programmng Idea: Problems can be dvded nto stages Soluton s a sequence o decsons and the decson at the current stage s based on the

More information

( ) [ ( k) ( k) ( x) ( ) ( ) ( ) [ ] ξ [ ] [ ] [ ] ( )( ) i ( ) ( )( ) 2! ( ) = ( ) 3 Interpolation. Polynomial Approximation.

( ) [ ( k) ( k) ( x) ( ) ( ) ( ) [ ] ξ [ ] [ ] [ ] ( )( ) i ( ) ( )( ) 2! ( ) = ( ) 3 Interpolation. Polynomial Approximation. 3 Interpolaton {( y } Gven:,,,,,, [ ] Fnd: y for some Mn, Ma Polynomal Appromaton Theorem (Weerstrass Appromaton Theorem --- estence ε [ ab] f( P( , then there ests a polynomal

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Machine Learning: and 15781, 2003 Assignment 4

Machine Learning: and 15781, 2003 Assignment 4 ahne Learnng: 070 and 578, 003 Assgnment 4. VC Dmenson 30 onts Consder the spae of nstane X orrespondng to all ponts n the D x, plane. Gve the VC dmenson of the followng hpothess spaes. No explanaton requred.

More information

Geometric Clustering using the Information Bottleneck method

Geometric Clustering using the Information Bottleneck method Geometr Clusterng usng the Informaton Bottlenek method Susanne Stll Department of Physs Prneton Unversty, Prneton, NJ 08544 susanna@prneton.edu Wllam Balek Department of Physs Prneton Unversty, Prneton,

More information

Single Variable Optimization

Single Variable Optimization 8/4/07 Course Instructor Dr. Raymond C. Rump Oce: A 337 Phone: (95) 747 6958 E Mal: rcrump@utep.edu Topc 8b Sngle Varable Optmzaton EE 4386/530 Computatonal Methods n EE Outlne Mathematcal Prelmnares Sngle

More information

Lecture 26 Finite Differences and Boundary Value Problems

Lecture 26 Finite Differences and Boundary Value Problems 4//3 Leture 6 Fnte erenes and Boundar Value Problems Numeral derentaton A nte derene s an appromaton o a dervatve - eample erved rom Talor seres 3 O! Negletng all terms ger tan rst order O O Tat s te orward

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method Soluton of Lnear System of Equatons and Matr Inverson Gauss Sedel Iteraton Method It s another well-known teratve method for solvng a system of lnear equatons of the form a + a22 + + ann = b a2 + a222

More information

Mean Field / Variational Approximations

Mean Field / Variational Approximations Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov JSM 2013 - Survey Researh Methods Seton Is t MAR or NMAR? Mhal Sverhkov Bureau of Labor Statsts 2 Massahusetts Avenue, NE, Sute 1950, Washngton, DC. 20212, Sverhkov.Mhael@bls.gov Abstrat Most methods that

More information

Chapter 3 Differentiation and Integration

Chapter 3 Differentiation and Integration MEE07 Computer Modelng Technques n Engneerng Chapter Derentaton and Integraton Reerence: An Introducton to Numercal Computatons, nd edton, S. yakowtz and F. zdarovsky, Mawell/Macmllan, 990. Derentaton

More information

Mathematical Economics MEMF e ME. Filomena Garcia. Topic 2 Calculus

Mathematical Economics MEMF e ME. Filomena Garcia. Topic 2 Calculus Mathematcal Economcs MEMF e ME Flomena Garca Topc 2 Calculus Mathematcal Economcs - www.seg.utl.pt/~garca/economa_matematca . Unvarate Calculus Calculus Functons : X Y y ( gves or each element X one element

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION Samplng Theory MODULE V LECTURE - 7 RATIO AND PRODUCT METHODS OF ESTIMATION DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOG KANPUR Propertes of separate rato estmator:

More information

APPROXIMATE OPTIMAL CONTROL OF LINEAR TIME-DELAY SYSTEMS VIA HAAR WAVELETS

APPROXIMATE OPTIMAL CONTROL OF LINEAR TIME-DELAY SYSTEMS VIA HAAR WAVELETS Journal o Engneerng Sene and ehnology Vol., No. (6) 486-498 Shool o Engneerng, aylor s Unversty APPROIAE OPIAL CONROL OF LINEAR IE-DELAY SYSES VIA HAAR WAVELES AKBAR H. BORZABADI*, SOLAYAN ASADI Shool

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

On Generalized Fractional Hankel Transform

On Generalized Fractional Hankel Transform Int. ournal o Math. nalss Vol. 6 no. 8 883-896 On Generaled Fratonal ankel Transorm R. D. Tawade Pro.Ram Meghe Insttute o Tehnolog & Researh Badnera Inda rajendratawade@redmal.om. S. Gudadhe Dept.o Mathemats

More information

Complex Variables. Chapter 18 Integration in the Complex Plane. March 12, 2013 Lecturer: Shih-Yuan Chen

Complex Variables. Chapter 18 Integration in the Complex Plane. March 12, 2013 Lecturer: Shih-Yuan Chen omplex Varables hapter 8 Integraton n the omplex Plane March, Lecturer: Shh-Yuan hen Except where otherwse noted, content s lcensed under a BY-N-SA. TW Lcense. ontents ontour ntegrals auchy-goursat theorem

More information

Correlation and Regression without Sums of Squares. (Kendall's Tau) Rudy A. Gideon ABSTRACT

Correlation and Regression without Sums of Squares. (Kendall's Tau) Rudy A. Gideon ABSTRACT Correlaton and Regson wthout Sums of Squa (Kendall's Tau) Rud A. Gdeon ABSTRACT Ths short pee provdes an ntroduton to the use of Kendall's τ n orrelaton and smple lnear regson. The error estmate also uses

More information

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows: Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH) Chapter 1 Samplng wth Unequal Probabltes Notaton: Populaton element: 1 2 N varable of nterest Y : y1 y2 y N Let s be a sample of elements drawn by a gven samplng method. In other words, s s a subset of

More information

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if SK300 and SK400 Lnk funtons for bnomal GLMs Autumn 08 We motvate the dsusson by the beetle eample GLMs for bnomal and multnomal data Covers the followng materal from hapters 5 and 6: Seton 5.6., 5.6.3,

More information

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis Statstcal analyss usng matlab HY 439 Presented by: George Fortetsanaks Roadmap Probablty dstrbutons Statstcal estmaton Fttng data to probablty dstrbutons Contnuous dstrbutons Contnuous random varable X

More information

9 : Learning Partially Observed GM : EM Algorithm

9 : Learning Partially Observed GM : EM Algorithm 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 9 : Learnng Partally Observed GM : EM Algorthm Lecturer: Erc P. Xng Scrbes: Rohan Ramanath, Rahul Goutam 1 Generalzed Iteratve Scalng In ths secton,

More information

DOAEstimationforCoherentSourcesinBeamspace UsingSpatialSmoothing

DOAEstimationforCoherentSourcesinBeamspace UsingSpatialSmoothing DOAEstmatonorCoherentSouresneamspae UsngSpatalSmoothng YnYang,ChunruWan,ChaoSun,QngWang ShooloEletralandEletronEngneerng NanangehnologalUnverst,Sngapore,639798 InsttuteoAoustEngneerng NorthwesternPoltehnalUnverst,X

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Clustering through Mixture Models

Clustering through Mixture Models lusterng through Mxture Models General referenes: Lndsay B.G. 995 Mxture models: theory geometry and applatons FS- BMS Regonal onferene Seres n Probablty and Statsts. MLahlan G.J. Basford K.E. 988 Mxture

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Spectral Clustering. Shannon Quinn

Spectral Clustering. Shannon Quinn Spectral Clusterng Shannon Qunn (wth thanks to Wllam Cohen of Carnege Mellon Unverst, and J. Leskovec, A. Raaraman, and J. Ullman of Stanford Unverst) Graph Parttonng Undrected graph B- parttonng task:

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

Numerical Methods Solution of Nonlinear Equations

Numerical Methods Solution of Nonlinear Equations umercal Methods Soluton o onlnear Equatons Lecture Soluton o onlnear Equatons Root Fndng Prolems Dentons Classcaton o Methods Analytcal Solutons Graphcal Methods umercal Methods Bracketng Methods Open

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

General Tips on How to Do Well in Physics Exams. 1. Establish a good habit in keeping track of your steps. For example, when you use the equation

General Tips on How to Do Well in Physics Exams. 1. Establish a good habit in keeping track of your steps. For example, when you use the equation General Tps on How to Do Well n Physcs Exams 1. Establsh a good habt n keepng track o your steps. For example when you use the equaton 1 1 1 + = d d to solve or d o you should rst rewrte t as 1 1 1 = d

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede Fall 0 Analyss of Expermental easurements B. Esensten/rev. S. Errede We now reformulate the lnear Least Squares ethod n more general terms, sutable for (eventually extendng to the non-lnear case, and also

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere Fall Analyss of Epermental Measurements B. Esensten/rev. S. Errede Some mportant probablty dstrbutons: Unform Bnomal Posson Gaussan/ormal The Unform dstrbuton s often called U( a, b ), hch stands for unform

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

International Mathematical Olympiad. Preliminary Selection Contest 2012 Hong Kong. Outline of Solutions

International Mathematical Olympiad. Preliminary Selection Contest 2012 Hong Kong. Outline of Solutions Internatonal Mathematcal Olympad Prelmnary Selecton ontest Hong Kong Outlne of Solutons nswers: 7 4 7 4 6 5 9 6 99 7 6 6 9 5544 49 5 7 4 6765 5 6 6 7 6 944 9 Solutons: Snce n s a two-dgt number, we have

More information

ASSESSMENT OF UNCERTAINTY IN ESTIMATION OF STORED AND RECOVERABLE THERMAL ENERGY IN GEOTHERMAL RESERVOIRS BY VOLUMETRIC METHODS

ASSESSMENT OF UNCERTAINTY IN ESTIMATION OF STORED AND RECOVERABLE THERMAL ENERGY IN GEOTHERMAL RESERVOIRS BY VOLUMETRIC METHODS PROCEEDINGS, Thrty-Fourth Workshop on Geothermal Reservor Engneerng Stanord Unversty, Stanord, Calorna, February 9-11, 009 SGP-TR-187 ASSESSMENT OF UNCERTAINTY IN ESTIMATION OF STORED AND RECOVERABLE THERMAL

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

The Karush-Kuhn-Tucker. Nuno Vasconcelos ECE Department, UCSD

The Karush-Kuhn-Tucker. Nuno Vasconcelos ECE Department, UCSD e Karus-Kun-ucker condtons and dualt Nuno Vasconcelos ECE Department, UCSD Optmzaton goal: nd mamum or mnmum o a uncton Denton: gven unctons, g, 1,...,k and, 1,...m dened on some doman Ω R n mn w, w Ω

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Mr.Said Anwar Shah, Dr. Noor Badshah,

Mr.Said Anwar Shah, Dr. Noor Badshah, Internatonal Journal of Sentf & Engneerng Researh Volume 5 Issue 5 Ma-04 56 ISS 9-558 Level Set Method for Image Segmentaton and B Feld Estmaton usng Coeffent of Varaton wth Loal Statstal Informaton Mr.Sad

More information

Discriminative Estimation (Maxent models and perceptron)

Discriminative Estimation (Maxent models and perceptron) srmnatve Estmaton Maxent moels an pereptron Generatve vs. srmnatve moels Many sles are aapte rom sles by hrstopher Mannng Introuton So ar we ve looke at generatve moels Nave Bayes But there s now muh use

More information

GEL 446: Applied Environmental Geology

GEL 446: Applied Environmental Geology GE 446: ppled Envronmental Geology Watershed Delneaton and Geomorphology Watershed Geomorphology Watersheds are fundamental geospatal unts that provde a physal and oneptual framewor wdely used by sentsts,

More information

A MODIFIED METHOD FOR SOLVING SYSTEM OF NONLINEAR EQUATIONS

A MODIFIED METHOD FOR SOLVING SYSTEM OF NONLINEAR EQUATIONS Journal of Mathematcs and Statstcs 9 (1): 4-8, 1 ISSN 1549-644 1 Scence Publcatons do:1.844/jmssp.1.4.8 Publshed Onlne 9 (1) 1 (http://www.thescpub.com/jmss.toc) A MODIFIED METHOD FOR SOLVING SYSTEM OF

More information

: 5: ) A

: 5: ) A Revew 1 004.11.11 Chapter 1: 1. Elements, Varable, and Observatons:. Type o Data: Qualtatve Data and Quanttatve Data (a) Qualtatve data may be nonnumerc or numerc. (b) Quanttatve data are always numerc.

More information

Shuai Dong. Isaac Newton. Gottfried Leibniz

Shuai Dong. Isaac Newton. Gottfried Leibniz Computatonal pyscs Sua Dong Isaac Newton Gottred Lebnz Numercal calculus poston dervatve ntegral v velocty dervatve ntegral a acceleraton Numercal calculus Numercal derentaton Numercal ntegraton Roots

More information

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh Computatonal Bology Lecture 8: Substtuton matrces Saad Mnemneh As we have ntroduced last tme, smple scorng schemes lke + or a match, - or a msmatch and -2 or a gap are not justable bologcally, especally

More information

XII.3 The EM (Expectation-Maximization) Algorithm

XII.3 The EM (Expectation-Maximization) Algorithm XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles

More information

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov 9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar

More information

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

p 1 c 2 + p 2 c 2 + p 3 c p m c 2 Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Prof. Paolo Colantonio a.a

Prof. Paolo Colantonio a.a Pro. Paolo olantono a.a. 3 4 Let s consder a two ports network o Two ports Network o L For passve network (.e. wthout nternal sources or actve devces), a general representaton can be made by a sutable

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

CHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG

CHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG Chapter 7: Constraned Optmzaton CHAPER 7 CONSRAINED OPIMIZAION : SQP AND GRG Introducton In the prevous chapter we eamned the necessary and suffcent condtons for a constraned optmum. We dd not, however,

More information

Probability Density Function Estimation by different Methods

Probability Density Function Estimation by different Methods EEE 739Q SPRIG 00 COURSE ASSIGMET REPORT Probablty Densty Functon Estmaton by dfferent Methods Vas Chandraant Rayar Abstract The am of the assgnment was to estmate the probablty densty functon (PDF of

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

ORDINARY DIFFERENTIAL EQUATIONS EULER S METHOD

ORDINARY DIFFERENTIAL EQUATIONS EULER S METHOD Numercal Analss or Engneers German Jordanan Unverst ORDINARY DIFFERENTIAL EQUATIONS We wll eplore several metods o solvng rst order ordnar derental equatons (ODEs and we wll sow ow tese metods can be appled

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

COMPUTER SCIENCE 349A SAMPLE EXAM QUESTIONS WITH SOLUTIONS PARTS 1, 2

COMPUTER SCIENCE 349A SAMPLE EXAM QUESTIONS WITH SOLUTIONS PARTS 1, 2 COMPUTE SCIENCE 49A SAMPLE EXAM QUESTIONS WITH SOLUTIONS PATS, PAT.. a Dene he erm ll-ondoned problem. b Gve an eample o a polynomal ha has ll-ondoned zeros.. Consder evaluaon o anh, where e e anh. e e

More information

CHAPTER 4d. ROOTS OF EQUATIONS

CHAPTER 4d. ROOTS OF EQUATIONS CHAPTER 4d. ROOTS OF EQUATIONS A. J. Clark School o Engneerng Department o Cvl and Envronmental Engneerng by Dr. Ibrahm A. Assakka Sprng 00 ENCE 03 - Computaton Methods n Cvl Engneerng II Department o

More information

MULTICRITERION OPTIMIZATION OF LAMINATE STACKING SEQUENCE FOR MAXIMUM FAILURE MARGINS

MULTICRITERION OPTIMIZATION OF LAMINATE STACKING SEQUENCE FOR MAXIMUM FAILURE MARGINS MLTICRITERION OPTIMIZATION OF LAMINATE STACKING SEENCE FOR MAXIMM FAILRE MARGINS Petr Kere and Juhan Kos Shool of Engneerng, Natonal nversty of ruguay J. Herrera y Ressg 565, Montevdeo, ruguay Appled Mehans,

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering Statstcs and Probablty Theory n Cvl, Surveyng and Envronmental Engneerng Pro. Dr. Mchael Havbro Faber ETH Zurch, Swtzerland Contents o Todays Lecture Overvew o Uncertanty Modelng Random Varables - propertes

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information