Kernel Maximum a Posteriori Classification with Error Bound Analysis

Size: px
Start display at page:

Download "Kernel Maximum a Posteriori Classification with Error Bound Analysis"

Transcription

1 Kernel Maxmum a Posteror Classfcaton wth Error Bound Analyss Zengln Xu, Kazhu Huang, Janke Zhu, Irwn Kng, and Mchael R. Lyu Dept. of Computer Scence and Engneerng, The Chnese Unv. of Hong Kong, Shatn, N.T., Hong Kong {zlxu,kzhuang,jkzhu,kng,lyu}@cse.cuhk.edu.hk Abstract. Kernel methods have been wdely used n data classfcaton. Many kernel-based classfers lke Kernel Support Vector Machnes (KSVM) assume that data can be separated by a hyperplane n the feature space. These methods do not consder the data dstrbuton. Ths paper proposes a novel Kernel Maxmum A Posteror (KMAP) classfcaton method, whch mplements a Gaussan densty dstrbuton assumpton n the feature space and can be regarded as a more generalzed classfcaton method than other kernel-based classfer such as Kernel Fsher Dscrmnant Analyss (KFDA). We also adopt robust methods for parameter estmaton. In addton, the error bound analyss for KMAP ndcates the effectveness of the Gaussan densty assumpton n the feature space. Furthermore, KMAP acheves very promsng results on eght UCI benchmark data sets aganst the compettve methods. 1 Introducton Recently, kernel methods have been regarded as the state-of-the-art classfcaton approaches [1]. The basc dea of kernel methods n supervsed learnng s to map data from an nput space to a hgh-dmensonal feature space n order to make data more separable. Classcal kernel-based classfers nclude Kernel Support Vector Machne (KSVM) [], Kernel Fsher Dscrmnant Analyss (KFDA) [3], and Kernel Mnmax probablty Machne [4,5]. The reasonablty behnd them s that the lnear dscrmnant functons n the feature space can represent complex separatng surfaces when mapped back to the orgnal nput space. However, one drawback of KSVM s that t does not consder the data dstrbuton and cannot drectly output the probabltes or confdences for classfcaton. Therefore, t s hard to be appled n systems that reason under uncertanty. On the other hand, n statstcal pattern recognton, the probablty denstes can be estmated from data. Future examples are then assgned to the class wth the Maxmum A Posteror (MAP) [6]. One typcal probablty densty functon s the Gaussan densty functon. The Gaussan densty functons are easy to handle. However, the Gaussan dstrbuton cannot be easly satsfed n the nput space and t s hard to deal wth non-lnearly separable problems. M. Ishkawa et al. (Eds.): ICONIP 007, Part I, LNCS 4984, pp , 008. c Sprnger-Verlag Berln Hedelberg 008

2 84 Z. Xu et al. To solve these problems, we propose a Kernel Maxmum a Posteror (KMAP) Classfcaton method under Gaussanty assumpton n the feature space. Dfferent from KSVM, we make the Gaussan densty assumpton, whch mples that data can be separated by more complex surfaces n the feature space. Generally, dstrbutons other than the Gaussan dstrbuton can also be assumed n the feature space. However, under a dstrbuton wth a complex form, t s hard to get a close form soluton and easy to trap n over-fttng. Moreover, wth the Gaussan assumpton, a kernelzed verson can be derved wthout knowng the explct form of the mappng functons for our model. In addton, to ndcate the effectveness of our assumpton, we calculate a separablty measure and the error bound for b-category data sets. The error bound analyss shows that Gaussan densty dstrbuton can be more easly satsfed n the feature space. Ths paper s organzed as follows. Secton derves the MAP decson rules n the feature space, and analyzes ts separablty measures and upper error bounds. Secton 3 presents the experments aganst other classfers. Secton 4 revews the related work. Secton 5 draws conclusons and lsts possble future research drectons. Man Results In ths secton, our MAP classfcaton model s derved. Then, we adopt a specal regularzaton to estmate the parameters. The kernel trck s used to calculate our model. Last, the separablty measure and the error bound are calculated n the kernel-nduced feature space..1 Model Formulaton Under the Gaussan dstrbuton assumpton, the condtonal densty functon for each class C (1 m) s wrtten as below: p(φ(x) C )= { 1 (π) N / Σ exp 1 } 1 / (Φ(x) µ ) T Σ 1 (Φ(x) µ ), (1) where Φ(x) s the mage of x n the feature space, N s the dmenson of the feature space (N could be nfnty), µ and Σ are the mean and the covarance matrx of C, respectvely, and Σ s the determnant of the covarance matrx. Accordng to the Bayesan Theorem, the posteror probablty of class C s calculated by P (C x) = p(x C )P (C ) m p(x C j )P (C j ). () Based on Eq. (), the decson rule can be formulated as below: x C w f P (C w x) = max 1 j m P (C j x). (3)

3 Kernel Maxmum a Posteror Classfcaton wth Error Bound Analyss 843 Ths means that a test data pont wll be assgned to the class wth the maxmum of P (C w x),.e., the MAP. Snce the MAP s calculated n the kernel-nduced feature space, the output model s named as the KMAP classfcaton. KMAP can provde not only a class label but also the probablty of a data pont belongng to that class. Ths probablty can be vewed as a confdence of classfyng new data ponts and can be used n statstcal systems that reason under uncertanty. If the confdence s lower than some specfed threshold, the system can refuse to make an nference. However, many kernel learnng methods ncludng KSVM cannot output these probabltes. It can be further formulated as follows: g (Φ(x)) = (Φ(x) µ ) T Σ 1 (Φ(x) µ ) + log Σ. (4) The ntutve meanng of the functon s that a class s more lkely assgned to an unlabeled data pont, when the Mahalanobs dstance from the data pont to the class center s smaller.. Parameter Estmaton In order to compute the Mahalanobs dstance functon, the mean vector and the covarance matrx for each class are requred to be estmated. Typcally, the mean vector (µ ) and the wthn covarance matrx (Σ ) are calculated by the maxmum lkelhood estmaton. In the feature space, they are formulated as follows: µ = 1 n Φ(x j ), (5) n Σ = S = 1 n n (Φ(x j ) µ )(Φ(x j ) µ ) T, (6) where n s the cardnalty of the set composed of data ponts belongng to C. Drectly employng S as the covarance matrx, wll generate quadratc dscrmnant functons n the feature space. In ths case, KMAP s noted as KMAP-M. However, the covarance estmaton problem s clearly ll-posed, because the number of data ponts n each class s usually much smaller than the number of dmensons n the kernel-nduced feature space. The treatment of ths ll-posed problem s to ntroduce the regularzaton. There are several knds of regularzaton methods. One of them s to replace the ndvdual wthncovarance matrx by ther average,.e., Σ = S = m =1 S m + ri, where I s the dentty matrx and r s a regularzaton coeffcent. Ths method can substantally reduce the number of free parameters to be estmated. Moreover, t also reduces the dscrmnant functon between two classes to a lnear one. Therefore, a lnear dscrmnant analyss method can be obtaned. Alternatvely, we can estmate the covarance matrx by combnng the above lnear dscrmnant functon wth the quadratc one. Instead of estmatng the

4 844 Z. Xu et al. covarance matrx n the nput space [7], we try to apply ths method n the feature space. The formulaton n the feature space s as follows: Σ = (1 η) Σ + η trace( Σ ) I, (7) n where Σ = (1 θ)s + θs. In the equatons, θ (0 θ 1) s a coeffcent lnked wth the lnear dscrmnant term and the quadratc dscrmnant one. Moreover, η (0 η 1) determnes the shrnkage to a multple of the dentty matrx. Ths approach s more flexble to adjust the effect of the regularzaton. The correspondng KMAP s noted as KMAP-R..3 Kernel Calculaton We derve methods to calculate the Mahalanobs dstance (Eq. (4)) usng the kernel trck,.e., we only need to formulate the functon n an nner-product form regardless of the explct mappng functon. To do ths, the spectral representaton of the covarance matrx, Σ = N Λ j Ω j Ω T j where Λ j R s the j-th egenvalue of Σ and Ω j R N s the egenvector relevant to Λ j, s utlzed. However, the small egenvalues wll degrade the performance of the functon overwhelmngly because they are underestmated due to the small number of examples. In ths paper, we only estmate the k largest egenvalues and replace each left egenvalue wth a nonnegatve number h. Thus Eq. (4) can be reformulated as follows: g (Φ(x)) = 1 [g 1 (Φ(x)) g (Φ(x))] + g 3 (Φ(x)) h = 1 N k ( [Ω T j (Φ(x) µ )] 1 h ) [Ω T j (Φ(x) µ )] h Λ j k + log. (8) h N k Λ j In the followng, we show that g 1 (Φ(x)), g (Φ(x)), and g 3 (Φ(x)) can all be wrtten n a kernel form. To formulate these equatons, we need to calculate the egenvalues Λ and egenvectors Ω. The egenvectors le n the space spanned by all the tranng samples,.e., each egenvector Ω j can be wrtten as a lnear combnaton of all the tranng samples: Ω j = n l=1 γ (l) j Φ(x l )=Uγ j (9) where γ j =(γ (1) (Φ(x 1 ),...,Φ(x n )). j,γ () j,..., γ (n) j ) T s an n dmensonal column vector and U =

5 Kernel Maxmum a Posteror Classfcaton wth Error Bound Analyss 845 It s easy to prove that γ j and Λ j are actually the egenvector and egenvalue of the covarance matrx Σ G (), where G () R n N s the -th block of the kernel matrx G relevant to C. We omt the proof due to the lmt of space. Accordngly, we can express g 1 (Φ(x)) as the kernel form: g 1 (Φ(x)) = n γ T j U T (Φ(x) µ ) T (Φ(x) µ )Uγ j [ ( )] n = γ T j K x 1 n K xl n l=1 = K x 1 n K xl, (10) n where K x = {K(x 1, x),...,k(x n, x)} T. In the same way, g (Φ(x)) can be formulated as the followng: g (Φ(x)) = k l=1 Substtutng (9) nto the above g (Φ(x)), we have: g (Φ(x)) = k ( 1 h ) Ω T j Λ (Φ(x) µ )(Φ(x) µ ) T Ω j. (11) j ( 1 h ) ( γ T j K x 1 Λ j n n l=1 K xl )( K x 1 n n l=1 K xl ) T γ j. (1) Now, the Mahalanobs dstance functon n the feature space g (Φ(x)) can be fnally wrtten n a kernel form, where N n g 3 (Φ(x)) s substtuted by the cardnalty of data n. The tme complexty of KMAP s manly due to the egenvalue decomposton whch scales as O(n 3 ). Thus KMAP has the same complexty as KFDA..4 Connecton to Other Kernel Methods In the followng, we show the connecton between KMAP and other kernel-based methods. In the regularzaton method based on Eq. (7), by varyng the settngs of θ and η, other kernel-based classfcaton methods can be derved. When (θ =0,η = 0), the KMAP model represents a quadratc dscrmnant method n the kernelnduced feature space; when (θ = 1,η = 0), t represents a kernel dscrmnant method; and when (θ =0,η = 1) or (θ =1,η = 1), t represents the nearest mean classfer. Therefore, by varyng θ and η, dfferent models can be generated from dfferent combnatons of quadratc dscrmnant, lnear dscrmnant and the nearest mean methods. We consder a specal case of the regularzaton method when θ = 1 and η = 0. If both classes are assumed to have the same covarance structure for a

6 846 Z. Xu et al. bnary class problem,.e., Σ = 1+, t leads to a lnear dscrmnant functon. Assumng all classes have the same class pror probabltes, g (Φ(x)) can be derved as: g (Φ(x)) = (Φ(x) µ ) T ( 1+ ) 1 (Φ(x) µ ), where =1,. We reformulate the above equaton n the followng form: g (Φ(x)) = w Φ(x) + b, where w = 4(Σ 1 + Σ ) 1 µ, and b =µ T (Σ 1 + Σ ) 1 µ. The decson hyperplane s f(φ(x)) = g 1 (Φ(x)) g (Φ(x)),.e., f(φ(x)) = (Σ 1 + Σ ) 1 (µ 1 µ )Φ(x) 1 (µ 1 µ ) T (Σ 1 + Σ ) 1 (µ 1 + µ ). (13) Eq. (13) s just the soluton of KFDA [3]. Therefore, KFDA can be vewed as a specal case of KMAP when all classes have the same covarance structure. Remark. KMAP provdes a rch class of kernel-based classfcaton algorthms usng dfferent regularzaton methods. Ths makes KMAP as a flexble framework for classfcaton adaptve to data dstrbuton..5 Separablty Measures and Error Bounds To measure the separablty of dfferent classes of data n the feature space, the Kullback-Lebler dvergence (a.k.a. K-L dstance) between two Gaussans s adopted. The K-L dvergence s defned as d K L [p (Φ(x)),p j (Φ(x))] = P (Φ(x)) ln p (Φ(x)) p j (Φ(x)). (14) Snce the K-L dvergence s not symmetrc, a two-way dvergence s used to measure the dstance between two dstrbutons d j = d K L [p (Φ(x)),p j (Φ(x))] + d K L [p j (Φ(x)),p (Φ(x))] (15) Followng [6], t can be proved that: d j = 1 (µ µ j ) T (Σ 1 + Σ 1 j )(µ µ j )+ 1 trace(σ 1 Σ j + Σ 1 j Σ I), (16) whch can be solved by usng the trck n Secton.3. The Bayesan decson rule guarantees the lowest average error rate as presented n the followng: P (correct) = m =1 R p(φ(x) C )P (C )dφ(x), (17) where R s the decson regon of class C. We mplement the Bhattacharyya bound n the feature space for the Gaussan densty dstrbuton functon. Followng [6], we have P (error) P (C 1 )P (C ) exp q(0.5), (18)

7 where Kernel Maxmum a Posteror Classfcaton wth Error Bound Analyss 847 q(0.5) = 1 8 (µ µ 1 ) T ( Σ1 + Σ ) 1 (µ µ 1 )+ 1 ln 1+ Σ1 Σ. (19) Usng the results n Secton.3, the Bhattacharyya error bound can be easly calculated n the kernel-nduced feature space. 3 Experments In ths secton, we report the experments to evaluate the separablty measure, the error bound and the predcton performance of the proposed KMAP. 3.1 Synthetc Data We compare the separablty measure and error bounds on three synthetc data sets. The descrpton of these data sets can be found n [8]. The data sets are named accordng to ther characterstcs and they are plotted n Fg. 1. We map the data usng RBF kernel to a specal feature space where Gaussan dstrbutons are approxmately satsfed. We then calculate separablty measures on all data sets accordng to Eq. (16). The separablty values for the Overlap, Bumpy, and Relevance n the orgnal nput space, are 14.94, 5.16, and.18, respectvely. Those correspondng values n the feature space are 30.88, 5.87, and 3631, respectvely. The results ndcate that data become more separable after mapped nto the feature space, especally for the Relevance data set. For data n the kernel-nduced feature space, the error bounds are calculated accordng to Eq. (18). Fgure 1 also plots the predcton rates and the upper error bounds for data n the nput space and n the feature space, respectvely. It can be observed that the error bounds are more vald n the feature space than those n the nput space. 3. Benchmark Data Expermental Setup. In ths experment, KSVM, KFDA, Modfed Quadratc Dscrmnant Analyss (MQDA) [9] and Kernel Fsher s Quadratc Dscrmnant Analyss (KFQDA) [10] are employed as the compettve algorthms. We mplement two varants of KMAP,.e., KMAP-M and KMAP-R. The propertes of eght UCI benchmark data sets are descrbed n Table 1. In all kernel methods, a Gaussan-RBF kernel s used. The parameter C of KSVM and the parameter γ n RBF kernel are all tuned by 10-cross valdaton. In KMAP, we select k pars of egenvalues and egenvectors accordng to ther l contrbuton to the covarance matrx,.e., the ndex j {l : α}; whle q n MQDF, the range of k s relatvely small and we select k by cross valdaton. PCA s used as the regularzaton method n KFQDA and the commutatve decay rato s set to 99%; the regularzaton parameter r s set to n KFDA. n q=1

8 848 Z. Xu et al. Overlap Bumpy (a) Overlap (b)bumpy 0.8 Relevance 60 nput_bound Predcton error and error bound (%) nput_rate feature_bound feature_rate (c) Relevance 0 Bumpy Relevance Overlap Dfferent data sets (d) Separablty Comparson Fg. 1. The data plot of Overlap, Bumpy and Relevance and the comparson of data separablty n the nput space and the feature space Table 1. Data set nformaton Data Set # Samples # Features # Classes Data Set # Samples # Features # Classes Iono Breast Twonorm Sonar Pma Irs Wne Segment In both KMAP and MQDF, h takes the value of Λ k+1. In KMAP-R, extra parameters (θ, η) are tuned by cross-valdaton. All expermental results are obtaned n 10 runs and each run s executed wth 10-cross valdaton for each data set. Expermental Results. Table reports the average predcton accuracy wth the standard errors on each data set for all algorthms. It can be observed that both varants of KMAP outperform MQDF, whch s an MAP method n the nput space. Ths also emprcally valdates that the separablty among dfferent classes of data becomes larger and that the upper error bounds get tghter and more accurate, after data are mapped to the hgh dmensonal feature space. Moreover, the performance of KMAP s compettve to that of other kernel methods. Especally, the performance of KMAP-R gets better predcton accuracy than all other methods for most of the data sets. The reason s that the regularzaton methods n KMAP favorably capture the pror dstrbutons of

9 Kernel Maxmum a Posteror Classfcaton wth Error Bound Analyss 849 Table. The predcton results of KMAP and other methods Data set KSVM MQDF KFDA KFQDA KMAP-M KMAP-R Iono(%) 94.1± ± ± ± ± ±0.3 Breast(%) 96.5± ± ± ± ± ±0.1 Twonorm(%) 96.1± ± ± ± ± ±0.4 Sonar(%) 86.6± ± ± ± ± ±1. Pma(%) 77.9± ± ± ± ± ±0.4 Irs(%) 96.± ± ± ± ± ±0.0 Wne(%) 98.8± ± ± ± ± ±0.6 Segment(%) 9.8± ± ± ± ±0. 9.1±0.8 Average(%) data, snce the Gaussan assumpton n the feature space can ft a very complex dstrbuton n the nput space. 4 Related Work In statstcal pattern recognton, the probablty densty functon can frst be estmated from data, then future examples could be assgned to the class wth the MAP. One typcal example s the Quadratc Dscrmnant Functon (QDF) [11], whch s derved from the multvarate normal dstrbuton and acheves the mnmum mean error rate under Gaussan dstrbuton. In [9], a Modfed Quadratc Dscrmnant Functon (MQDF) less senstve to estmaton error s proposed. [7] mproves the performance of QDF by covarance matrx nterpolaton. Unlke QDF, another type of classfers does not assume the probablty densty functons n advance, but are desgned drectly on data samples. An example s the Fsher dscrmnant analyss (FDA), whch maxmzes the between-class covarance whle mnmzng the wthn-class varance. It can be derved as a Bayesan classfer under Gaussan assumpton on the data. [3] develops a Kernel Fsher Dscrmnant Analyss (KFDA) by extendng FDA to a non-lnear space by the kernel trck. To supplement the statstcal justfcaton of KFDA, [10] extends the maxmum lkelhood method and Bayes classfcaton to ther kernel generalzaton under Gaussan Hlbert space assumpton. The authors do not drectly kernelze the quadratc forms n terms of kernel values. Instead, they use an explct mappng functon to map the data to a hgh dmensonal space. Thus the kernel matrx s usually used as the nput data of FDA. The derved model s named as Kernel Fsher s Quadratc Dscrmnant Analyss (KFQDA). 5 Concluson and Future Work In ths paper, we present a novel kernel classfer named Kernel-based Maxmum a Posteror, whch mplements Gaussan dstrbuton n the kernel-nduced feature space. Comparng to state-of-the-art classfers, the advantages of KMAP nclude that the pror nformaton of dstrbuton s ncorporated and that t can output probablty or confdence n makng a decson. Moreover, KMAP can

10 850 Z. Xu et al. be regarded as a more generalzed classfcaton method than other kernel-based methods such as KFDA. In addton, the error bound analyss llustrates that Gaussan dstrbuton s more easly satsfed n the feature space than that n the nput space. More mportantly, KMAP wth proper regularzaton acheves very promsng performance. We plan to ncorporate the probablty nformaton nto both the kernel functon and the classfer n the future work. Acknowledgments The work descrbed n ths paper s fully supported by two grants from the Research Grants Councl of the Hong Kong Specal Admnstratve Regon, Chna (Project No. CUHK405/04E and Project No. CUHK435/04E). References 1. Schölkopf, B., Smola, A.: Learnng wth Kernels. MIT Press, Cambrdge (00). Vapnk, V.N.: Statstcal Learnng Theory. John Wley & Sons, Chchester (1998) 3. Mka, S., Ratsch, G., Weston, J., Scholkopf, B., Muller, K.: Fsher dscrmnant analyss wth kernels. In: Proceedngs of IEEE Neural Network for Sgnal Processng Workshop, pp (1999) 4. Lanckret, G.R.G., Ghaou, L.E., Bhattacharyya, C., Jordan, M.I.: A robust mnmax approach to classfcaton. Journal of Machne Learnng Research 3, (00) 5. Huang, K., Yang, H., Kng, I., Lyu, M.R., Chan, L.: Mnmum error mnmax probablty machne. Journal of Machne Learnng Research 5, (004) 6. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classfcaton. Wley-Interscence Publcaton (000) 7. Fredman, J.H.: Regularzed dscrmnant analyss. Journal of Amercan Statstcs Assocaton 84(405), (1989) 8. Centeno, T.P., Lawrence, N.D.: Optmsng kernel parameters and regularsaton coeffcents for non-lnear dscrmnant analyss. Journal of Machne Learnng Research 7(), (006) 9. Kmura, F., Takashna, K., S., T., Y., M.: Modfed quadratc dscrmnant functons and the applcaton to Chnese character recognton. IEEE Transactons on Pattern Analyss and Machne Intellgence 9, (1987) 10. Huang, S.Y., Hwang, C.R., Ln, M.H.: Kernel Fsher s dscrmnant analyss n Gaussan Reproducng Kernel Hlbert Space. Techncal report, Academa Snca, Tawan, R.O.C. (005) 11. Fukunaga, K.: Introducton to Statstcal Pattern Recognton, nd edn. Academc Press, San Dego (1990)

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

CSE 252C: Computer Vision III

CSE 252C: Computer Vision III CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

Pattern Classification

Pattern Classification attern Classfcaton All materals n these sldes were taken from attern Classfcaton nd ed by R. O. Duda,. E. Hart and D. G. Stork, John Wley & Sons, 000 wth the ermsson of the authors and the ublsher Chater

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

A kernel method for canonical correlation analysis

A kernel method for canonical correlation analysis A kernel method for canoncal correlaton analyss Shotaro Akaho AIST Neuroscence Research Insttute, Central 2, - Umezono, Tsukuba, Ibarak 3058568, Japan s.akaho@ast.go.jp http://staff.ast.go.jp/s.akaho/

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Feb 14: Spatial analysis of data fields

Feb 14: Spatial analysis of data fields Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Solving Nonlinear Differential Equations by a Neural Network Method

Solving Nonlinear Differential Equations by a Neural Network Method Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

Multigradient for Neural Networks for Equalizers 1

Multigradient for Neural Networks for Equalizers 1 Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Statistical Foundations of Pattern Recognition

Statistical Foundations of Pattern Recognition Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Non-linear Canonical Correlation Analysis Using a RBF Network

Non-linear Canonical Correlation Analysis Using a RBF Network ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane

More information

Fisher Linear Discriminant Analysis

Fisher Linear Discriminant Analysis Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear

More information

829. An adaptive method for inertia force identification in cantilever under moving mass

829. An adaptive method for inertia force identification in cantilever under moving mass 89. An adaptve method for nerta force dentfcaton n cantlever under movng mass Qang Chen 1, Mnzhuo Wang, Hao Yan 3, Haonan Ye 4, Guola Yang 5 1,, 3, 4 Department of Control and System Engneerng, Nanng Unversty,

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU, Machne Learnng 10-701/15-781, 781, Fall 2011 Nonparametrc methods Erc Xng Lecture 2, September 14, 2011 Readng: 1 Classfcaton Representng data: Hypothess (classfer) 2 1 Clusterng 3 Supervsed vs. Unsupervsed

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radal-Bass uncton Networs v.0 March 00 Mchel Verleysen Radal-Bass uncton Networs - Radal-Bass uncton Networs p Orgn: Cover s theorem p Interpolaton problem p Regularzaton theory p Generalzed RBN p Unversal

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES

FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES DING-ZHOU CAO, SU-LIN PANG, YUAN-HUAI

More information

Research Article Green s Theorem for Sign Data

Research Article Green s Theorem for Sign Data Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of

More information

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu ICIC Express Letters ICIC Internatonal c 2008 ISSN 1881-803 Volume 2, Number 4, December 2008 pp. 345 350 MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES Yongzhong ng, aobe Wu

More information

An Iterative Modified Kernel for Support Vector Regression

An Iterative Modified Kernel for Support Vector Regression An Iteratve Modfed Kernel for Support Vector Regresson Fengqng Han, Zhengxa Wang, Mng Le and Zhxang Zhou School of Scence Chongqng Jaotong Unversty Chongqng Cty, Chna Abstract In order to mprove the performance

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Nonlinear Classifiers II

Nonlinear Classifiers II Nonlnear Classfers II Nonlnear Classfers: Introducton Classfers Supervsed Classfers Lnear Classfers Perceptron Least Squares Methods Lnear Support Vector Machne Nonlnear Classfers Part I: Mult Layer Neural

More information

An Improved multiple fractal algorithm

An Improved multiple fractal algorithm Advanced Scence and Technology Letters Vol.31 (MulGraB 213), pp.184-188 http://dx.do.org/1.1427/astl.213.31.41 An Improved multple fractal algorthm Yun Ln, Xaochu Xu, Jnfeng Pang College of Informaton

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

A quantum-statistical-mechanical extension of Gaussian mixture model

A quantum-statistical-mechanical extension of Gaussian mixture model A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information