Sparse Gaussian Processes Using Backward Elimination
|
|
- Esther Cobb
- 5 years ago
- Views:
Transcription
1 Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an , Chna {blf018, wlp}@163.com Abstract. Gaussan Processes (GPs) have state of the art performance n regresson. In GPs, all the bass functons are requred for predcton; hence ts test speed s slower than other learnng algorthms such as support vector machnes (SVMs), relevance vector machne (RVM), adaptve sparseness (AS), etc. To overcome ths lmtaton, we present a backward elmnaton algorthm, called GPs-BE that recursvely selects the bass functons for GPs untl some stop crteron s satsfed. By ntegratng rank-1 update, GPs-BE can be mplemented at a reasonable cost. Extensve emprcal comparsons confrm the feasblty and valdty of the proposed algorthm. 1 Introducton Covarance functons have a great effect on the performance of GPs. The experments performed by Wllams [1] and Rusmussen [] have shown that the followng covarance functon works well n practce d p p (, ) exp ( ) C x x j = θ p x xj (1.1) p= 1 where θ p s scalng factor. If some varable s unmportant or rrelevant for regresson, the assocated scalng factor wll be made small; otherwse t wll be made large. The key advantage of GPs s that the hyperparameters of covarance functon can be optmzed by maxmzng the evdence. Ths s not appeared n other kernel based learnng methods such as support vector machnes (SVMs) [3]. In SVMs, an extra model selecton crteron, e.g. cross valdaton score s requred for choosng hyperparameters, whch s ntractable when a large number of hyperparameters are nvolved. Though GPs are very successful, they also have some shortages: (1) the computatonal cost of GPs s ( ) 3 O l, where l s the sze of tranng samples, whch seems to prohbt the applcatons of GPs to large datasets; () all the bass functons are requred for predcton; hence ts test speed s slower than other learnng algorthms such as SVMs, relevance vector machne (RVM) [4], adaptve sparseness (AS) [5], etc. Some researchers have tred to deal wth the shortages of GPs. In 000, Smola et al. [6] presented sparse greedy Gaussan processes (SGGPs) whose computatonal J. Wang et al. (Eds.): ISNN 006, LNCS 3971, pp , 006. Sprnger-Verlag Berln Hedelberg 006
2 1084 L. Bo, L. Wang, and L. Jao cost s O( kn l ), where n s the number of bass functons and k s a constant factor. In 00, Csató et al. also proposed sparse on-lne Gaussan processes (SOGPs) [7] that result n good sparseness and low complexty smultaneously. However both SGGPs and SOGPs throw away the key advantage of GPs. As a result, they have dffcultes n tacklng the hyperparameters. Ths paper focuses on the second shortage of GPs above. We propose a backward elmnaton algorthm (GPs-BE) that recursvely selects the bass functons wth the smallest leave-one-out score at the current step untl some stop crteron s satsfed. GPs-BE has reasonable computatonal complexty by ntegratng rank-1 update formula. GPs-BE s performed after GPs s traned; hence all the advantages of GPs are reserved. Extensve emprcal comparsons show that our method greatly reduces the number of bass functons of GPs almost wthout sacrfcng the performance. Gaussan Processes l Let = {(, y) } 1 Z x be l emprcal samples set drawn from = ( ) y = f x, w + ε, = 1,, L l (.1) where ε s ndependent sample from some nose process whch s further assumed to be mean-zeros Gaussan wth varance σ. We further assume f l (, ) wc(, ) xw = xx (.) = 1 Accordng Bayesan nference, the posteror probablty of w can be expressed as P ( w Z) P = ( Z w) P( w) P( Z) (.3) Maxmzng the log-posteror s equvalent to mnmzng the followng objectve functon T T ( ( ) ( ( ) )) T L λ σ wˆ = arg mn w, = w C C+ I w wc y (.4) where I s the dentty matrx. Hyperparameters are chosen by maxmzng the followng evdence l T ( ) ( ) 1 1 T T P, σ π σ exp ( σ ) 1 θ Z = I + CC y I + CC y (.5) In the related Bayesan models, ths equalty s known as the margnal lkelhood, and ts maxmzaton s known as the type- maxmum lkelhood method [8]. Wllams [9] has demonstrated that ths model s equvalent to Gaussan Processes T σ I+ CC ; hence we call t GPs n ths paper. (GPs) wth the covarance ( )
3 Sparse Gaussan Processes Usng Backward Elmnaton Backward Elmnaton for Gaussan Processes In GPs, all the bass functons are used for predcton; therefore t s nferor to neural networks, SVMs and RVM n testng speed, whch seems to prohbt ts applcaton n some felds. Here, GPs-BE s proposed to overcome ths problem that selects the bass functon by a backward elmnaton technque after tranng procedure. GPs-BE s a backward greedy algorthm that recursvely removes the bass functon wth the smallest leave-one-out score at the current step untl some stop crteron s satsfed. For convenence of dervaton, we reformulate (.6) nto 1 w = H b (3.1) where ( T T ( ) H = C C+σ I ) and b = C y. Let Δ f k be the ncrement of L wth the tranng sample deleted and then the followng theorem holds true. ( ) ( k ) wk 1 Theorem 3.1: Δ f =, where R = H, R kk denotes the k th dagonal R kk 1 element of H. ( ) We call Δ f k leave-one-out score. At each step, we wll remove the bass functon wth the smallest leave-one-out score. The ndex of the bass functon to be deleted can be obtaned by ( k ) ( f ) th k s = arg mn Δ, (3.) k P where P s a set of the ndces of the remander bass functons. Note that the (l+1)-th varable,.e. the bas, s preserved durng the backward elmnaton process. When one bass functon s deleted, we requre updatng the matrx R and the vector w. In terms of a rank-1 update, R and w can be formulated as ( R ) j RsRsj = Rj,, j s, (3.3) R ss n RsRsj ( w ) = j bj, R j s Rss s. (3.4) Together wth w = Rb, (3.4) s smplfed as Rs ( w ) = w ws, R ss s. (3.5) Suppose that Δ t s the ncrement of f at the t-th teraton, and then we wll termnate the backward elmnaton procedure f Δt ε f (3.6) where we set ε = The detaled backward elmnaton procedure s summarzed n Fgure 3.1.
4 1086 L. Bo, L. Wang, and L. Jao Agorthm1: GPs-BE 1. Compute the ndex of bass functon to be removed by (3.);. Update the matrx R and the vector w by (3.3) and (3.5); 3. Remove the ndex resultng from step 1; 4. If (3.6) s satsfed, Stop; otherwse, go to Step 1. Fg Flow chart of backward elmnaton 4 Emprcal Study In order to evaluate the performance of GPs-BE, we compare t wth GPs, GPs-U, SVM, RVM and AS on four benchmark datasets,.e. Fredman1 [10], Boston Housng, Abalone and Computer Actvty [11]. GPs-U denotes GPs whose covarance functon has the same scalng factors. Before experments, all the tranng data are scaled n [-1, 1] and the testng data are adjusted usng the same lnear transformaton. For Fredman1 and Boston Housng data sets, the results are averaged over 100 random splts of the full datasets. For Abalone and Computer Actvty data sets, the results are averaged over 10 random splts of the mother datasets. The free parameters n GPs, GPs-BE and GPs-U are optmzed by maxmzng the evdence. The free parameters n RVM, SVMs and AS are selected by 10-fold cross valdaton procedure. Table 4.1. Characterstcs of four benchmark datasets Abrr. Problem Attrbutes Total Sze Tranng Sze Testng Sze FRI Fredman BOH Boston Housng ABA Abalone COA Computer Actvty Table 4.. Mean of the testng errors of sx algorthms Problem GPs GPs-BE GPs-U RVM SVMs AS FRI BOH ABA COA Table 4.3. Mean of the number of bass functons of sx algorthms on benchmark datasets Problem GPs GPs-BE GPs-U RVM SVMs AS FRI BOH ABA COA
5 Sparse Gaussan Processes Usng Backward Elmnaton 1087 Table 4.4. Runtme of sx algorthms on benchmark datasets Problem GPs GPs-BE GPs-U RVM SVMs AS FRI BOH ABA COA From Table 4. we know that GPs-BE and GPs obtan smlar generalzaton performance and are sgnfcantly better than GPs-U, RVM, SVMs and AS n the two regresson tasks,.e. Fredman1and Computer Actvty. As for the remanng two tasks, all the sx approaches have smlar performance. Snce GPs-U s often superor to SGGPs and SOGPs n terms of the generalzaton performance, GPs-BE s expected to have the better generalzaton performance than SGGPs and SOGPs.Table 4.3 show that the number of bass functons of GPs-BE approaches that of RVM and AS, and s sgnfcantly smaller than that of GPs, GPs-U and SVMs. Table 4.4 show that the runtme of GPs-BE approaches that of GPs, GPs-U and AS, and s sgnfcantly smaller than that of GPs, GPs-U and SVMs. An alternatve s to select the bass functons usng the forward selecton proposed by [1-13]. Table 4.5 compares our method wth forward selecton n the same stop crteron. Table 4.5. Comparson of backward elmnaton and forward selecton Problem Backward Elmnaton Forward Selecton FRI BOH ABA COA Normalzed Mean Table 4.5 shows that the backward elmnaton outperforms the forward selecton n the performance and the number of bass functons n the same stop crteron. In summary, GPs-BE greatly reduces the number of bass functons of GPs almost wthout sacrfcng the performance and ncreasng the runtme. Moreover, GPs-BE s better than GPs-U n performance, whch further ndcates the performance of GPs- BE s better than that of SGGPs and SOGPs. GPs-BE s better than SVMs n all the three aspects. GPs-BE s also better than RVM and AS n performance wth the smlar number of bass functons and runtme. Fnally, the backward elmnaton outperforms the forward selecton n the same stop crteron. 5 Concluson Ths paper presents a backward elmnaton algorthm to select the bass functons for GPs. By ntegratng rank-1 update, we can mplement GPs-BE at a reasonable cost. The results show that GPs-BE greatly reduces the number of bass functons of GPs
6 1088 L. Bo, L. Wang, and L. Jao almost wthout sacrfcng the performance and ncreasng the runtme. Comparsons wth forward selecton show that GPS-BE obtans better performance and smaller bass functons n the same stop crteron. Ths research s supported by Natonal Natural Scence Foundaton of Chna under grant and and Natonal 973 Project grant 001CB References 1. Wllams, C. K. I., Rasmussen, C. E.: Gaussan Processes for Regresson. Advances n Neural Informaton Processng Systems 8 (1996) Rasmussen, C. E.: Evaluaton of Gaussan Processes and Other Methods for Non-lnear Regresson. Ph.D. thess, Dep.of Computer Scence, Unversty of Toronto. Avalable from 3. Vapnk, V.: The Nature of Statstcal Learnng Theory. New York: Sprnger-Verlag (1995) 4. Tppng, M. E.: Sparse Bayesan Learnng and the Relevance Vector Machne. Journal Machne Learnng Research 1 (001) Fgueredo, M. A. T.: Adaptve Sparseness for Supervsed Learnng. IEEE Trans. Pattern Analyss and Machne Intellgence 5 (003) Smola, A. J., Bartlett, P. L.: Sparse Greedy Gaussan Processes Regresson, Advances n Neural Informaton Processng Systems 13 (000) Csato, L., Opper, M.: Sparse Onlne Gaussan Processes, Neural Computaton, 14 (00) Berger, J. O.: Statstcal Decson Theory and Bayesan Analyss. Sprnger, Second Edton (1985) 9. Wllams, C. K. I.: Predcton wth Gaussan Processes: from Lnear Regresson to Lnear Predcton and Beyond. Learnng and Inference n Graphcal Models (1998) Fredman, J. H.: Multvarate Adaptve Regresson Splnes. Annals of Statstcs 19 (1991) Blake, C. L., Merz, C. J.: UCI Repostory of Machne Learnng Databases, Techncal Report, Unversty of Calforna, Department of Informaton and Computer Scence, Irvne, CA (1998) Data avalable at 1. Chen, S., Cowan, C. F. N., Grant, P. M.: Orthogonal Least Squares Learnng Algorthm for Radal Bass Functon Networks. IEEE Trans. Neural Networks (1991) Bo, L. F., Wang, L., Jao, L. C.: Sparse Bayesan Learnng Based on an Effcent Subset Selecton, Lecture Notes n Computer Scence 3173 (004) 64-69
Lecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationA New Evolutionary Computation Based Approach for Learning Bayesian Network
Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang
More informationRelevance Vector Machines Explained
October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes
More informationSemi-supervised Classification with Active Query Selection
Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationDetermination of Compressive Strength of Concrete by Statistical Learning Algorithms
Artcle Determnaton of Compressve Strength of Concrete by Statstcal Learnng Algorthms Pjush Samu Centre for Dsaster Mtgaton and Management, VI Unversty, Vellore, Inda E-mal: pjush.phd@gmal.com Abstract.
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationNatural Images, Gaussian Mixtures and Dead Leaves Supplementary Material
Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess
More informationDurban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications
Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationGaussian process classification: a message-passing viewpoint
Gaussan process classfcaton: a message-passng vewpont Flpe Rodrgues fmpr@de.uc.pt November 014 Abstract The goal of ths short paper s to provde a message-passng vewpont of the Expectaton Propagaton EP
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationFORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES
Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES DING-ZHOU CAO, SU-LIN PANG, YUAN-HUAI
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationMultigradient for Neural Networks for Equalizers 1
Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT
More informationHidden Markov Models
CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte
More informationScalable Multi-Class Gaussian Process Classification using Expectation Propagation
Scalable Mult-Class Gaussan Process Classfcaton usng Expectaton Propagaton Carlos Vllacampa-Calvo and Danel Hernández Lobato Computer Scence Department Unversdad Autónoma de Madrd http://dhnzl.org, danel.hernandez@uam.es
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More informationDiscretization of Continuous Attributes in Rough Set Theory and Its Application*
Dscretzaton of Contnuous Attrbutes n Rough Set Theory and Its Applcaton* Gexang Zhang 1,2, Lazhao Hu 1, and Wedong Jn 2 1 Natonal EW Laboratory, Chengdu 610036 Schuan, Chna dylan7237@sna.com 2 School of
More informationMDL-Based Unsupervised Attribute Ranking
MDL-Based Unsupervsed Attrbute Rankng Zdravko Markov Computer Scence Department Central Connectcut State Unversty New Brtan, CT 06050, USA http://www.cs.ccsu.edu/~markov/ markovz@ccsu.edu MDL-Based Unsupervsed
More informationLecture 3: Dual problems and Kernels
Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationAn Improved multiple fractal algorithm
Advanced Scence and Technology Letters Vol.31 (MulGraB 213), pp.184-188 http://dx.do.org/1.1427/astl.213.31.41 An Improved multple fractal algorthm Yun Ln, Xaochu Xu, Jnfeng Pang College of Informaton
More informationOutline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline
Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More informationCluster Validation Determining Number of Clusters. Umut ORHAN, PhD.
Cluster Analyss Cluster Valdaton Determnng Number of Clusters 1 Cluster Valdaton The procedure of evaluatng the results of a clusterng algorthm s known under the term cluster valdty. How do we evaluate
More informationEnsemble of GA based Selective Neural Network Ensembles
Ensemble of GA based Selectve eural etwork Ensembles Jan-Xn WU Zh-Hua ZHOU Zhao-Qan CHE atonal Laboratory for ovel Software Technology anjng Unversty anjng, 0093, P.R.Chna wujx@a.nju.edu.cn {zhouzh, chenzq}@nju.edu.cn
More informationNon-linear Canonical Correlation Analysis Using a RBF Network
ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane
More informationDynamic Ensemble Selection and Instantaneous Pruning for Regression
Dynamc Ensemble Selecton and Instantaneous Prunng for Regresson Kaushala Das and Terry Wndeatt Centre for Vson Speech and Sgnal Processng Faculty of Engneerng and Physcal Scences Unversty of Surrey, Guldford,
More informationKristin P. Bennett. Rensselaer Polytechnic Institute
Support Vector Machnes and Other Kernel Methods Krstn P. Bennett Mathematcal Scences Department Rensselaer Polytechnc Insttute Support Vector Machnes (SVM) A methodology for nference based on Statstcal
More informationHidden Markov Models
Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationGlobal Gaussian approximations in latent Gaussian models
Global Gaussan approxmatons n latent Gaussan models Botond Cseke Aprl 9, 2010 Abstract A revew of global approxmaton methods n latent Gaussan models. 1 Latent Gaussan models In ths secton we ntroduce notaton
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan
More informationPop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing
Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,
More informationBAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup
BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationAdaptive Fit Parameters Tuning with Data Density Changes in Locally Weighted Learning
Adaptve Ft Parameters Tunng wth Data Densty Changes n Locally Weghted Learnng Han Le, Xe un Qng, and Song Guo Je ey Laboratory of Machne Percepton (Mnstry of Educaton), Pekng Unversty {hanle,kunqng}@cs.pku.edu.cn,
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationInstance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification
Instance-Based earnng (a.k.a. memory-based learnng) Part I: Nearest Neghbor Classfcaton Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n
More informationMultilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata
Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,
More informationAn efficient algorithm for multivariate Maclaurin Newton transformation
Annales UMCS Informatca AI VIII, 2 2008) 5 14 DOI: 10.2478/v10065-008-0020-6 An effcent algorthm for multvarate Maclaurn Newton transformaton Joanna Kapusta Insttute of Mathematcs and Computer Scence,
More informationRegularized Discriminant Analysis for Face Recognition
1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More informationStatistical machine learning and its application to neonatal seizure detection
19/Oct/2009 Statstcal machne learnng and ts applcaton to neonatal sezure detecton Presented by Andry Temko Department of Electrcal and Electronc Engneerng Page 2 of 42 A. Temko, Statstcal Machne Learnng
More informationSpatial Modelling of Peak Frequencies of Brain Signals
Malaysan Journal of Mathematcal Scences 3(1): 13-6 (9) Spatal Modellng of Peak Frequences of Bran Sgnals 1 Mahendran Shtan, Hernando Ombao, 1 Kok We Lng 1 Department of Mathematcs, Faculty of Scence, and
More informationA quantum-statistical-mechanical extension of Gaussian mixture model
A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More information2 STATISTICALLY OPTIMAL TRAINING DATA 2.1 A CRITERION OF OPTIMALITY We revew the crteron of statstcally optmal tranng data (Fukumzu et al., 1994). We
Advances n Neural Informaton Processng Systems 8 Actve Learnng n Multlayer Perceptrons Kenj Fukumzu Informaton and Communcaton R&D Center, Rcoh Co., Ltd. 3-2-3, Shn-yokohama, Yokohama, 222 Japan E-mal:
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationMin Cut, Fast Cut, Polynomial Identities
Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.
More informationAdvances in Longitudinal Methods in the Social and Behavioral Sciences. Finite Mixtures of Nonlinear Mixed-Effects Models.
Advances n Longtudnal Methods n the Socal and Behavoral Scences Fnte Mxtures of Nonlnear Mxed-Effects Models Jeff Harrng Department of Measurement, Statstcs and Evaluaton The Center for Integrated Latent
More informationDevelopment of a Semi-Automated Approach for Regional Corrector Surface Modeling in GPS-Levelling
Development of a Sem-Automated Approach for Regonal Corrector Surface Modelng n GPS-Levellng G. Fotopoulos, C. Kotsaks, M.G. Sders, and N. El-Shemy Presented at the Annual Canadan Geophyscal Unon Meetng
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationFeature Selection in Multi-instance Learning
The Nnth Internatonal Symposum on Operatons Research and Its Applcatons (ISORA 10) Chengdu-Juzhagou, Chna, August 19 23, 2010 Copyrght 2010 ORSC & APORC, pp. 462 469 Feature Selecton n Mult-nstance Learnng
More informationMAXIMUM A POSTERIORI TRANSDUCTION
MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,
More information1 Motivation and Introduction
Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationImprove Multi-Instance Neural Networks through Feature Selection
Improve Mult-Instance Neural Networks through Feature Selecton Mn-Lng Zhang and Zh-Hua Zhou* State Key Laboratory for Novel Software Technology, Nanjng Unversty, Nanjng 210093, Chna Abstract. Mult-nstance
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationRegression Using Support Vector Machines: Basic Foundations
Regresson Usng Support Vector Machnes: Basc Foundatons Techncal Report December 004 Aly Farag and Refaat M Mohamed Computer Vson and Image Processng Laboratory Electrcal and Computer Engneerng Department
More informationEvaluation of simple performance measures for tuning SVM hyperparameters
Evaluaton of smple performance measures for tunng SVM hyperparameters Kabo Duan, S Sathya Keerth, Aun Neow Poo Department of Mechancal Engneerng, Natonal Unversty of Sngapore, 0 Kent Rdge Crescent, 960,
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationResearch Article Green s Theorem for Sign Data
Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationNumber of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k
ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels
More informationA New Refinement of Jacobi Method for Solution of Linear System Equations AX=b
Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,
More informationLINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables
LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory
More informationChapter 15 Student Lecture Notes 15-1
Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationStatistical Foundations of Pattern Recognition
Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement
More informationAdmin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester
0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #
More informationConjugacy and the Exponential Family
CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the
More informationPROPERTIES I. INTRODUCTION. Finite element (FE) models are widely used to predict the dynamic characteristics of aerospace
FINITE ELEMENT MODEL UPDATING USING BAYESIAN FRAMEWORK AND MODAL PROPERTIES Tshldz Marwala 1 and Sbusso Sbs I. INTRODUCTION Fnte element (FE) models are wdely used to predct the dynamc characterstcs of
More informationComputing MLE Bias Empirically
Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.
More informationUnified Subspace Analysis for Face Recognition
Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA
More information8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore
8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø
More information