Adaptive evolutionary Monte Carlo algorithm for optimization with applications to sensor placement problems

Size: px
Start display at page:

Download "Adaptive evolutionary Monte Carlo algorithm for optimization with applications to sensor placement problems"

Transcription

1 Stat Comput (2008) 18: DOI /s Adaptve evolutonary Monte Carlo algorthm for optmzaton wth applcatons to sensor placement problems Yuan Ren Yu Dng Famng Lang Receved: 28 Aprl 2008 / Accepted: 13 June 2008 / Publshed onlne: 11 July 2008 Sprnger Scence+Busness Meda, LLC 2008 Abstract In ths paper, we present an adaptve evolutonary Monte Carlo algorthm (AEMC), whch combnes a treebased predctve model wth an evolutonary Monte Carlo samplng procedure for the purpose of global optmzaton. Our development s motvated by sensor placement applcatons n engneerng, whch requres optmzng certan complcated black-box obectve functon. The proposed method s able to enhance the optmzaton effcency and effectveness as compared to a few alternatve strateges. AEMC falls nto the category of adaptve Markov chan Monte Carlo (MCMC) algorthms and s the frst adaptve MCMC algorthm that smulates multple Markov chans n parallel. A theorem about the ergodcty property of the AEMC algorthm s stated and proven. We demonstrate the advantages of the proposed method by applyng t to a sensor placement problem n a manufacturng process, as well as to a standard Grewank test functon. Keywords Global optmzaton Adaptve MCMC Evolutonary Monte Carlo Data mnng Y. Ren Y. Dng Department of Industral and Systems Engneerng, Texas A&M Unversty, College Staton, TX , USA Y. Dng e-mal: yudng@emal.tamu.edu F. Lang ( ) Department of Statstcs, Texas A&M Unversty, College Staton, TX , USA e-mal: flang@stat.tamu.edu 1 Introducton Optmzaton problems arse n many engneerng applcatons. Engneers often need to optmze a black-box obectve functon,.e., a functon that can only be evaluated by runnng a computer program. These problems are generally dffcult to solve because of the complexty of the obectve functon and the large number of decson varables nvolved. Two categores of statstcal methodologes, one based on random samplng and another based on predctve modelng have made great contrbuton to solvng the optmzaton problems of ths nature. In ths artcle, we propose an adaptve evolutonary Monte Carlo (AEMC) method, whch enhances the effcency and effectveness of engneerng optmzaton problems. A real example that motvates ths research s the sensor placement problem. Smply put, n a sensor placement problem, one needs to determne the number and locatons of multple sensors so that certan desgn crtera can be optmzed wthn a gven budget. Sensor placement ssues have been encountered n varous applcatons, such as manufacturng qualty control (Mandrol et al. 2006), structural health montorng (Bukkapatnam et al. 2005), transportaton management (Čvls et al. 2005), and securty survellance (Brooks et al. 2003). Dependng on applcatons, the desgn crtera to be optmzed nclude, among others, senstvty, detecton probablty, and coverage. A desgn crteron s a functon of the number and locatons of sensors, and ths functon s usually complcated and nonlnear. Evaluatng the desgn crteron needs to run a computer program, qualfyng t as a black-box obectve functon. Mathematcally, a sensor placement problem can be formulated as a constraned optmzaton problem. mn H(w) subect to G(w) 0, (1) w W

2 376 Stat Comput (2008) 18: where W R d, d s the number of sensors, w s a vector of decson varables (.e., sensor locatons), H : W R s a user-specfed desgn crteron to be optmzed, and G( ) 0 represents physcal constrants assocated wth engneerng systems. Takng sensor placement n an assembly process for an example, G( ) 0 means that sensors can only be nstalled on the surface of subassembles, and H( ) s an E- optmalty desgn crteron (Mandrol et al. 2006). In Sect. 4, we wll vst ths sensor placement problem wth more detals. When the physcal constrants are complcated and dffcult to handle n an optmzaton routne, engneers could dscretze the soluton space W and create a fnte (yet possbly huge) number of soluton canddates that satsfy the constrants (see Km and Dng 2005, Sect. 1 for an example). For the sensor placement problems, ths means to dentfy all the vable sensor locatons apror; ths can be done relatvely easly because ndvdual sensors are located n a low (less than or equal to three) dmensonal space. One should use a hgh enough resoluton for dscretzaton so that good sensor locatons are not lost. Suppose we do dscretzaton. Then, the formulaton (1) becomes an unconstraned optmzaton problem, mn H(x), (2) x X where X s the sample space that contans the fnte number of canddate sensor locatons. Clearly, X Z d, whch s the set of d-dmensonal vectors wth nteger elements. Note that H( ) n (2) s stll calculated accordng to the same desgn crteron as n (1) but now defned on X. Recall that H( ) s of black-box type wth potentally plenty of local optma, due to the complex nature of engneerng systems. Solvng ths dscrete optmzaton problem mght seem mathematcally trval because one ust needs to enumerate all potental solutons exhaustvely and select the best one. In most real-world applcatons, however, there could be an overwhelmngly large number of potental solutons to be evaluated, especally when a hgh-resoluton dscretzaton was performed. Two categores of statstcal methodologes exst for solvng ths type of optmzaton problems. The frst category s the samplng-based methods: t starts wth a set of random samples, and then generates new samples accordng to some pre-specfed mechansm based on current samples and probablstcally accepts/reects the new samples for subsequent teratons. Many well-known optmzaton methods, such as smulated annealng (Bertsmas and Tstskls 1993), genetc algorthm (Holland 1992), and Markov chan Monte Carlo (MCMC) methods (Wong and Lang 1997), fall nto ths category; the dfferences among them come from the specfc mechansm an algorthm uses to generate and accept new samples. These methods can handle complcated response surface well and have been wdely appled to engneerng optmzatons. Ther shortcomng s that they generally requre a large number of functon evaluatons before reachng a good soluton. The second category s the metamodel-based methods. It also starts wth a set of soluton samples {x}. A metamodel s a predctve model ftted by usng the hstorcal soluton pars {x, H(x)}. Wth ths predctve model, new solutons are generated based on the model s predcton of where one s more lkely to fnd good solutons. Subsequently, the predctve model s updated as more solutons are collected. The model s labeled as metamodel because H(x) s the computatonal output based on a computer model. The metamodel-based method orgnates from the research on computer experments (Chen et al. 2006; Fang et al. 2006; Sacks et al. 1989; Smpson et al. 1997). Ths strategy s also called data-mnng guded method, especally when the predctve model used theren s a classfcaton tree model (Lu and Igusa 2007; Km and Dng 2005; Schwabacher et al. 2001) snce the tree model s a typcal data-mnng tool. For the metamodel-based or data-mnng guded methods, the maor shortcomng s ther neffectveness n handlng complcated response surfaces, and as a result, they only look for local optma. Ths paper proposes an optmzaton algorthm, combnng the samplng-based and metamodel-based methods. Specfcally, the proposed algorthm combnes evolutonary Monte Carlo (EMC) (Lang and Wong 2000, 2001) and a tree-based predctve model. The advantage of such a hybrd s that t ncorporates strengths from both EMC samplng and the predctve metamodelng: the tree-based predctve model adaptvely learns nformatve rules from past solutons so that the new solutons generated from these rules are expected to have better obectve functon values than the ones generated from blnd samplng operatons, whle the EMC mechansm allows a search to go over the whole sample space and gudes the solutons toward the global optmum. We thus label the proposed algorthm adaptve evolutonary Monte Carlo (AEMC). We wll further elaborate the ntutons behnd AEMC at the begnnng of Sect. 3,after we revew the two exstng methodologes wth more detals n Sect. 2. The remander of ths paper s organzed as follows. Secton 2 provdes detals of the two relevant methodologes. Secton 3 descrbes the general dea and mplementaton detals of the AEMC algorthm. We also prove that the AEMC algorthm preserves the ergodcty property of Markov chan samples. In Sect. 4, we employ AEMC to solve a sensor placement problem. We provde addtonal numercal examples to show AEMC s performance n optmzaton as well as ts potental use for samplng. We conclude ths paper n Sect. 5.

3 Stat Comput (2008) 18: Related work 2.1 Samplng-based methods Among the samplng-based methods, smulated annealng and genetc algorthms have been used to solve optmzaton problems for qute some tme. They use dfferent technques to generate new random samples. Smulated annealng works by smulatng a sequence of dstrbutons determned by a temperature ladder. It draws samples accordng to these dstrbutons and probablstcally accepts or reects the samples. Geman and Geman (1984) have shown that f the temperature decreases suffcently slowly (at a logarthmc rate), smulated annealng can reach the global optmum of H(x)wth probablty 1. However, no one can afford such a slow coolng schedule n practce. People generally use a lnearly or geometrcally decreasng coolng schedule, but when dong so, the global optmum s no longer guaranteed. Genetc algorthm uses evolutonary operators such as crossover and mutaton to construct new samples. Mmckng natural selecton, crossover operators are appled on two parental samples to produce an offsprng that nherts characterstcs of both the parents, whle mutaton operators are occasonally used to brng varaton to the new samples. Genetc algorthm selects new samples accordng to ther ftness (for example, ther obectve functon values can be used as a measure of ftness). Genetc algorthm s known to converge to a good soluton rather slowly and lacks rgorous theores to support ts convergence to the global optmum. The MCMC methods have also been used to solve optmzaton problems (Wong and Lang 1997; Lang 2005; Lang et al. 2007; Neal 1996). Even though a typcal applcaton of MCMC s to draw samples from complcated probablty dstrbutons, the samplng operatons can be readly utlzed for optmzaton. Consder a Boltzmann dstrbuton p(x) exp( H(x)/τ) for some τ>0. MCMC methods could be used to generate samples from p(x). As a result, the MCMC method has a hgher chance to obtan samples wth lower H(x) values. If we keep generatng samples accordng to p(x), we wll eventually fnd samples close enough to the global mnmum of H(x). The MCMC methods perform random walks n the whole sample space and thus may potentally escape from local optma gven long enough run tme. Lang and Wong (2000, 2001) proposed a method called evolutonary Monte Carlo (EMC), whch ncorporates many attractve features of smulated annealng and genetc algorthm nto a MCMC framework. It has been shown that EMC s effectve for both samplng from hgh-dmensonal dstrbutons and optmzaton problems (Lang and Wong 2000, 2001). Because EMC s an MCMC procedure, t guarantees the ergodcty of the Markov chan samples n a long run. Nonetheless, t appears that there s stll a need and room to further mprove the convergence rate of an EMC procedure. Recently, adaptve proposals have been used to mprove the convergence rate of tradtonal MCMC algorthms. For example, Glks et al. (1998) and Brockwell and Kadane (2005) proposed to use regeneratve Markov chans and update the proposal parameters at regeneraton tmes; Haaro et al. (2001) proposed an adaptve Metropols algorthm whch attempts to update the covarance matrx of the proposal dstrbutons by makng use of all past samples. Important theoretcal advances on the ergodcty of the adaptve MCMC method have been made by Haaro et al. (2001), Andreu and Robert (2002), Atchadé and Rosenthal (2005), and Roberts and Rosenthal (2007). 2.2 Metamodel-based methods In essence, metamodel-based methods are not much dfferent from other sequental samplng procedures guded by a predctve model, e.g., response surface methodology used for physcal experments (Box and Wlson 1951). The metamodel based method consttutes of desgn and analyss of computer experments (Chen et al. 2006; Fang et al. 2006; Sacks et al. 1989; Smpson et al. 1997) where the socalled metamodel s an nexpensve surrogate or substtute of the computer model that s oftentmes computatonally expensve to run (e.g., the computer model could be a fnte element model of a cvl structure). Varous statstcal predctve methods have been used as the metamodels, accordng to the survey by Chen et al. (2006), ncludng neural networks, tree-based methods, Splnes, and spatal correlaton models. Durng the past few years, there have emerged a number of research developments, labeled as data-mnng guded engneerng desgns (Gukema et al. 2004; Huyet 2006; Lu and Igusa 2007; Km and Dng 2005; Mchalsk 2000; Schwabacher et al. 2001). The data-mnng guded methods are bascally one form of metamodel-based methods because they also use a statstcal predctve model to gude the selecton of desgn solutons. The predctve models used n the data-mnng guded desgns nclude regresson, classfcaton tree, and clusterng methods. When lookng for an optmal soluton, the predctve model s used as follows. After fttng a metamodel (or smply a model, n the cases of physcal experments), one could use t to predct where good solutons are more lkely to be found and thus select subsequent samples accordngly. Ths samplng-modelng-predcton procedure s consdered a data-mnng operaton. Lu and Igusa (2007) and Km and Dng (2005) demonstrated that the data-mnng operaton could greatly speed up computaton under rght crcumstances. Compared wth the slow convergng samplngbased methods, the metamodel-based methods can be especally useful when one has lmted amount of data samples; ths happens when physcal experments or computer smulatons are expensve to conduct. But the metamodel based

4 378 Stat Comput (2008) 18: methods are greedy search methods and can be easly entrapped n local optma. 3 Adaptve evolutonary Monte Carlo 3.1 General dea of AEMC The strengths as well as the lmtatons of the samplngbased and the metamodel-based search methods motvate us to combne the two schemes and develop the AEMC algorthm. The ntuton behnd how AEMC works s explaned as follows. A crtcal shortcomng of the metamodel-based methods s that ther effectveness hghly depends on how representatve the sampled solutons are of the optmal regons of the sample space. Wthout representatve data, the resultng metamodel could mslead the search to non-optmal regons. Consequently, the subsequent samplng from those regons wll not help the search get out of the trap. In partcular, when the sample space s large and good solutons only le n a small porton of the space, data obtaned by a unform samplng from the sample space wll not be representatve enough. Takng the sensor placement problem shown later n ths paper for an example, we found that only 5% of the solutons have relatvely good obectve functon values. Under ths crcumstance, stand-alone metamodelng mechansm could hardly be effectve (as shown n the numercal results n Sect. 4), thereby promotng the need to mprove the sample qualty for the purpose of establshng a better metamodel. It turns out that samplng-based algorthms (we choose EMC n ths paper), though slow as a stand-alone optmzaton tool, are able to mprove the qualty of the sampled solutons. Ths s because when conductng random searches over a sample space, EMC wll gradually converge n dstrbuton to the Boltzmann dstrbuton n (4),.e., the smaller the value of H(x) s, the hgher the probablty of samplng x s (recall that we want to mnmze H(x)). In other words, EMC wll teratvely and stochastcally drect current samples toward the optmal regons such that the vsted solutons are more representatve of the optmal regons of the sample space. Wth the representatve samples produced by EMC, a metamodelng operaton could generate more accurate predctve models to characterze the promsng subregons of the sample space. The prmary tool for mprovng the samplng-based search s to speed up ts converge rate. As argued n Sect. 2, makng a MCMC method adaptve s an effectve way of achevng such an obectve. The metamodel part of AEMC learns the functon surface of H(x), and allows us to construct more effectve proposal dstrbutons for subsequent samplng operatons. As argued n Glks et al. (1995), the rate of convergence of a Markov chan to the Boltzmann dstrbuton n (4) depends crucally on the relatonshp between the proposal functon and the target functon H(x). To the best of our knowledge, AEMC s also the frst adaptve MCMC method that smulates multple Markov chans n parallel, whle the exstng adaptve MCMC methods are all based on smulaton of one sngle Markov chan. So the AEMC can utlze nformaton from multple chans to mprove the convergence rate. The above dscussons explan the beneft of combnng the metamodel-based and samplng-based method and executng them alternately n a fashon shown n Fg. 1. In the sequel, we wll present the detals of the proposed AEMC algorthm. For metamodelng (or data-mnng) operatons, we use classfcaton and regresson trees (CART), proposed by Breman et al. (1984), to ft predctve models. We choose CART prmarly because of ts computatonal effcency. Our goal of solvng an optmzaton problem requres the data-mnng operatons to be fast and computatonally scalable n order to accommodate large-szed data sets. Snce the data-mnng operatons are repeatedly used, a complcated and computatonally expensve method wll unavodably slow down the optmzaton process. Fg. 1 General framework of combnng samplng-based and metamodel-based methods

5 Stat Comput (2008) 18: Evolutonary Monte Carlo For the convenence of readng ths paper, we provde a bref summary of EMC n ths secton and a descrpton of the operators of EMC n Appendx A. Please refer to Lang and Wong (2000, 2001) for more detals. EMC ntegrates features of smulated annealng and genetc algorthm nto a MCMC framework. Smlar to smulated annealng, EMC uses a temperature ladder and smultaneously smulates a populaton of Markov chans, each of whch s assocated wth a dfferent temperature. The chans wth hgh temperatures can easly escape from local optma, whle the chans wth low temperatures can search around some local regons and fnd better solutons faster. The populaton s updated by crossover and mutaton operators, ust lke genetc algorthm, and therefore adopts some level of learnng capablty,.e., samples wth better ftness wll have a greater probablty of beng selected and pass ther good genetc materals to the offsprngs. A populaton, as mentoned above, s actually a set of n soluton samples. The state space assocated wth a populaton s the product of n sample spaces, namely X n = X X. Denote a populaton x X n such that x = {x 1,...,x n }, where x ={x 1,...,x d } X s the -th d- dmensonal soluton sample. EMC attaches a dfferent temperature, t, to a sample x, and the temperatures form a ladder wth the orderng t 1 t n. We denote t = {t 1,...,t n }. Then the Boltzmann densty can be defned for a sample x as f (x ) = 1 Z(t ) exp{ H(x )/t }, (3) where Z(t ) s a normalzng constant, and Z(t ) = exp{ H(x )/t }. {x } Assumng that samples n a populaton are mutually ndependent, we then have the Boltzmann dstrbuton of the populaton as { n f(x) = f (x ) = 1 n Z(t) exp H(x )/t }, (4) =1 =1 where Z(t) = n =1 Z(t ). Gven an ntal populaton x (0) ={x (0) 1,...,x(0) n } and the temperature ladder t ={t 1,...,t n }, n Markov chans are smulated smultaneously. Denote the teraton ndex by k = 1, 2,..., and the k-th teraton of EMC conssts of two steps: 1. Wth probablty p m (mutaton rate), apply a mutaton operator to each sample ndependently n the populaton x (k). Wth probablty 1 p m, apply a crossover operator to the populaton x (k). Accept the new populaton accordng to the Metropols-Hastngs rule. Detals are gven n Appendces A.1 and A Try to exchange n pars of samples (x (k),x (k) ), wth unformly chosen from {1,...,n} and = ± 1 wth probablty w(x (k) x (k) ) as descrbed n Appendx A.3. EMC s a standard MCMC algorthm and thus mantans the ergodcty property for ts Markov chans. Because of ncorporaton of features from smulated annealng and genetc algorthm, EMC constructs proposal dstrbutons more effectvely and converges faster than tradtonal MCMC algorthms. 3.3 The AEMC algorthm In AEMC, we frst run a number of teratons of EMC and then use CART to learn a proposal dstrbuton (for generatng new samples) from the samples produced by EMC. Denote by (k) the set of samples we have retaned after teraton k. From (k), we defne hgh performance samples to be those wth relatvely small H(x)values. The hgh performance samples are the representatves of the promsng search regons. We denote by H (k) (h) the h percentle of the H(x) values n (k). Then, the set of hgh performance samples at teraton k, H (k), are defned as H (k) ={x : x (k) and H(x) H (k) (h) }. As a result, the samples n (k) are grouped nto two classes, the hgh performance samples n H (k) and the others. Treatng these samples as a tranng dataset, we then ft a CART model to a two-class classfcaton problem. Usng the predcton from the resultng CART model, we can partton the sample space nto rectangular regons, some of whch have small H(x) values and are therefore deemed as the promsng regons, whle other regons as non-promsng regons. The promsng regons produced by CART are represented as a (k) x b (k), = 1,...,d, = 1,...,n. Snce X s dscrete and fnte, there s a lower bound l and an upper bound u n the -th dmenson of the sample space. Clearly we have l a (k) b (k) u.cartmay produce multple promsng regons. We denote by m (k) the number of regons. Then, the collecton of the promsng regons s specfed n the followng: a (k) s x b (k) s, = 1,...,d, = 1,...,n, s = 1,...,m (k). (5) As the algorthm goes on, we contnuously update (k), and hence a (k) s and b (k) s. After we have dentfed the promsng regons, the proposal densty s constructed based on the followng thoughts:

6 380 Stat Comput (2008) 18: get a sample from the promsng regons wth probablty R, and from elsewhere wth probablty 1 R, respectvely. We recommend usng a relatvely large R value, say R =.9. Snce there may be multple promsng regons dentfed by CART, we denote the proposal densty assocated wth each regon by q ks (x), s = 1,...,m (k). In ths paper, we use a Metropols-wthn-Gbbs procedure (Müller 1991) to generate new samples as follows. For = 1,...,n, denote the populaton after the k-th teraton by x (k+1, 1) = (x (k+1) 1,...,x (k+1) 1, x (k),...,x n (k) ), of whch the frst 1 samples have been updated, and the Metropols-wthn-Gbbs procedure s about to generate the -th new sample. Note that x (k+1,0) = (x (k) 1,...,x(k) n ). 1. Set S to be randomly chosen from {1,...,m (k) }. Generate a sample x from the proposal densty q ks( ). q ks (x ) = d =1 (r I(a(k) S x b(k) S ) b (k) S a(k) S or x >b(k) S ) + (1 r) I(x <a(k) S (u l ) (b (k) S a(k) S ) ), (6) where I( ) s the ndcator functon. Here r s the probablty of samplng unformly wthn the range specfed by the CART rules on each dmenson. Snce each dmenson s ndependent of each other, we have R = r d. 2. Construct a new populaton x (k+1,) by replacng x (k) wth x, and accept the new populaton wth probablty mn(1,r d ), where r d = f(x(k+1,) ) T(x (k+1, 1) x (k+1,) ) f(x (k+1, 1) ) T(x (k+1,) x (k+1, 1) ) = exp{ (H (x ) H(x(k) ))/t } T(x(k+1, 1) x (k+1,) ) T(x (k+1,) x (k+1, 1) ), (7) If the proposal s reected, x (k+1,) = x (k+1, 1). The transton probablty n (7) s calculated as follows. Snce we only change one sample n each Metropolswthn-Gbbs step, the transton probablty can be wrtten as T(x (k) x (k) +1,...,x(k) n T(x (k) x x(k) [ ]), where x(k) ). Then we have m(k) x x(k) [ ] ) = 1 m (k) q ks(x ). s=1 [ ] = (x(k+1) 1,...,x (k+1) 1, From (6), t s not dffcult to see that as long as 0 <r<1, the proposal s global,.e., q ks (x) > 0 for all x X. Snce X s fnte, t s natural to assume that f(x) s bounded away from 0 and on X. Thus, the mnorsaton condton (Mengersen and Tweede 1996),.e., ω f(x) = sup x X q ks (x) <, s satsfed. As shown n Appendx C, satsfacton of ths condton would lead to the ergodcty of the AEMC algorthm. Now we are ready to present a summary of the AEMC algorthm, whch conssts of two modes: the EMC mode and the data-mnng (or metamodelng) mode. 1. Set k = 0. Start wth an ntal populaton x (0) by unformly samplng n samples over X and a temperature ladder t ={t 1,...,t n }. 2. EMC mode: run EMC untl a swtchng condton s met. Apply mutaton, crossover, and exchange operators to the populaton x (k) and accept the updated populaton accordng to the Metropols-Hastngs rule. Set k = k Run the data-mnng mode untl a swtchng condton s met. Wth probablty P k, use the CART method to update the promsng regons,.e., update the values of a (k+1) s and b (k+1) s n (5). Wth probablty 1 P k, do not apply CART and smply let a (k+1) s = a (k) s and b (k+1) s = b (k) s. Generate n new samples followng the Metropolswthn-Gbbs procedure mentoned earler n ths secton. Set k = k Alternate between the two modes untl a stoppng rule s met. The algorthm could termnate when the computatonal budget (the number of teratons) s consumed or when the change n the best H(x) value does not exceed a gven threshold for several teratons. To effectvely mplement AEMC, several ssues need to be consdered. Frstly, t s the choce of parameters n EMC: n and p m. We smply follow the recommendatons made n the EMC related research. So we set n and p m to values that favor EMC. Typcally, n = 5 20 and p m.25 (Lang and Wong 2000). Secondly, the choce of P k. We need to make sure P 1 > P 2 > >P k >, and lm k P k = 0, whch ensures the dmnshng adaptaton condton requred for the ergodcty of the adaptve MCMC algorthms (Roberts and Rosenthal 2007). As dscussed n Appendx C, meetng the dmnshng adaptaton condton s crucal to the convergence of AEMC. Intutvely, P k could be consdered as the learnng rate. In the begnnng of the algorthm, we do not have much nformaton about the functon surface, and therefore we apply the data-mnng mode to learn new nformaton

7 Stat Comput (2008) 18: wth a relatvely hgh probablty. As the algorthm goes on, we may have suffcent knowledge about the functon surface, and t may be a waste to execute the data-mnng mode too often. So we make the learnng rate decrease over tme. Specfcally, we set P k = 1/k δ.theδ(δ>0) controls the decreasng speed of P k. The larger δ s, the faster P k decreases to 0. We choose δ =.1 n ths paper. Thrdly, the constructon of tranng samples (k).the queston s that should we use all the past samples to construct (k) or should we use a subset nstead, for example, usng only recent samples gathered snce the last datamnng operaton. If we use all the past samples, data mnng wll be performed on a large dataset, and dong so wll nevtably take a long tme and thus slow down the optmzaton process. Because EMC s able to randomly sample from the whole sample space, AEMC s less lkely to fall nto local optma even f we ust use recent samples. Thus, the latter becomes our choce. Lastly, we dscuss the followng tunng parameters of AEMC. Swtchng condton M. In order to adopt the strengths of the two mechansms and compensate ther weaknesses, a proper swtchng condton s needed to select the approprate mode of operatons for the proposed optmzaton procedure. Because of the nablty of a stand-alone data-mnng mode to fnd representatve samples (wll be shown n Sect. 4), t s not benefcal to run the datamnng mode for multple teratons. So a natural choce s to run the data-mnng mode only once for every M teratons of EMC. If M s too large, the learnng effect from the data-mnng mode wll be overshadowed and AEMC vrtually becomes a pure EMC algorthm. If M s too small, EMC may not be able to gather enough representatve data and thus data mnng could hardly be effectve. We recommend choosng M based on the value of n M, whch s the same sze used n the data-mnng mode for establshng the predctve model. From our experence, nm = works qute well. The proper choce of h vares for dfferent problems. For a mnmzaton problem, a large h value could brng many unnterestng samples nto the set of supposedly hgh performance solutons and then slow down the optmzaton process. On the other hand, a small value of h would ncrease the chance for the algorthm to fall nto local optmum. Besdes, a very small h value may lead to small promsng regons, whch could make the acceptance rate of new samples too low. But the danger of fallng nto the local optma s not grave because the data-mnng mode s followed by EMC that randomzes the populaton agan and makes t possble to escape from the local optma. In lght of ths, we recommend an aggressve choce for the h value,.e. h = 5 15%. Choce of the tree sze n CART. Fttng a tree s to approxmate the response surface of H(x). A small-szed tree may not be sophstcated enough to approxmate the surface, whle a large-szed tree may overft the data. Snce we apply CART multple tmes n the entre procedure of AEMC, we beleve that the mechansm of how the trees work n AEMC s actually smlar to tree boostng (Haste et al. 2001), where a seres of CART are put together to produce a result. For tree boostng, Haste et al. (2001) recommended 2 J 10. In our problem, however, controllng J alone does not precsely fulfll our obectve. Because our goal s to fnd the global optmum rather than to make good predcton for the whole response surface, we are much more nterested n the part of the response surface where the H(x) values are relatvely small, correspondng to the class of hgh performance samples, H (k). It then makes sense to control the number of termnal nodes assocated wth H (k), denoted by J H. Controllng J H enables us to ft a CART model that better approxmates a hgh performance porton of the response surface. Note that the set of termnal nodes representng H (k) s a subset of all termnal nodes n the correspondng tree, meanng that the value of J H s postvely correlated wth the value of J. Thus, controllng J H n the meanwhle also regulates the value of J. The basc ratonale behnd the selecton of J H s smlar to that for J :alargej H results n a large J and could lead to overfttng; the danger of usng too small a J H s that there wll be too few promsng regons for the subsequent actons to search and evolve from, whch may cause the proposed procedure to mss some good samples. From the above arguments, we note that the H (k) n our problem plays an analogous role as the whole response surface n the tradtonal tree boostng. We beleve that the gudelne for J could be transferred to J H,.e., 2 J H 10. In Sect. 4, we shall provde a senstvty analyss of the three tunng parameters, whch reveals how the performance of AEMC depends on ther choces. From the results, we wll see that M and h are the two most mportant tunng parameters and that AEMC s not senstve to the value of J H, provded that t s chosen from the above-recommended range. As an adaptve MCMC algorthm, AEMC smulates multple Markov chans n parallel and could therefore utlze the nformaton from dfferent chans for mprovng the convergence rate. We wll provde some emprcal results n support of ths clam n Sect. 4 because a theoretcal assessment of convergence rate s too dffcult to obtan. But we do nvestgate the ergodcty of AEMC. We are able to show that AEMC s ergodc as long as the proposal q ks ( ) s global and the data-mnng mode s run wth probablty P k 0as k. The ergodcty mples that AEMC wll reach the

8 382 Stat Comput (2008) 18: global optmum of H(x) gven enough run tme. Ths property of the AEMC algorthm s stated n Theorem 1 and ts proof s ncluded n Appendx C. Theorem 1 If 0 <r<1, lm k P k = 0, and X s compact, then AEMC s ergodc,.e., the samples x (k) converge n dstrbuton to f(x). A fnal note s that we assume the sample space X to be dscrete and bounded n ths paper. Yet the AEMC algorthm could easly be extended to an optmzaton problem wth a contnuous and bounded sample space. One ust needs to use the mutaton and crossover operators that are proposed n Lang and Wong (2001). Theorem 1 stll holds. For unbounded sample spaces, some constrants can be put on the tals of the dstrbuton f(x) and the proposal dstrbuton to ensure that the mnorsaton condton hold. Refer to Roberts and Tweede (1996), Rosenthal (1995) and Roberts and Rosenthal (2004) for more dscussons on ths ssue. 4 Numercal results To llustrate the effectveness of the AEMC algorthm, we use t to solve three problems: the frst two are for optmzaton purposes and the thrd s for samplng purposes. The frst example s a sensor placement problem n an assembly process, and n the second example we optmze the Grewank functon (Grewank 1981), a wdely used test functon n the area of global optmzaton. In the thrd example we use AEMC to sample from a mxture Gaussan dstrbuton and see how AMEC, as an adaptve MCMC method, can help the samplng process. For the two optmzaton examples, we compare AEMC wth EMC, the stand-alone CART guded method, and the standard genetc algorthm. As to the parameters n AEMC, we chose n = 5, p m =.25, M = 60, J H = 6 and h = 10%. For the standard genetc algorthm, we let the populaton sze be 100, crossover rate be.9 and mutaton rate be.01. All optmzaton algorthms were mplemented n the MAT- LAB envronment, and all reported performance statstcs of the algorthms were the average result of 10 trals. The performance ndces for comparson nclude the best functon value found by an algorthm and the number of tmes that H( ) has been evaluated (also called the number of functon evaluatons herenafter). The use of the number of functon evaluatons as a performance measure makes good sense for many engneerng desgn problems, where the obectve functon H( ) s complex and tme consumng to evaluate. Thus the tme of functon evaluatons essentally domnates the entre computatonal cost. In Sect. 4.3, we wll also present a senstvty analyss on the three tunng parameters, the swtchng condton M, the percentle value h,andthetreeszej H. 4.1 Sensor placement example In ths secton, we attempt to fnd an optmal sensor placement strategy n a three-staton two-dmensonal (2-D) assembly process (Fg. 2). Coordnate sensors are dstrbuted throughout the assembly process to montor the dmensonal qualty of the fnal assembly and/or of the ntermedate subassembles. M 1 M 5 are fve coordnate sensors that are currently n place on the three statons; ths s smply one nstance of, out of hundreds of thousands of other possble, sensor placements. The goal of havng these coordnate sensors s to estmate the dmensonal devaton at the fxture locators on dfferent statons, labeled as P, = 1,...,8, n Fg. 2. Researchers have establshed physcal models connectng the sensor measurements to the devatons assocated wth the fxture locators (Jn and Sh 1999; Mandrol et al. 2006). Such a relatonshp could be expressed n a lnear model, mathematcally equvalent to a lnear regresson model. Thus, the desgn of sensor placement becomes very smlar to the optmal desgn problem n expermentaton, and the problem to Fg. 2 Illustraton: a mult-staton assembly process. The process proceeds as follows: () at the staton I, part 1 and part 2 are assembled; () at the staton II, the subassembly consstng of part 1 and part 2 receves part 3 and part 4; and () at the staton III, no assembly operaton s performed but the fnal assembly s nspected. The 4-way pns constran the part moton n both the x-andthe z-axes, and the 2-way pns constran the part moton n the z-axs

9 Stat Comput (2008) 18: decde where one should place them on respectve assembly statons so that the estmaton of the parameters (.e., fxturng devatons) can be acheved wth the mnmum varance. Smlar to the optmal expermental desgns, people chose to optmze an alphabetc optmalty crteron (such as D-optmalty or E-optmalty) of an nformaton matrx that s determned by the correspondng sensor placement. In ths paper, we use the E-optmalty desgn crteron as a measure of the sensor system senstvty, the same as n Lu et al. (2005). But the AMEC algorthm s certanly applcable to other desgn crtera. Due to the complexty of ths sensor placement problem, one wll need to run a set of MATLAB codes to calculate the response of senstvty for a gven placement of sensors. For more detals of the physcal process and modelng, please refer to Mandrol et al. (2006). We want to maxmze the senstvty, whch s equvalent to mnmzng the maxmum varance of the parameter estmaton. To facltate the applcaton of the AEMC algorthm, we dscretze the geometrc area of each part vable for sensor placement usng a resoluton of 10 mm (whch s the sze of a locator s dameter); ths treatment s the same as what was done n Km and Dng (2005) and Lu et al. (2005). The dscretzaton procedure also ensures that all constrants are ncorporated nto the canddate samples so that we can solve the unconstraned optmzaton (2) for sensor placement. Ths dscretzaton results n N c canddate locatons for any sngle sensor so the sample space for (2) sx =[1,N c ] d Z d. For the assembly process shown n Fg. 2, the 10-mm resoluton level results n the number of canddate sensor locatons on each part as n 1 = 6,650, n 2 = 7,480, n 3 = 2,600, and n 4 = 2,600. Because part 1 and 2 appear on all three statons, part 3 and 4 appear on the second and thrd statons, there are totally N c = 3 (n 1 + n 2 ) + 2 (n 3 + n 4 ) = 52,790 canddate locatons for each sensor. Suppose that d = 9, meanng that nne sensors are to be nstalled, then the total number of soluton canddates s C 52, , where Ca b s the combnatonal operator. Evdently, the number of soluton canddates s overwhelmngly large. Moreover, we want to maxmze the senstvty obectve functon (.e., more senstve a sensor system, the better), whle AEMC s to solve a mnmzaton problem. For ths reason, we let H(x) n the AEMC algorthm equal to the senstvty response of x multpled by 1, where x represents an nstance of sensor placement. We solve the optmal sensor placement problem for nne sensors and 20 sensors, respectvely. For the scenaro of d = 9, each algorthm was run for 10 5 functon evaluatons. The results of varous methods are presented n Fg. 3. It demonstrates a clear advantage of AEMC over the other algorthms. EMC and genetc algorthm have smlar performances. After about functon evaluatons, AEMC fnds H(x) 1.20, whch s the Fg. 3 Performances of the varous algorthms for nne sensors Fg. 4 Uncertanty of dfferent algorthms for nne sensors best value found by EMC and genetc algorthm after 10 5 functon evaluatons. Ths translates to 2.5 tmes mprovement n terms of CPU tme. Fgure 4 gves a Boxplot of the best senstvty values found at the end of each algorthm. AMEC fnds a senstvty value, on average, 10% better than EMC and genetc algorthm. AEMC also has smaller uncertanty than EMC. From the two fgures, t s worth notng that the stand-alone CART guded method performs much worse than the other algorthms n ths example. We beleve ths happens manly because the stand-alone CART guded method fals to gather representatve data n the sample space assocated wth the problem. Fgure 5 presents the best (.e., yeldng the largest senstvty) sensor placement strategy found n ths example. We also test the AEMC method n a hgher dmensonal case,.e., when d = 20. All the algorthms were agan run for 10 5 functon evaluatons. The algorthm performance curves are presented n Fg. 6, where we observe that the senstvty value found by AEMC after functon evaluatons s the same as that found by EMC after 10 5 functon evaluatons. Ths translates to a 3-fold mprove-

10 384 Stat Comput (2008) 18: Fg. 5 Best sensor placement for nne sensors Fg. 8 Best sensor placement for 20 sensors Fg. 6 Performances of the varous algorthms for 20 sensors Fg. 9 Performances of the varous algorthms for the Grewank functon 4.2 Grewank test functon In order to show the potental applcablty of AEMC to other optmzaton problems, we test t on a well-known test functon. The Grewank functon (Grewank 1981) has been used as a test functon for global optmzaton algorthms n a broad body of lterature. The functon s defned as Fg. 7 Uncertanty of dfferent algorthms for 20 sensors ment n terms of CPU tme. Interestngly, ths mprovement s greater than that n the 9-sensor case. The fnal senstvty value attaned by AMEC s, on average, 7% better than EMC and genetc algorthm. Agan, the stand-alone CART guded method fals to compete wth the other algorthms. We feel that as the dmensonalty of the sample space gets hgher, the performance of the stand-alone CART guded method gets worse compared to others. Fgure 7 shows the uncertanty of each algorthm. In ths case, AEMC has a lttle hgher uncertanty than EMC, but the average results of AEMC are stll better. Fgure 8 presents the best sensor placement strategy found n ths example. H G (x) = d x 2 d 4000 =1 =1 ( ) x cos + 1, 600 x 600, = 1,...,d. The global mnmum s located at the orgn and the global mnmum value s 0. The functon has a very large number of local mnma, exponentally ncreasng wth d. Herewe set d = 50. Fgure 9 presents the performances of the varous algorthms. All algorthms were run for 10 5 functon evaluatons. Please note that we have truncated the y-axs to be between 0 and 400 so as to show the performances at the early-stage of dfferent algorthms more clearly. AEMC clearly outperforms the other algorthms, especally n the begnnng stage, as one can observe that AEMC converges much faster. As explaned earler, ths fast convergence s an appealng property to engneerng desgn problems. In ths example,

11 Stat Comput (2008) 18: Table 1 ANOVA analyss for the sensor placement example Source Sum sq. D.f. Mean sq. F Prob >F M h J H M h M J H h J H Error Total Fg. 10 Uncertanty of dfferent algorthms for the Grewank functon the stand-alone CART guded method converges faster than genetc algorthm and EMC, but s entrapped nto a local optmum at an early stage. AEMC appears to level off after 65,000 functon evaluatons. However, as assured by Theorem 1, AEMC wll eventually reach the global optmum f gven enough computatonal effort. Fgure 10 gves a Boxplot of the best H G (x) found at the end of the algorthms. Comparng to the other methods, the AEMC algorthm not only mproves the average performance but also reduces the uncertanty. Although the stand-alone CART guded method has found good solutons, the uncertanty of ths algorthm s much hgher than AEMC and the other two. 4.3 Senstvty analyss We run an ANOVA analyss to nvestgate how senstve the performance of AEMC to the tunng parameters: the swtchng condton M, the percentle value h, and the tree sze J H. The value of M s chosen from fve levels (10, 30, 60, 90, 120), the h s chosen from three levels (1%, 10%, 20%), and the J H s chosen from three levels (3, 6, 12). Then a full factoral desgn wth 45 cases s constructed. For the sensor placement example, we use the 9-sensor case for the ANOVA analyss. AEMC was run for 10 5 functon evaluatons, and we recorded the best functon value found as the output. For each factor level combnaton, ths was done fve tmes. The ANOVA table s shown n Table 1. We can see that the man effect of M and h s sgnfcant at the.05 level. Our study also revealed that relatvely smaller M and h are favored. For the Grewank example, we ran AEMC for functon evaluatons and recorded the best functon value found as the output. For each factor level combnaton, ths was done fve tmes. The ANOVA table s shown n Table 2. If usng.05 level, then we can see that the man effects of M and h are sgnfcant. Here smaller h and M arefavoredaswell. Table 2 ANOVA analyss for the Grewank example Source Sum sq. D.f. Mean sq. F Prob >F M h J H M h M J H h J H Error Total Based on our senstvty analyss for both examples, we understand that the swtchng condton M and the percentle value h are the mportant factors affectng the performance of AEMC. To choose sutable values for M and h, users may follow our general gudelnes outlned n Sect. 3.3 and further tune ther values for specfc problems. 4.4 Samplng from a mxture Gaussan dstrbuton AEMC falls nto the category of adaptve MCMC methods, and thus could be used to draw samples from a gven target dstrbuton. As shown by Theorem 1, the dstrbuton of those samples wll asymptotcally converge to the target dstrbuton. We test AEMC on a fve-dmensonal mxture Gaussan dstrbuton π(x) = 1 3 N 5(0,I 5 ) N 5(5,I 5 ), where 0 = (0, 0, 0, 0, 0) and 5 = (5, 5, 5, 5, 5). Ths example s used n Lang and Wong (2001). Snce n ths paper we assume the sample space to be bounded, we set the sample space to be [ 10, 10] 5 here. The dstance between the two modes s 5 5, whch makes t dffcult to ump from one mode to another. We compare the performance of AEMC wth the Metropols algorthm and EMC. Each algorthm

12 386 Stat Comput (2008) 18: Fg. 11 Convergence rate of dfferent algorthms was used to obtan 10 5 samples and all numercal results were averages of 10 runs. The Metropols algorthm was appled wth a unform proposal dstrbuton U[x 2,x+ 2] 5. The acceptance rate was.22. The Metropols algorthm could not escape from the mode n whch t started. We then compare AEMC wth EMC. We only look at samples of the frst dmenson, snce each dmenson s ndependent of each other. Snce the true hstogram of the dstrbuton s known, we can calculate the L 2 dstance between the estmated mass vector and the true dstrbuton. Specfcally, we dvde the nterval [ 10, 10] nto 40 ntervals (wth a resoluton of.5), and we can calculate the true and estmated probablty mass respectvely n each of the ntervals. All EMC related parameters are set followng Lang and Wong (2001). In AEMC, we set h = 25% so that samples from both modes can be obtaned. If h s too small, AEMC wll focus only on the peaks of the functon and thus only samples around the mode 5 can be obtaned (ths s because the probablty of samplng around the mode 5 s twce as large as the mode 0). In EMC, we employ the mutaton and crossover operators used n Lang and Wong (2001). The acceptance rates of mutaton and crossover operators were.22 and.44, respectvely. In AEMC, the acceptance rates of mutaton, crossover, and data-mnng operators were.23,.54, and.10, respectvely. The Fg. 11 shows the L 2 dstance versus the number of samples for the three methods n comparson. AEMC converges faster than EMC and the Metropols algorthm, and ts samplng qualty s far better than the Metropols algorthm and t also acheves better samplng qualty than EMC. 5 Conclusons In ths paper, we have presented an AEMC algorthm for optmzaton problems wth black-box obectve functon, whch are often encountered n engneerng desgns (e.g., a sensor placement problem n an assembly process). Our experence ndcates that hybrdzng a predctve model wth an evolutonary Monte Carlo method could mprove the convergence rate for an optmzaton procedure. We have also shown that the algorthm mantans the ergodcty, mplyng ts convergence to the global optmum. Numercal studes are used to compare the proposed AEMC method wth other alternatves. All methods n comparson are used to solve a sensor placement problem and to optmze a Grewank functon. In these studes, AEMC outperforms other alternatves and shows a much enhanced convergence rate. Ths paper focuses manly on the applcaton of AEMC n solvng optmzaton problems. Yet consderng that the AEMC algorthm s an adaptve MCMC method, t should also be useful for samplng from complcated probablty dstrbutons. The EMC algorthm has already been shown to be a powerful tool as a samplng method. We beleve that the data-mnng component could further mprove the convergence rate to the target dstrbuton wthout destroyng the ergodcty of the algorthm. We demonstrated the effectveness of AEMC for samplng purposes usng a mxture Gaussan dstrbuton. In the current verson of AEMC, we use CART as the metamodelng method; consequently the sample space s slced nto rectangles. When the functon surface s complex, a rectangular partton may not be suffcent. A more sophstcated partton may be requred. However, a good replacement for CART may not be straghtforward to fnd because any vable canddate must be computatonally effcent so as not to slow down the optmzaton process. Some other data-mnng or predctve modelng methods, such as neural networks, may have more learnng power, but they are computatonally much more expensve and are therefore less lkely to be a good canddate for the AEMC algorthm. Acknowledgements The authors would lke to gratefully acknowledge fnancal support from the NSF under grants CMMI and CMMI The authors also apprecate the edtors and the referees for ther valuable comments and suggestons. Appendx We frst descrbe the crossover, mutaton, and exchange operators used n the EMC algorthm n Appendx A. InAppendx B, we then gve a bref summary of the publshed results on the convergence of adaptve MCMC algorthms. In Appendx C, we prove the ergodcty of the AEMC algorthm.

13 Stat Comput (2008) 18: Appendx A: Operators n the EMC algorthm A.1 Crossover From the current populaton x (k) ={x (k) 1,...,x(k) n }, wefrst select one parental par, say x (k) and x (k) ( ). The frst parental sample s chosen accordng to a roulette wheel procedure wth Boltzmann weghts. Then the second parental sample s chosen randomly from the rest of the populaton. So the probablty of selectng the par (x (k),x (k) ) s P ((x (k),x (k) ) x (k) ) 1 = (n 1)G(x (k) ) [ ] exp{ H(x (k) )/τ s }+exp{ H(x (k) )/τ s }, (8) where G(x (k) ) = n =1 exp{ H(x (k) )/τ s }, and τ s s the selecton temperature. Two offsprngs are generated by some crossover operator, and the offsprng wth a smaller ftness value s denoted as y and the other s y. All the crossover operators used n genetc algorthm, e.g., 1-pont crossover, 2-pont crossover and real crossover, could be used here. Then the new populaton y ={x (k) 1,...,y,...,y,...,x n (k) } s accepted wth probablty mn(1,r c ), r c = f(y) T(x (k) y) f(x (k) ) T(y x (k) ) = exp{ (H (y ) H(x (k) ))/t (H (y ) H(x (k) ))/t } T(x(k) y) T(y x (k) ), where T( ) s the transton probablty between populatons, and T(y x) = P ((x,x ) x) P ((y,y ) (x,x )) for any two populatons x and y. If the proposal s accepted, the populaton x (k+1) = y, otherwse x (k+1) = x (k). Note that all the crossover operators are symmetrc,.e., P ((y,y ) (x,x )) = P ((x,x ) (y,y )). So T(x y)/ T(y x) = P ((y,y ) y)/p ((x,x ) x), whch can be calculated accordng to (8). Followng the above selecton procedure, samples wth better H( ) values have a hgher probablty to be selected. Offsprng generated by these parents wll lkely to be good as well. In other words, the offsprng have learned from the good parents. So crossover operator allows us to construct better proposal dstrbutons, and new samples generated by t are more lkely to have better obectve H( ) values. A.2 Mutaton A sample, say x (k), s randomly selected from the current populaton x (k), then mutated to a new sample y by reversng the values of some randomly chosen bts. Then the new populaton y ={x (k) 1,...,y,...,x n (k) } s accepted wth probablty mn(1, r m ), r m = f(y) T(x (k) y) f(x (k) ) T(y x (k) ) = exp{ (H (y ) H(x (k) ))/t } T(x(k) y) T(y x (k) ). The 1-pont, 2-pont, and unform mutaton are all symmetrc operators, and thus T(y x (k) ) = T(x (k) y). Ifthe proposal s accepted, the populaton x (k+1) = y, otherwse x (k+1) = x (k). A.3 Exchange Gven the current populaton x (k) and the temperature ladder t, we try to change (x (k), t) = (x (k) 1,t 1,...,x (k),t,..., x (k),t,...,x n (k),t n ) to (x, t) = (x (k) 1,t 1,...,x (k),t,..., x (k),t,...,x n (k),t n ). The new populaton x s accepted wth probablty mn(1,r e ), where r e = f(x ) T(x (k) x ) f(x (k) ) T(x x (k) ) { ( = exp (H (x (k) ) H(x (k) 1 )) 1 t t )} T(x (k) x ) T(x x (k) ). If ths proposal s accepted, x (k+1) = x, otherwse x (k+1) = x (k). Typcally, the exchange s performed only on states wth neghborng temperature values,.e., =1. Let p(x (k) ) be the probablty that x (k) s chosen to exchange wth another state, and w(x (k) x (k) ) be the probablty that x (k) s chosen to exchange wth x (k).so = ± 1, and w(x (k) ) = w(x (k) +1 x(k) w(x (k) n 1 x(k) 1 x(k) ) =.5 and w(x (k) 2 x(k) 1 ) = n ) = 1. The transton probablty T(x x (k) ) = p(x (k) ) w(x (k) x (k) ) + p(x (k) ) w(x (k) x (k) ), and thus T(x x (k) ) = T(x (k) x ). Appendx B: Publshed results on the convergence of adaptve MCMC algorthms In ths secton, we brefly revew the results presented n Roberts and Rosenthal (2007). Let f( ) be a target probablty dstrbuton on a state space X n wth B X n = B X B X beng the σ -algebra generated by measurable rectangles. Let {K γ } γ Y be a collecton of Markov chan kernels on X n, each of whch has f( ) as a statonary dstrbuton: (f K γ )( ) = f( ). Assume K γ s φ-rreducble and aperodc, and we have that K γ s ergodc for f( ),.e., for all x X n, lm k Kγ k (x, ) f( ) =0, where μ( ) ν( ) =sup B BX n μ(b) ν(b) s the usual total

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method Maxmzng Overlap of Large Prmary Samplng Unts n Repeated Samplng: A comparson of Ernst s Method wth Ohlsson s Method Red Rottach and Padrac Murphy 1 U.S. Census Bureau 4600 Slver Hll Road, Washngton DC

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

Solving Nonlinear Differential Equations by a Neural Network Method

Solving Nonlinear Differential Equations by a Neural Network Method Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

DUE: WEDS FEB 21ST 2018

DUE: WEDS FEB 21ST 2018 HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Cathy Walker March 5, 2010

Cathy Walker March 5, 2010 Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Originated from experimental optimization where measurements are very noisy Approximation can be actually more accurate than

Originated from experimental optimization where measurements are very noisy Approximation can be actually more accurate than Surrogate (approxmatons) Orgnated from expermental optmzaton where measurements are ver nos Approxmaton can be actuall more accurate than data! Great nterest now n applng these technques to computer smulatons

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Information Geometry of Gibbs Sampler

Information Geometry of Gibbs Sampler Informaton Geometry of Gbbs Sampler Kazuya Takabatake Neuroscence Research Insttute AIST Central 2, Umezono 1-1-1, Tsukuba JAPAN 305-8568 k.takabatake@ast.go.jp Abstract: - Ths paper shows some nformaton

More information