A marginal mixture model for discovering motifs in sequences

Size: px
Start display at page:

Download "A marginal mixture model for discovering motifs in sequences"

Transcription

1 A margna mxture mode for dscoverng motfs n sequences E Voudgar and Konstantnos Bekas Abstract. In ths study we present a margna mxture mode for dscoverng probabstc motfs n categorca sequences. The proposed method s based on a genera framework for deveopng, extendng and margnazng expressve motf modes that encapsuates spata nformaton. Two aternatve schemes for constructng expressve modes are descrbed, the extend and the margna approach. The EM agorthm s apped for estmatng the mode parameters, whe an ntazaton procedure s used based on the known K-means agorthm. Numerca experments on varous smuated and rea sets of sequences demonstrate the advantages of the proposed approach n comparson wth the basc maxmum kehood wth the EM agorthm scheme and the Gbbs sampng approach. Introducton Dscoverng motfs n sequences s an mportant and attractve pattern recognton probem found n severa appcaton areas, such as bonformatcs, web mnng, etc. Gven a set of dscrete sequences, a probabstc motf (or pattern) can be seen as a common substrng that s nosy repeated n dfferent ocatons n sequences. The motfs are hghy conserved resdues present n actve stes of sequences and thus they have powerfu dscrmnatve abtes n the sense of creatng features n cassfcaton tasks [4, ]. In the terature there are varous methods that have been ntroduced for sovng ths probem, cassfed accordng to the type of motfs they produce. The Gbbs sampng [9, ], the MEME [], the SAM [8], the BoProspector [], the Greedy EM [3] and the LOGOS [3], consttute representatve statstca approaches for dscoverng shared motfs n a set of sequences. Most of them use probabstc generatve modes to mode motfs as stochastc strng patterns randomy embedded n a smpe background. In such a settng, the motf dscovery probem can be seen as a standard mssng-vaue nference and beng converted nto a parameter estmaton probem. In partcuar, a methods formuate the probem usng ether mxture of mutnoma modes or hdden Markov modes, and appy standard methods such as the Expectaton-Maxmzaton (EM) agorthm [6, 2] or Gbbs sampng[9, ] schemes to estmate the motf mode parameters. The most common way for stochastcay representng a motf s through the poston weght matrx (PWM) that gves the dstrbuton of aphabet characters at every poston assumng that the postons wthn a motf are ndependent. Nevertheess, the smpe PWM mode descrpton s senstve to nose and random or trva recurrent patterns (repettons of short substrngs), and s unabe to capture potenta ste dependences nsde the motf [3]. Varous methods Department of Computer Scence, Unversty of Ioannna, 4 Ioannna, Greece, ema: {evoudga, kbekas}@cs.uo.gr have been deveoped to ncorporate spata nformaton n motfs but, mosty, they are motf specfc and hande ony speca shape motfs [3]. Our am n ths study s to deveoore compact expressve motf modes that encoses more naturay the mportant spata nformaton of motf postons. Ths s done by ntroducng a margna mxture mode under two schemes that encompass the motf neghborhood. In the frst approach (the extended mode), the PWM s expanded for capturng the entre neghborhood of a motf and not a snge K-mer. Aternatvey (the margna mode), the PWM not ony can ft the motf occurrences, but aso can partay ft the substrngs that overap wth them and beong to the same motf neghborhood. These new mode capabtes strengthen the expresson eve of a motf mode by ndrecty ntroducng usefu spata operators. Fnay, another advantage s that the proposed approach s ess senstve to the ntazaton that makes the EM agorthm apped for estmatng mode parameters to be ess affected on parameter ntazaton. We have made experments wth smuated sets of sequences as we as known boogca benchmarks. Comparsons have been aso made wth rva methods: the basc maxmum kehood approach wth the EM agorthm [] and the Gbbs sampng [9, ], where we set up three evauaton crtera. In secton 2 we present the basc methodoogy for dscoverng probabstc motfs wth mxture modes and the EM agorthm for estmatng mode parameters, and then the proposed margna mxture mode and ts expressve motf modes. eca care s aso gven to the ntazaton where we present a scheme based on the k- means custerng agorthm over the mode parametrc space. Secton 3 presents expermenta resuts obtaned by our method and comparatve approaches n varous datasets. Fnay, we summarze and gve some concuded remarks n secton 4. 2 Appyng mxture modes for motfs dscoverng Consder a fnte set Σ {c,..., c M } consstng of M ndvdua characters. An arbtrary strng over the set Σ s any sequence S j {s jk } L j k of ength Lj, where s jk Σ denotes the character at the k-th poston of the j-th sequence. Now, et S {S,..., S N } be a set of N strngs of ength L,..., L N, respectvey. A common subsequence of ength K that s repeated at dfferent stes among the nput sequences of set S s caed motf. In our study we w consdered that the motf ength K s known. The appcaton of mxture modes needs a necessary preprocessng step to be made frst, where we coect a the possbe substrngs over S of ength equa to K. Ths can be done by sdng a wndow of sze K to every sequence S j, obtanng a set of L j K + substrngs. Each substrng ndcates the startng poston

2 of a possbe motf copy over sequences. In such way, we obtan a set of n substrngs X {x } n, n N (Lj K + ), that consttutes the set of observatons n our probem. 2. The basc approach In the basc approach, we ft a two-component mxture of mutnomas mode to the set of nput substrngs X. Here, we assume that each substrng x has been generated from ether a motf mode (y ) or a background mode (y ), as gven by the hdden (bnary) varabe y. The motf mode component has a pror probabty P (y ) π, whe the second component, the background, captures a the non-motf nformaton wth a pror probabty P (y ) π. The densty functon of the mxture mode for an observaton x s gven by f(x Θ) π(x θ) + ( π)p b (x b), () where Θ {π, θ, b} s the set of the (unknown) mode parameters: the pror probabty ( π ) and the two component densty parameters. A fexbe and most commony used way for descrbng a motf mode s through a poston weght matrx (PWM) θ [θ k ] of sze K M. Each eement θ k denotes the probabty of character c Σ to be found n the k-th poston of the motf, where for each poston k t hods θ k. On the other hand, the background dstrbuton s represented wth an M-vector of probabtes b [b,..., b M ] that s common for any substrng poston ( b ). Foowng the mutnoma dstrbuton and assumng ndependence among motf postons, the probabty densty functons of the motf (x θ) and the background mode p b (x b) are K (x θ) p(x y, θ) θ δ k k, (2) p b (x b) p(x y, b) k K δ k k b, (3) where δ k s the Kronecker deta,.e. f character c s found at the k-th poston of substrng x, otherwse. Based on the above formuaton, the motf dscovery probem can now be converted nto a maxmum kehood (ML) estmaton probem, where the og-kehood functon s gven by L(Θ) og p(x Θ) og f(x Θ). (4) The EM agorthm [6] consttutes an effcent framework for estmatng the mode parameters Θ { π, {θ k } and {b } }. It teratvey performs two man steps. At frst the E-step where the condtona expectaton z p(y x, Θ (t) ) of the hdden varabes y s computed p(y x, Θ (t) ) π (t) (x θ (t) ) π (t) (x θ (t) ) + ( π (t) )p b (x b (t) ), () and aso the expectaton of the compete data og-kehood s estabshed: Q(Θ, Θ (t) ) {og π + og (x θ)} + +( ){og( π) + og p b (x b)}.(6) The above Q-functon s maxmzed next durng the M-step over the mxture mode parameters. Ths gves the foowng update equatons: π (t+) θ (t+) k b (t+) n, (7) δ k, (8) M δ k ( ) K K k ( ) δ k. (9) The EM agorthm guarantees the convergence of the kehood functon to a oca maxmum where smutaneousy satsfes a the constrants of the parameters. At the end of the process, the occurrences of the estmated motf can be found by seectng the substrngs x whose posteror probabty vaue z s above a threshod (e.g..8). 2.2 The proposed approach Nevertheess, a sgnfcant drawback from the above basc approach appears due to the convenent..d. assumpton of the observatons x. Ths prevents nto takng nto account the vauabe spata nformaton of data. As a resut substrngs that are found n the neghborhood of a motf occurrence may be consdered as motf copes.e. to have hgh posteror probabty vaues z to be a motf. Ths phenomenon, whch s mosty common n recurrent type of motfs where a short part of the motf s repeated, may be substantay modfy the expressve motf mutnoma mode to a non-nformatve unform mode. Suppose, for exampe, that the motf we want to dscover s the T AT AT AT AT A, where the 2-gram T A s repeated tmes (K ). Suppose aso a copy of ths motf x T AT AT AT AT A. Then, as shown n Tabe, there are substrngs n ts neghborhood (whch overap wth the x ) that w contan a sgnfcant part of the motf. Ths w make ther posteror probabty vaues (Eq. ) to be sgnfcant arge, and thus ther contrbuton to the update rue of motf mode parameters θ k of Eq. 8 w be hgh. Ths w cause nto a step by step dramatc reducton of the hgher probabtes vaues n every motf poston (that correspond to the characters of the motf) durng the EM procedure. Under ths crcumstance, the EM agorthm w end up to a unform dstrbuton of M characters n K motf postons and thus the motf dscovery w be faed. In the terature there are varous methods that try to overcome ths occurrence by ndrecty or drecty adoptng spata nformaton to the mode. In [] for exampe, a normazaton of the posteror vaue z of the adjacent sequences s performed so that guarantees n any wndow of ength K the sum of z vaues remans ess than or equa to. Another approach was presented n [2] by ntroducng spata constrants to the mode through the use of a Markov Random Fed (MRF) pror over the motf abes. In ths study we work more systematcay over spataty by ncorporatng ths usefu knd 2

3 Tabe. Substrngs n the neghborhood of a motf occurrence x. ce they a have a sgnfcant overap wth the rea motf and thus hgh posteror vaues z, they w contrbute consderaby to the EM updated rue of the motf parameters. x 4 x 2 x x +2 x +4 ****TATATA **TATATATA TATATATATA TATATATA** TATATA**** of nformaton from observatons more naturay and drecty to the expressve motf mode The extended mode In the frst scheme, the orgna poston weght matrx θ s extended by K more nes to both sdes, so as to ncude the entre neghborhood of a motf. The motf mode s now descrbed wth a new matrx of dmenson (3K 2) M and the fttng procedure can now be seen as searchng over the new matrx mode θ to fnd the best suted bock matrx of K sze (number of nes). Therefore, ths matrx expanson suggests 2K (overappng) mutnoma dstrbutons equa to the number of a possbe startng postons wthn motf neghborhood (frst 2K nes of matrx θ), wth the foowng densty functon ( j,..., 2K ) φ j (x θ) p(x y j, θ) K θ δ k j+k,. () k The hdden varabe y now defnes the startng poston of the substrng x wthn the neghborhood of the motf. Note that when there s not any such poston (y 2K), the substrng s generated by the smpest background mode as prevous. The mxture mode densty functon now becomes f(x Θ) 2K 2K π jφ j(x θ) + ( π j)p b (x b). () In the exampe of Tabe the ntroducton of the new matrx θ aow the neghborng substrngs (x 4, x 2, x +2, x +4 ) that are overappng wth the rea motf copy (x ) to be ftted wth one of these mutnoma dstrbuton and not destroy the dstrbuton of the orgna motf. We w refer to ths scheme as the extended mode. Lkewse, the EM agorthm can now be apped for estmatng the mode parameters. It requres the cacuaton of the posteror probabtes of hdden varabes durng the E-step j P (y j x, Θ (t) ) π(t) j φ j(x θ (t) ), (2) f(x Θ (t) ) and the maxmzaton of the correspondng Q-functon durng the M-step Q(Θ, Θ (t) ) 2K +( { 2K j {og π j + og φ j (x θ)} + 2K j ){og( π j ) + og p b (x b)}}. (3) Ths gves the foowng update rues θ (t+) k k π j n z(t) j n j δ,k j+, k k jk K+ j jk K+ 2K j δ,k j+, k jk K+ 2K b (t+) jk K+ j j δ,k j+, j, (4), f k < K, f K k < 2K n ( 2K j ) K k δ k, f 2K k < 3K () K n ( 2K j ). (6) As t s cear n the update equaton for the motf extended matrx vaues θ k (Eq. ), every ne contrbutes more than one tme (except for the borders) to the computaton of the posteror probabtes z j), snce they beong to the 2K mutnomas severa tmes. Moreover, t must be noted that n practce t does not aways happen the rea motf to be found n the centra bock matrx (between the K and 2K nes) of the extended matrx θ. After convergence of the EM agorthm, the fna motf w be found by searchng for the K contnuous nes of the matrx that have the greater sum of ther maxmum probabty vaues θ k. Fnay, there are two other mportant advantages of the above scheme that must be dscussed. At frst t s ess senstvty to the ntazaton of the motf parameters. Even f we have not perfecty ntazed the motf and a part of t s mssng, the matrx extenson w fnay manage to fx t and ncorporate the mssng part to ts neghborng nes. Aso, the extended mode s not very senstve to the ength of motf K, snce the new expressve matrx has a suffcent space for motfs of overestmated or underestmated ength The margna mode In the second approach, the poston weght matrx θ remans the same as the basc approach (sze K M), but now t has a dfferent behavor. Here, we consder that n ths mode ony a part of t s used for fttng nput substrngs x, whe the remanng part s treated as background. Ths can be seen as appyng a shftng operator to the examned substrng x across the matrx θ so as to fnd the optma agnment between them n terms of fttng. There are two cases. Ether the frst j postons of x are modeed by the background and the rest K j postons are modeed from the frst j nes of the matrx θ, or the opposte,.e. the ast j nes of θ are responsbe for modeng the frst j motf postons whe the rest are consdered as background. In the negatve case, the substrng s consdered entrey as background nformaton (Eq. 3). Therefore, we have agan 2K, 3

4 mutnoma dstrbutons, equa to the number of a possbe combnatons of fttng the observaton wth the margna mode. Every densty functon s now gven by K j φ j (x θ) b 2K j k k δ k j k θ δ k j K+k, θ δ,k+k j, k b K k2k j+ δ k, j K,K + j 2K (7) In ths case the hdden varabe y defnes the part (number of ast or frst postons) of any observaton x that beongs to the motf mode. Ths scheme w be referred next as margna mode. The appcaton of the EM agorthm for estmatng the parameters of the margna mode eads to dfferent update equatons at the M- step, n comparson wth the extended mode. These are: θ (t+) k b (t+) K+k jk K { j δ,k j+k, K+k jk K j j j δ k + k 2K jk+ K 2K { (K j) j + j jk+ K k2k j+ (j K) j } (8) δ k } (9) At the end of the EM agorthm, the motf can be found by searchng for the frst (k ) and the ast (k 2 ) poston (ne of the matrx θ) among the totay K postons, where ther maxmum probabty vaue are above a threshod (e.g..8). Ths segment [k k 2] corresponds ony to a part of the true motf. Its rest part can be substantay obtaned by frst fndng the substrngs x wth hgh posteror probabty vaues (z k2 >.8) that capture the segment [k k 2 ], and then by the suffcent statstcs of ther neghborng substrngs n ether drecton. 2.3 Intazaton of motf modes The major drawback of the EM agorthm s ts great dependence on the nta vaues of the mode parameters that may cause nto gettng stuck n oca maxma of the kehood functon [2]. In our study, ths probem s many concentrated on the ntazaton of the motf mode parameters θ, snce the background densty b s aways ntazed by the reatve frequences of characters n sequences. Varous approaches have been proposed n the terature to overcome ths probem. In [] for exampe, a dynamc programmng approach s used whch estmates the goodness of many possbe startng ponts based on the kehood of the mode after one EM teraton. Another method presented n [3] appes a dvsve herarchca custerng approach that generates a motf modes parametrc search space. In ths study we have apped an ntazaton strategy that s based on a custerng scheme usng the cassca k-means agorthm. In the genera case the k-means agorthm ams at fndng a partton of M dsjont custers to a set of n observatons, so as to mnmze the overa sum of ther dstances wth the custer centers µ j. 4 Dependng on the type of observatons, one must determne an approprate dstance functon and aso a method for cacuatng custer centers. In our study, we ntay map every nput substrng x nto a poston weght matrx ϑ, where ts vaues were taken from the Kronecker deta vaues ϑ k δ k. In ths mode parametrc search space we perform then a custerng procedure by usng the Manhattan dstance functon between two subsequences x, x j: d(, j) K k M ϑ k µ jk ϑ µ j. Fnay, the custer centers µ j {µ jk } are estmated by the suffcent statstcs of the substrngs that currenty beong to jth custer (reatve frequences of characters c n every poston k). The number of custers for searchng was set to k n/n. When fnshng, we ntaze the motf mode wth the center µ j from the custer j that has the mnmum ntracuster dstance,.e. average dstance between a custer members and ts center. The expermenta study has shown that ths custerng scheme provdes satsfactory nta vaues of mode parameters θ very fast. 3 Expermenta resuts Severa experments were performed usng both artfca and rea datasets n an attempt to study the effectveness of the proposed approach. Durng a experments the custerng scheme of k-means was frst executed twenty (2) tmes and the optmum souton was kept to ntaze the motf mode. The pror probabtes π j were a ntay set to π /(2K). We have tested both versons of the proposed margna mxture mode, the extend () and the margna (). Comparatve resuts have been aso obtaned usng the basc ML approach (basc), as we as the Gbbs sampng () [9, ]. The method teratvey performs two steps unt kehood convergence: It randomy seects frst a sequence S and re-estmates the motf mode θ (t+) usng the current motf postons of a sequences but S. Then a new startng poston of the motf mode s seected n S (among the L K + possbe) by sampng from the posteror dstrbuton. Obvousy, ths verson of the assumes that each sequence has a unque occurrence of the pattern, whe our approach suggests an arbtrary number of motf copes n each sequence. Athough ths s not far, n our study we have seected datasets havng a snge motf copy n every sequence. Fnay, t must be mentoned that a methods were ntazed wth the same way, except from the method whch s better to randomy ntazed. The ast was executed 2 successve tmes wth dfferent seed vaue and the best souton found was kept. The generaton mechansm of the artfca sets used n our experments was the foowng: Usng a dscrete aphabet Σ wth M characters, we unformy produce a number N 2 sequences of varabe ength (wth a mean ength L ). Then, n each sequence we randomy seect a poston for pacng a nosy copy of a preseected seed motf of ength K, accordng to a mutaton probabty vaue (common to every motf poston). Eght dfferent vaues for the nose parameter were used, and for each vaue we generated dfferent sets of artfca sequences and kept statstcs (mean vaues and stds) of the performance of a the comparatve methods. We have used the foowng three performance crtera for evauatng each method: θ: Manhattan dstance between the estmated and the true motf

5 mode (dstance dstrbutons) θ θ θ true K M k θ k θ true k, 2 2 where true pattern densty θ true was estmated from the reatve frequences of characters of the nosy motf copes. : percentage of the predcted postons of a true motf copes n the set of nput sequences X. : percentage of the rea copes of a motf found havng detected at east 33% of ther postons. In must be noted that for the cacuaton of and quanttes, when a method found overappng substrngs n a motf copy neghborhood, ony ther mean vaue of the predcted number of postons woud have been kept for that copy. We have made experments n a varety of smuated sets of sequences (varous seed motfs, ength K, aphabet sze M and nose parameter ). Fgure ustrates the depcted resuts we obtaned wth four seed motfs and an aphabet of sze M 8. Each dagram represents the mean vaue and the standard devaton of the evauaton measurements ( θ,, ) as cacuated by each one of the four comparatve approaches usng dfferent nosy eves (). Both methods, extend and margna, showed a better performance n comparson wth the other two approaches. In amost a cases both schemes of our method had managed to estmate better the true mode accordng to the θ crteron. And ths s of great nterest. Aso, as t was expected, n the case of recurrent motfs (Fg. (c), (d)) the basc approach competey fas to dscover them. On the other hand, our approach yeded satsfactory resuts and comparabe to these of the method. The resuts were the same wth severa other experments wth smuated data created from aphabet of dfferent sze. We have aso evauated our method n a rea dataset generated by the motf nformaton of Eschercha co ReguonDB [7]. Ths database conssts of varous sets of DNA sequences 2 havng a snge motf copy per sequence wth symmetrc margns (background) on both ste of t. The study of these benchmarks s very nterestng and attractve, snce the ground truth (ocatons of motf occurrences) s known a pror and aso the degree of smarty of motf occurrences s very ow (extremey nosy motfs). More detas about these data can be found n [7]. In our experments we have used a part of eght (8) such data sets presented n Tabe 2, where we have consdered two margn szes: 2 and (a) seed motf:p AMEP ARAKAT W, wth no repettons, K 2, M (b) seed motf:p AMEP ARAKAT W P AMET W RA wth no repettons, K 2, M (c) seed motf:t AT AT AT AT AT A, wth repettons, K 2, M 8 Tabe 2. The descrpton of the rea data used n our expermenta study Fename Motf ength (K) Number of sequences (N) DnaA 9 7 FruR 4 Fur 9 2 LexA 2 CytR 4 PurR 76 SoxS 78 7 TyrR Fgure 2 summarzes the resuts obtaned by the four comparng methods. In partcuar, t ustrates the mean vaues of both measurements and, as cacuated after 2 dfferent runs n any of the 2 Data can be downoaded from (d) seed motf:t AT AT AT AT AT AT AT AT AT A, wth repettons, K 2, M 8 Fgure. Comparatve resuts wth smuated data. For each probem we present three dagrams wth the cacuated mean vaues and stds of the three performance measurements.

6 (a) 2 margn sze (b) margn sze Fgure 2. Comparatve resuts on the rea datasets of Tabe 2 n the case of 2 (a) and (b) margn sze. The vertca bars represent the mean vaues of the two cacuated measurements, after executons of each method. seected data sets. Athough these knd of motfs are not recurrent, on average, we obtaned better resuts wth our approach n comparson wth the basc ML method. Ths renforce the proposed margna mxture mode snce t manages to detect quatatvey better motfs n extremey nose parameter. Fnay, athough the comparson wth the verson of the method n unfar, our method yeded comparabe resuts especay n motfs of sma ength K. An expanaton of ths s that when searchng for motfs of arge ength, for exampe K 76, the seected margn sze of 2 or s not enough for capturng the neghborhood of motfs wthn the expressve mode and thus the proposed approaches cannot be normay apped. 83, (99). [2] K. Bekas, A mxture mode based Markov random feds for dscoverng probabstc patterns n sequences, n Panheenc Conference n Artfca Integence (SETN-26) (Lecture Notes n Artfca Integence), voume 39, pp. 2 34, (26). [3] K. Bekas, D.I. Fotads, and A. Lkas, Greedy mxture earnng for mutpe motf dscoverng n boogca sequences, Bonformatcs, 9(), 67 67, (23). [4] A. Brāzma, I. Jonasses, I. Edhammer, and D. Gbert, Approaches to the automatc dscovery of patterns n bosequences, Journa of Computatona Boogy, (2), , (998). [] B. Bréjova, C. DMarco, T. Vna r, S.R. Hdago, G. Hogun, and C. Patten. Fndng patterns n boogca sequences. Project Report for CS798g, Unversty of Wateroo, 2. [6] A.P. Dempster, N.M. Lard, and D.B. Rubn, Maxmum kehood from ncompete data va the EM agorthm, J. Roy. Statst. Soc. B, 39, 38, (977). [7] J. Hu, B. L, and D. Khara, Lmtatons and potentas of current motf dscovery agorthms, Nucec Acds Research, 33(), , (2). [8] R. Hughey and A. Krogh, Hdden Markov modes for sequence anayss: enson and anayss of the basc method, CABIOS, 2(2), 9 7, (996). [9] C.E. Lawrence, S.F. Atschu, M.S. Bogusk, J.S. Lu, A.F. Neuwand, and J.C. Wootton, Detectng subte sequence sgnas: a Gbbs sampng strategy for mutpe agnment, Scence, 226, 28 24, (993). [] J.S. Lu, A.F. Neuwad, and C.E. Lawrence, Bayesan modes for mutpe oca sequence agnment and Gbbs sampng strateges, J. Amer. Statstca Assoc, 9, 6 69, (99). [] X. Lu, D.L. Brutag, and J.S. Lu, BoProspector: dscoverng conserved DNA motfs n upstream reguatory regons of co-expressed genes, n Pac. Symp. Bocomput, pp , (2). [2] G.M. McLachan and D. Pee, Fnte Mxture Modes, New York: John Wey & Sons, Inc., 2. [3] E.P. Xng, W. Wu, M.I. Jordan, and R.M. Karp, LOGOS: A moduar Bayesan mode for de novo motf detecton, Journa of Bonformatcs and Computatona Boogy, 2(), 27 4, (24). 4 Concusons We have presented a margna mxture mode for dscoverng probabstc motfs n categorca sequences that ncorporates an advanced and more nformatve expressve motf mode. Two smar versons of that mode were proposed, one wth an expanson and another one wth a demtaton of bounds wthn poston weght matrx. Ths aows to ft the neghborhood of a motf that eads to an effcent and consstent nference of motf ocatons. Another sgnfcant advantage of our method s that the EM agorthm that s used to estmate the mode parameters, s ess depended on the mode parameters ntazaton than wth the cassca approach. Experments on a varety of artfca and rea data sets have shown mproved performance of the proposed scheme and ts abty to dentfy quatatvey better motfs, n comparson wth the basc ML approach and the method. We are pannng to further nvestgate the performance of the proposed method to other expermenta data sets and aso to desgn more compex expressve motf modes that can smutaneousy hande gaps among motf postons. REFERENCES [] T.L. Baey and C. Ekan, Unsupervsed earnng of mutpe motfs n Bopoymers usng Expectaton Maxmzaton, Machne Learnng, 2, 6

MARKOV CHAIN AND HIDDEN MARKOV MODEL

MARKOV CHAIN AND HIDDEN MARKOV MODEL MARKOV CHAIN AND HIDDEN MARKOV MODEL JIAN ZHANG JIANZHAN@STAT.PURDUE.EDU Markov chan and hdden Markov mode are probaby the smpest modes whch can be used to mode sequenta data,.e. data sampes whch are not

More information

Associative Memories

Associative Memories Assocatve Memores We consder now modes for unsupervsed earnng probems, caed auto-assocaton probems. Assocaton s the task of mappng patterns to patterns. In an assocatve memory the stmuus of an ncompete

More information

Image Classification Using EM And JE algorithms

Image Classification Using EM And JE algorithms Machne earnng project report Fa, 2 Xaojn Sh, jennfer@soe Image Cassfcaton Usng EM And JE agorthms Xaojn Sh Department of Computer Engneerng, Unversty of Caforna, Santa Cruz, CA, 9564 jennfer@soe.ucsc.edu

More information

Multispectral Remote Sensing Image Classification Algorithm Based on Rough Set Theory

Multispectral Remote Sensing Image Classification Algorithm Based on Rough Set Theory Proceedngs of the 2009 IEEE Internatona Conference on Systems Man and Cybernetcs San Antono TX USA - October 2009 Mutspectra Remote Sensng Image Cassfcaton Agorthm Based on Rough Set Theory Yng Wang Xaoyun

More information

Neural network-based athletics performance prediction optimization model applied research

Neural network-based athletics performance prediction optimization model applied research Avaabe onne www.jocpr.com Journa of Chemca and Pharmaceutca Research, 04, 6(6):8-5 Research Artce ISSN : 0975-784 CODEN(USA) : JCPRC5 Neura networ-based athetcs performance predcton optmzaton mode apped

More information

Nested case-control and case-cohort studies

Nested case-control and case-cohort studies Outne: Nested case-contro and case-cohort studes Ørnuf Borgan Department of Mathematcs Unversty of Oso NORBIS course Unversty of Oso 4-8 December 217 1 Radaton and breast cancer data Nested case contro

More information

Supplementary Material: Learning Structured Weight Uncertainty in Bayesian Neural Networks

Supplementary Material: Learning Structured Weight Uncertainty in Bayesian Neural Networks Shengyang Sun, Changyou Chen, Lawrence Carn Suppementary Matera: Learnng Structured Weght Uncertanty n Bayesan Neura Networks Shengyang Sun Changyou Chen Lawrence Carn Tsnghua Unversty Duke Unversty Duke

More information

Research on Complex Networks Control Based on Fuzzy Integral Sliding Theory

Research on Complex Networks Control Based on Fuzzy Integral Sliding Theory Advanced Scence and Technoogy Letters Vo.83 (ISA 205), pp.60-65 http://dx.do.org/0.4257/ast.205.83.2 Research on Compex etworks Contro Based on Fuzzy Integra Sdng Theory Dongsheng Yang, Bngqng L, 2, He

More information

Example: Suppose we want to build a classifier that recognizes WebPages of graduate students.

Example: Suppose we want to build a classifier that recognizes WebPages of graduate students. Exampe: Suppose we want to bud a cassfer that recognzes WebPages of graduate students. How can we fnd tranng data? We can browse the web and coect a sampe of WebPages of graduate students of varous unverstes.

More information

A finite difference method for heat equation in the unbounded domain

A finite difference method for heat equation in the unbounded domain Internatona Conerence on Advanced ectronc Scence and Technoogy (AST 6) A nte derence method or heat equaton n the unbounded doman a Quan Zheng and Xn Zhao Coege o Scence North Chna nversty o Technoogy

More information

COXREG. Estimation (1)

COXREG. Estimation (1) COXREG Cox (972) frst suggested the modes n whch factors reated to fetme have a mutpcatve effect on the hazard functon. These modes are caed proportona hazards (PH) modes. Under the proportona hazards

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Optimum Selection Combining for M-QAM on Fading Channels

Optimum Selection Combining for M-QAM on Fading Channels Optmum Seecton Combnng for M-QAM on Fadng Channes M. Surendra Raju, Ramesh Annavajjaa and A. Chockangam Insca Semconductors Inda Pvt. Ltd, Bangaore-56000, Inda Department of ECE, Unversty of Caforna, San

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Chapter 6 Hidden Markov Models. Chaochun Wei Spring 2018

Chapter 6 Hidden Markov Models. Chaochun Wei Spring 2018 896 920 987 2006 Chapter 6 Hdden Markov Modes Chaochun We Sprng 208 Contents Readng materas Introducton to Hdden Markov Mode Markov chans Hdden Markov Modes Parameter estmaton for HMMs 2 Readng Rabner,

More information

Delay tomography for large scale networks

Delay tomography for large scale networks Deay tomography for arge scae networks MENG-FU SHIH ALFRED O. HERO III Communcatons and Sgna Processng Laboratory Eectrca Engneerng and Computer Scence Department Unversty of Mchgan, 30 Bea. Ave., Ann

More information

On the Power Function of the Likelihood Ratio Test for MANOVA

On the Power Function of the Likelihood Ratio Test for MANOVA Journa of Mutvarate Anayss 8, 416 41 (00) do:10.1006/jmva.001.036 On the Power Functon of the Lkehood Rato Test for MANOVA Dua Kumar Bhaumk Unversty of South Aabama and Unversty of Inos at Chcago and Sanat

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Predicting Model of Traffic Volume Based on Grey-Markov

Predicting Model of Traffic Volume Based on Grey-Markov Vo. No. Modern Apped Scence Predctng Mode of Traffc Voume Based on Grey-Marov Ynpeng Zhang Zhengzhou Muncpa Engneerng Desgn & Research Insttute Zhengzhou 5005 Chna Abstract Grey-marov forecastng mode of

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Cyclic Codes BCH Codes

Cyclic Codes BCH Codes Cycc Codes BCH Codes Gaos Feds GF m A Gaos fed of m eements can be obtaned usng the symbos 0,, á, and the eements beng 0,, á, á, á 3 m,... so that fed F* s cosed under mutpcaton wth m eements. The operator

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

A DIMENSION-REDUCTION METHOD FOR STOCHASTIC ANALYSIS SECOND-MOMENT ANALYSIS

A DIMENSION-REDUCTION METHOD FOR STOCHASTIC ANALYSIS SECOND-MOMENT ANALYSIS A DIMESIO-REDUCTIO METHOD FOR STOCHASTIC AALYSIS SECOD-MOMET AALYSIS S. Rahman Department of Mechanca Engneerng and Center for Computer-Aded Desgn The Unversty of Iowa Iowa Cty, IA 52245 June 2003 OUTLIE

More information

The Application of BP Neural Network principal component analysis in the Forecasting the Road Traffic Accident

The Application of BP Neural Network principal component analysis in the Forecasting the Road Traffic Accident ICTCT Extra Workshop, Bejng Proceedngs The Appcaton of BP Neura Network prncpa component anayss n Forecastng Road Traffc Accdent He Mng, GuoXucheng &LuGuangmng Transportaton Coege of Souast Unversty 07

More information

Clustering gene expression data & the EM algorithm

Clustering gene expression data & the EM algorithm CG, Fall 2011-12 Clusterng gene expresson data & the EM algorthm CG 08 Ron Shamr 1 How Gene Expresson Data Looks Entres of the Raw Data matrx: Rato values Absolute values Row = gene s expresson pattern

More information

Lower Bounding Procedures for the Single Allocation Hub Location Problem

Lower Bounding Procedures for the Single Allocation Hub Location Problem Lower Boundng Procedures for the Snge Aocaton Hub Locaton Probem Borzou Rostam 1,2 Chrstoph Buchhem 1,4 Fautät für Mathemat, TU Dortmund, Germany J. Faban Meer 1,3 Uwe Causen 1 Insttute of Transport Logstcs,

More information

GENERATIVE AND DISCRIMINATIVE CLASSIFIERS: NAIVE BAYES AND LOGISTIC REGRESSION. Machine Learning

GENERATIVE AND DISCRIMINATIVE CLASSIFIERS: NAIVE BAYES AND LOGISTIC REGRESSION. Machine Learning CHAPTER 3 GENERATIVE AND DISCRIMINATIVE CLASSIFIERS: NAIVE BAYES AND LOGISTIC REGRESSION Machne Learnng Copyrght c 205. Tom M. Mtche. A rghts reserved. *DRAFT OF September 23, 207* *PLEASE DO NOT DISTRIBUTE

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Item calibration in incomplete testing designs

Item calibration in incomplete testing designs Pscoógca (20), 32, 07-32 Item cabraton n ncompete testng desgns Theo JHM Eggen * & Norman D Verhest** *Cto/Unversty of Twente, The etherands **Cto, The etherands Ths study dscusses the justfabty of tem

More information

Discriminating Fuzzy Preference Relations Based on Heuristic Possibilistic Clustering

Discriminating Fuzzy Preference Relations Based on Heuristic Possibilistic Clustering Mutcrtera Orderng and ankng: Parta Orders, Ambgutes and Apped Issues Jan W. Owsńsk and aner Brüggemann, Edtors Dscrmnatng Fuzzy Preerence eatons Based on Heurstc Possbstc Custerng Dmtr A. Vattchenn Unted

More information

3. Stress-strain relationships of a composite layer

3. Stress-strain relationships of a composite layer OM PO I O U P U N I V I Y O F W N ompostes ourse 8-9 Unversty of wente ng. &ech... tress-stran reatonshps of a composte ayer - Laurent Warnet & emo Aerman.. tress-stran reatonshps of a composte ayer Introducton

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

Inthem-machine flow shop problem, a set of jobs, each

Inthem-machine flow shop problem, a set of jobs, each THE ASYMPTOTIC OPTIMALITY OF THE SPT RULE FOR THE FLOW SHOP MEAN COMPLETION TIME PROBLEM PHILIP KAMINSKY Industra Engneerng and Operatons Research, Unversty of Caforna, Bereey, Caforna 9470, amnsy@eor.bereey.edu

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

A MIN-MAX REGRET ROBUST OPTIMIZATION APPROACH FOR LARGE SCALE FULL FACTORIAL SCENARIO DESIGN OF DATA UNCERTAINTY

A MIN-MAX REGRET ROBUST OPTIMIZATION APPROACH FOR LARGE SCALE FULL FACTORIAL SCENARIO DESIGN OF DATA UNCERTAINTY A MIN-MAX REGRET ROBST OPTIMIZATION APPROACH FOR ARGE SCAE F FACTORIA SCENARIO DESIGN OF DATA NCERTAINTY Travat Assavapokee Department of Industra Engneerng, nversty of Houston, Houston, Texas 7704-4008,

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle

Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle Lower bounds for the Crossng Number of the Cartesan Product of a Vertex-transtve Graph wth a Cyce Junho Won MIT-PRIMES December 4, 013 Abstract. The mnmum number of crossngs for a drawngs of a gven graph

More information

The Entire Solution Path for Support Vector Machine in Positive and Unlabeled Classification 1

The Entire Solution Path for Support Vector Machine in Positive and Unlabeled Classification 1 Abstract The Entre Souton Path for Support Vector Machne n Postve and Unabeed Cassfcaton 1 Yao Lmn, Tang Je, and L Juanz Department of Computer Scence, Tsnghua Unversty 1-308, FIT, Tsnghua Unversty, Bejng,

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

G : Statistical Mechanics

G : Statistical Mechanics G25.2651: Statstca Mechancs Notes for Lecture 11 I. PRINCIPLES OF QUANTUM STATISTICAL MECHANICS The probem of quantum statstca mechancs s the quantum mechanca treatment of an N-partce system. Suppose the

More information

COMBINING SPATIAL COMPONENTS IN SEISMIC DESIGN

COMBINING SPATIAL COMPONENTS IN SEISMIC DESIGN Transactons, SMRT- COMBINING SPATIAL COMPONENTS IN SEISMIC DESIGN Mchae O Leary, PhD, PE and Kevn Huberty, PE, SE Nucear Power Technooges Dvson, Sargent & Lundy, Chcago, IL 6060 ABSTRACT Accordng to Reguatory

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

LECTURE 21 Mohr s Method for Calculation of General Displacements. 1 The Reciprocal Theorem

LECTURE 21 Mohr s Method for Calculation of General Displacements. 1 The Reciprocal Theorem V. DEMENKO MECHANICS OF MATERIALS 05 LECTURE Mohr s Method for Cacuaton of Genera Dspacements The Recproca Theorem The recproca theorem s one of the genera theorems of strength of materas. It foows drect

More information

[WAVES] 1. Waves and wave forces. Definition of waves

[WAVES] 1. Waves and wave forces. Definition of waves 1. Waves and forces Defnton of s In the smuatons on ong-crested s are consdered. The drecton of these s (μ) s defned as sketched beow n the goba co-ordnate sstem: North West East South The eevaton can

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

A new P system with hybrid MDE- k -means algorithm for data. clustering. 1 Introduction

A new P system with hybrid MDE- k -means algorithm for data. clustering. 1 Introduction Wesun, Lasheng Xang, Xyu Lu A new P system wth hybrd MDE- agorthm for data custerng WEISUN, LAISHENG XIANG, XIYU LIU Schoo of Management Scence and Engneerng Shandong Norma Unversty Jnan, Shandong CHINA

More information

XII.3 The EM (Expectation-Maximization) Algorithm

XII.3 The EM (Expectation-Maximization) Algorithm XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

IV. Performance Optimization

IV. Performance Optimization IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

ON AUTOMATIC CONTINUITY OF DERIVATIONS FOR BANACH ALGEBRAS WITH INVOLUTION

ON AUTOMATIC CONTINUITY OF DERIVATIONS FOR BANACH ALGEBRAS WITH INVOLUTION European Journa of Mathematcs and Computer Scence Vo. No. 1, 2017 ON AUTOMATC CONTNUTY OF DERVATONS FOR BANACH ALGEBRAS WTH NVOLUTON Mohamed BELAM & Youssef T DL MATC Laboratory Hassan Unversty MORO CCO

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

Short-Term Load Forecasting for Electric Power Systems Using the PSO-SVR and FCM Clustering Techniques

Short-Term Load Forecasting for Electric Power Systems Using the PSO-SVR and FCM Clustering Techniques Energes 20, 4, 73-84; do:0.3390/en40073 Artce OPEN ACCESS energes ISSN 996-073 www.mdp.com/journa/energes Short-Term Load Forecastng for Eectrc Power Systems Usng the PSO-SVR and FCM Custerng Technques

More information

REDUCTION OF CORRELATION COMPUTATION IN THE PERMUTATION OF THE FREQUENCY DOMAIN ICA BY SELECTING DOAS ESTIMATED IN SUBARRAYS

REDUCTION OF CORRELATION COMPUTATION IN THE PERMUTATION OF THE FREQUENCY DOMAIN ICA BY SELECTING DOAS ESTIMATED IN SUBARRAYS 15th European Sgna Processng Conference (EUSIPCO 27), Poznan, Poand, September 3-7, 27, copyrght by EURASIP REDUCTION OF CORRELATION COMPUTATION IN THE PERMUTATION OF THE FREQUENCY DOMAIN ICA BY SELECTING

More information

Research Article Green s Theorem for Sign Data

Research Article Green s Theorem for Sign Data Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Aggregation of Social Networks by Divisive Clustering Method

Aggregation of Social Networks by Divisive Clustering Method ggregaton of Socal Networks by Dvsve Clusterng Method mne Louat and Yves Lechaveller INRI Pars-Rocquencourt Rocquencourt, France {lzennyr.da_slva, Yves.Lechevaller, Fabrce.Ross}@nra.fr HCSD Beng October

More information

we have E Y x t ( ( xl)) 1 ( xl), e a in I( Λ ) are as follows:

we have E Y x t ( ( xl)) 1 ( xl), e a in I( Λ ) are as follows: APPENDICES Aendx : the roof of Equaton (6 For j m n we have Smary from Equaton ( note that j '( ( ( j E Y x t ( ( x ( x a V ( ( x a ( ( x ( x b V ( ( x b V x e d ( abx ( ( x e a a bx ( x xe b a bx By usng

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Numerical Investigation of Power Tunability in Two-Section QD Superluminescent Diodes

Numerical Investigation of Power Tunability in Two-Section QD Superluminescent Diodes Numerca Investgaton of Power Tunabty n Two-Secton QD Superumnescent Dodes Matta Rossett Paoo Bardea Ivo Montrosset POLITECNICO DI TORINO DELEN Summary 1. A smpfed mode for QD Super Lumnescent Dodes (SLD)

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Note 2. Ling fong Li. 1 Klein Gordon Equation Probablity interpretation Solutions to Klein-Gordon Equation... 2

Note 2. Ling fong Li. 1 Klein Gordon Equation Probablity interpretation Solutions to Klein-Gordon Equation... 2 Note 2 Lng fong L Contents Ken Gordon Equaton. Probabty nterpretaton......................................2 Soutons to Ken-Gordon Equaton............................... 2 2 Drac Equaton 3 2. Probabty nterpretaton.....................................

More information

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient 58:080 Expermental Engneerng 1 OBJECTIVE Lab 2e Thermal System Response and Effectve Heat Transfer Coeffcent Warnng: though the experment has educatonal objectves (to learn about bolng heat transfer, etc.),

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Mean Field / Variational Approximations

Mean Field / Variational Approximations Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels

More information

Interference Alignment and Degrees of Freedom Region of Cellular Sigma Channel

Interference Alignment and Degrees of Freedom Region of Cellular Sigma Channel 2011 IEEE Internatona Symposum on Informaton Theory Proceedngs Interference Agnment and Degrees of Freedom Regon of Ceuar Sgma Channe Huaru Yn 1 Le Ke 2 Zhengdao Wang 2 1 WINLAB Dept of EEIS Unv. of Sc.

More information

A New Modified Gaussian Mixture Model for Color-Texture Segmentation

A New Modified Gaussian Mixture Model for Color-Texture Segmentation Journa of Computer Scence 7 (): 79-83, 0 ISSN 549-3636 0 Scence Pubcatons A New Modfed Gaussan Mxture Mode for Coor-Texture Segmentaton M. Sujartha and S. Annadura Department of Computer Scence and Engneerng,

More information

QUARTERLY OF APPLIED MATHEMATICS

QUARTERLY OF APPLIED MATHEMATICS QUARTERLY OF APPLIED MATHEMATICS Voume XLI October 983 Number 3 DIAKOPTICS OR TEARING-A MATHEMATICAL APPROACH* By P. W. AITCHISON Unversty of Mantoba Abstract. The method of dakoptcs or tearng was ntroduced

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Boundary Value Problems. Lecture Objectives. Ch. 27

Boundary Value Problems. Lecture Objectives. Ch. 27 Boundar Vaue Probes Ch. 7 Lecture Obectves o understand the dfference between an nta vaue and boundar vaue ODE o be abe to understand when and how to app the shootng ethod and FD ethod. o understand what

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Quantum Runge-Lenz Vector and the Hydrogen Atom, the hidden SO(4) symmetry

Quantum Runge-Lenz Vector and the Hydrogen Atom, the hidden SO(4) symmetry Quantum Runge-Lenz ector and the Hydrogen Atom, the hdden SO(4) symmetry Pasca Szrftgser and Edgardo S. Cheb-Terrab () Laboratore PhLAM, UMR CNRS 85, Unversté Le, F-59655, France () Mapesoft Let's consder

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Some results on a cross-section in the tensor bundle

Some results on a cross-section in the tensor bundle Hacettepe Journa of Matematcs and Statstcs Voume 43 3 214, 391 397 Some resuts on a cross-secton n te tensor bunde ydın Gezer and Murat tunbas bstract Te present paper s devoted to some resuts concernng

More information

Efficient MCMC Sampling with Implicit Shape Representations

Efficient MCMC Sampling with Implicit Shape Representations Effcent MCMC Sampng wth Impct Shape Representatons Jason Chang Massachusetts Insttute of Technoogy 32 Vassar St. Cambrdge, MA jchang7@csa.mt.edu John W. Fsher III Massachusetts Insttute of Technoogy 32

More information

Optimization of JK Flip Flop Layout with Minimal Average Power of Consumption based on ACOR, Fuzzy-ACOR, GA, and Fuzzy-GA

Optimization of JK Flip Flop Layout with Minimal Average Power of Consumption based on ACOR, Fuzzy-ACOR, GA, and Fuzzy-GA Journa of mathematcs and computer Scence 4 (05) - 5 Optmzaton of JK Fp Fop Layout wth Mnma Average Power of Consumpton based on ACOR, Fuzzy-ACOR, GA, and Fuzzy-GA Farshd Kevanan *,, A Yekta *,, Nasser

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

MODEL TUNING WITH THE USE OF HEURISTIC-FREE GMDH (GROUP METHOD OF DATA HANDLING) NETWORKS

MODEL TUNING WITH THE USE OF HEURISTIC-FREE GMDH (GROUP METHOD OF DATA HANDLING) NETWORKS MODEL TUNING WITH THE USE OF HEURISTIC-FREE (GROUP METHOD OF DATA HANDLING) NETWORKS M.C. Schrver (), E.J.H. Kerchoffs (), P.J. Water (), K.D. Saman () () Rswaterstaat Drecte Zeeand () Deft Unversty of

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Suppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl

Suppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl RECURSIVE SPLINE INTERPOLATION METHOD FOR REAL TIME ENGINE CONTROL APPLICATIONS A. Stotsky Volvo Car Corporaton Engne Desgn and Development Dept. 97542, HA1N, SE- 405 31 Gothenburg Sweden. Emal: astotsky@volvocars.com

More information