BIOINFORMATICS: PAST, PRESENT AND FUTURE. Susan R. Wilson Mathematical Sciences Institute, Australian National University, Australia

Size: px
Start display at page:

Download "BIOINFORMATICS: PAST, PRESENT AND FUTURE. Susan R. Wilson Mathematical Sciences Institute, Australian National University, Australia"

Transcription

1 BIOINFORMATICS: PAST, PRESENT AND FUTURE Susan R. Wlson Mathematcal Scences Insttute, Australan Natonal Unversty, Australa Keywords: Bonformatcs, bologcal sequence analyss, sequence algnment, hdden Markov models, evolutonary models, phylogenetc reconstructon, gene expresson analyss, mcroarray data, desgn and analyss of mcroarray experments, systems bology, federated data ntegraton, bo-grds, proteomcs, functonal genomcs, comparatve genomcs Contents 1. Introducton 2. Bologcal sequence analyss 2.1. Background 2.2. Scorng Systems 2.3. Sequence Algnment 2.4. Assessment of Sgnfcance Overvew of Basc Theory Complcatons and Developments 3. Applcatons of hdden Markov models n bonformatcs 4. Evolutonary models and phylogenetc reconstructon 5. Gene expresson analyss 5.1. Background 5.2. Issues Concernng Outcome Measures 5.3. Expermental Desgn 5.4. Analyss of Mcroarray Data. 6. Statstcal methods n proteomcs 7. Systems bology 8. Federated data ntegraton and bo-grds 9. Dscusson Glossary Bblography Bographcal Sketch Summary Bonformatcs s a newly emergng feld that s ncreasngly beng wdely vewed as a more fundamental dscplne at the ntersecton of the bologcal scences wth the mathematcal, statstcal, and physcal scences and chemstry and nformaton technology. Ths revew chapter touches brefly on those aspects of bonformatcs that wll be of nterest to bometrcans, ncludng bologcal sequence analyss and algnment (Secton 2), applcatons of hdden Markov models n bonformatcs (Secton 3), evolutonary models and phylogenetc reconstructon (Secton 4), gene expresson analyss and mcroarrays (Secton 5), proteomcs (Secton 6) systems bology (Secton 7) and federated data ntegraton and bo-grds (Secton 8), fnshng wth a bref dscusson hghlghtng the exctng future of the feld durng the comng decades.

2 1. Introducton Bonformatcs s an emergng feld that was once consdered to be the part of computatonal bology that explctly dealt wth the management of the ncreasng number of large databases, ncludng methods for data retreval and analyses, and algorthms for sequence smlarty searches, structural predctons, functonal predctons and comparsons and so forth. There has been phenomenal growth of lfe scence databases. For example, the most wdely used nucleotde sequence database s Genbank that s mantaned by the Natonal Center for Botechnology Informaton (NCBI) of the US Natonal Lbrary of Medcne; as of February 2003 t contaned 28.5 bllon nucleotdes from 22.3 mllon sequences. Its sze contnues to grow exponentally as more genomes are beng sequenced. However, there s a very large gap (that wll take a long tme to fll) between our knowledge of the functonng of the genome and the generaton (and storng) of raw genomc data. Bonformatcs nvolves the analyss of bologcal data. So very recently, the feld of bonformatcs has been rapdly evolvng, not only due to the mpact of the varous genome proects, but also wth the development of expermental technologes, such as mcroarrays for gene expresson analyses and mass spectrometry for detecton of proten-proten nteractons. Currently bonformatcs s beng ncreasngly wdely vewed as a more fundamental dscplne that also encompasses mathematcs, statstcs, physcs and chemstry. Further, the feld s already lookng forward to a systems bology approach and to smulatons of whole cells wth ncorporaton of more levels of complexty. A recent edtoral n the ournal Bonformatcs noted that a maor upcomng challenge for the bonformatcs communty [s] to adopt a more statstcal way of thnkng and to nteract more closely wth statstcans. The stated goal for many researchers s for developments n Bonformatcs to be focused at fndng the fundamental laws that govern bologcal systems, as n physcs. However, f such laws exst, they are a long way from beng determned for bologcal systems. Instead the current am s to fnd nsghtful ways to model lmted components of bologcal systems and to create tools whch bologsts can use to analyze data. Examples nclude tools for statstcal assessment of the smlarty between two or more DNA sequences or proten sequences, for fndng genes n genomc DNA, for quanttatve analyss of functonal genomcs data, and for estmatng dfferences n how genes are expressed n say dfferent tssues, for analyss and comparson of genomes from dfferent speces, for phylogenetc analyss, and for DNA sequence analyss and assembly. Tools such as these nvolve statstcal modelng of bologcal systems. Although the most relable way to determne a bologcal molecule s structure or functon s by drect expermentaton, there s much that can be acheved n vtro,.e. by obtanng the DNA sequence of the gene correspondng to an RNA or proten and analyzng t, rather than the more laborous fndng of ts structure or functon by drect expermentaton. Much bologcal data arse from mechansms that have a substantal probablstc component, the most sgnfcant beng the many random processes nherent n bologcal evoluton, and also from randomness n the samplng process used to collect the data. Another source of varablty or randomness s ntroduced by the

3 botechnologcal procedures and experments used to generate the data. So the basc goal s to dstngush the bologcal sgnal from the nose. Today, as expermental technques are beng developed for studyng genome wde patterns, such as expresson arrays, the need to approprately deal wth the nherent varablty has been multpled astronomcally. For example, we have progressed from studyng one or a few genes n comparatve solaton to beng able to evaluate smultaneously thousands of genes (or expressed sequence tags) Not only must methodologes be developed whch scale up to handle the enormous data sets generated n the post-genomc era, they need to become more senstve to the underlyng bologcal knowledge and understandng of the mechansms that generate the data. For bometrcans, research has reached an exctng and challengng stage at the nterface of computatonal statstcs and bology. The need for novel approaches to handle the new genome-wde data (ncludng that generated by mcroarrays) has concded wth a perod of dramatc change n approaches to statstcal methods and thnkng. Ths quantum change has been brought about, or even has been drven by, the potental of ever more ncreasng computng power. What was thought to be ntractable n the past s now feasble, and so new methodologes need to be developed and appled. Unfortunately too many of the current practces n the bologcal scences rely on methods developed when computatonal resources were very lmtng and are often ether (a) smple extensons of methods for workng wth one or a few outcome measures, and do not work well when there are thousands of outcome measures, or (b) ad-hoc methods (that are commonly referred to as statstcal or computatonal, or more recently data mnng! methods) that make many assumptons for whch there s often no (bologcal) ustfcaton. The challenge now s to creatvely combne the power of the computer wth relevant bologcal and stochastc process knowledge to derve novel approaches and models, usng mnmal assumptons, and whch can be appled at genomc wde scales. Such technques comprse the foundaton of bonformatc methods n the future. 2. Bologcal Sequence Analyss 2.1. Background Wth the advent of whole genomes becomng avalable for many speces, as well as many other (usually extremely large) databases, such as proten sequence databases, ncreasngly bologsts are askng questons about whether some query sequence of nterest to her or hm s sgnfcantly smlar to some other sequence/s n one (or more) of these databases. If some of these smlar sequences are lkely to correspond to genes or protens wth known functons, then by assocaton t s nferred that the functon of the query sequence s related. Essentally an evolutonary model s assumed where the two sequences have dverged from some common ancestor by the process of mutaton and selecton. Mutatons can be such as to change one nucleotde to another n a DNA sequence or change one amno acd to another n a proten. Natural selecton s a type of screenng process that favors neutral or advantageous mutatons, nsertons and deletons compared wth more deleterous mutatons, nsertons and deletons.

4 Consder the followng two sequences u1, u2,, un 1 v1, v2,, vn 2 where u and v represent the elements of the set {A,G,C,T} n the case of DNA sequences, and have the letters from the set of 20 amno acds n the case of proten sequences. Comparsons of two sequences usually cannot dstngush between whether a deleton has occurred n one sequence or an nserton n the other. Insertons and deletons are referred to as gaps, and n scorng an algnment these are penalzed by assgnng them a cost. The most wdely used program s BLAST (Basc Local Algnment Search Tool) and ts varants, and the more heurstc algorthm FASTA s also wdely used. Ths method starts wth an ntal word search - fndng the best and longest contnuous set of ungapped words n a database - as a startng pont. The length of ths - the ntal k-tuple or word sze - s the prmary determnant of the speed and senstvty of the search. In the remander of ths secton we overvew sequence analyss from the vewpont of basc BLAST search, separately dscussng the three steps, namely () fndng a scorng system for comparng elements of two sequences () fndng optmal algnments, () assessng sgnfcance Scorng Systems For comparng DNA sequences, smple nucleotde scores can be used, such as + m for a match, m' otherwse (that may be equal to m ), correspondng to a smple evolutonary model n whch all nucleotdes are equally common and all substtutons are equally lkely. However, f the sequences of nterest code for proten, t s usually better to compare the proten translaton to amno acds, snce, after only a small amount of evolutonary change, f smple nucleotde substtutons are used there s less nformaton wth whch to deduce homology. Substtuton matrces are used for amno acd algnments. These are matrces n whch each possble resdue substtuton s gven a score reflectng the probablty that t s related to the correspondng resdue n the query. The algnment score wll be the sum of the scores for each poston n the algned sequence. These scores are obtaned as follows. The null hypothess s that u and v occur ndependently, and hence the probablty of the two sequences s pu p v. The alternatve hypothess s that the two sequences have dverged from the same ancestor. Denotng the probablty that u and v have evolved ndependently from w k as p uv then the probablty for the whole algnment s p uv. The rato of these two probabltes gves the lkelhood rato comparng the two hypotheses, and to obtan an addtve scorng system, logarthms are taken, so su (, v) = log( pu ) v p u p v (1)

5 For protens we have a matrx, and the commonly used scorng matrces are PAM and BLOSUM matrces. Elements n the accepted pont mutaton PAMn matrces are computed from a Markov chan model of the replacement of one amno acd by another followng ts ntal occurrence by mutaton, takng nto account the frequences of the amno acds, ther respectve mutabltes and the probabltes for replacement of amno acds. A PAM1 substtuton matrx has the property that t s derved from a Markov chan for whch the average probablty of a change from one amno acd to another n the chan s A PAMn substtuton matrx s found from the n th power of the Markov chan transton matrx that gave the PAM1 substtuton matrx. Now the elements n the (symmetrc) BLOSUMx matrx are found by clusterng proten sequences nto blocks of algned sequences such that they have x% dentty, and then an estmated and rounded log lkelhood rato s determned. The BLAST web page at NCBI gves the matrces for x = 45, 62 & 80. The 20 dagonal elements are all postve, whle the 190 dstnct off-dagonal elements (.e. parwse comparsons) are generally negatve. Note, the larger n s for a PAM matrx the longer s the evolutonary dstance, whereas for BLOSUM matrces, smaller values of x correspond to longer evolutonary dstance. In choosng to use, say a PAMn (or BLOSUMx) substtuton matrx, an assumpton s beng made about the value of n (or x) and hence an mplct assumpton about how long n the past the most recent common ancestor exsted. Such an assumpton s most unlkely to be correct, especally snce databases contan nformaton about many speces, and the most recent ancestor of these varous speces wth the speces of the query sequence mght vary markedly. Commonly chosen values are x = 62, n = 120 or 250. Problems can arse f too small a value of n s chosen. For example, suppose that wth the correct choce n' the probablty that u and v have evolved ndependently from w k s p ' uv, then the mean score s proportonal to p ' uv log ( p' ) uv p u p v (2), As n ', the mean score becomes negatve, and the more negatve ths value, the more lkely t s that the null hypothess wll be accepted. For example, n the smple symmetrc model (descrbed later), f the value n = 100 s chosen, the mean score s negatve for n' 193. Alternatvely f too small a value of n s chosen the test s less powerful. Note that tryng to overcome ths problem by choosng a varety of substtuton matrces leads to one of the many multple testng problems nherent n the use of BLAST (and dscussed further below). When comparng non-codng DNA sequences, a more complex model than the above smple evolutonary model, n whch transtons are more lkely than transversons, yelds dfferent msmatch scores for transtons and transversons. The best scores to use wll depend on whether one s comparng relatvely dverged or closely related sequences.

6 2.3. Sequence Algnment Gven a scorng system, next we need an algorthm to fnd an optmal algnment for a par of sequences. Algnments can be global or local. To fnd the global algnment of proten sequences, Needleman & Wunch developed a dynamc programmng algorthm, and there have been many extensons. The most wdely used local algnment method s that gven by Smth & Waterman (S-W) by frst constructng a matrx M ndexed by and correspondng to u and v n our sequences. Let S be the score of the optmal algnment between the subsequence up to u, and the subsequence up to v. Then the S- W algorthm s gven by S max(0, S + s( u, v ), S = 1, 1 + gap penalty, 1,, 1 + gap penalty) S The score S can never become negatve, and hence always there wll be areas of smlartes even f there are long msmatches or gaps n between Assessment of Sgnfcance Overvew of Basc Theory Havng obtaned an optmal algnment, the next step s to assess ts sgnfcance, namely s ths a sgnfcantly good algnment,.e. match? In other words how does one test the null hypothess that there s no sgnfcant homology between the two sequences aganst the alternatve hypothess that there s sgnfcant homology? The basc theory underlyng the answer to ths queston was due to Sam Karln, and s based on random walk theory, on renewal theory and on asymptotc dstrbuton theory. Consder Table 1 that gves a smple case of two algned DNA sequences (of equal length). The bold poston numbers are those n whch the same nucleotde occurs n both sequences (.e. matches occur). Suppose we gve a score +m for a match and m where the nucleotdes are dfferent. Comparng the two sequences (startng from the left, wth value zero) we can fnd the accumulated value of the scores, namely m, 2m, m, 0, m,... (gven n bottom row of Table 1; alternatvely a graph representaton could be used). Ths s a smple random walk wth steps ± m. Ladder ponts are those that are lower than any prevously reached pont, and occur when the values 0, m, 2m, are frst reached n ths example at postons L 1 = 4, L 2 = 5, L 3 = 8,... (all shown n bold n the bottom row). (3) The term upwards excurson descrbes that part of the walk between two consecutve ladder ponts, L and L + 1, and nterest s n the maxmum heght startng from the frst ladder pont L. (Where ladder ponts fall at consecutve postons, such as 8 & 9 above, then ths value s zero.) The maxmum heght acheved by the varous excursons s the

7 bass for the test statstc to evaluate the homology of the two sequences. In the above example the heghts of the varous upward excursons are, respectvely, 0, 1, 0, 0, 3, 4, wth a maxmum value of G G T A C T G G G G G G G G C C T T C C m 2m m 0 -m 0 -m -2m -3m -4m A A C T T T T T C C A A C C G G T T A A -3m -2m -m -2m -3m -4m -3m -2m -3m -4m C G G G T A A A A T G G G G T A T C C C -5m -4m -3m -2m -m -2m -3m -4m -5m -6m Table 1: Two algned DNA sequences For comparson of two proten sequences, the scores are obtaned from the approprate (or selected) substtuton matrx, and the random walk s less regular than n the above example. Suppose the substtuton matrx used allocates a score S (, ) to a match of amno acds and. The null hypothess s that the two sequences are random wth respect to one another and under ths hypothess the mean score s, S (, pp ) where p s the frequency of amno acd. For the BLAST procedure t s necessary that ths mean score be negatve, and we assume t s, so that the general trend of the random walk (startng from the left) s downward, passng through a sequence of ncreasngly negatve ladder ponts. The test statstc s the heght Y max of the largest upwards excurson followng a ladder pont relatve the poston of that ladder pont, before the walk reaches the next ladder pont. It can be shown, usng standard random walk theory, that f Y s the maxmum heght acheved by the walk after reachng any ladder pont and before reachng the next, then Pr( Y y) ~ Ce λ y (4) where λ s the unque postve soluton of the moment generatng equaton S(, ) ppe λ = 1 (5), Ths s referred to as a geometrc-lke dstrbuton. For the smple random walk above, λ = log (1 p) p and lettng p (<1/2) be the probablty of a postve step, we obtan [ ] C = 1 e λ. The formula for C for a gven substtuton matrx s more complcated (and

8 not gven here). We are nterested n Y max, the largest of these maxmum heghts. Unfortunately there s no lmtng probablty dstrbuton for Y max, although bounds can be found. Next consder the number of ladder ponts reached by the walk before t fnshes (after n steps have been taken where n s the length of the sequences beng compared). The theory s complex, but for our purposes the number of ladder ponts can be taken as na where A s the mean number of steps taken from one ladder pont to the next (and can be found by random walk theory or by applcaton of Wald's dentty), and also s taken here as gven. Then t can be shown that the P-value assocated wth an observed value y max of Y max s approxmately max ( Ce λ y ) na 1 1 (6) and n BLAST ths s approxmately 1 exp( e s ), where s = λ ymax log nk and λ K = Ce A. When s s large, P-value ~ e s. Note that f the elements n the chosen substtuton matrx are all multpled by some constant k, then t can be shown that the P-value s not altered. Hence the quantty s s called a normalzed score. The BLAST software also gves an Expect value. Ths s the mean number of walks to reach a heght equal to or hgher than the maxmum heght observed when the null hypothess s true. When the P-value s small, ths s close to the P-value, otherwse t s close to log(1 P-value). Note that the P-values are qute senstve to the somewhat arbtrary numercal values of K and λ Bblography TO ACCESS ALL THE 24 PAGES OF THIS CHAPTER, Vst: Bald, P. and Brunak, S. (2001). Bonformatcs: The Machne Learnng Approach, 2 nd Ed., 452 pp, USA: The MIT Press. [Ths gves more background to the materal of Sectons 3 and 4, and covers areas of bonformatcs that have appled the machne learnng approach.] Durbn, R., Eddy, S., Krogh, A. and Mtchson, G. (2000). Bologcal Sequence Analyss: Probablstc Models of Protens and Nuclec Acds, 356 pp, UK: Cambrdge Unversty Press. [Ths concentrates on applcatons of hdden Markov models n bonformatcs.] Elston, R., Olson, J., and Palmer, L. (2002). (eds) Bostatstcal Genetcs and Genetc Epdemology, 831 pp, UK: John Wley. [Ths ncludes artcles on bonformatcs, on DNA sequences, and on sequence analyss; updates wll appear n the Encyclopeda of Bostatstcs 2 nd Edton, 2004.]

9 Ewens W.J. and Grant, G.R. (2001). Statstcal Methods n Bonformatcs. 476 pp, USA: Sprnger- Verlag. [Ths ncludes more background on the materal n Sectons 2, 3 and 4, especally the theory underpnnng BLAST.] Mandonald, J.H., Pttelkow, Y.E. and Wlson, S.R. (2003). Some consderatons for the desgn of mcroarray experments. Scence and Statstcs, Vol. 40 (ed. D.R. Goldsten), USA: Insttute of Mathematcal Statstcs. [Ths dscusses ssues relevant for the desgn of mcroarray experments wth emphass on uses of replcaton and the mportance of dentfyng maor sources of varaton.] Speed, T. (2003). (ed.) Statstcal Analyss of Gene Expresson Mcroarray Data, 222 pp, UK: Chapman & Hall. [Ths s a small collecton of papers manly concerned wth analyss of gene expresson data.] Bographcal Sketch Sue Wlson was born s Sydney, Australa; she obtaned her B.Sc. from the Unversty of Sydney (1969), followed by her Ph.D. from the ANU (awarded 1973). Sue then spent two years as a Lecturer n the Department of Probablty and Statstcs at Sheffeld Unversty. She returned to ANU towards the end of 1974 and has snce held a varety of postons there, both n some of the Statstcal groupngs, as well as at the Natonal Centre for Epdemology and Populaton Health. She s currently Professor, Statstcal Scence Program, Centre for Mathematcs and ts Applcatons, Mathematcal Scences, Insttute and Co- Drector, Centre for Bonformaton Scence (ont wth John Curtn School of Medcal Research), at the Australan Natonal Unversty (ANU). Sue has over 150 publcatons n bometry and appled statstcs, wth a partcular emphass recently on statstcal genetcs/genomcs and bonformatcs. These papers have arsen from her extensve consultng experence n the bologcal, socal, and medcal scences, leadng to statstcal modelng developments to answer substantve research questons n these dscplnes. Sue s an elected member of the Internatonal Statstcal Insttute, a Fellow of the Amercan Statstcal Assocaton and a Fellow of the Insttute of Mathematcal Statstcs (IMS). She was Presdent, Internatonal Bometrc Socety, 1999 & 2000 (Vce Presdent, IBS, 1998, 2001). Currently she s Assocate Edtor, Annals of Human Genetcs; Assocate Edtor, Computatonal Statstcs and Data Analyss; Member, Edtoral Board, Statstcal Methods n Medcal Research; Member, Edtoral Board, Behavor Genetcs.

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh Computatonal Bology Lecture 8: Substtuton matrces Saad Mnemneh As we have ntroduced last tme, smple scorng schemes lke + or a match, - or a msmatch and -2 or a gap are not justable bologcally, especally

More information

be the i th symbol in x and

be the i th symbol in x and 2 Parwse Algnment We represent sequences b strngs of alphetc letters. If we recognze a sgnfcant smlart between a new sequence and a sequence out whch somethng s alread know, we can transfer nformaton out

More information

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models I529: Machne Learnng n Bonformatcs (Sprng 217) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 217 Outlne Smple model (frequency & profle) revew Markov chan

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Multple sequence algnment Parwse sequence algnment ( and ) Substtuton matrces Database searchng Maxmum Lelhood Estmaton Observaton: Data, D (HHHTHHTH) What process generated ths data? Alternatve hypothess:

More information

Search sequence databases 2 10/25/2016

Search sequence databases 2 10/25/2016 Search sequence databases 2 10/25/2016 The BLAST algorthms Ø BLAST fnds local matches between two sequences, called hgh scorng segment pars (HSPs). Step 1: Break down the query sequence and the database

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Split alignment. Martin C. Frith April 13, 2012

Split alignment. Martin C. Frith April 13, 2012 Splt algnment Martn C. Frth Aprl 13, 2012 1 Introducton Ths document s about algnng a query sequence to a genome, allowng dfferent parts of the query to match dfferent parts of the genome. Here are some

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms Course organzaton 1 Introducton Week 1-2) Course ntroducton A bref ntroducton to molecular bology A bref ntroducton to sequence comparson Part I: Algorthms for Sequence Analyss Week 3-8) Chapter 1-3 Models

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Comparative Studies of Law of Conservation of Energy. and Law Clusters of Conservation of Generalized Energy

Comparative Studies of Law of Conservation of Energy. and Law Clusters of Conservation of Generalized Energy Comparatve Studes of Law of Conservaton of Energy and Law Clusters of Conservaton of Generalzed Energy No.3 of Comparatve Physcs Seres Papers Fu Yuhua (CNOOC Research Insttute, E-mal:fuyh1945@sna.com)

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Download the files protein1.txt and protein2.txt from the course website.

Download the files protein1.txt and protein2.txt from the course website. Queston 1 Dot plots Download the fles proten1.txt and proten2.txt from the course webste. Usng the dot plot algnment tool http://athena.boc.uvc.ca/workbench.php?tool=dotter&db=poxvrdae, algn the proten

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Sampling Self Avoiding Walks

Sampling Self Avoiding Walks Samplng Self Avodng Walks James Farbanks and Langhao Chen December 3, 204 Abstract These notes present the self testng algorthm for samplng self avodng walks by Randall and Snclar[3] [4]. They are ntended

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms CSE 53 Lecture 4 Dynamc Programmng Junzhou Huang, Ph.D. Department of Computer Scence and Engneerng CSE53 Desgn and Analyss of Algorthms The General Dynamc Programmng Technque

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information

Course organization. Part II: Algorithms for Network Biology (Week 12-16)

Course organization. Part II: Algorithms for Network Biology (Week 12-16) Course organzaton Introducton Week 1-2) Course ntroducton A bref ntroducton to molecular bology A bref ntroducton to sequence comparson Part I: Algorthms for Sequence Analyss Week 3-11) Chapter 1-3 Models

More information

Methods of Detecting Outliers in A Regression Analysis Model.

Methods of Detecting Outliers in A Regression Analysis Model. Methods of Detectng Outlers n A Regresson Analyss Model. Ogu, A. I. *, Inyama, S. C+, Achugamonu, P. C++ *Department of Statstcs, Imo State Unversty,Owerr +Department of Mathematcs, Federal Unversty of

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS CHAPTER IV RESEARCH FINDING AND DISCUSSIONS A. Descrpton of Research Fndng. The Implementaton of Learnng Havng ganed the whole needed data, the researcher then dd analyss whch refers to the statstcal data

More information

Spatial Modelling of Peak Frequencies of Brain Signals

Spatial Modelling of Peak Frequencies of Brain Signals Malaysan Journal of Mathematcal Scences 3(1): 13-6 (9) Spatal Modellng of Peak Frequences of Bran Sgnals 1 Mahendran Shtan, Hernando Ombao, 1 Kok We Lng 1 Department of Mathematcs, Faculty of Scence, and

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Designing of Combined Continuous Lot By Lot Acceptance Sampling Plan

Designing of Combined Continuous Lot By Lot Acceptance Sampling Plan Internatonal Journal o Scentc Research Engneerng & Technology (IJSRET), ISSN 78 02 709 Desgnng o Combned Contnuous Lot By Lot Acceptance Samplng Plan S. Subhalakshm 1 Dr. S. Muthulakshm 2 1 Research Scholar,

More information

On the Repeating Group Finding Problem

On the Repeating Group Finding Problem The 9th Workshop on Combnatoral Mathematcs and Computaton Theory On the Repeatng Group Fndng Problem Bo-Ren Kung, Wen-Hsen Chen, R.C.T Lee Graduate Insttute of Informaton Technology and Management Takmng

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

arxiv: v2 [stat.me] 26 Jun 2012

arxiv: v2 [stat.me] 26 Jun 2012 The Two-Way Lkelhood Rato (G Test and Comparson to Two-Way χ Test Jesse Hoey June 7, 01 arxv:106.4881v [stat.me] 6 Jun 01 1 One-Way Lkelhood Rato or χ test Suppose we have a set of data x and two hypotheses

More information

Understanding Cellular Systems Using Genome Data

Understanding Cellular Systems Using Genome Data Understandng Cellular Systems Usng Genome Data "@? Km Reynolds, UT Southwestern, Sept. 2014 Why s ths problem hard? Detaled nowledge of the molecular players an apparently dense, nterconnected networ.

More information

and V is a p p positive definite matrix. A normal-inverse-gamma distribution.

and V is a p p positive definite matrix. A normal-inverse-gamma distribution. OSR Journal of athematcs (OSR-J) e-ssn: 78-578, p-ssn: 39-765X. Volume 3, ssue 3 Ver. V (ay - June 07), PP 68-7 www.osrjournals.org Comparng The Performance of Bayesan And Frequentst Analyss ethods of

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980 MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

Note 10. Modeling and Simulation of Dynamic Systems

Note 10. Modeling and Simulation of Dynamic Systems Lecture Notes of ME 475: Introducton to Mechatroncs Note 0 Modelng and Smulaton of Dynamc Systems Department of Mechancal Engneerng, Unversty Of Saskatchewan, 57 Campus Drve, Saskatoon, SK S7N 5A9, Canada

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

On the Dirichlet Mixture Model for Mining Protein Sequence Data

On the Dirichlet Mixture Model for Mining Protein Sequence Data On the Drchlet Mxture Model for Mnng Proten Sequence Data Xugang Ye Natonal Canter for Botechnology Informaton Bologsts need to fnd from the raw data lke ths Background Background the nformaton lke ths

More information

CHAPTER IV RESEARCH FINDING AND ANALYSIS

CHAPTER IV RESEARCH FINDING AND ANALYSIS CHAPTER IV REEARCH FINDING AND ANALYI A. Descrpton of Research Fndngs To fnd out the dfference between the students who were taught by usng Mme Game and the students who were not taught by usng Mme Game

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

III. Econometric Methodology Regression Analysis

III. Econometric Methodology Regression Analysis Page Econ07 Appled Econometrcs Topc : An Overvew of Regresson Analyss (Studenmund, Chapter ) I. The Nature and Scope of Econometrcs. Lot s of defntons of econometrcs. Nobel Prze Commttee Paul Samuelson,

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method Maxmzng Overlap of Large Prmary Samplng Unts n Repeated Samplng: A comparson of Ernst s Method wth Ohlsson s Method Red Rottach and Padrac Murphy 1 U.S. Census Bureau 4600 Slver Hll Road, Washngton DC

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov, UCLA STAT 3 ntroducton to Statstcal Methods for the Lfe and Health Scences nstructor: vo Dnov, Asst. Prof. of Statstcs and Neurology Chapter Analyss of Varance - ANOVA Teachng Assstants: Fred Phoa, Anwer

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Interpolated Markov Models for Gene Finding

Interpolated Markov Models for Gene Finding Interpolated Markov Models for Gene Fndng BMI/CS 776 www.bostat.wsc.edu/bm776/ Sprng 208 Anthony Gtter gtter@bostat.wsc.edu hese sldes, ecludng thrd-party materal, are lcensed under CC BY-NC 4.0 by Mark

More information

An adaptive SMC scheme for ABC. Bayesian Computation (ABC)

An adaptive SMC scheme for ABC. Bayesian Computation (ABC) An adaptve SMC scheme for Approxmate Bayesan Computaton (ABC) (ont work wth Prof. Mke West) Department of Statstcal Scence - Duke Unversty Aprl/2011 Approxmate Bayesan Computaton (ABC) Problems n whch

More information

THEORY OF GENETIC ALGORITHMS WITH α-selection. André Neubauer

THEORY OF GENETIC ALGORITHMS WITH α-selection. André Neubauer THEORY OF GENETIC ALGORITHMS WITH α-selection André Neubauer Informaton Processng Systems Lab Münster Unversty of Appled Scences Stegerwaldstraße 39, D-48565 Stenfurt, Germany Emal: andre.neubauer@fh-muenster.de

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information