Computational Biology Lecture 8: Substitution matrices Saad Mneimneh
|
|
- Elwin Price
- 5 years ago
- Views:
Transcription
1 Computatonal Bology Lecture 8: Substtuton matrces Saad Mnemneh As we have ntroduced last tme, smple scorng schemes lke + or a match, - or a msmatch and -2 or a gap are not justable bologcally, especally or amno acd sequences (protens). Instead, more elaborated scorng unctons are used. These scores are usually obtaned as a result o analyzng chemcal propertes and statstcal data or amno acds and DNA sequences. For example, t s known that same sze amno acds are more lkely to be substtuted by one another. Smlarly, amno acds wth same anty to water are lkely to serve the same purpose n some cases. On the other hand, some mutatons are not acceptable (may lead to demse o the organsm). PAM and BLOSUM matrces are amongst results o such analyss. We wll see the technques through whch PAM and BLOSUM matrces are obtaned. Substrtuton matrces Chemcal propertes o amno acds govern how the amno acds substtue one another. In prncple, a substrtuton matrx s, where s j s used to score algnng character wth character j, should relect the probablty o two characters substtung one another. The queston s how to buld such a probablty matrx that closely maps realty? Derent strateges result n derent matrces but the central dea s the same. I we go back to the concept o a hgh scorng segment par, theory tells us that the algnment (ungapped) gven by such a segment s governed by a lmtng dstrbuton such that where: s s the substuton matrx used q j = e λsj q j s the probablty o observng character algned wth character j p s the probablty o occurrence o character Thereore, s j = λ ln q j Ths ormula or s j suggests a way to constrcut the matrx s. I hgh scorng algnments are to be real, {q j } represent the desred probabltes o substtons, whle {p } represent the background probabltes o occurrence. By observng related amles o sequences, one could estmate q and p and hence obtan the matrx s by usng some scalng actor λ. Note that λ,j s j =,j ln q j =,j p j ln q j p j Inormaton theory tells us that the above sum s strctly less than 0 p q, whch s a desred property. Also, note that the score S n bts o a gven segment (see prevous lecture) s S = log eλs K Snce S s a sum o terms o the orm λ ln q j, e λs s a product o terms o the orm q j. Thereore, S relect the log lkelhood o observng the algnment due to substtuton (governed probablstcally by q) relatve to smply by chance (governed probablstcally by p). Dvson by the constant K adjusts the score to account or the rate o observng maxmal scorng segments as descrbed n the prevous lecture.
2 Here s another ntutve approach that justes the above scheme. To construct a substtuton matrx to score proten algnments, a amly o protens can be consdered, and a multple algnment o all the proten sequences n the amly s obtaned. Agan, we are consderng algnments wth no gaps; thereore, we assume sequences have the same length (a vald assumpton they are related) and, thereore, the multple algnment s trval. or any par o amno acds and j we need q j, the probablty o observng algned wth j (whch s same as q j ), and p, the probablty o observng an. The queston s, n an algnment (ungapped) o sequences x and y that algns two o ther amno acds and j, dd ths happen by chance or was ndeed because o a mutaton rom to j or vce versa? To capture ths complementary behavors, we consder two models: M, where x and y are related and obtaned accordng to the jont probabltes q j, and R, where x and y are unrelated and obtaned ndependently at random accordng to the ndvdual probabltes p and p j. Consderng ths, now the score s the lkelhood that the sequences are related compared relatve to them beng unrelated. Ths s called the odds rato and s mathematcally expressed as: P (x, y M) score(x, y) = P (x, y R) = q x y p x p. y Ths ormula says that the score o the (ungapped) algnment s the probablty that the symbols o x and y are algned because they are related, relatve to the probablty o ther symbols beng algned just by chance. For two algned amno acds and j, we take s pont o vew, what s the probablty to see a j on the other sde? Well, ths s the probablty that an mutated nto a j, p( j). However, there s a mere chance o p j or a random occurence o a j as well. Hence, the probablty rato p( j) p j relects how much beleves that ths j s related to t. Now, snce q j = = p p( j) (the probablty o observng an and ts mutated orm), we can also express the lkelhood as: q j p( j) = p j. By dong ths or every par o algned symbols n the algnment and ndng the product o the terms, we obtan the ormula above, whch relect how much we beleve that the two entre sequences are related. In all the algnment algorthms we have seen so ar, we reled on the act that the score s addtve, and ths was a key property or the dynamc programmng to work. In ths case - when the score s computed as,j multplcatve. In order to make t addtve we can take the log and compute: log,j =,j log q x y. Ths s called the log-odds rato. Thereore, the values makng up the sum wll be the ndvdual scores ound n the matrx, hence s j = log qj up to some scalng actor. Note that ths s symmetrc, so scorng algned wth j s the same as scorng j algned wth, hence, the drecton o the algnment s not mportant (but one could n prncple make a dstncton needed). Now the mportant queston: how to compute p, p j, and q j? We re gong to look at two ways o computng ths: PAM and BLOSUM matrces. PAM (Pont Accepted Mutatons) matrces q x y q x y - t s PAM stands or Pont Accepted Mutatons. An accepted mutaton s dened as a mutaton that was postvely selected by the envronment and dd not caused the demse o the organsm. A PAM matrx M holds the probablty o beng replaced by j n a certan evolutonary tme perod. The longer the evolutonary perod o tme, the more dcult t s to determne the correct values. The reason beng that could mutate several tmes beore becomng a j, and t wll be hard to capture all these ntermedate mutatons, snce we only observe and j. What we are gong to do s look over mutatons that occurred n a relatvely short evolutonary perod o tme. One unt o evoluton s dened to be the amount o evoluton that changes, on the average, n 00 amno acds. Consderng ths unt, a -PAM matrx s rst computed. Usng ths as a startng pont, a k-pam matrx can be generated rom the -PAM matrx. For a -PAM matrx M, M j s gong to be p( j) scaled by a actor, such that the expected number o mutatons s 0.0; n other words t s the same as havng the probablty o n 00 or a mutaton to occur. The computatonal steps that lead to the -PAM matrx are: 2
3 Compute p or every. Compute p( j) or every par and j and let M j = p( j). Scale M such that the expected number o mutatons p ( M ) s 0.0. M Use s j = 0 log j 0 p j to obtan the addtve scores. s j s rounded to an nteger and here, the scalng actor 0 s used just to provde a better nteger approxmaton. Next we ll take a closer look at each o these steps. Let the requency count j be the number o tmes s algned wth j countng both drectons. Then, let the number o occurrences o, = j j, and the count o all characters =. Now, we can estmate p j = j, whose meanng s smply the rate at whch was ound to be algned wth j. Smlarly, the rate o ndng an occurence o s p =. Now, havng both p j and p determned, the elements o the matrx M are beng computed as: M j = p( j) = pj p. M s ndeed a probablty matrx, and ths can be proved by notng that j M j =. To llustrate ths step o computng a matrx M, let s have a quck example. Let the algnment be: In ths case, the requences are: hence the estmated probabltes: A B A A AB = BA = AA = 2 A = X AX = AB + AA = B = X BX = BA = = X X = 4 The matrx M wll be: The expected number o mutatons s X p AB = AB = 4 p BA = BA = 4 p AA = AA = 2 p A = A = 4 p B = B = 4 p(a B) = p AB p A = p(b A) = p BA p B = M = [ 2 0 p X ( M XX ) = p A ( M AA ) + p B ( M BB ) = 4 ( 2 ) + ( 0) = 0.5 = 50% 4 The next step s the scalng o M such that t s consstent wth the denton o a -PAM matrx: n 00 expected mutatons. Suppose matrx M s elements, M j, are scaled by a actor α. In ths case the new values become M j = αm j. Ths wll change the values o the row sums such that j M j = α. Snce we want a probablty matrx - every row sums up to - a small adjustment s needed: we wll add α to every element on the man dagonal: ] M j = αm j, j M = αm + α Ths wll restore the property o a probablty matrx. Now what should α be? Let s compute the new expected number o mutatons: p ( M ) = p ( αm + α) = α p ( M )
4 Ths s just α multpled by the old expected number o mutatons. Thereore, we can set α approprately. For nstance, n the example above, α = Havng a -PAM matrx computed, the queston s how to compute a 2-PAM matrx? In other words, what s the probablty p 2 ( j) o mutatng nto j n two unts o evoluton. Ths s the probablty o mutatng nto k, or some k, n the rst unt o evoluton, and then, k mutatng nto j n the second unt o evoluton. Mathematcally, ths can be expressed as: p 2 ( j) = k p( k)p(k j) = k M k M kj Ths s the ormula used to obtan the entry correspondng to the par and j when multply M by tsel. Hence, the 2-PAM matrx s just M 2. An analogous step s used to show that the k-pam matrx s the same as M k. When workng wth a k-pam probablty matrx the score wll be computed n the same way: s k j = 0 log M k j 0 p j. The only change s that now the values o M k are plugged nstead o those o M. BLOSUM (BLOCKS Substtuton Matrces) matrces As mentoned earler, BLOSUM are another type o matrces used n scorng sequence algnments. They are ntended to be used or scorng smlartes o proten sequences that are evolutonary ar apart (dstant). Computng ther values s done usng the normaton stored n a database o blocks (called the BLOCKS database) where each block s a multple ungapped algnment o related proten sequences. The sequences o each block are clustered, puttng two sequences nto the same cluster ther percentage o matchng algned resdues - or level o smlarty - s above a certan threshold L%. We dene two sequences to be dstant they all n derent clusters. Thereore, two dstant sequences der by at least (00 L)%. The computaton o BLOSUM-L, or a partcular value o L, s based on countng the number o mutatons among dstant sequences only. Thereore, lower values o L correspond to longer evolutonary tmes, and are applcable or more dstant sequences. As explaned above, n computng a BLOSUM-L matrx s entres, we want to count the number o mutatons between dstant sequences only - the ones that are less than L% smlar. The value ab s the relatve requency o seeng a algned wth b. Whenever such an algnment s observed or two sequences that are n derent clusters, ab s ncremented by n n 2, where n and n 2 are the szes o the two clusters (we scale by the sze o the cluster snce larger clusters are more lkely to contan mutatons). The steps through whch a matrx s computed are: Estmate p = j j ; Estmate q j = k,l j j k,l kl ; BLOSUM-L(,j)= log q j wth some scalng actor λ. Consder an example where sequences are generated at random (so we are not usng the BLOCKS database here) such that p A = p G = p C = p T = 4 and the level o smlarty s 50%,.e. the probablty that two algned resdues are the same s 0.5. Then L = 50%, we expect to have one cluster, where: and p AA = p GG = p CC = p T T = = 8 p AG = p AC = p AT = p GA = p GC = p GT = p CA = p CG = p CT = p T A = p T G = p T C = = 24 Then a match wll have a score and a msmatch wll have a score m = log /8 /4./4 = s = log /24 /4./4 =
5 Reerences Setubal J., Medans, J., Introducton to Molecular Bology, Chapter. Drubn R. et al., Bologcal Sequence Analyss, Chapter 2. 5
Search sequence databases 2 10/25/2016
Search sequence databases 2 10/25/2016 The BLAST algorthms Ø BLAST fnds local matches between two sequences, called hgh scorng segment pars (HSPs). Step 1: Break down the query sequence and the database
More informationbe the i th symbol in x and
2 Parwse Algnment We represent sequences b strngs of alphetc letters. If we recognze a sgnfcant smlart between a new sequence and a sequence out whch somethng s alread know, we can transfer nformaton out
More informationGeneral Tips on How to Do Well in Physics Exams. 1. Establish a good habit in keeping track of your steps. For example, when you use the equation
General Tps on How to Do Well n Physcs Exams 1. Establsh a good habt n keepng track o your steps. For example when you use the equaton 1 1 1 + = d d to solve or d o you should rst rewrte t as 1 1 1 = d
More informationCS 331 DESIGN AND ANALYSIS OF ALGORITHMS DYNAMIC PROGRAMMING. Dr. Daisy Tang
CS DESIGN ND NLYSIS OF LGORITHMS DYNMIC PROGRMMING Dr. Dasy Tang Dynamc Programmng Idea: Problems can be dvded nto stages Soluton s a sequence o decsons and the decson at the current stage s based on the
More informationNote on EM-training of IBM-model 1
Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationCopyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor
Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data
More informationECEN 5005 Crystals, Nanocrystals and Device Applications Class 19 Group Theory For Crystals
ECEN 5005 Crystals, Nanocrystals and Devce Applcatons Class 9 Group Theory For Crystals Dee Dagram Radatve Transton Probablty Wgner-Ecart Theorem Selecton Rule Dee Dagram Expermentally determned energy
More informationEigenvalues of Random Graphs
Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationRandom Walks on Digraphs
Random Walks on Dgraphs J. J. P. Veerman October 23, 27 Introducton Let V = {, n} be a vertex set and S a non-negatve row-stochastc matrx (.e. rows sum to ). V and S defne a dgraph G = G(V, S) and a drected
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More informationExample: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,
The greatest common dvsor of two ntegers a and b (not both zero) s the largest nteger whch s a common factor of both a and b. We denote ths number by gcd(a, b), or smply (a, b) when there s no confuson
More informationSpring Force and Power
Lecture 13 Chapter 9 Sprng Force and Power Yeah, energy s better than orces. What s net? Course webste: http://aculty.uml.edu/andry_danylov/teachng/physcsi IN THIS CHAPTER, you wll learn how to solve problems
More informationUniversity of Washington Department of Chemistry Chemistry 452/456 Summer Quarter 2014
Lecture 16 8/4/14 Unversty o Washngton Department o Chemstry Chemstry 452/456 Summer Quarter 214. Real Vapors and Fugacty Henry s Law accounts or the propertes o extremely dlute soluton. s shown n Fgure
More informationCopyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor
Taylor Enterprses, Inc. Adjusted Control Lmts for U Charts Copyrght 207 by Taylor Enterprses, Inc., All Rghts Reserved. Adjusted Control Lmts for U Charts Dr. Wayne A. Taylor Abstract: U charts are used
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationLecture 2 Solution of Nonlinear Equations ( Root Finding Problems )
Lecture Soluton o Nonlnear Equatons Root Fndng Problems Dentons Classcaton o Methods Analytcal Solutons Graphcal Methods Numercal Methods Bracketng Methods Open Methods Convergence Notatons Root Fndng
More informationSplit alignment. Martin C. Frith April 13, 2012
Splt algnment Martn C. Frth Aprl 13, 2012 1 Introducton Ths document s about algnng a query sequence to a genome, allowng dfferent parts of the query to match dfferent parts of the genome. Here are some
More informationBézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0
Bézer curves Mchael S. Floater September 1, 215 These notes provde an ntroducton to Bézer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of
More information/ n ) are compared. The logic is: if the two
STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence
More informationDesign and Analysis of Algorithms
Desgn and Analyss of Algorthms CSE 53 Lecture 4 Dynamc Programmng Junzhou Huang, Ph.D. Department of Computer Scence and Engneerng CSE53 Desgn and Analyss of Algorthms The General Dynamc Programmng Technque
More informationLecture 3. Ax x i a i. i i
18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest
More informationPhysics 2A Chapters 6 - Work & Energy Fall 2017
Physcs A Chapters 6 - Work & Energy Fall 017 These notes are eght pages. A quck summary: The work-energy theorem s a combnaton o Chap and Chap 4 equatons. Work s dened as the product o the orce actng on
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationTemperature. Chapter Heat Engine
Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the
More informationA Simple Research of Divisor Graphs
The 29th Workshop on Combnatoral Mathematcs and Computaton Theory A Smple Research o Dvsor Graphs Yu-png Tsao General Educaton Center Chna Unversty o Technology Tape Tawan yp-tsao@cuteedutw Tape Tawan
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationDownload the files protein1.txt and protein2.txt from the course website.
Queston 1 Dot plots Download the fles proten1.txt and proten2.txt from the course webste. Usng the dot plot algnment tool http://athena.boc.uvc.ca/workbench.php?tool=dotter&db=poxvrdae, algn the proten
More informationChapter 6. Operational Amplifier. inputs can be defined as the average of the sum of the two signals.
6 Operatonal mpler Chapter 6 Operatonal mpler CC Symbol: nput nput Output EE () Non-nvertng termnal, () nvertng termnal nput mpedance : Few mega (ery hgh), Output mpedance : Less than (ery low) Derental
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More information: Numerical Analysis Topic 2: Solution of Nonlinear Equations Lectures 5-11:
764: Numercal Analyss Topc : Soluton o Nonlnear Equatons Lectures 5-: UIN Malang Read Chapters 5 and 6 o the tetbook 764_Topc Lecture 5 Soluton o Nonlnear Equatons Root Fndng Problems Dentons Classcaton
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationMaximum Likelihood Estimation
Multple sequence algnment Parwse sequence algnment ( and ) Substtuton matrces Database searchng Maxmum Lelhood Estmaton Observaton: Data, D (HHHTHHTH) What process generated ths data? Alternatve hypothess:
More informationWeek 11: Chapter 11. The Vector Product. The Vector Product Defined. The Vector Product and Torque. More About the Vector Product
The Vector Product Week 11: Chapter 11 Angular Momentum There are nstances where the product of two vectors s another vector Earler we saw where the product of two vectors was a scalar Ths was called the
More informationSolutions to Problem Set 6
Solutons to Problem Set 6 Problem 6. (Resdue theory) a) Problem 4.7.7 Boas. n ths problem we wll solve ths ntegral: x sn x x + 4x + 5 dx: To solve ths usng the resdue theorem, we study ths complex ntegral:
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationCOMPLEX NUMBERS AND QUADRATIC EQUATIONS
COMPLEX NUMBERS AND QUADRATIC EQUATIONS INTRODUCTION We know that x 0 for all x R e the square of a real number (whether postve, negatve or ero) s non-negatve Hence the equatons x, x, x + 7 0 etc are not
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationPhysics 2A Chapter 3 HW Solutions
Phscs A Chapter 3 HW Solutons Chapter 3 Conceptual Queston: 4, 6, 8, Problems: 5,, 8, 7, 3, 44, 46, 69, 70, 73 Q3.4. Reason: (a) C = A+ B onl A and B are n the same drecton. Sze does not matter. (b) C
More informationAnswers Problem Set 2 Chem 314A Williamsen Spring 2000
Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%
More informationAnnexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances
ec Annexes Ths Annex frst llustrates a cycle-based move n the dynamc-block generaton tabu search. It then dsplays the characterstcs of the nstance sets, followed by detaled results of the parametercalbraton
More informationCS286r Assign One. Answer Key
CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,
More informationLecture Nov
Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances
More informationFinite Difference Method
7/0/07 Instructor r. Ramond Rump (9) 747 698 rcrump@utep.edu EE 337 Computatonal Electromagnetcs (CEM) Lecture #0 Fnte erence Method Lecture 0 These notes ma contan coprghted materal obtaned under ar use
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationLecture Space-Bounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationMixture o f of Gaussian Gaussian clustering Nov
Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:
More informationHopfield Training Rules 1 N
Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationStatistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran
Statstcs and Quanttatve Analyss U430 Segment 3: Probablty Prof. Sharyn O Halloran Revew: Descrptve Statstcs Code book for Measures Sample Data Relgon Employed 1. Catholc 0. Unemployed. Protestant 1. Employed
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationFormulas for the Determinant
page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationHidden Markov Models
CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationStatistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )
Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton
More informationComplex Variables. Chapter 18 Integration in the Complex Plane. March 12, 2013 Lecturer: Shih-Yuan Chen
omplex Varables hapter 8 Integraton n the omplex Plane March, Lecturer: Shh-Yuan hen Except where otherwse noted, content s lcensed under a BY-N-SA. TW Lcense. ontents ontour ntegrals auchy-goursat theorem
More informationNorms, Condition Numbers, Eigenvalues and Eigenvectors
Norms, Condton Numbers, Egenvalues and Egenvectors 1 Norms A norm s a measure of the sze of a matrx or a vector For vectors the common norms are: N a 2 = ( x 2 1/2 the Eucldean Norm (1a b 1 = =1 N x (1b
More informationDIFFERENTIAL SCHEMES
DIFFERENTIAL SCHEMES RAYMOND T. HOOBLER Dedcated to the memory o Jerry Kovacc 1. schemes All rngs contan Q and are commutatve. We x a d erental rng A throughout ths secton. 1.1. The topologcal space. Let
More informationLECTURE 9 CANONICAL CORRELATION ANALYSIS
LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of
More informationQuantum Mechanics I - Session 4
Quantum Mechancs I - Sesson 4 Aprl 3, 05 Contents Operators Change of Bass 4 3 Egenvectors and Egenvalues 5 3. Denton....................................... 5 3. Rotaton n D....................................
More informationRandomness and Computation
Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationMath 217 Fall 2013 Homework 2 Solutions
Math 17 Fall 013 Homework Solutons Due Thursday Sept. 6, 013 5pm Ths homework conssts of 6 problems of 5 ponts each. The total s 30. You need to fully justfy your answer prove that your functon ndeed has
More informationCredit Card Pricing and Impact of Adverse Selection
Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n
More informationChapter 3 Differentiation and Integration
MEE07 Computer Modelng Technques n Engneerng Chapter Derentaton and Integraton Reerence: An Introducton to Numercal Computatons, nd edton, S. yakowtz and F. zdarovsky, Mawell/Macmllan, 990. Derentaton
More informationSIMPLE LINEAR REGRESSION
Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two
More informationJournal of Universal Computer Science, vol. 1, no. 7 (1995), submitted: 15/12/94, accepted: 26/6/95, appeared: 28/7/95 Springer Pub. Co.
Journal of Unversal Computer Scence, vol. 1, no. 7 (1995), 469-483 submtted: 15/12/94, accepted: 26/6/95, appeared: 28/7/95 Sprnger Pub. Co. Round-o error propagaton n the soluton of the heat equaton by
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016
U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and
More informationIntroduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms
Course organzaton 1 Introducton Week 1-2) Course ntroducton A bref ntroducton to molecular bology A bref ntroducton to sequence comparson Part I: Algorthms for Sequence Analyss Week 3-8) Chapter 1-3 Models
More informationBIOINFORMATICS: PAST, PRESENT AND FUTURE. Susan R. Wilson Mathematical Sciences Institute, Australian National University, Australia
BIOINFORMATICS: PAST, PRESENT AND FUTURE Susan R. Wlson Mathematcal Scences Insttute, Australan Natonal Unversty, Australa Keywords: Bonformatcs, bologcal sequence analyss, sequence algnment, hdden Markov
More informationCourse organization. Part II: Algorithms for Network Biology (Week 12-16)
Course organzaton Introducton Week 1-2) Course ntroducton A bref ntroducton to molecular bology A bref ntroducton to sequence comparson Part I: Algorthms for Sequence Analyss Week 3-11) Chapter 1-3 Models
More informationESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy
ESCI 341 Atmospherc Thermodynamcs Lesson 10 The Physcal Meanng of Entropy References: An Introducton to Statstcal Thermodynamcs, T.L. Hll An Introducton to Thermodynamcs and Thermostatstcs, H.B. Callen
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationOn the Repeating Group Finding Problem
The 9th Workshop on Combnatoral Mathematcs and Computaton Theory On the Repeatng Group Fndng Problem Bo-Ren Kung, Wen-Hsen Chen, R.C.T Lee Graduate Insttute of Informaton Technology and Management Takmng
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More informationStanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7
Stanford Unversty CS54: Computatonal Complexty Notes 7 Luca Trevsan January 9, 014 Notes for Lecture 7 1 Approxmate Countng wt an N oracle We complete te proof of te followng result: Teorem 1 For every
More information7. Products and matrix elements
7. Products and matrx elements 1 7. Products and matrx elements Based on the propertes of group representatons, a number of useful results can be derved. Consder a vector space V wth an nner product ψ
More information