11 Hidden Markov Models

Size: px
Start display at page:

Download "11 Hidden Markov Models"

Transcription

1 Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the (ofte hidde) parameters. Oce a algorithm has bee suitably traied, it ca apply these isights to the aalysis of a test sample. As the amout of traiig data icreases, the accuracy of the machie learig algorithm typically icreases as well. The parameters that are leared durig traiig represet kowledge; applicatio of the algorithm with those parameters to ew data (ot used i the traiig phase) represets the algorithm s use of that kowledge. The Hidde Markov Model (HMM) approach, cosidered i this chapter, lears some ukow probabilistic parameters from traiig samples ad uses these parameters i the framework of dyamic programmig (ad other algorithmic techiques) to fid the best explaatio for the experimetal data.. CG-Islads ad the Fair Bet Casio The least frequet diucleotide i may geomes is CG. The reaso for this is that the C withi CG is easily methylated, ad the resultig methyl-c has a tedecy to mutate ito T. However, the methylatio is ofte suppressed aroud gees i areas called CG-islads i which CG appears relatively frequetly. A importat problem is to defie ad locate CG-islads i a log geomic text. Fidig CG-islads ca be modeled after the followig toy gamblig problem. The Fair Bet Casio has a game i which a dealer flips a coi ad. Cells ofte biochemically modify DNA ad proteis. Methylatio is the most commo DNA modificatio ad results i the additio of a methyl (CH ) group to a ucleotide positio i DNA.

2 88 Hidde Markov Models the player bets o the outcome (heads or tails). The dealer i this (crooked) casio uses either a fair coi (heads or tails are equally likely) or a biased coi that will give heads with a probability of. For security reasos, the dealer does ot like to chage cois, so this happes relatively rarely, with a probability of 0.. Give a sequece of coi tosses, the problem is to fid out whe the dealer used the biased coi ad whe he used the fair coi, sice this will help you, the player, lear the dealer s psychology ad eable you to wi moey. Obviously, if you observe a log lie of heads, it is likely that the dealer used the biased coi, whereas if you see a eve distributio of heads ad tails, he likely used the fair oe. Though you ca ever be certai that a log strig of heads is ot just a fluke, you are primarily iterested i the most probable explaatio of the data. Based o this sesible ituitio, we might formulate the problem as follows: Fair Bet Casio Problem: Give a sequece of coi tosses, determie whe the dealer used a fair coi ad whe he used a biased coi. Iput: A sequece x = x x x... x of coi tosses (either H or T ) made by two possible cois (F or B). Output: A sequece π = π π π π, with each π i beig either F or B idicatig that x i is the result of tossig the fair or biased coi, respectively. Ufortuately, this problem formulatio simply makes o sese. The ambiguity is that ay sequece of cois could possibly have geerated the observed outcomes, so techically π = FFF... FF is a valid aswer to this problem for every observed sequece of coi flips, as is π = BBB...BB. We eed to icorporate a way to grade differet coi sequeces as beig better aswers tha others. Below we explai how to tur this ill-defied problem ito the Decodig problem based o HMM paradigm. First, we cosider the problem uder the assumptio that the dealer ever chages cois. I this case, lettig 0 deote tails ad heads, the questio is which of the two cois he used, fair (p + (0) = p + () = ) or biased (p (0) =, p () = ). If the resultig sequece of tosses is x = x... x, the the

3 . CG-Islads ad the Fair Bet Casio 8 probability that x was geerated by a fair coi is P(x fair coi) = i= p + (x i ) =. O the other had, the probability that x was geerated by a biased coi is P(x biased coi) = p (x i ) = i= ( k ) ( k k ) = k. Here k is the umber of heads i x. If P(x fair coi) > P(x biased coi), the the dealer most likely used a fair coi; o the other had, we ca see that if P(x fair coi)<p(x biased coi), the the dealer most likely used a biased coi. The probabilities P(x fair coi)= ad P(x biased coi) = k become equal at k = log. As a result, whe k < log, the dealer most likely used a fair coi, ad whe k > log, he most likely used a biased coi. We ca defie the log-odds ratio as follows: P(x fair coi) k log P(x biased coi) = p + (x i ) log p (x i ) = k log i= However, we kow that the dealer does chage cois, albeit rarely. Oe approach to makig a educated guess as to which coi the dealer used at each poit would be to slide a widow of some width alog the sequece of coi flips ad calculate the log-odds ratio of the sequece uder each widow. I effect, this is cosiderig the log-odds ratio of short regios of the sequece. If the log-odds ratio of the short sequece falls below 0, the the dealer most likely used a biased coi while geeratig this widow of sequece; otherwise the dealer most likely used a fair coi. Similarly, a aive approach to fidig CG-islads i log DNA sequeces is to calculate log-odds ratios for a slidig widow of some particular legth, ad to declare widows that receive positive scores to be potetial CG-islads. Of course, the disadvatage of this approach is that we do ot kow the legth of CG-islads i advace ad that some overlappig widows may classify the same ucleotide differetly. HMMs represet a differet probabilistic approach to this problem.. The otatio P(x y) is shorthad for the probability of x occurrig uder the assumptio that (some coditio) y is true. The otatio Q i= a i meas a a a a.

4 0 Hidde Markov Models. The Fair Bet Casio ad Hidde Markov Models A HMM ca be viewed as a abstract machie that has a ability to produce some output usig coi tossig. The operatio of the machie proceeds i discrete steps: at the begiig of each step, the machie is i a hidde state of which there are k. Durig the step, the HMM makes two decisios: () What state will I move to ext? ad () What symbol from a alphabet Σ will I emit? The HMM decides o the former by choosig radomly amog the k states; it decides o the latter by choosig radomly amog the Σ symbols. The choices that the HMM makes are typically biased, ad may follow arbitrary probabilities. Moreover, the probability distributios that gover which state to move to ad which symbols to emit chage from state to state. I essece, if there are k states, the there are k differet ext state distributios ad k differet symbol emissio distributios. A importat feature of HMMs is that a observer ca see the emitted symbols but has o ability to see what state HMM is i at ay step, hece the ame Hidde Markov Models. The goal of the observer is to ifer the most likely states of the HMM by aalyzig the sequeces of emitted symbols. Sice a HMM effectively uses dice to emit symbols, the sequece of symbols it produces does ot form ay readily recogizable patter. Formally, a HMM M is defied by a alphabet of emitted symbols Σ, a set of (hidde) states Q, a matrix of state trasitio probabilities A, ad a matrix of emissio probabilities E, where Σ is a alphabet of symbols; Q is a set of states, each of which will emit symbols from the alphabet Σ; A = (a kl ) is a Q Q matrix describig the probability of chagig to state l after the HMM is i state k; ad E = (e k (b)) is a Q Σ matrix describig the probability of emittig the symbol b durig a step i which the HMM is i state k. Each row of the matrix A describes a state die with Q sides, while each row of the matrix E describes a symbol die with Σ sides. The Fair. A probability distributio is simply a assigmet of probabilities to outcomes; i this case, the outcomes are either symbols to emit or states to move to. We have see probability distributios, i a disguised form, i the cotext of motif fidig. Every colum of a profile, whe each elemet is divided by the umber of sequeces i the sample, forms probability distributios.. Sigular of dice.

5 . The Fair Bet Casio ad Hidde Markov Models 0 0 F 0 0 B H T H T Figure. The HMM desiged for the Fair Bet Casio problem. There are two states: F (fair) ad B (biased). From each state, the HMM ca emit either heads (H) or tails (T), with the probabilities show. The HMM will switch betwee F ad B with probability /0. Bet Casio process correspods to the followig HMM M(Σ, Q, A, E) show i figure.: Σ = {0, }, correspodig to tails (0) or heads () Q = {F, B}, correspodig to a fair (F ) or biased (B) coi a FF = a BB = 0., a FB = a BF = 0. e F (0) =, e F() =, e B(0) =, e B() = A path π = π... π i the HMM M is a sequece of states. For example, if a dealer used the fair coi for the first three ad the last three tosses ad the biased coi for five tosses i betwee, the correspodig path π would be π = FFFBBBBBFFF. If the resultig sequece of tosses is 00000, the the followig shows the matchig of x to π ad the probability of x i beig geerated by π i at each flip: x π P(x i π i ) = F F F B B B B B F F F We write P(x i π i ) to deote the probability that symbol x i was emitted from state π i these values are give by the matrix E. We write P(π i π i+ )

6 Hidde Markov Models to deote the probability of the trasitio from state π i to π i+ these values are give by the matrix A. The path π = FFFBBBBBFFF icludes oly two switches of cois, first from F to B (after the third step), ad secod from B to F (after the eighth step). The probability of these two switches, π π ad π 8 π, is 0, while the probability of all other trasitios, π i π i, is 0 as show below:5 x π P(x i π i ) P(π i π i ) = F F F B B B B B F F F The probability of geeratig x through the path π (assumig for simplicity that i the first momet the dealer is equally likely to have a fair or a biased coi) is roughly ad is computed as: ««««««««««« I the above example, we assumed that we kew π ad observed x. However, i reality we do ot have access to π. If you oly observe that x = 00000, the you might ask yourself whether or ot π =FFFBBBBBFFF is the best explaatio for x. Furthermore, if it is ot the best explaatio, is it possible to recostruct the best oe? It turs out that FFFBBBBBFFF is ot the most probable path for x = 00000: FFFBBBFFFFF is slightly better, with probability x π P(x i π i ) P(π i π i ) = F F F B B B F F F F F The probability that sequece x was geerated by the path π, give the model M, is P(x π) = P(π 0 π ) P(x i π i )P(π i π i+ ) = a π0,π e πi (x i ) a πi,π i+. i= 5. We have added a fictitious term, P(π 0 π ) = to model the iitial coditio: the dealer is equally likely to have either a fair or a biased coi before the first flip. i=

7 . Decodig Algorithm For coveiece, we have itroduced π 0 ad π + as the fictitious iitial ad termial states begi ad ed. This model defies the probability P(x π) for a give sequece x ad a give path π. Sice oly the dealer kows the real sequece of states π that emitted x, we say that π is hidde ad attempt to solve the followig Decodig problem: Decodig Problem: Fid a optimal hidde path of states give observatios. Iput: Sequece of observatios x = x...x geerated by a HMM M(, Q, A, E). Output: A path that maximizes P(x π) over all possible paths π. The Decodig problem is a improved formulatio of the ill-defied Fair Bet Casio problem.. Decodig Algorithm I 67 Adrew Viterbi used a HMM-ispired aalog of the Mahatta grid for the Decodig problem, ad described a efficiet dyamic programmig algorithm for its solutio. Viterbi s Mahatta is show i figure. with every choice of π,...,π correspodig to a path i this graph. Oe ca set the edge weights i this graph so that the product of the edge weights for path π=π...π equals P(x π). There are Q ( ) edges i this graph with the weight of a edge from (k, i) to (l, i + ) give by e l (x i+ ) a kl. Ulike the aligmet approaches covered i chapter 6 where the set of valid directios was restricted to south, east, ad southeast edges, the Mahatta built to solve the decodig problem oly forces the tourists to move i ay eastward directio (e.g., ortheast, east, southeast, etc.), ad places o additioal restrictios (fig..). To see why the legth of the edge betwee the vertices (k, i) ad (l, i + ) i the correspodig graph is give by e l (x i+ ) a kl, oe should compare p k,i [the probability of a path edig i vertex (k, i)] with

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms  Hidden Markov Models Hidden Markov Models Hidden Markov Models Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training

More information

HIDDEN MARKOV MODELS

HIDDEN MARKOV MODELS HIDDEN MARKOV MODELS Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training Baum-Welch algorithm

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15 CS 70 Discrete Mathematics ad Probability Theory Sprig 2012 Alistair Siclair Note 15 Some Importat Distributios The first importat distributio we leared about i the last Lecture Note is the biomial distributio

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15 CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Hidden Markov Models 1

Hidden Markov Models 1 Hidden Markov Models Dinucleotide Frequency Consider all 2-mers in a sequence {AA,AC,AG,AT,CA,CC,CG,CT,GA,GC,GG,GT,TA,TC,TG,TT} Given 4 nucleotides: each with a probability of occurrence of. 4 Thus, one

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19 CS 70 Discrete Mathematics ad Probability Theory Sprig 2016 Rao ad Walrad Note 19 Some Importat Distributios Recall our basic probabilistic experimet of tossig a biased coi times. This is a very simple

More information

Fortgeschrittene Datenstrukturen Vorlesung 11

Fortgeschrittene Datenstrukturen Vorlesung 11 Fortgeschrittee Datestruture Vorlesug 11 Schriftführer: Marti Weider 19.01.2012 1 Succict Data Structures (ctd.) 1.1 Select-Queries A slightly differet approach, compared to ra, is used for select. B represets

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

Lecture 4 The Simple Random Walk

Lecture 4 The Simple Random Walk Lecture 4: The Simple Radom Walk 1 of 9 Course: M36K Itro to Stochastic Processes Term: Fall 014 Istructor: Gorda Zitkovic Lecture 4 The Simple Radom Walk We have defied ad costructed a radom walk {X }

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Final Review for MATH 3510

Final Review for MATH 3510 Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas CG-Islands Given 4 nucleotides: probability of occurrence is ~ 1/4. Thus, probability of

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2015 Doug Fowler, GS

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2015 Doug Fowler, GS Lecture 2: Probability, Radom Variables ad Probability Distributios GENOME 560, Sprig 2015 Doug Fowler, GS (dfowler@uw.edu) 1 Course Aoucemets Problem Set 1 will be posted Due ext Thursday before class

More information

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1 PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

2.4 - Sequences and Series

2.4 - Sequences and Series 2.4 - Sequeces ad Series Sequeces A sequece is a ordered list of elemets. Defiitio 1 A sequece is a fuctio from a subset of the set of itegers (usually either the set 80, 1, 2, 3,... < or the set 81, 2,

More information

PRACTICE PROBLEMS FOR THE FINAL

PRACTICE PROBLEMS FOR THE FINAL PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to

More information

Roberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series

Roberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series Roberto s Notes o Series Chapter 2: Covergece tests Sectio 7 Alteratig series What you eed to kow already: All basic covergece tests for evetually positive series. What you ca lear here: A test for series

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

SNAP Centre Workshop. Basic Algebraic Manipulation

SNAP Centre Workshop. Basic Algebraic Manipulation SNAP Cetre Workshop Basic Algebraic Maipulatio 8 Simplifyig Algebraic Expressios Whe a expressio is writte i the most compact maer possible, it is cosidered to be simplified. Not Simplified: x(x + 4x)

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2017 Doug Fowler, GS

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2017 Doug Fowler, GS Lecture 2: Probability, Radom Variables ad Probability Distributios GENOME 560, Sprig 2017 Doug Fowler, GS (dfowler@uw.edu) 1 Course Aoucemets Problem Set 1 will be posted Due ext Thursday before class

More information

4.1 Sigma Notation and Riemann Sums

4.1 Sigma Notation and Riemann Sums 0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

More information

NUMERICAL METHODS FOR SOLVING EQUATIONS

NUMERICAL METHODS FOR SOLVING EQUATIONS Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

More information

Putnam Training Exercise Counting, Probability, Pigeonhole Principle (Answers)

Putnam Training Exercise Counting, Probability, Pigeonhole Principle (Answers) Putam Traiig Exercise Coutig, Probability, Pigeohole Pricile (Aswers) November 24th, 2015 1. Fid the umber of iteger o-egative solutios to the followig Diohatie equatio: x 1 + x 2 + x 3 + x 4 + x 5 = 17.

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

The Binomial Theorem

The Binomial Theorem The Biomial Theorem Robert Marti Itroductio The Biomial Theorem is used to expad biomials, that is, brackets cosistig of two distict terms The formula for the Biomial Theorem is as follows: (a + b ( k

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary Recursive Algorithm for Geeratig Partitios of a Iteger Sug-Hyuk Cha Computer Sciece Departmet, Pace Uiversity 1 Pace Plaza, New York, NY 10038 USA scha@pace.edu Abstract. This article first reviews the

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY Structure 2.1 Itroductio Objectives 2.2 Relative Frequecy Approach ad Statistical Probability 2. Problems Based o Relative Frequecy 2.4 Subjective Approach

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview CS 70 Discrete Mathematics ad Probability Theory Fall 2016 Walrad Probability: A Overview Probability is a fasciatig theory. It provides a precise, clea, ad useful model of ucertaity. The successes of

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information

Probability theory and mathematical statistics:

Probability theory and mathematical statistics: N.I. Lobachevsky State Uiversity of Nizhi Novgorod Probability theory ad mathematical statistics: Law of Total Probability. Associate Professor A.V. Zorie Law of Total Probability. 1 / 14 Theorem Let H

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutions CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

End-of-Year Contest. ERHS Math Club. May 5, 2009

End-of-Year Contest. ERHS Math Club. May 5, 2009 Ed-of-Year Cotest ERHS Math Club May 5, 009 Problem 1: There are 9 cois. Oe is fake ad weighs a little less tha the others. Fid the fake coi by weighigs. Solutio: Separate the 9 cois ito 3 groups (A, B,

More information

Lecture 1: Basic problems of coding theory

Lecture 1: Basic problems of coding theory Lecture 1: Basic problems of codig theory Error-Correctig Codes (Sprig 016) Rutgers Uiversity Swastik Kopparty Scribes: Abhishek Bhrushudi & Aditya Potukuchi Admiistrivia was discussed at the begiig of

More information

Castiel, Supernatural, Season 6, Episode 18

Castiel, Supernatural, Season 6, Episode 18 13 Differetial Equatios the aswer to your questio ca best be epressed as a series of partial differetial equatios... Castiel, Superatural, Seaso 6, Episode 18 A differetial equatio is a mathematical equatio

More information

Linear Programming and the Simplex Method

Linear Programming and the Simplex Method Liear Programmig ad the Simplex ethod Abstract This article is a itroductio to Liear Programmig ad usig Simplex method for solvig LP problems i primal form. What is Liear Programmig? Liear Programmig is

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

Hashing and Amortization

Hashing and Amortization Lecture Hashig ad Amortizatio Supplemetal readig i CLRS: Chapter ; Chapter 7 itro; Sectio 7.. Arrays ad Hashig Arrays are very useful. The items i a array are statically addressed, so that isertig, deletig,

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Introduction to Computational Molecular Biology. Gibbs Sampling

Introduction to Computational Molecular Biology. Gibbs Sampling 18.417 Itroductio to Computatioal Molecular Biology Lecture 19: November 16, 2004 Scribe: Tushara C. Karuarata Lecturer: Ross Lippert Editor: Tushara C. Karuarata Gibbs Samplig Itroductio Let s first recall

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Pixel Recurrent Neural Networks

Pixel Recurrent Neural Networks Pixel Recurret Neural Networks Aa ro va de Oord, Nal Kalchbreer, Koray Kavukcuoglu Google DeepMid August 2016 Preseter - Neha M Example problem (completig a image) Give the first half of the image, create

More information

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability.

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability. CSE 0 Homework 1 Matthias Spriger, A9950078 1 Problem 1 Notatio a b meas that a is matched to b. a < b c meas that b likes c more tha a. Equality idicates a tie. Strog istability Yes, there does always

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Introduction to Probability and Statistics Twelfth Edition

Introduction to Probability and Statistics Twelfth Edition Itroductio to Probability ad Statistics Twelfth Editio Robert J. Beaver Barbara M. Beaver William Medehall Presetatio desiged ad writte by: Barbara M. Beaver Itroductio to Probability ad Statistics Twelfth

More information

MA238 Assignment 4 Solutions (part a)

MA238 Assignment 4 Solutions (part a) (i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative

More information

1. By using truth tables prove that, for all statements P and Q, the statement

1. By using truth tables prove that, for all statements P and Q, the statement Author: Satiago Salazar Problems I: Mathematical Statemets ad Proofs. By usig truth tables prove that, for all statemets P ad Q, the statemet P Q ad its cotrapositive ot Q (ot P) are equivalet. I example.2.3

More information